מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר

IBM Site Reliability Engineer
India, Karnataka, Bengaluru
917243935

24.06.2024

שיתוף

Your Role and Responsibilities

Automation: Develop and maintain automation tools and scripts to streamline deployment, monitoring, and management of the infrastructure and applications.
Monitoring and Alerting: Set up and maintain monitoring and alerting systems to proactively identify and resolve issues before they impact customers or services. Including participation in on-call rotations to respond promptly to high priority incidents.
Performance Optimization: Identify opportunities for performance optimization and work with development teams to implement improvements.
Documentation: Maintain up-to-date documentation for the infrastructure, processes, and procedures.
Collaboration: Work closely with development teams, product managers, and other stakeholders to understand requirements and ensure the reliability of the platform.
Continuous Improvement: Participate in post-incident reviews, retrospectives, and other forums to identify areas for improvement and drive continuous improvement initiatives.

Required Technical and Professional Expertise

Strong Linux systems engineering background with CentOS/RHEL or Debian including experience building, maintaining and troubleshooting these systems.
Automation and Scripting: Strong scripting skills (e.g., Bash, Python) and experience with configuration management tools (e.g., Ansible, Chef, Puppet) to automate deployment and management tasks.
Excellent Git skills (merges, branching, forking)
Experience with Cloud Platforms: Strong experience with cloud platforms such as IBM, AWS, Azure, or Google Cloud Platform, including expertise in:
- Deploying and managing services in these environments.
- Managing, and troubleshooting containerized applications.
Troubleshooting and Problem Solving: Strong troubleshooting skills and the ability to quickly identify and resolve complex issues in a production environment, including experience with incident response and post-incident analysis.

Preferred Technical and Professional Expertise

DevOps Culture: Experience working in a DevOps culture and mindset, including a strong understanding of the collaboration between development and operations teams to achieve business goals.
Container Orchestration: Proficiency in container orchestration tools such as Nomad or Kubernetes, including experience with Hashicorp Consul/Vault or equivalents.
Monitoring and Logging: Experience with monitoring and logging tools (e.g., ELK stack, Grafana, Prometheus) to monitor the health and performance of infrastructure and applications. Including experience building and maintaining these tools.
Security: Knowledge of implementing security best practices and maintaining compliance standards (Center for Internet Security (CIS) Benchmarks, FedRAMP).
Security: Ability to patch software or adjust configurations to mitigate Common Vulnerabilities and Exposures (CVE) in a timely fashion.
Experience with clustered time series database technologies such as InfluxDB as well as experience with distributed event streaming platforms using Kafka and Telegraf.
CI/CD: Experience with application deployment using CI/CD tools such as Jenkins and Tekton.
Working knowledge with GitHub, JIRA, Confluence, and ServiceNow.

משרות נוספות שיכולות לעניין אותך

IBM Site Reliability Engineer India, Karnataka, Bengaluru

הצטרפו למאות שיצרו קורות חיים ושדרגו את הקריירה שלהם

צרו קו"ח