Expoint – all jobs in one place
מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר
Limitless High-tech career opportunities - Expoint

IBM Network Observability SRE 
India, Karnataka, Bengaluru 
63674698

Today
Your role and responsibilities

Automation:Develop and maintain automation tools and scripts to streamline deployment, monitoring, and management of the infrastructure and applications.

Monitoring and Alerting:Set up and maintain monitoring and alerting systems to proactively identify and resolve issues before they impact customers or services. Including participation in on-call rotations to respond promptly to high priority incidents.

Performance Optimization:Identify opportunities for performance optimization and work with development teams to implement improvements.

Documentation:Maintain up-to-date documentation for the infrastructure, processes, and procedures.

Collaboration:Work closely with development teams, product managers, and other stakeholders to understand requirements and ensure the reliability of the platform.

Continuous Improvement:Participate in post-incident reviews, retrospectives, and other forums to identify areas for improvement and drive continuous improvement initiatives.

Required education
Bachelor's Degree
Preferred education
Master's Degree
Required technical and professional expertise

· Strong Linux systems engineering background with CentOS/RHEL or Debian including experience building, maintaining and troubleshooting these systems.

· Automation and Scripting: Strong scripting skills (e.g., Bash, Python) and experience with configuration management tools (e.g., Ansible, Chef, Puppet) to automate deployment and management tasks.

· Excellent Git skills (merges, branching, forking)

· Experience with Cloud Platforms: Strong experience with cloud platforms such as IBM, AWS, Azure, or Google Cloud Platform, including expertise in:

o Deploying and managing services in these environments.

o Managing, and troubleshooting containerized applications.

· Troubleshooting and Problem Solving: Strong troubleshooting skills and the ability to quickly identify and resolve complex issues in a production environment, including experience with incident response and post-incident analysis.

Preferred technical and professional experience

Container Orchestration:Proficiency in container orchestration tools such as Nomad or Kubernetes, including experience with Hashicorp Consul/Vault or equivalents.

Monitoring and Logging:Experience with monitoring and logging tools (e.g., ELK stack, Grafana, Prometheus) to monitor the health and performance of infrastructure and applications. Including experience building and maintaining these tools.

Security:Knowledge of implementing security best practices and maintaining compliance standards (Center for Internet Security (CIS) Benchmarks, FedRAMP).


Security:Ability to patch software or adjust configurations to mitigate Common Vulnerabilities and Exposures (CVE) in a timely fashion.

· Experience with clustered time series database technologies such as InfluxDB as well as experience with distributed event streaming platforms using Kafka and Telegraf.

CI/CD:Experience with application deployment using CI/CD tools such as Jenkins and Tekton.

· Working knowledge with GitHub, JIRA, Confluence, and ServiceNow.

Being an IBMer means you’ll be able to learn and develop yourself and your career, you’ll be encouraged to be courageous and experiment everyday, all whilst having continuous trust and support in an environment where everyone can thrive whatever their personal or professional background.

OTHER RELEVANT JOB DETAILS

When applying to jobs of your interest, we recommend that you do so for those that match your experience and expertise. Our recruiters advise that you apply to not more than 3 roles in a year for the best candidate experience. For additional information about location requirements, please discuss with the recruiter following submission of your application.