In this role, you’ll work in one of our IBM Consulting Client Innovation Centers (Delivery Centers), where we deliver deep technical and industry expertise to a wide range of public and private sector clients around the world. Our delivery centers offer our clients locally based skills and technical expertise to drive innovation and adoption of new technology.
Your Role and Responsibilities
The Site Reliability Engineer is a critical role in Cloud based projects. An SRE works with the development squads to build platform & infrastructure management/provisioning automation and service monitoring using the same methods used in software development to support application development. SREs create a bridge between development and operations by applying a software engineering mindset to system administration topics. They split their time between operations/on-call duties and developing systems and software that help increase site reliability and performance.
Required Technical and Professional Expertise
- Overall 12+ yrs experience required.
- Have good exposure to Operational aspects (Monitoring, Automation, Remediations) – Monitoring tools exposure like NewRelic, Prometheus, ELK, Distributed tracing, APM, App Dynamics, etc.
- Troubleshooting and documenting Root cause analysis and automate the incident
- Understands the Architecture, SRE mindset, Understands data model
- Platform Architecture and Engineering – Ability to design, architect a Cloud platform that can meet Client SLAs /NFRs such as Availability, system performance etc. SRE will define the environment provisions framework, identify potential performance bottlenecks and design a cloud platform.
Preferred Technical and Professional Expertise
- Effectively communicate with business and technical team members.
- Creative problem solving skills and superb communication Skill.
- Telecom domain experience is an added plus