Work in our SRE team to design, implement and maintain monitoring solutions using tools like Grafana, Kibana, Prometheus and App dynamics.
Contribute to the projects and sprints related to firming up of SLOs/SLIs for Developer pipeline applications ensuring high availability and performance.
Look at proactively identifying bottlenecks in the application performance and bring your findings to Sprint stand-up; helping to implement solutions.
Harness Python/Javascript to come up with scripts to automate and streamline operational tasks.
Take ownership of problems and work with the larger team in driving resolution. Bias for action is preferred and upskilling encouraged.
Analyze system logs and metrics to identify root cause of issues
Effectively communicate technical details to both technical and non-technical audiences.
Work collaboratively with development, operations, and other teams to ensure alignment and smooth execution.
Contribute to a culture of continuous improvement and knowledge sharing.
Being an SRE, you will –
Get exposed to cutting edge and latest technology in use by a market leader by Citi like - Tekton, Harness involving OpenShift.
Get hands on experience of Gen AI, developing use cases from idea inception to product delivery.
Work with agile tools to manage tasks.
Learn ground-up processes for building observability for our supported applications and be part of the team that designs SLO/SLIs for improving the application performance.
Skills Required:
Proficiency in Python, Javascript or willingness to learn
A good understanding of the Software development lifecycle and Pipeline management. Working experience is an added advantage.
Basic understanding of observability principles and SLO/SLIs
Experience in using monitoring tools like Grafana, Kibana, Prometheus, and AppDynamics.
Basic working knowledge of data visualization tools like Tableau.
Understanding of Agile concepts and related process.
Working or theoretical knowledge on opensift, Tekton and Harness pipelines