Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Apple Site Reliability Engineering Lead SAP Global Systems
India, Telangana, Hyderabad
426221165

03.08.2025

RESPONSIBILITIES:- Build up, lead and improve existing processes to provide 24x7 operational response for applications in public cloud platforms. - Maintain services once they are live by setting up monitoring, alerting and measuring availability, latency, and overall system health. - Own and review work for accuracy, quality, application performance and completeness. - Review release readiness through activities such as system design consulting, reviewing all observability and monitoring, capacity planning, and launch reviews. - Understanding of Core Principles of DevSecOps.- Partner with architects and engineers to design and implement automation, operations, and support solutions. - Partner Management- Proficient in designing and implementing end-to-end observability frameworks using tools such as Prometheus, Grafana, CloudWatch, ELK/EFK, and OpenTelemetry, ensuring service reliability through dashboard design, SLOs/SLIs, and alerting systems.

8 - 14 years of experience with a track record of building and leading Cloud Native SRE and Operations for AWS or GCP Hyperscalers.
Solid experience supporting customer facing applications in an 24-7 uptime environment of distributed systems.
Bachelor's degree or equivalent experience in Computer Science, Engineering or other relevant major.
Collaborate with security, development, and infrastructure teams to implement a Zero Trust Architecture, handle secrets securely, and establish secure CI/CD pipelines.

Expertise in SRE principles, production-scale system design, and DevOps practices.
Design / Architect the Solutions on Multi Cloud Environments / OnPrem systems.
Solid understanding of core cloud services such as IAM, EC2/GCE, RDS/CloudSQL, EKS/GKE, CloudWatch/Cloud Monitoring, S3/GCS etc
Understand complex landscape architectures. Have working knowledge of on-prem and cloud based hybrid architectures and infrastructure concepts of Regions, Availability Zones, VPCs/Subnets, Load balancers, API Gateways etc.
Good understanding of common authentication schemes, certificates, secrets and protocols.
Implement infrastructure-as-code practices applying tools such as Terraform, Helm, or Pulumi.
Scripting and/or coding skills needed for automation, triaging and troubleshooting . Experience on any of these scripting Python, Go, Java etc.
Experience with Planning and Designing the Disaster Recovery for BCP and Non BCP Applications.
Core Knowledge on the Standard processes of Security and Governance.
Expertise handling production incidents, with experience working towards resolution and collaborator communication during incidents.
Track record with improving service reliability and efficiency.
Ability to implement and coordinate telemetry using monitoring and observability tools
Adapt at prioritizing multiple issues in a high stress environment. Good experience in designing and improving response processes
Mentor and foster professional development of junior SREs, thereby contributing to operational excellence across diverse environments.
Automation focus for operational efficiency - designing and implementing automation processes for repeatable and consistent service deployment
A solid sense of ownership. critical thinking & interpersonal skills to work effectively across diverse & multi-functional teams.
Certifications like AWS Solutions Architect, AWS DevOps Professional, GCP Professional Architect is a plus.

Full job details

These jobs might be a good fit

Apple Site Reliability Engineering Lead SAP Global Systems United States, West Virginia

Google Systems Engineer III Site Reliability Engineering India, Karnataka, Bengaluru

Google Systems Engineer II Site Reliability Engineering India, Karnataka, Bengaluru

Professional CV Builder tool from Expoint.

Get to the top of the "yes list" with a standout CV!

CREATE CV

Apple Site Reliability Engineering Lead SAP Global Systems India, Telangana, Hyderabad 426221165

Apple Site Reliability Engineering Lead SAP Global Systems United States, West Virginia

Google Systems Engineer III Site Reliability Engineering India, Karnataka, Bengaluru

Google Systems Engineer III Site Reliability Engineering India, Karnataka, Bengaluru

Google Systems Engineer II Site Reliability Engineering India, Karnataka, Bengaluru

Apple Site Reliability Engineering Lead SAP Global Systems
India, Telangana, Hyderabad
426221165