Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Apple Site Reliability Engineering Manager Enterprise Technology Services 
Singapore 
461199662

18.11.2024
Description
• Lead, mentor and develop a team of SREs. • Foster a culture of reliability and excellence within the team.• Promote continuous learning and knowledge sharing.• Help the team to build and maintain robust and highly available System• Automate CI/CD processes.• Ensure the availability and performance of production systems.• Oversee incident response, post-mortem analysis, and root cause investigations.• Implement and maintain service level objectives (SLOs) and service level indicators (SLIs).• Work closely with development, product, and other engineering teams to ensure reliability is prioritized in the development lifecycle.• Communicate effectively with stakeholders regarding reliability metrics, incident reports, and team progress.• Develop and execute a strategic roadmap for the SRE team.• Identify areas for improvement and propose solutions that align with business goals.• Optimize resource allocation and usage for operational efficiency.• Identify and assess risks to production systems and work to mitigate them.• Establish and maintain disaster recovery and business continuity plans.
Minimum Qualifications
  • BS degree or higher in Computer Science or a related field.
  • 5+ years in a site reliability engineering, DevOps, or related role, with at least 2 years in a lead capacity.
  • Strong understanding of systems architecture, cloud infrastructure, and monitoring tools.
  • Proficiency in one or more programming languages in particular Java.
  • Proven experience in leading and mentoring engineering teams.
  • Strong analytical skills and the ability to troubleshoot complex systems.
  • Knowledge on fundamentals of network, databases, system administration, Version Control, CI/CD automations
  • Strong problem-solving, communication skills
Preferred Qualifications
  • Knowledgeable with container based technologies such as Docker, Kubernetes, or EKS.
  • Knowledgeable with modern web services architectures and cloud platforms such as AWS, GCP.
  • Exceptional analytical and troubleshooting skills in complex Unix/Linux systems environment and applications implementations.
  • Ability to build tools from scratch.
  • Ability to work in a collaborative environment.