Expoint – all jobs in one place
המקום בו המומחים והחברות הטובות ביותר נפגשים
Limitless High-tech career opportunities - Expoint

Apple SRE Manager - Private Cloud Compute iCloud 
United Kingdom, England, London 
995709076

Yesterday
As a Site Reliability Engineering Manager, responsibilities include: - Manage staging and production environments with goal of maximizing availability - Promote observability of systems for monitoring, alerting, and metrics reporting - Advocate best practices of reliability engineering
  • Experience with large scale distributed systems, especially ML infrastructure and services including LLMs, Generative AI, and transformers
  • Demonstrable success leading engineering teams - ideally SRE or Production Engineering
  • Knowledge of core operating system principles, networking fundamentals, and systems management
  • Understanding of SRE principles, including monitoring, alerting, error budgets, fault analysis, and other common reliability engineering concepts
  • Bachelors or Masters degree in computer science or equivalent field.
  • Experience with hiring and leading engineers
  • Professional experience in an engineering leadership position
  • Bachelors degree or equivalent