Expoint - all jobs in one place

מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר

Limitless High-tech career opportunities - Expoint

Apple Site Reliability Engineering SRE Manager iCloud 
United Kingdom, England, London 
925404440

21.12.2024
Description
As a Site Reliability Engineering Manager, responsibilities include:Manage staging and production environments with goal of maximizing availability Promote observability of systems for monitoring, alerting, and metrics reporting Advocate best practices of reliability engineering
Minimum Qualifications
  • Experience with large scale distributed systems, especially ML infrastructure and services including LLMs, Generative AI, and transformers
  • Demonstrable success leading engineering teams - ideally SRE or Production Engineering
  • Knowledge of core operating system principles, networking fundamentals, and systems management
  • Understanding of SRE principals, including monitoring, alerting, error budgets, fault analysis, and other common reliability engineering concepts
Preferred Qualifications
  • Experience with hiring and leading engineers
  • Professional experience in an engineering leadership position
  • Bachelors degree or equivalent
Education & Experience
Bachelors or Masters degree in computer science or equivalent field.