מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר
Being the cybersecurity partner of choice, protecting our digital way of life.
Your Career
You will play a critical role in ensuring the stability, scalability, and efficiency of our high-scale production systems. You will lead teams responsible for maintaining production reliability, managing on-call processes, and driving automation for incident remediation.
Your Impact
Ensure the ongoing resilience, stability, and availability of production environments.
Lead the design and implementation of on-call processes that reduce noise and improve incident response times.
Drive the development of remediation automation to prevent recurring issues and reduce manual intervention.
Collaborate with engineering teams to align reliability goals
Establish best practices for system upgrades, maintenance processes, and operational playbooks.
Foster a culture of continuous improvement through post-incident reviews and proactive problem-solving.
Empower your team to build self-service tools that enhance operational efficiency and reduce toil.
Promote and enhance observability practices to improve system monitoring, alerting, and diagnostics.
Your Experience
Strong management experience: 7+ years of managing SRE and operational groups, driving stability and efficiency improvements.
Proven hands on background in Devops domain: 5+ years of experience, including on-call shifts, system upgrades, incident management, and driving reliability improvements through automation.
Demonstrated ability to work cross-functionally in a matrix organizational structure.
Ability to communicate complex technical concepts to both technical and non-technical stakeholders, ensuring alignment across the organization.
Proven experience managing large-scale, complex systems and ensuring stability and performance at scale.
Strong analytical and troubleshooting skills, with a proactive approach to identifying and resolving issues before they impact production.
Strong foundation in cloud infrastructure (AWS, GCP, Azure - GCP is preferred), Kubernetes and monitoring tools (Prometheus, Grafana, etc.).
All your information will be kept confidential according to EEO guidelines.
משרות נוספות שיכולות לעניין אותך