Expoint – all jobs in one place
המקום בו המומחים והחברות הטובות ביותר נפגשים
Limitless High-tech career opportunities - Expoint

Monday Senior Incident Manager 
Israel, Tel Aviv District, Tel Aviv-Yafo 
600315605

Yesterday

This isn't just about fighting fires; it's about building a world-class fire department.

You are the calm in the storm, the coach for our commanders, and the driving force behind our continuous improvement.

About The Role

This isn't just about fighting fires; it's about building a world-class fire department.

You are the calm in the storm, the coach for our commanders, and the driving force behind our continuous improvement.

  • Own the Program: Own and evolve our end-to-end incident management framework, including all associated policies, processes, and tooling for our 600+ person engineering organization.
  • Train Our Responders: Develop, manage, and lead a comprehensive training program for our rotational, on-call Incident Commanders. You will be responsible for ensuring our commanders are confident, capable, and ready to lead under pressure.
  • Champion Blameless Learning: Drive a healthy, blameless post-mortem culture. You will facilitate post-incident reviews for major incidents, ensuring root causes are identified and that actionable, high-quality follow-up items are tracked to completion.
  • Drive with Data: Define, track, and report on key reliability metrics (MTTR, MTTA, incident frequency, etc.). Use this data to identify trends, pinpoint systemic risks, and advocate for strategic reliability initiatives.
  • Refine Communication: Partner with our technical and corporate communications teams to refine and execute our internal and external communication strategies during incidents, including the effective use of our public status page.
  • Improve Readiness: Proactively improve our operational readiness by designing and facilitating "Game Day" drills, chaos engineering experiments, and other readiness exercises.
  • Manage the Toolchain: Own the administration and optimization of our incident management toolchain (e.g., PagerDuty, Incident.io, Statuspage).
  • Be the Strategic Leader: During major incidents, you will act as a strategic advisor and coach to the on-duty Incident Commander, ensuring the process is followed and removing organizational roadblocks.
Your Experience & Skills
  • Experienced Hand: 5+ years of experience in a relevant field such as Site Reliability Engineering (SRE), Technical Program Management (TPM), DevOps, or a dedicated Incident Management role.
  • Proven in the Trenches: You have direct, hands-on experience managing and participating in major technical incidents for a large-scale SaaS or cloud-based platform.
  • A Natural Leader and Coach: You have experience leading under pressure and a passion for training and mentoring others. You lead with influence, not just authority.
  • Process-Driven: You excel at creating, documenting, and implementing scalable processes that reduce cognitive load for teams in crisis.
  • Calm and Communicative: You possess exceptional communication and interpersonal skills and have a proven ability to remain calm, focused, and effective in high-pressure situations.
  • Culturally Savvy: You are deeply committed to fostering a blameless, learning-oriented culture and understand the human factors involved in incident response.
  • Technically Credible: You have sufficient technical depth to understand complex distributed systems and facilitate deep technical conversations between Subject Matter Experts without needing to be the expert yourself.

Bonus Points

  • Experience managing or contributing to a formal Change Management / Change Enablement process.
  • Experience building an Incident Commander training program from the ground up.
  • Familiarity with modern incident management automation tools like Incident.io, FireHydrant, or Blameless.