As a Senior SRE, you will:
- Design, build, and maintain scalable, fault-tolerant systems.
- Define and enforce SLOs, SLIs, and SLAs — and drive improvements based on real data.
- Build automation and tooling to enhance observability, testing, and deployments.
- Lead complex incident responses, including on-call rotations and postmortems.
- Collaborate closely with engineering, product, and support teams to embed reliability into everything we do.
- Mentor engineers and promote operational excellence across the organization.
You should apply if you:
- Have 7+ years of experience in SRE, DevOps, or Production Engineering roles, ideally in SaaS environments.
- Bring deep expertise in resilience engineering, monitoring, and building fault-tolerant systems.
- Are hands-on with monitoring tools like Datadog, Dynatrace, Opensearch, Coralogix, or Sentry.
- Are experienced with CI/CD tools like Jenkins or ArgoCD.
- Are proficient with infrastructure-as-code tools like Terraform or Crossplane.
- Have strong knowledge of Linux systems and networking fundamentals.
- Have solid experience with cloud platforms (AWS preferred).
- Are an advanced coder in Java (Python or Go is a plus).
- Know Kubernetes and the broader CNCF ecosystem inside out.
- Excel at debugging and root cause analysis.
- Are fluent in Hebrew and English.
- Bring a high sense of ownership and accountability to everything you do.
*We operate in a flexible hybrid work model.
for more details.