Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

Fortinet Site Reliability Engineer 
United Kingdom, England, London 
845556108

15.08.2024

Our team develops and supports the infrastructure layers spanning our cloud accounts, network/connectivity, workload management, observability, and storage services. We build tooling to perform automated operations in order to scale the Lacework infrastructure and service. To be successful you will design, define, develop, deploy and operate internal tooling, APIs, and frameworks which streamline our workflows and automate our infrastructure.


• Automation. Automation. Automation.
• Automate as much as reasonable to significantly improve operational efficiency.
• Design, build and improve our infrastructure to enhance service scalability, resiliency, and efficiency across the company.
• Identify mission-critical problems and solve them via automation, tooling, communication, and informed design.
• Build and improve monitoring and instrumentation to predict future scalability or failure risks and solve them before they manifest into customer-facing issues.
• Facilitate company-wide visibility into key metrics, SLAs, and milestones so that scale and resiliency are a part of every conversation.
• Develop best practices alongside engineering/operations teams to improve the scalability and reliability of internal processes.
• Participate in an on-call rotation.


• Strong development and automation skills.
• Extensive experience with CI/CD pipelines and Infrastructure as Code (Terraform, CloudFormation, etc), as well as supporting tooling (Atlantis, ArgoCD etc)
• Extensive experience with a variety of cloud managed services and providers
• Experience building production quality cloud infrastructure that enables reliable and rapid deployment of microservices with effective monitoring and resilient operations.
• Strong passion for improving the lives of coworkers while ensuring a stable, reliable experience for customers.
• Strong cross-team communication skills.
• Experience with the building blocks of large-scale systems including load balancing, distributed/cloud computing, containers, instrumentation, and monitoring.
• Knowledge of cloud networking, including VPC configuration and cross-cloud connectivity.
• Familiarity with one or more programming languages (Python, Ruby, Golang, etc.).


• Desire to "build for lazy" and build systems and computers that reason for us
• Experience with monitoring and observability systems like Prometheus and Grafana, applications like New Relic or DataDog and tools or frameworks like telegraf and OpenTracing
• Believe everything should be "as code"
• Experience in Systems, Operations, or Full-Stack Development is a major bonus
• Experience with Java application servers and JVM configuration