Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

Fortinet Site Reliability Engineer 
United Kingdom, England, London 
396586954

28.11.2024

The Role:

  • ● Automate as much as reasonable to significantly improve operational efficiency of the Lacework platform
  • ● Design, build and improve our infrastructure to enhance service scalability, resiliency, and efficiency across the company.
  • ● Identify mission-critical problems and solve them via automation, tooling, communication, and informed design.
  • ● Build and improve monitoring and instrumentation to predict future scalability or failure risks and solve them before they manifest into customer-facing issues.
  • ● Facilitate company-wide visibility into key metrics, SLAs, and milestones so that scale and resiliency are a part of every conversation.
  • ● Develop best practices alongside engineering/operations teams to improve the scalability and reliability of internal processes.
  • ● Participate in an on-call rotation.

Minimum Qualifications:

  • ● 3 years of SRE experience with production systems (depending on level)
  • ● Strong development and automation skills.
  • ● Extensive experience with Infrastructure as Code (Terraform, etc), as well as supporting tooling (Atlantis, ArgoCD, etc)
  • ● Extensive experience with Kubernetes and supporting tooling (Helm, operators, etc)
  • ● Extensive experience with a variety of cloud managed services and providers
  • ○ AWS: EKS, EC2, S3, RDS, Secrets Manager, etc.
  • ● Experience building production quality cloud infrastructure that enables reliable and rapid deployment of microservices with effective monitoring and built in high availability and/or fault tolerance.
  • ● Strong passion for using automation to create simple repeatable dev and ops patterns that ensures a stable, reliable experience for customers.
  • ● Strong cross-team communication skills.
  • ● Experience with the building blocks of large-scale systems including load balancing, distributed/cloud computing, containers, instrumentation, and monitoring.
  • ● Knowledge of cloud networking, including VPC configuration and cross-cloud connectivity.
  • ● Familiarity with one or more programming languages (Python, Golang, etc.).

Preferred Qualifications:

  • ● Experience with monitoring and observability systems and tools (Prometheus, Grafana, New Relic, DataDog, etc.)
  • ● Believe everything should be "as code"
  • ● Experience in Systems, Operations, or Full-Stack Development is a major bonus
  • ● Experience with Java application servers and JVM configuration