The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

JPMorgan Site Reliability Engineer III
United Kingdom, Scotland
540222207

24.04.2025

As a Site Reliability Engineer III at JPMorgan Chase within the, you will solve complex and broad business problems with simple and straightforward solutions. Through code and cloud infrastructure, you will configure, maintain, monitor, and optimize applications and their associated infrastructure to independently decompose and iteratively improve on existing solutions. You are a significant contributor to your team by sharing your knowledge of end-to-end operations, availability, reliability, and scalability of your application or platform.

Job responsibilities

Able to drive the continuous improvement of reliability, monitoring and alerting for our mission-critical microservices.
Reduce toil by automation, creating reliable infrastructure and tooling to expedite feature development.
Develop and add metrics to microservices, define user-journeys, SLOs and error budgets, and configure dashboards and alerts based on these.
Facilitate blameless post-mortems and ensure permanent closure of incidents
Engage with development team throughout the life cycle to help develop software for reliability and scale, ensuring minimal refactoring or changes; Design self-healing and resiliency patterns
Collaborate and influence across the organization on behalf of their application portfolio.
Respond to incidents alongside developers and infrastructure engineers where required, providing support and insight.
Collaborate with other software engineers and teams to design and implement deployment approaches using automated continuous integration and continuous delivery pipelines
Implement infrastructure, configuration, and network as code for the applications and platforms in your remit
Understand service level indicators and utilizes service level objectives to proactively resolve issues before they impact customers
Supports the adoption of site reliability engineering best practices within your team (metrics, alerting, logging, automation, resiliency, capacity, performance)

Required qualifications, capabilities, and skills

Formal training or certification on site reliability engineering concepts and proficient applied experience in public cloud such as AWS or Azure or GCP
Proficient in at least one programming language such as Python, Go, Java/Spring Boot
Expertise in at least one technology stack designing, coding, testing, and delivering software
Experience with Kubernetes.
Experience in cloud computing (preferably AWS).
Proficiency in one or more technology domains, may be a cross-domain expert able to solve complex and mission critical problems within a business or across the firm
Excellent debugging and trouble shooting skills
Ability to contribute to large and collaborative teams and proactively recognize road blocks and demonstrates interest in learning technology that facilitates innovation
Experience with continuous integration and continuous delivery tools like Jenkins, GitLab, Terraform
Experience in at least one observability tool such as Dynatrace, Datadog, New Relic, CloudWatch, AppDynamics, Splunk.,

Preferred Qualification