As a Site Reliability Engineer III at JPMorgan Chase within the Technology Department, you will be a key player in solving complex business problems with simple and straightforward solutions. Your role will involve configuring, maintaining, monitoring, and optimizing applications and their associated infrastructure. You will be a significant contributor to your team, sharing your knowledge and ensuring the reliability and scalability of our applications.
Job responsibilities
- Guide and assist others in building appropriate level designs and gaining consensus from peers where appropriate.
- Collaborate with other software engineers and teams to design and implement deployment approaches using automated continuous integration and continuous delivery pipelines.
- Collaborate with other software engineers and teams to design, develop, test, and implement availability, reliability, scalability, and solutions in their applications.
- Implement infrastructure, configuration, and network as code for the applications and platforms in your remit.
- Collaborate with technical experts, key stakeholders, and team members to resolve complex problems.
- Understand service level indicators and utilize service level objectives to proactively resolve issues before they impact customers.
- Support the adoption of site reliability engineering best practices within your team.
Required qualifications, capabilities, and skills
- Obtain formal training or certification on site reliability culture and principles concepts with 3+ years of applied experience.
- Become proficient in site reliability culture and principles, and understand how to implement site reliability within an application or platform.
- Master at least one programming language such as Python, Java/Spring Boot, or .Net.
- Develop proficient knowledge of software applications and technical processes within a given technical discipline (e.g., Cloud, artificial intelligence, Android, etc.).
- Gain experience in observability, including white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, and others.
- Utilize continuous integration and continuous delivery tools like Jenkins, GitLab, or Terraform.
- Familiarize yourself with container and container orchestration technologies such as ECS, Kubernetes, and Docker.
- Acquire experience in cloud computing, preferably with Cloud Foundry or AWS.
- Understand and troubleshoot common networking technologies and issues.
- Gain experience in cloud data lakes, such as Databricks or Snowflake
Preferred qualifications, capabilities, and skills
- Contribute to large and collaborative teams by presenting information in a logical and timely manner with compelling language and limited supervision.
- Proactively recognize roadblocks and demonstrate interest in learning technology that facilitates innovation.
- Identify new technologies and relevant solutions to ensure design constraints are met by the software team.
- Initiate and implement ideas to solve business problems.
- Prefer certifications in AWS, Splunk, Dynatrace, and Terraform..