Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

JPMorgan SRE & Observability Engineer 
Argentina, Autonomous City of Buenos Aires, Buenos Aires 
164852976

08.02.2025

Join the CIB Capacity Management team as a Capacity Management Support Engineer at JPMorgan Chase, where you will play a pivotal role in enhancing, building, and delivering top-notch technology frameworks. In this role, you will encourage a culture of continuous improvement as you troubleshoot, maintain, identify, escalate, and resolve capacity planning related issues for all internally and externally developed systems, leading to a seamless user experience.

Job responsibilities

  • Analyze and troubleshoot capacity compliance issues across multiple platforms to ensure optimal performance and availability.
  • Evaluate information, assess situations, and collaborate effectively to address and resolve capacity issues, ensuring efficient remediation of business and production problems.
  • Research, resolve, and analyze root causes for capacity issues related to major incidents, associating them with aligned control domains.
  • Support day-to-day adherence to control procedures related to capacity management.
  • Monitor production environments for anomalies and address issues using standard observability tools.
  • Identify issues for escalation and communication, providing solutions to business and technology stakeholders.
  • Analyze complex situations and trends to anticipate and solve tickets raised by application teams.
  • Build and maintain strong relationships with JPMC business partners and technology teams to identify process improvement opportunities.
  • Lead and ensure assigned project activities are completed within established timelines
  • Guides and assists others in the areas of building appropriate level designs and gaining consensus from peers where appropriate
  • Collaborates with other software engineers and teams to design and implement deployment approaches using automated continuous integration and continuous delivery pipelines
  • Collaborates with other software engineers and teams to design, develop, test, and implement availability, reliability, scalability, and solutions in their applications
  • Implements infrastructure, configuration, and network as code for the applications and platforms in your remit
  • Collaborates with technical experts, key stakeholders, and team members to resolve complex problems
  • Understands service level indicators and utilizes service level objectives to proactively resolve issues before they impact customers
  • Supports the adoption of site reliability engineering best practices within your team

Required qualifications, capabilities, and skills

  • Formal training or degree in computer science/engineering or related field with 5+ years of applied experience.
  • Experience or equivalent expertise in troubleshooting, resolving, and maintaining information technology services.
  • Basic skillset in Shell scripting, Python, Oracle/SQL, and familiarity with monitoring and observability tools (e.g., ITRS Geneos, Grafana, Datadog, Dynatrace).
  • Demonstrated knowledge of applications or infrastructure in a large-scale technology environment, both on-premises and in the public cloud.
  • Experience with observability and monitoring tools and techniques.
  • Proficient in site reliability culture and principles and familiarity with how to implement site reliability within an application or platform
  • Proficient in at least one programming language such as Python, Java/Spring Boot, and .Net
  • Proficient knowledge of software applications and technical processes within a given technical discipline (e.g., Cloud, artificial intelligence, Android, etc.)
  • Experience in observability such as white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, and others
  • Experience with continuous integration and continuous delivery tools like Jenkins, GitLab, or Terraform
  • Familiarity with container and container orchestration such as ECS, Kubernetes, and Docker
  • Familiarity with troubleshooting common networking technologies and issues
  • Ability to contribute to large and collaborative teams by presenting information in a logical and timely manner with compelling language and limited supervision
  • Ability to proactively recognize road blocks and demonstrates interest in learning technology that facilitates innovation
  • Ability to identify new technologies and relevant solutions to ensure design constraints are met by the software team
  • Ability to initiate and implement ideas to solve business problems
Preferred qualifications, capabilities, and skills
  • Practical cloud-native experience, primarily in AWS.
  • Knowledge of industry-wide technology trends and best practices.