Finding the best job has never been easier

JPMorgan Lead Infrastructure Engineer - Network Operation
Singapore
125015993

05.02.2025

Job responsibilities

Guides and assists others in the areas of building appropriate level designs and gaining consensus from peers where appropriate
Collaborates with other software engineers and teams to design and implement deployment approaches using automated continuous integration and continuous delivery pipelines
Implements infrastructure, configuration, and network as code for the applications and platforms in your remit
Collaborates with technical experts, key stakeholders, and team members to resolve complex problems
Understands service level indicators and utilizes service level objectives to proactively resolve issues before they impact customers
Improve aspects of network products related to reliability related nonfunctional requirements such as logging, monitoring, observability, performance, scalability, capacity, resiliency, etc.
Perform research and discovery on industry tools and lead build versus buy
Collaborate with other network and software engineering teams to automate processes, reduce toil and modernize operations
Participate in on-call rotation as an escalation contact for production issues
Turn theory into practice, navigate through ambiguity to build a plan
Accomplish common goals using SCRUM practices

Required qualifications, capabilities, and skills

Bachelor’s Degree in Computer Science, Engineering, Mathematics or other related disciplines
Minimally 5 years of site reliability engineering or related experience
Minimally 3 years of network engineering or related experience.
Ability to contribute to large and collaborative teams by presenting information in a logical and timely manner with compelling language and limited supervision
Ability to proactively recognize road blocks and demonstrates interest in learning technology that facilitates innovation
Experience in observability such as white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, and others
Familiarity with troubleshooting common networking technologies and issues
Experience with one or more application performance management technologies (AppDynamics, Dynatrace, Riverbed SteelCentral, Prometheus)
Ability to initiate and implement ideas to solve business problems
Experience triaging and diagnosing issues in complex distributed architectures leveraging infrastructure and application telemetry
Experience with one or more infrastructure automation technologies (Ansible, Terraform, Puppet, building APIs and services using REST, SOAP, etc.)

These jobs might be a good fit