Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Bank Of America Site Reliability Engineering Lead
United States, North Carolina, Charlotte
54970561

19.11.2025

Job Description:

This job is responsible for partnering with engineering and technology teams to implement measures prescribed by the Site Reliability Engineer teams it leads. Key responsibilities include ensuring appropriate instrumentation, tooling, ticketing, alerting and on call routines are in place for key services, demonstrating technical expertise within domains, and decomposing objectives into work units. Job expectations include advancing efficient solution delivery practices and promoting exceptional design, engineering, and organizational practices.

Responsibilities:

Collaborates with Development and Infrastructure teams to understand technical solutions and implement monitoring capabilities outlined in the application and system monitoring designs put forward by the Senior Site Reliability Engineer (SRE)
Develops and maintains reliability scripts, tools and libraries and leverages them for common instrumentation, automation, and operational needs, and when mentoring SRE resources on reliability practices and established tools/capabilities
Partners to implement code changes to make use of common reliability libraries and tools and helps Application Production Services and Application Development teammates understand how to use them
Participates regularly in architecture community of practice meetings and communication via other channels
Identifies vulnerabilities and opportunities for reliability improvement, such as investigating low level error rates and 'noise' in monitoring, and defines solutions to reduce manual support effort and/or improve system reliability
Engages as a subject matter expert in major incident triage efforts and failure scenario modelling and diagnosis with Problem Manager root causes for major incident/problem management investigations

Key Responsibilities:

Own and drive production issue triage, including expert-level heap and thread dump analysis, memory profiling, garbage collection investigations, and CPU/thread diagnostics.
Work closely with performance testing teams to monitor system behavior pre- and post-release, ensuring consistent throughput and low-latency service delivery.
Develop and maintain monitoring and alerting solutions tailored for performance testing infrastructure and production-like environments.
Collaborate with developers, QA, and performance engineers to interpret telemetry data, identify failure patterns, and implement self-healing mechanisms.
Act as a technical enabler for multiple teams, providing tooling, insights, and best practices around observability, reliability engineering, and performance.
Build internal tools that integrate with existing monitoring platforms (e.g., Splunk, , Dynatrace, DNT) to collate and derive insights from performance testing and production metrics.
Work alongside in-house development teams to enhance internal platforms that aggregate observability data, provide root cause analysis views, and enable smart test reporting.
Champion reliability and stability by guiding incident response practices, postmortem reviews, and service-level objectives (SLOs) tracking.

Required Qualifications:

7-8 years’ experience with hands-on experience in heap dump, thread dump, GC log analysis and JVM internals.
Proficiency in scripting and application development (e.g., Python, Java, Shell, Node.js) to create diagnostic and observability tools.
7-8 years’ experience with logging and monitoring platforms (e.g., Splunk, Dynatrace, DNT).
7-8 years experience working with distributed systems, microservice architecture, and container orchestration platforms (e.g., Kubernetes, Docker).
Experience with performance testing environments and tools like JMeter, LoadRunner, Gatling, or custom test harnesses.
Ability to identify systemic reliability issues and implement resilient patterns (e.g., circuit breakers, graceful degradation, retry logic).
Exceptional debugging and root cause analysis skills across application, infrastructure, and network layers.
Demonstrated ability to build observability tooling or integrations that serve multiple internal teams.
Familiarity with CI/CD practices and infrastructure-as-code (e.g., Terraform, Ansible) is a plus.

Desired Qualifications:

Self-starter with a service-oriented mindset and a relentless drive to improve system reliability.
Comfortable working in a cross-functional, high-impact environment supporting dev, ops, and test teams.
Strong communication and mentoring abilities to influence engineering culture around reliability and performance.
Experience contributing to or designing internal engineering platforms or toolkits to scale team capabilities.

Skills:

Automation
Collaboration
Influence
Production Support
Result Orientation
Analytical Thinking
Application Development
Architecture
Solution Design
Stakeholder Management
Adaptability
DevOps Practices
Project Management
Risk Management
Solution Delivery Process

1st shift (United States of America)

Full job details

These jobs might be a good fit

IE-

Intercontinental Exchange - ICE Lead Site Reliability Engineering SRE United States, Florida, Jacksonville

Citi Group Site Reliability Engineering Support Lead United States, Indiana

Wells Fargo Lead Site Reliability Engineer United States, North Carolina, Charlotte

Professional CV Builder tool from Expoint.

Get to the top of the "yes list" with a standout CV!

CREATE CV

Bank Of America Site Reliability Engineering Lead United States, North Carolina, Charlotte 54970561

Intercontinental Exchange - ICE Lead Site Reliability Engineering SRE United States, Florida, Jacksonville

Citi Group Site Reliability Engineering Support Lead United States, Indiana

Wells Fargo Lead Site Reliability Engineer United States, North Carolina, Charlotte

Wells Fargo Lead Site Reliability Engineer United States, North Carolina, Charlotte

Bank Of America Site Reliability Engineering Lead
United States, North Carolina, Charlotte
54970561