Candidates are expected to be available for critical production issues as required. This may include off shift hours and weekends.You will be responsible for driving operational excellence, improving documentation, and ensuring resilience across midrange platforms, while mentoring a distributed team and influencing change across engineering, SRE, and production support.Skills Desired:
• Proven experience leading midrange or infrastructure operations in a high-availability environment
• Deep knowledge of Linux/RedHat systems, patching, and vulnerability remediation
• Familiarity with observability and APM tools (e.g., Dynatrace, vROps, Big Panda, Logscale)
• Strong incident management and SRE-aligned thinking (e.g., proactive issue identification, toil reduction)
• Excellent communication and documentation skills
• A collaborative approach to cross-functional engagement and knowledge transferManage Daily Operations & Incident Management
• Review and manage alerts and events in Big Panda
• Track and prioritize R1/R2/P1 incidents, escalating to appropriate SRC Engineering teams
• Oversee Midrange requests via SRC Chat, ensuring timely and accurate responses
• Drive adoption of SRE practices, identifying systemic issues and remediating proactively
• Monitor and manage open issues via RedHat case management
Weekly Midrange Knowledge Transfer & Operational Reviews
• Lead regular knowledge transfer sessions across teams
o Review and present RedHat TAM updates and open cases
o Discuss recent escalated incidents, excessive resolution time cases, and recurring issues
o Coordinate patching updates (IDS Patching)
What’s being patched, known issues, and current vulnerabilities
o Track open vulnerabilities by Mnemonic owner with a focus on remediationPlatform Operations
• Manage and escalate issues including:
o Syslog NG restarts
o Disk and path failures
o Access, login, and account security issues
o Remote access and server unavailability
o Patching defects and requests
• Collaborate closely with Midrange Engineering leads and other platform SMEsManage Accountability for Documentation & Knowledge Management
• Drive the creation and upkeep of Linux system documentation, targeting at least one publish-ready doc every 3 weeks
• Maintain and enhance tooling documentation, including:
Job Description- Leads a team of Site Reliability Engineers in implementing, maintaining, and improving robust monitoring response sites and infrastructure applications.
- Recommends and facilitates the implementation of infrastructure enhancements as required to maintain the performance of sites in response to business growth and strategy.
- Streamlines the deployment process by introducing automated configuration management tools, resulting in a reduction in deployment time and increased efficiency.
- Oversees robust technical solutions for complex business and application challenges, while helping to define and communicate technical standards and best practices. Manages and oversees proactive reviews and audits of production sites, issue triage and follow up.
- Leads in the collaboration with cross-functional teams to design and implement scalable and highly available infrastructure.
- Maximizes staff contribution through professional growth and development, to increase teamwork and more effectively meet business needs.
PNC Employees take pride in our reputation and to continue building upon that we expect our employees to be:
- Customer Focused - Knowledgeable of the values and practices that align customer needs and satisfaction as primary considerations in all business decisions and able to leverage that information in creating customized customer solutions.
- Managing Risk - Assessing and effectively managing all of the risks associated with their business objectives and activities to ensure they adhere to and support PNC's Enterprise Risk Management Framework.
PNC also has fundamental expectations of our people managers. As a manager of talent in PNC, you will be expected to:
- Include Intentionally - Cultivates diverse teams and inclusive workplaces to expand thinking.
- Live the Values - Role models our values with transparency and courage.
- Enable Change - Takes action to drive change and innovation that will transform our business.
- Achieve Results - Takes personal ownership to deliver results. Empowers and trusts others in decision making.
- Develop the Best - Raises the bar with every talent decision and guides the achievement of all employees and customer.
QualificationsSuccessful candidates must demonstrate appropriate knowledge, skills, and abilities for a role. Listed below are skills, competencies, work experience, education, and requiredneeded to be successful in this position.
Analytical Thinking, Application Design, Architecture, Application Maintenance, Application Testing, Emerging Technologies, Innovation, IT Industry: Trends & Directions, IT Standards, Procedures & Policies, Software Process Improvement (SPI), Software Reliability Management, Technical TroubleshootingRoles at this level typically require a university / college degree, with 5+ years of industry-relevant experience. At least 3 years of prior management experience is typically required. In lieu of a degree, a comparable combination of education, job specific certification(s), and experience (including military service) may be considered.No Required Certification(s)No Required License(s)
This position is subject to the requirements of Section 19 of the Federal Deposit Insurance Act (FDIA) and, for any registered role, the Secure and Fair Enforcement for Mortgage Licensing Act of 2008 (SAFE Act) and/or the Financial Industry Regulatory Authority (FINRA), which prohibit the hiring of individuals with certain criminal history.
California ResidentsRefer to the