Identification and resolution of production incidents
Problem Management post-incident resolution to determine root cause, mitigating actions, drive and track permanent resolution
Management of incident and problem tickets through the enterprise ITSM tool
Understanding and ensuring compliance with the Incident Management and Problem Management policies and procedures
Capacity and Performance Management
Working with development teams for take-on and training of new services or significant upgrades
Providing support for Audits (internal and external)
Stakeholder management, working closely with Business and Operations partners to understand KPI metrics to measure service stability to prioritize defect fixes and
Responsibilities:
Leads End-to-End maintenance responsibility of all production services related to technology. Work activities specific to Production Shared Services include: Problem/Incident Management, Release/Deployment, Operational Readiness,Capacity/AvailabilityManagement, Application Monitoring, Service Analytics and Reporting, Production Governance, Triage, Associate Support, Change/Configuration Management
Actively engage and lead production support issues/incidents
Takes ownership of escalations and perform trouble shooting, analysis, research and resolution
Ensure production and performance SLAs are met and escalate issues which needs attention
Performs analytical, technical, and administrative work in planning, installing, and supporting new and existing equipment and software under moderate supervision
Identifies vulnerabilities and opportunities for improvement, as well as maintain metrics to help develop analysis that will drive improvement in all areas of Production Shared Services
Creates and enhances administrative, operational and technical policies and procedures, adopting best practice guidelines, process improvements, standards and procedures
Exercises judgment within defined procedures and practices to determine appropriate action
Serves as an escalation point between the client/business area and internal management for the resolution of moderately complex unresolved problems, complaints and service requests
Should have increased awareness and exposure to basic technical principles, concepts and techniques
Resolves complex issues. Works on problems of minimal-moderate scope where analysis of situation or data requires a review of identifiable factors
Support of on-call rotation for off-hours and weekend support as needed
Skills:
Adaptability
Analytical Thinking
Influence
Production Support
Automation
Collaboration
Innovative Thinking
Result Orientation
Solution Design
Required Qualifications:
5+ years of relevant IT experience i.e. Production or Release Support, ITIL, Technical Implementations, or equivalent
Demonstrate flexibility, navigate ambiguity, and quickly establish credibility among technical peers
Excellent written and verbal communication skills (English)
Proven knowledge in some/all of the following: Java/J2EE – Core Java, JDBC, EJB, & Java Web Services & Experience in Server-side technologies - SOAP/Restful services, XML/XSLT, XML, JDBC, AOP, MQ Micro Services & MuleSoft.
Good knowledge of Middleware components; Message Broker, IBM Websphere MQ, JBoss application server, MuleSoft
Strong operating system knowledge in Unix and Windows including strong scripting skills
Experience with Database technologies (examples Oracle, DB2 and PL/SQL ) queries to support incident resolution.
Knowledge of event driven and schedule driven batch processes
Experience of handling various production support roles (technical – L1/L2/L3) and hands-on experience in using at least 2-3 widely used monitoring / scheduling tools
Ability to be part of IT production support team providing front-line technical support to end users responding to issues related to Incident / Problem Management, Release/Deployment, Operational Readiness, Application Monitoring & Production Governance
Experience with: troubleshooting, analysis, research, and resolution using advanced query, programming skills, conduct root cause analysis, and identify mitigations/risk, real-time restoration, triaging of issues impacting technical services(application/infrastructure)to bank customers and partners in a timely manner while keeping partners advised of significant progress or challenges during the restoration period
Ability to work closely with Technology Infrastructure, Development & Testing Teams in supporting Integrated / Independent releases, software/hardware upgrades, server upgrades, etc.
Ability to assess initial severity, gather impacts, engage necessary supporting teams, and escalate as necessary to ensure timely restoration
Experience with on call support for triaging problems, coordinating with various support teams across the organization and carryout activities related to incident and problem management
Ability to communicate with all lines of business and management the overall status and health of the application, contribute to automation, causal analysis, develop shared/common solutions, and proactively identify cross-functional or technical issues
Working on some weekends and bank holidays as part of a 5 day/week shift pattern
Desired Qualifications:
Knowledge of event-driven and schedule driven batch processes
Experience with Database technologies (examples Oracle, DB2 and PL/SQL ) queries to support incident resolution
Good knowledge of Middleware components such as Message Broker, IBM Websphere MQ, JBoss application server, MuleSoft