Job responsibilities
- Provides end-to-end application or infrastructure service delivery to enable successful business operations of the firm.
- Assists in the monitoring of production environments for anomalies and address issues utilizing standard observability tools.
- Identifies issues for escalation and communication, and provide solutions to the business and technology stakeholders.
- Analyzes complex situations and trends to anticipate and solve incident, problem, and change management in support of full stack technology systems, applications, or infrastructure.
- Coordinates the resolution of critical Major Incidents utilizing technical and business resources in 24x7x365 enterprise organization.
- Drives Major Incidents to resolution authoritatively and confidently.
- Sends executive communications to a global audience giving details of the incident and impacts to the business, including next steps and root cause analysis.
- Assists with root cause analysis for all critical Major incidents and driving resolution to the issues.
- Partners with peers to assist in coordination and identification of “Air Traffic Control” across the various technical estates during the incident.
- Deals with change and problems related to Incident and overall Production Management.
Required qualifications, capabilities, and skills
- Bachelor’s degree in Computer Science/Information Systems/Engineering or related disciplines
- 3+ years of experience or equivalent expertise troubleshooting, resolving, and maintaining information technology services
- Demonstrated knowledge of applications or infrastructure in a large-scale technology environment both on premises and public cloud
- Experience in observability and monitoring tools and techniques
- Exposure to processes in scope of the Information Technology Infrastructure Library (ITIL) framework
Preferred qualifications, capabilities, and skills
- Experience in Incident Management tools, specifically Service Now
- Experience with one or more general purpose programming languages and/or automation scripting
- AWS Cloud Practitioner
- Working understanding of public cloud