Job responsibilities
- Coordinates the resolution of critical Major Incidents utilizing technical and business resources in 24x7x365 enterprise organization.
- Leads Major Incidents to resolution authoritatively and confidently; in the heat of the moment may be called upon to make decisions on behalf of Employee Platforms that could have production impacting implications.
- Sends executive communications to a global audience giving details of the incident and impacts to the business, including next steps and root cause analysis.
- Leads large senior management conference calls to advise Lines of business (LOBs) of major incidents occurring in the environment.
- Performs root cause analysis for all critical Major incidents and driving resolution to the issues.
- Ensures that the firm’s monitoring and automation platforms are actively leveraged to drive continuous improvement of business data and identifying systemic issues and eliminate them from the root level.
- Leads teams of technologists that provide end-to-end application or infrastructure service delivery for the successful business operations of Employee Platforms.
- Executes policies and procedures that ensure operational stability and availability.
- Monitors production environments for anomalies, address issues, and drive evolution of utilization of standard observability tools.
- Escalates and communicate issues and solutions to the business and technology stakeholders, actively participating from incident resolution to service restoration.
- Leads incident, problem, and change management in support of full stack technology systems, applications, or infrastructure.
Required qualifications, capabilities, and skills:
- Bachelor’s degree in Computer Science/Information Systems/Engineering or related disciplines
- 5+ years of experience or equivalent expertise troubleshooting, resolving, and maintaining information technology services.
- Experience managing applications or infrastructure in a large-scale technology environment both on premises and public cloud.
- Proficient in observability and monitoring tools and techniques.
- Experience executing on processes in scope of the Information Technology Infrastructure Library (ITIL) framework.
- Expertise in Service Now, including building reports and dashboards.
Preferred qualifications, capabilities, and skills
- Working knowledge in one or more general purpose programming languages and/or automation scripting.
- Practical experience with public cloud.
- Agile experience, with JIRA and Align.