The Major Incident Manager / Outage Manager is a critical role responsible for managing and resolving high-impact incidents that affect the services and operations of our organization. The individual in this position will be expected to lead and coordinate the response to major incidents, ensuring minimal disruption to business operations and maintaining communication with all stakeholders throughout the incident lifecycle. The ideal candidate will have a strong background in IT service management, exceptional problem-solving skills, and the ability to work under pressure. The Major Incident Manager / Outage Manager is a vital role that requires a dedicated professional capable of navigating high-stress situations and delivering results. If you have the experience, skills, and certifications required, and are ready to take on this challenging yet rewarding position, we encourage you to apply.
Key responsibilities
- During a major incident work collaboratively across IT Services to ensure that Major Incidents are identified consistently, managed efficiently, and service restored as quickly as possible, minimizing the impact to the business and managing the lifecycle of all Major Incidents.
- Lead the process of Major Incident management from detection to resolution and post-incident review, ensuring timely participation of stake holders and quick resolution minimizing the impact to the business
- Quickly identifying and classifying major incidents based on severity and impact to the business.
- Mobilizing and leading the incident response team, which may include IT support staff, developers, network engineers, external vendors, management, business and other relevant personnel.
- Communicate effectively with all levels of management and stakeholders during major incidents.
- Keeping all stakeholders, including management, IT teams, and potentially customers, informed about the status of the incident, actions being taken, and expected resolution times. Communicate effectively with all levels of management and stakeholders during major incidents
- Write high quality incident/outage communications with zero defect
- Ensuring that the necessary resources and personnel are available and allocated efficiently to resolve the incident as quickly as possible.
- Acting as a bridge between technical teams and business stakeholders to ensure clear understanding and communication of technical issues and their business implications.
- Knowing when to escalate the incident to higher levels of management or to external vendors for additional support.
- Keeping detailed records of the incident's timeline, actions taken, resources used, and any communications sent out during the incident management process, in accordance with EY’s Incident Management policies and procedures.
- Conduct thorough analysis after the incident has been resolved to identify root causes, document lessons learned, and develop improvement plans to prevent recurrence.
- Continuously reviewing and improving incident management processes and procedures based on insights gained from past incidents and industry best practices.
- Providing training and guidance to the incident management team and other relevant staff to ensure they are prepared to respond effectively to future incidents
- Develop and maintain the Major Incident Management Plan and ensure it is followed during incidents
This individual should possess a combination of technical skills, analytical abilities, and leadership attributes
To qualify for the role you must have
- Proficiency in ITIL (Information Technology Infrastructure Library) practices, particularly in incident management and response procedures.
- A strong understanding of IT systems, networks, and applications to effectively communicate with technical teams and understand the implications of incidents.
- Excellent verbal and written communication skills to coordinate with internal teams, stakeholders, and possibly customers during major incidents.
- Ability to lead and motivate a team under high-pressure situations, ensuring efficient incident resolution and maintaining team morale.
- Demonstratable ability to lead bridge calls related to high impact incidents and collaborate with Stakeholders at all levels with an aim to restore services as soon as possible
- Strong analytical and problem-solving skills to quickly assess incident reports, identify root causes, and determine the most effective course of action.
- Capable of making critical decisions rapidly, often with incomplete information, to mitigate the impact of incidents.
- Resilience and the ability to remain calm and focused under pressure, as major incidents can be highly stressful and demanding.
- Proficient in organizing resources, managing time effectively, and prioritizing tasks to handle multiple incidents simultaneously if needed.
- Understanding of customer service principles to ensure that communication and incident resolution are handled in a way that maintains customer trust and satisfaction.
- commitment to learning from incidents to improve processes and prevent future occurrences, including conducting post-incident reviews and implementing lessons learned
- Good Understanding of ServiceNow or similar ITSM tool, Incident and Major Incident Management module
Ideally, you’ll also have
- Excellent English language skills (verbal and written),
- Excellent communication, collaboration and basic project management skills
- Good presentation skills with ability to present material clearly and concisely
- Excellent awareness of different cultures and working practices across the regions
- Proven experience in working in, and basic management of, diverse and geographically dispersed teams
What we look for
- A minimum of 5+ years of industry experience in a Major Incident Management or Outage Management role.
- Flexible work schedule, including availability to work outside of standard business hours, on weekends, and during holidays as needed to manage and resolve major incidents.
- Willingness to be on-call to respond to high-priority incidents as they occur.
- Bachelor’s degree in computer science, Information Technology, or a related field.
- ITIL Foundation certification required; intermediate ITIL certification preferred
What we offer
As part of this role, you'll work in a highly integrated, global team with the opportunity and tools to grow, develop and drive your career forward. Here, you can combine global opportunity with flexible working. The EY benefits package goes above and beyond too, focusing on your physical, emotional, financial and social well-being. Your recruiter can talk to you about the benefits available in your country. Here’s a snapshot of what we offer:
- Continuous learning: You’ll develop the mindset and skills to navigate whatever comes next.
- Success as defined by you: We’ll provide the tools and flexibility, so you can make a meaningful impact, your way.
- Transformative leadership: We’ll give you the insights, coaching and confidence to be the leader the world needs.
- Diverse and inclusive culture: You’ll be embraced for who you are and empowered to use your voice to help others find theirs.
EY exists to build a better working world, helping to create long-term value for clients, people and society and build trust in the capital markets.