Is it your desire to be part of a premier group of technology specialists that can make a difference in IBM Cloud? Individuals performing this role optimize the availability of IBMs Cloud infrastructure, systems, and services to meet the commitments IBM has made to its clients in a cost[1]effective manner. Availability Managers (AVMs) act as Incident Commanders leading and driving resolution of Critical Impacting Events (CIEs), and Problem Management experts for IBM Cloud’s platforms and services.
Key Responsibilities:- Lead and steer Incident Management driving to Service resolution.
- Perform situational appraisal, assess CIE severity/priority and appraise User impact extent
- from client-facing teams
- Mobilize and coordinate recovery efforts across necessary support functions, personnel and
- leadership to expedite end-to-end troubleshooting, fault domain isolation and urgent
- resolution.
- Maintain a multi-tiered Plan of action tracking time-bound deliverables and actions.
- Maintain a heightened level of sensitivity for future / potential business impact and risk tocustomers
- Record and maintain incident record with recovery process noting areas of improvement.
- Provide clear client and internal leadership and stakeholders centric communications.
- Train, coach, and review proper Problem Management with the problem owners.
- Identify areas of improvement for problem owners to include to target problem resolution
- and identify additional areas to the overall time to resolution.
- Utilize tooling and technical knowledge to assure services and components are designed and
- delivered to meet their availability targets.
- Provide a holistic view of the clouds environment and make recommendations to improve
- overall service.
- Identify and/or lead Service Improvement Programs (SIP) for chronic conditions
- Maintain focus on time-bound deliverables and actions
- Focuses on individual/team objectives and development of professional effectiveness.
- Lead strategic areas of importance to the service team.
- Recognized as incident and problem management thought leaders and subject matter experts
Required Technical and Professional Expertise
- Enterprise incident command and control.
- Understanding of industry methodologies (5 Whys Root Cause Methodology, Failure Modes and Effects Analysis, Kepner-Tregoe, etc.)
- Fundamental and/or working knowledge of Cloud technologies
- Knowledge and experience working with any number of enterprise technologies including but not limited to Compute (Server/OS), Database, Network, Storage, Middleware, Perimeter Security (Firewall, VPN, Host / Application Security)
- Working knowledge and experience with Service Now
- ITIL V4 proficient.
Preferred Technical and Professional Expertise
- ITIL V4 certification.
- Kepner-Tregoe certification
- Working knowledge of Financial Services
Soft skills / abilities required for you to be successful in this role include:
- Critical Thinking, Problem Solving, Active Listening and Deductive Reasoning
- Leadership – Capacity, Capability and Competency (“Leaders inspire others to take action”) Command and Control presence
- Ability to “command the room” in a professional manner
- Ability and confidence to act decisively and take constructive feedback onboard
- Exercise influence over others across various levels of the organization (manage up, down and across)
- Ability to multi-task effectively and make sound judgments in a dynamic and high impact setting
- Capable of constructively challenging assumptions and information that does not reflect accurately on the situation at hand
- Excellent phone / video presence and written / verbal communication skills
- Strong relationship management and client-centric mindset
- Ability to work on-call rotations, during the business time-frames and occasional weekends, Holiday’s and after hours.
- Communicates fluently in English verbally and written