Lead Major Incident Support efforts, coordinating with cross-functional teams to ensure timely resolution and communication during critical network incidents.
Conduct Post-Mortem/Root Cause Analysis (RCA) on major incidents to identify underlying causes and implement corrective actions.
Monitor and maintain network infrastructure to ensure optimal performance and availability.
Troubleshoot and resolve complex network issues escalated from Tier-1 and Tier-2 support teams.
Perform network upgrades and maintenance activities, including hardware and software updates.
Manage the Permit-To-Operate (PTx) process, ensuring compliance with operational standards and requirements.
Engage with Product Engineering teams to align network operations with product development and enhancements.
Work with developers to create automation workflows that enhance network operations and efficiency.
Develop and maintain network documentation, including diagrams, configurations, and procedures.
Provide training and mentorship to team members, fostering a culture of continuous learning and improvement.
Participate in on-call rotation for after-hours support and emergency response.
Required qualifications, capabilities, and skills
Formal training or certification on Networking and Infrastructure concepts and 5+ years being technical leaders
Proven 10+ years of experience in managing Major Incident Support, with the ability to lead incident resolution efforts and communicate effectively with stakeholders.
Collaborate with cross-functional teams to design and implement network solutions.
Strong knowledge of network protocols, routing, and switching (e.g., BGP, OSPF, MPLS).
Experience with network monitoring and management tools (e.g., Sev-One, Splunk, Wireshark).
Proficiency in configuring and managing network devices (e.g., routers, switches, firewalls).
Strong Experience with Bluecoat Proxies, F5 Load Balancers, and Fortinet Firewalls.
Excellent problem-solving skills and the ability to work under pressure.