Job responsibilities
- Develop, communicate, and present results to Senior leadership both verbally and in writing.
- Lead continuous improvements in areas such as processes, runbooks, automation, monitoring, postmortems, and reliability improvements.
- Lead Incident, change, and Problem management, including leading calls, developing reporting and analytics, defining Service level objectives, and vendor management
- Create lasting relationships with our clients, customers, and stakeholders.
- Assist with providing peers or team feedback on how to upskill for employee development.
- Lead and or participate in the Strategy of the organization, developing, and tracking goals.
- Analyze, track, and manage incident management for day-to-day operations. Summarize findings to leadership to hold peers in other teams accountable for delivery dates.
- Ensure the quality of our support is meeting our customers’ expectations. Manages any escalations and find ways to create self-service opportunities for our clients.
- Drive results around continued improvement of our Network support systems, monitoring, automation, processes, documentation, runbooks, and training.
Required qualifications, capabilities, and skills
- Minimum 7 years of experience in a senior leadership role to lead an engineering team, SRE, or operations team
- Bachelor’s Degree in Computer Science / Information Systems / Engineering or related disciplines
- Expertise in multiple Infrastructure technologies below. Data centers – Cisco (ACI) or VXLAN with Cisco and or Arista; Server load balancing with F5 – Local or Global traffic management; Firewalls -Fortinet ;Presenting metrics and accomplishments using data analytics; Network management and tooling – SevOne, Splunk, Cisco Nexus Dashboard, and or DNA (Digital Network Architecture), SNMP (Simple Network Management Protocol); Able to read and explain TCP/IP traces
- Ability to organize work for teams of employees so there is clearly documented accountability.
- Strong understanding of infrastructure components and how they are tracked and managed in various systems.
- Facilitate Post Mortem meetings and document Problem next steps.
- Experience with Data analytics, reporting, and looking for patterns in incidents.
- Technological, Organizational, and/or Operational change management.
- Superior communication skills to lead major incidents with clients, engineers, and vendors.
- Proven ability to drive results with Agile methodologies and Jira(Align).
Preferred qualifications, capabilities and skills
- SDwan solutions or Service Provider level Wan experience.
- CI/CD pipelines, Ansible, & Python
- Experience with AWS, Azure, or GCP operations.