"VAST's data management vision is the future of the market." - Forbes
As a Technical Escalation Engineer, you will be responsible for monitoring and maintaining the health and performance of our fleet of installed clusters. You will work in a 24/7 network operations center-style environment, ensuring the availability, reliability, and security of services. This role involves real-time monitoring, incident detection, incident management, incident resolution, and clear written and verbal communication with other teams and stakeholders.
The Role
- Monitor clusters using internal monitoring tools to detect and troubleshoot issues promptly.
- Respond to alerts and incidents in a timely manner, following standard operating procedures (SOPs) and escalation processes.
- Perform initial investigation and diagnosis of problems, escalating complex issues to support.
- Document incidents, including their details, troubleshooting steps, and resolutions in the incident tracking system.
- Collaborate with other teams, including Support, R&D, Account teams, and customers to ensure effective incident resolution and communication.
- Conduct routine checks and audits to identify potential problems or vulnerabilities.
- Assist with the implementation of changes and updates to the infrastructure as directed by team leads.
- Assist with writing Root Cause Analysis documentation, and delivering to customers within prescribed timelines.
- Participate in shift-based work schedules, including nights, weekends, and holidays, to provide 24/7 coverage.
- Maintain up-to-date knowledge of VAST Data Platform technologies via prescribed hands-on training modules.
- Adhere to security protocols and ensure the confidentiality, integrity, and availability of network and system data.
- Provide excellent customer service to internal and external stakeholders during incident resolution and communication.
Requirements
- Proven experience as a NOC Operator or in a similar network monitoring role is preferred.
- Superior communication skills, both written and verbal, to interact with technical and non-technical stakeholders.
- Strong understanding of networking concepts, protocols, and technologies (TCP/IP, SNMP, DHCP, DNS, etc.).
- Ability to work independently and collaboratively in a team-based environment.
- Excellent problem-solving and analytical skills, with the ability to multitask effectively.
- Willingness to work in a 24/7 shift-based environment, including nights, weekends, and holidays. Option for Wednesday – Saturday shift, Sunday -Wednesday, or Monday – Friday.
- Detail-oriented and committed to maintaining accurate documentation.
- Demonstrated commitment to continuous learning and self-improvement