Expoint – all jobs in one place
המקום בו המומחים והחברות הטובות ביותר נפגשים
Limitless High-tech career opportunities - Expoint

Nvidia Director Global Network Reliability Engineering 
United States, California 
262667862

Today
US, CA, Santa Clara
time type
Full time
posted on
Posted 4 Days Ago
job requisition id

What you will be doing:

  • Your main focus will be maturing the current support model and processes to a more data driven, automated, SRE model.

  • Build an in-house team of reliability experts for networking support and operations from the existing outsourced SMES , providing leadership, direction, and strategy for a growing team.

  • Set the technical vision, strategy, and roadmap for network operations in partnership with the key infrastructure and partner teams.

  • Work across Network Architecture, Network engineering and partner well to establish run books, regular training sessions and ensure we build the network to be self-healing.

  • Work very well in understanding RCAs from events and incidents and work with our AI operations to enrich our observability tooling for better full stack view of the network to applications.

  • Influence the architecture of the Nvidia networks both on-prem and in the clouds.

What we need to see:

  • Bachelor’s degree in Computer Science, related technical field, or equivalent experience

  • Experience building and growing teams that are geographically distributed , appreciate local operations and bring in a global perspective, following standards.

  • Ability to do technical deep-dives into code, networking, operating systems, and storage, as well as being verbally and cognitively agile enough to hold your own in strategy discussions with NVIDIA’s executive team and peer SMEs

  • Ability to identify trends and promote solutions that solve challenges efficiently across multiple product areas

  • Excellent innovative thinking, collaboration, and problem-solving skills.

  • 12+ overall years of experience with system design, network architecture, network engineering, and network operations and 7+ years Leadership of experience

Ways to stand out from the crowd:

  • Experience transforming network operations using software driven methods

  • Experience in a Hyperscale Cloud Service Provider (public facing or not)

  • Knowledge of SRE principles (observability, SLOs, SLIs, logging, etc)

  • Knowledge of software interface design & documentation for less technical end-users

You will also be eligible for equity and .