What you will be doing:
Lead the architecture, design, and deployment of global-scale backbone and fabric for HPC, AI, and GPU computing clusters.
Develop high-performance data center fabric using Infiniband, high-throughput Ethernet, and related technologies.
Optimize carrier interconnects, backbone routing, and dark fiber deployments to ensure low latency and high reliability.
Partner with system, OS, GPU, and HPC teams to deliver scalable, highly available networks for extreme-performance workloads.
Implement network monitoring, telemetry, solving, and continuous performance improvement processes.
Drive technology selection, vendor engagement, and lifecycle management for backbone networking hardware and software.
Ensure security, compliance, and reliability across all backbone components to support sensitive compute loads and business requirements.
Collaborate with internal product managers develop NVIDIA on NVIDIA solutions
What we need to see:
MS or PhD in Electrical Engineering, Computer Science, Computer Engineering, Artificial Intelligence, Data Science, Mathematics, Statistics, or equivalent experience.
12+ years of experience in building, managing and supporting large scale hybrid networks, developing automation pipelines with Python, Ruby, Go or other languages used in infrastructure automation.
Expert in networking technologies: TCP/UDP, IPv4/IPv6, BGP/MP-BGP, VPN, L2 switching, EVPN, VxLAN, Segment Routing, MPLS, IS-IS, DWDM.
Experience automating SDN/NFV/NFVI Infrastructure
Experience using an automated configuration management system (Terraform, Chef, Puppet, Ansible, Salt, etc.)
You will also be eligible for equity and .
משרות נוספות שיכולות לעניין אותך