Expoint – all jobs in one place
מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר
Limitless High-tech career opportunities - Expoint

Nvidia Senior Software Engineer Compute Infrastructure Robotics Research 
United States, Texas 
362105695

Today
US, CA, Santa Clara
US, Remote
time type
Full time
posted on
Posted 13 Days Ago
job requisition id

What you’ll be doing:

  • Develop mechanisms to launch and manage large compute jobs to support multi-modal foundation models for robotics. These will include data jobs, training jobs, evaluation jobs, and so forth.

  • Optimize GPU and cluster utilization for efficient model training, fine-tuning, and evaluation on massive datasets.

  • Develop robust observability tools and procedures for this compute infrastructure to ensure reliability and performance.

  • Collaborate with researchers to integrate innovative compute technologies into scalable training and eval pipelines.

What we need to see:

  • Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent experience

  • 12+ years of full-time industry experience in large-scale MLOps and AI infrastructure

  • Experience with ML frameworks like PyTorch, JAX, or TensorFlow.

  • Deep understanding of Kubernetes, experience with Ray

  • Experience with data frameworks and standards like SQL, Apache Spark, LanceDB

  • Experience of GPU acceleration and CUDA programming

  • Strong programming skills in Python and a high-performance language such as C++ for efficient system development.

Ways to stand out from the crowd:

  • Master’s or PhD’s degree in Computer Science, Robotics, Engineering, or a related field

  • Demonstrated Tech Lead experience, coordinating a team of engineers and driving projects from conception to deployment

  • Deep background at building and operating large-scale data infrastructure

  • Strong experience and curiosity in frontier AI research

You will also be eligible for equity and .