Expoint – all jobs in one place
מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר
Limitless High-tech career opportunities - Expoint

Nvidia Senior Software Engineer Compute Infrastructure Robotics Research 
United States, Texas 
305038702

Today
US, CA, Santa Clara
US, Remote
time type
Full time
posted on
Posted 6 Days Ago
job requisition id

What you’ll be doing:

  • Develop mechanisms to launch and manage large compute jobs to support multi-modal foundation models for robotics. These will include data jobs, training jobs, evaluation jobs, and so forth.

  • Optimize GPU and cluster utilization for efficient model training, fine-tuning, and evaluation on massive datasets.

  • Develop robust observability tools and procedures for this compute infrastructure to ensure reliability and performance.

  • Collaborate with researchers to integrate innovative compute technologies into scalable training and eval pipelines.

What we need to see:

  • Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent experience

  • 5+ years of full-time industry experience in large-scale MLOps and AI infrastructure

  • Experience with ML frameworks like PyTorch, JAX, or TensorFlow.

  • Deep understanding of Kubernetes, experience with Ray

  • Experience with data frameworks and standards like SQL, Apache Spark, LanceDB

  • Experience of GPU acceleration and CUDA programming

  • Strong programming skills in Python and a high-performance language such as C++ for efficient system development.

Ways to stand out from the crowd:

  • Master’s or PhD’s degree in Computer Science, Robotics, Engineering, or a related field

  • Demonstrated Tech Lead experience, coordinating a team of engineers and driving projects from conception to deployment

  • Deep background at building and operating large-scale data infrastructure

  • Strong experience and curiosity in frontier AI research

You will also be eligible for equity and .