Expoint – all jobs in one place
מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר
Limitless High-tech career opportunities - Expoint

Nvidia Senior ML Software Engineer 
United States, California 
282359416

02.07.2025
US, CA, Santa Clara
time type
Full time
posted on
Posted 15 Days Ago
job requisition id

If you care deeply about software craftsmanship, maintainability, and performance—and have hands-on experience building ML systems—this role is for you.

What you’ll be doing:

  • Develop and maintain high-quality, modular, and well-tested Python code for large-scale ML infrastructure. Seehttps://github.com/nvidia-cosmos

  • Design and optimize post-training, inference, and data processing pipelines used by ground breaking ML models.

  • Collaborate with research and product teams to bring ML systems from prototype to production.

  • Contribute to open-source projects and build internal tools that enable scalable AI experimentation.

  • Improve performance, reliability, and observability of large distributed systems.

  • Mentor and support teammates through design reviews, code reviews, and collaborative development.

What we need to see:

  • Expert-level proficiency in Python and a track record of delivering production-quality software.

  • Strong experience with PyTorch (or similar frameworks such as JAX or TensorFlow), especially in real-world applications.

  • Deep understanding of ML system design, training loops, data loaders, evaluation, and model serving.

  • Familiarity with containerization, CI/CD, and maintaining in production environments

  • Comfortable working with large codebases, building reusable libraries, and writing documentation and tests.

  • BSc degree or equivalent experience in Computer Science, Engineering, or a related field is preferred.

  • 5+ years of relevant software development experience

Ways to stand out in the crowd:

  • Contributions to open-source ML or Python infrastructure projects.

  • Background in scaling ML training and inference systems across GPUs as well as experience building libraries that wrap or extend PyTorch functionality.

  • Prior exposure to multimodal models or simulation environments (vision, language, audio).

  • Familiarity with NVIDIA’s GPU compute stack or high-performance computing clusters.

  • Experience with distributed computing tools like DDP, FSDP, ZeRO, or Ray.

You will also be eligible for equity and .