Expoint – all jobs in one place
מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר
Limitless High-tech career opportunities - Expoint

Nvidia Senior GenAI Algorithms Engineer — Post-Training Optimizations 
United States, California 
376667705

Today
US, CA, Santa Clara
time type
Full time
posted on
Posted 28 Days Ago
job requisition id

What you’ll be doing:

  • Design and build modular, scalable model optimization software platforms that deliver exceptional user experiences while supporting diverse AI models and optimization techniques to drive widespread adoption.

  • Explore, develop, and integrate innovative deep learning optimization algorithms (e.g., quantization, speculative decoding, sparsity) into NVIDIA's AI software stack, e.g., TensorRT Model Optimizer, NeMo/Megatron, and TensorRT-LLM.

  • Construct and curate large problem specific datasets for post-training, finetuning, and reinforcement learning.

  • Deploy optimized models into leading OSS inference frameworks and contribute specialized APIs, model-level optimizations, and new features tailored to the latest NVIDIA hardware capabilities.

  • Partner with NVIDIA teams to deliver model optimization solutions for customer use cases, ensuring optimal end-to-end workflows and balanced accuracy-performance trade-offs.

  • Drive continuous innovation in deep learning inference performance to strengthen NVIDIA platform integration and expand market adoption across the AI inference ecosystem.

What we need to see:

  • Master’s, PhD, or equivalent experience in Computer Science, Artificial Intelligence, Applied Mathematics, or a related field.

  • 5+ years of relevant work or research experience in deep learning.

  • Strong software design skills, including debugging, performance analysis, and test development.

  • Proficiency in Python, PyTorch, and modern ML frameworks/tools.

  • Proven foundation in algorithms and programming fundamentals.

  • Strong written and verbal communication skills, with the ability to work both independently and collaboratively in a fast-paced environment.

Ways to stand out from the crowd:

  • Contributions to PyTorch, Megatron-LM, NeMo, TensorRT-LLM, vLLM, SGLang, or other machine learning training and inference frameworks.

  • Hands-on training, fine-tuning, or reinforcement learning experience on LLM or VLM models with large-scale GPU clusters.

  • Proficient in GPU architectures and compilation stacks, adept at analyzing and debugging end-to-end performance.

  • Familiarity with NVIDIA’s deep learning SDKs (e.g., NeMo, TensorRT, TensorRT-LLM).

You will also be eligible for equity and .