Expoint – all jobs in one place
The point where experts and best companies meet
Limitless High-tech career opportunities - Expoint

Nvidia Senior AI System Engineer 
United States, Texas 
92009857

31.08.2025
US, CA, Santa Clara
US, OR, Hillsboro
US, WA, Redmond
time type
Full time
posted on
Posted 3 Days Ago
job requisition id

What You’ll Be Doing:

  • Optimize inference deployment by pushing the Pareto frontier of Accuracy, Throughput and Interactivity at datacenter scale

  • Develop high-fidelity performance models to prototype emerging algorithmic techniques & hardware optimizations to drive model-hardware co-design for Generative AI.

  • Prioritize features to guide future software and hardware roadmap based on detailed performance modeling and analysis

  • Model end-to-end performance impact of emerging GenAI workflows - such as Agentic Pipelines, Inference-time compute scaling, etc. – to understand future datacenter needs

  • This position requires you to keep up with the latest DL research and collaborate with diverse teams, including DL researchers, hardware architects, and software engineers.

What we need to see:

  • A Master's degree (or equivalent experience) in Computer Science, Electrical Engineering or related fields.

  • 3+ years of hands-on experience in system evaluation of AI/ML workloads or performance analysis, modeling and optimizations for AI

  • Strong background in computer architecture, roofline modeling, queuing theory and statistical performance analysis techniques.

  • Solid understanding of ML fundamentals, model parallelism and inference serving techniques.

  • Proficiency in Python (and optionally C++) for simulator design and data analysis.

  • Experience with GPU computing (CUDA)

  • Experience with deep learning frameworks like PyTorch, TRT-LLM, VLLM, SGLang

  • Growth mindset and pragmatic “measure, iterate, deliver” approach.

Ways to Stand Out from the Crowd:

  • Comfortable defining metrics, designing experiments and visualizing large performance datasets to identify resource bottlenecks.

  • Proven track record of working in cross-functional teams, spanning algorithms, software and hardware architecture.

  • Ability to distill complex analyses into clear recommendations for both technical and non-technical stakeholders.

You will also be eligible for equity and .