Expoint – all jobs in one place
מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר
Limitless High-tech career opportunities - Expoint

Nvidia Senior Deep Learning Manager LLM Inference 
United States, California 
961878289

09.09.2025
US, CA, Santa Clara
time type
Full time
posted on
Posted 6 Days Ago
job requisition id

What you'll be doing

  • You will be responsible for managing a team that characterizes the latest LLMs and inference servers like TensorRT-LLM, vLLM, and SGLang to ensure that NVIDIA maintains its leadership position.

  • Join forces with the performance marketing team to build engaging content, including blog posts and other written materials, that highlight TensorRT-LLM's outstanding achievements.

  • Collaborate with engineers from AI startup companies to debug and establish standard methodologies.

  • Profile GPU kernel-level performance to identify hardware and software optimization opportunities.

  • Develop profiling and analysis software tools that can keep up with the rapid pace of network scaling.

  • Contribute to deep learning software projects, such as PyTorch, TRT-LLM, vLLM, and SGLang to drive advancements in the field.

  • Verify that TRT-LLM's performance meets expectations for new GPU product launches.

  • Collaborate across the company to guide the direction of inference serving, working with software, research, and product teams to ensure world-class performance.

What we need to see

  • Master's or PhD degree in Computer Science, Computer Engineering, or related fields, or equivalent experience.

  • 10+ overall years of software development experience and at least 3 years of management experience.

  • Detailed knowledge of deep learning inference serving, PyTorch programming, profiling, and compiler optimizations.

  • Proficiency in Python and C++ programming languages and familiarity with CUDA.

  • Experience with LLMs and their performance challenges and opportunities.

  • Solid understanding of CPU and GPU microarchitecture and performance characteristics.

  • Experience with complex software projects like frameworks, compilers, or operating systems.

  • Good written and verbal communication skills and the ability to work independently and collaboratively in a fast-paced environment.

Ways to stand out from the crowd

  • Demonstrate a drive to continuously improve software and hardware performance.

  • Showcase examples of novel use cases for agentic AI tools in the workplace.

  • Experience with database and visualization tools like D3.js will set you apart.

You will also be eligible for equity and .