The point where experts and best companies meet
Share
What you will be doing:
Understand, analyze, profile, and optimize deep learning training and inference workloads on state-of-the-art hardware and software platforms.
Collaborate with researchers and engineers across NVIDIA, providing guidance on improving the performance of workloads.
Implement production-quality software across NVIDIA's deep learning platform stack.
Build tools to automate workload analysis, workload optimization, and other critical workflows.
What we want to see:
5+ years of experience.
MSc or PhD in CS, EE or CSEE or equivalent experience.
Strong background in deep learning and neural networks, both training & inference.
Deep understanding of computer architecture, and familiarity with the fundamentals of GPU architecture.
Proven experience analyzing, modeling and tuning application performance.
Programming skills in C++ and Python.
Ways to stand out from the crowd:
Experience with modern LLM inference frameworks (TRT-LLM, vLLM, Ollama, etc.)
Strong fundamentals in algorithms.
Experience with production deployment of Deep Learning models.
Proven experience with processor and system-level performance modelling.
GPU programming experience (CUDA or OpenCL) is a strong plus but not required.
These jobs might be a good fit