Expoint – all jobs in one place
Finding the best job has never been easier
Limitless High-tech career opportunities - Expoint

Nvidia Senior Software Engineer Machine Learning Inference 
United States, Texas 
994831017

Today
US, CA, Santa Clara
US, CA, Remote
time type
Full time
posted on
Posted 15 Days Ago
job requisition id

What you’ll be doing:

  • Design, develop and optimize NVIDIA TensorRT and TensorRT-LLM to supercharge inference applications for datacenter, workstations, and PCs.

  • Develop software in C++, Python, and CUDA for seamless and efficient deployment of state-of-the-art LLMs and Generative AI models.

  • Collaborate with deep learning experts and GPU architects throughout the company to influence Hardware and Software design for inference.

What we need to see:

  • BS, MS, PhD or equivalent experience in Computer Science, Computer Engineering or a related field.

  • 8+ years of software development experience on a large codebase or project.

  • Strong proficiency in C++ (required), Rust or Python programming languages.

  • Experience in developing Deep Learning Frameworks, Compilers, or System Software.

  • Excellent problem-solving skills and passion to learn and work effectively in a fast-paced, collaborative environment.

  • Strong communication skills and the ability to articulate complex technical concepts.

Ways to stand out from the crowd:

  • Experience in developing inference backends and compilers for GPUs.

  • Knowledge of Machine Learning techniques and GPU programming with CUDA or OpenCL.

  • Background in working with LLM inference frameworks like TensorRT-LLM, vLLM, SGLang.

  • Experience working with deep learning frameworks like TensorRT, PyTorch, JAX.

  • Knowledge of close-to-metal performance analysis, optimization techniques, and tools.

You will also be eligible for equity and .