Expoint – all jobs in one place
מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר
Limitless High-tech career opportunities - Expoint

Nvidia Senior Software Engineer TensorRT Inference 
United States, California 
160934721

20.05.2025
US, CA, Santa Clara
time type
Full time
posted on
Posted 7 Days Ago
job requisition id

What you’ll be doing:

Key responsibilities include:

  • Design, develop and optimize NVIDIA TensorRT to achieve tightly coordinated and responsive inference applications for datacenter, workstations, and PCs.

  • Develop software in C++, Python, and CUDA to enable seamless and efficient deployment of state-of-the-art LLM and Generative AI models.

  • Collaborate with deep learning experts and GPU architects throughout the company to influence Hardware and Software strategy for inference.

What we need to see:

  • BS, MS, PhD or equivalent experience in Computer Science, Computer Engineering or a related field.

  • 8+ years of software development experience on a large codebase or project.

  • Strong proficiency in C++ and Python programming languages.

  • Experience with development of: Deep Learning Frameworks, Compilers, or System Software.

  • Foundational knowledge of Machine Learning techniques, or GPU optimizations.

  • Excellent problem-solving skills and the ability to learn and work effectively in a fast-paced, collaborative environment.

  • Strong communication skills and the ability to articulate complex technical concepts.

Ways to stand out from the crowd:

  • Background in developing inference backends and compilers for GPUs.

  • Knowledge of GPU programming and optimizations using CUDA or OpenCL.

  • Experience working with LLM inference frameworks like TRT-LLM, vLLM, SGLang.

  • Experience working with deep learning frameworks like TensorRT, PyTorch, JAX.

  • Knowledge of CUDA performance analysis, optimization techniques, and tools.

You will also be eligible for equity and .