Expoint – all jobs in one place
Finding the best job has never been easier
Limitless High-tech career opportunities - Expoint

Nvidia Senior Software Development Engineer TensorRT-LLM 
United States, Texas 
681718503

10.11.2025
US, CA, Santa Clara
US, CA, Remote
time type
Full time
posted on
Posted 9 Days Ago
job requisition id

What you'll be doing:

  • Craft and develop robust inferencing software that can be scaled to multiple platforms for functionality and performance

  • Perform benchmarking, profiling, and system-level programming for GPU applications.

  • Closely follow academic developments in the field of artificial intelligence and feature update TensorRT

  • Provide code reviews, design docs, and tutorials to facilitate collaboration among the team.

  • Conduct unit tests and performance tests for different stages of the inference pipeline.

  • Collaborate across the company to guide the direction of machine learning inferencing, working with software, research and product teams

  • Write safe, scalable, modular, and high-quality (C++/Python) code for our core backend software for LLM inference.

  • Improve the usability of the TensorRT-LLM library and build systems (CMake)

What we need to see:

  • Masters or higher degree in Computer Engineering, Computer Science, Applied Mathematics or related computing focused degree (or equivalent experience)

  • 4+ years of relevant software development experience.

  • Excellent C/C++ programming and software design skills, including debugging, performance analysis, and test design.

  • Strong curiosity about artificial intelligence, awareness of the latest developments in deep learning like LLMs, generative and recommender models

  • Experience working with deep learning frameworks like TensorFlow and PyTorch

  • Self-starter who consistently takes initiative to drive projects forward

  • Excellent written and oral communication skills in English

Ways to stand out from the crowd:

  • Prior experience with a LLM framework or a DL compiler in inference, deployment, algorithms, or implementation

  • Prior experience with performance modeling, profiling, debug, and code optimization of aDL/HPC/high-performanceapplication

  • Architectural knowledge of CPU and GPU

  • GPU programming experience (CUDA or OpenCL)

You will also be eligible for equity and .