Share
What you’ll be doing:
Develop components of TensorRT-LLM, NVIDIA’s best-in-class library for optimizing LLM inference performance on NVIDIA GPUs.
Provide expert solutions to internal and external TensorRT-LLM users across GitHub and internal forums, and help manage TensorRT-LLM’s Open Source Software (OSS) repo on GitHub.
Collaborate across diverse teams of deep learning experts, GPU architects and DevOps engineers within NVIDIA, as well as the larger deep learning community, in an open-source development process.
What we need to see:
A Bachelor's, Master's, PhD or equivalent experience in Computer Science, Computer Engineering, Electrical Engineering or related field.
6+ years of software development experience.
Strong experience with Python.
Strong grasp of Machine Learning concepts, especially related to Large Language Models.
Excellent communication skills, and an aptitude for collaboration and teamwork.
Ways to stand out from the crowd:
Strong experience with C++11/C++14/C++17
Background with OSS development (prior contributions to related deep learning projects a big plus!).
Background in working with vLLM, TensorRT, PyTorch, JAX, or other ML frameworks.
Experience collaborating with external customers and end users to effectively disambiguate and resolve complex technical issues.
Experience in software performance benchmarking, profiling, and optimizations.
You will also be eligible for equity and .
These jobs might be a good fit