Expoint - all jobs in one place
Finding the best job has never been easier
Limitless High-tech career opportunities - Expoint

Nvidia Principal Deep Learning Software Engineer LLM Performance 
United States, Texas 
505217568

Today
US, CA, Santa Clara
US, CA, Remote
time type
Full time
posted on
Posted 6 Days Ago
job requisition id

What you'll be doing:

  • Performance optimization, analysis, and tuning of LLM, VLM and GenAI models for DL inference, serving and deployment in NVIDIA/OSS LLM frameworks.

  • Scale performance of LLM models across different architectures and types of NVIDIA accelerators.

  • Scale performance for max throughput, minimum latency and throughput under latency constraints.

  • Contribute features and code to NVIDIA/OSS LLM frameworks, inference benchmarking frameworks, TensorRT, and Triton.

  • Work with cross-collaborative teams across generative AI, automotive, image understanding, and speech understanding to develop innovative solutions.

What we need to see:

  • Bachelors, Masters, PhD, or equivalent experience in relevant fields (Computer Engineering, Computer Science, EECS, AI).

  • At least 12 years of relevant software development experience.

  • Excellent Python/C/C++ programming, software design and software engineering skills

  • Experience with a DL framework like PyTorch, JAX, TensorFlow.

Ways to stand out from the crowd:

  • Prior experience with a LLM framework or a DL compiler in inference, deployment, algorithms, or implementation

  • Prior experience with performance modeling, profiling, debug, and code optimization of aDL/HPC/high-performanceapplication

  • Architectural knowledge of CPU and GPU

  • GPU programming experience (CUDA or OpenCL)

You will also be eligible for equity and .