Expoint - all jobs in one place

מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר

Limitless High-tech career opportunities - Expoint

Nvidia Senior Deep Learning Architect LLM Inference 
United States, Texas 
289876131

18.08.2024

What you'll be doing:

  • You will be responsible for characterizing the latest LLMs and inference servers like vLLM and DeepSpeed-MII to ensure that TRT-LLM maintains its leadership position.

  • Join forces with the performance marketing team to build engaging content, including blog posts and other written materials, that highlight TRT-LLM's outstanding achievements.

  • Collaborate with engineers from AI startup companies to debug and establish standard methodologies.

  • Profile GPU kernel-level performance to identify hardware and software optimization opportunities.

  • Develop profiling and analysis software tools that can keep up with the rapid pace of network scaling.

  • Contribute to deep learning software projects, such as PyTorch, vLLM, and LLMPerf, to drive advancements in the field.

  • Verify that TRT-LLM's performance meets expectations for new GPU product launches.

  • Collaborate across the company to guide the direction of inference serving, working with software, research, and product teams to ensure world-class performance.

What we need to see:

  • Master's or PhD degree in Computer Science, Electrical Engineering, or related fields, or equivalent experience.

  • 3+ years of relevant / meaningful work experience.

  • Detailed knowledge of deep learning inference serving, PyTorch programming and profiling, and compiler optimizations.

  • Proficiency in C++ and Python programming languages and familiarity with CUDA.

  • Experience with LLMs and their performance challenges and opportunities.

  • Solid understanding of CPU and GPU microarchitecture and performance characteristics.

  • Experience with complex software projects like compilers, operating systems, or frameworks.

  • Good written and verbal communication skills and the ability to work independently and collaboratively in a fast-paced environment.

Ways to stand out from the crowd:

  • Demonstrate a drive to continuously improve software and hardware performance.

  • Showcase a proven history of developing workplace efficiency tools.

  • Experience with database and visualization tools like D3.js will set you apart.

You will also be eligible for equity and .