Expoint – all jobs in one place
מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר
Limitless High-tech career opportunities - Expoint

Nvidia AI Computing Software Development Engineer TensorRT-LLM 
Taiwan, Taipei 
106622889

Yesterday
Taiwan, Taipei
Taiwan, Hsinchu
time type
Full time
posted on
Posted 29 Days Ago
job requisition id

What you'll be doing:

  • Craft and develop robust inference software that can be scaled to multiple platforms for functionality and performance

  • Performance analysis, optimization, and tuning for Large Language Models (LLMs)

  • Closely follow academic developments in the field of artificial intelligence and feature update TensorRT-LLM

  • Provide feedback into the architecture and hardware design and development

  • Collaborate across the company to guide the direction of deep learning inference, working with software, research and product teams

  • Publish key results in scientific conferences

What we need to see:

  • Master or higher degree in Computer Engineering, Computer Science, Applied Mathematics or related computing focused degree (or equivalent experience)

  • 3+ years of relevant software development experience.

  • Excellent Python programming skills, software design, and software engineering skills

  • Awareness of the latest developments in LLM architectures and LLM inference techniques

  • Experience working with deep learning frameworks like PyTorch and HuggingFace

  • Proactive and able to work without supervision

  • Excellent written and oral communication skills in English

Ways to stand out from the crowd:

  • Prior experience with a LLM inference framework (TensorRT-LLM, SGLang, vLLM, lamma.cpp, MLC-LLM, etc.) or a DL compiler in inference, deployment, algorithms, or implementation

  • Prior experience with performance modeling, profiling, debug, and code optimization of aDL/HPC/high-performanceapplication

  • Excellent C/C++ programming and software design skills, including debugging, performance analysis, and test design.

  • Architectural knowledge of CPU and GPU

  • GPU programming experience (CUDA or OpenCL)