

What You’ll Be Doing:
Lead and grow a team responsible for specialized kernel development, runtime optimizations, and frameworks for LLM inference.
Drive the design, development, and delivery of production inference software, targeting NVIDIA's next-generation enterprise and edge hardware platforms.
Integrating cutting-edge technologies developed at NVIDIA and offering an intuitive developer experience for LLM deployment.
Lead software development execution, with responsibility for project planning, milestone delivery, and cross-functional coordination.
What We Need to See:
MS, PhD, or equivalent experience in Computer Science, Computer Engineering, AI, or a related technical field.
7+ overall years of overall software engineering experience, including 3+ years of technical leadership experience.
Proven ability to lead and scale high-performing engineering teams, especially across distributed and cross-functional groups.
Strong background in C++ or Python, with expertise in software design and delivering production-quality software libraries.
Demonstrated expertise in large language models (LLM) and/or vision language models (VLM).
Ways to Stand Out from the Crowd:
Deep understanding of GPU architecture, CUDA programming, and system-level performance tuning.
Background in LLM inference or working with frameworks such as TensorRT-LLM, vLLM, or SGLang.
Passion for building scalable, user-friendly APIs and enabling developers in the AI ecosystem.
Have a proven track record of growing and managing a team that encourages idea sharing, empowers team members, and provides opportunities for professional growth.
You will also be eligible for equity and .
משרות נוספות שיכולות לעניין אותך