Finding the best job has never been easier
Share
What you'll be doing:
Research and Development:Explore and incorporate contemporary research on generative AI, agents, and inference systems into the NVIDIA LLM software stack.
Workload Analysis and Optimization:Conduct in-depth analysis, profiling, and optimization of agentic LLM workloads to significantly reduce request latency and increase request throughput while maintaining workflow fidelity.
System Design and Implementation:Design and implement scalable systems to accelerate agentic workflows and efficiently handle sophisticated datacenter-scale use cases.
Collaboration and Communication:
What we need to see:
BS, MS, PhD in Computer Science, Electrical Engineering, Computer Engineering, or a related field (or equivalent experience).
8+ years of experience in deep learning and deep learning systems design.
Proficiency in Python and C++ programming
Strong understanding of computer architecture, and GPU/parallel datacenter computing fundamentals.
Proven interest in analyzing, modeling, and tuning application performance.
Ways to stand out from the crowd:
Experience in building large-scale LLM inference systems, especially those involving compound AI.
Experience with processor and system-level performance modeling.
GPU programming experience with CUDA or OpenCL.
You will also be eligible for equity and .
These jobs might be a good fit