Share
What you’ll be doing:
Understand, analyze, profile, and optimize AI training workloads on state-of-the-art hardware and software platforms.
Guide development of future generations of artificial intelligence accelerators and systems.
Develop detailed performance models and simulator infrastructure for computing systems accelerating AI training, and implement and evaluate hardware feature proposals.
Collaborate across the company to guide the direction of machine learning at NVIDIA; spanning teams from hardware to software and research to production.
Drive HW/SW co-design of NVIDIA’s full deep learning platform stack, from silicon to DL frameworks.
What we'd like to see:
PhD in CS, EE or CSEE and 5+ years; or MS (or equivalent experience) and 8+ years of relevant work experience.
Strong background in computer architecture, with a proven track record of architecting features in shipping high-performance processors.
Background in artificial intelligence and large language models, in particular training algorithms and workloads.
Experience analyzing and tuning application performance on state-of-the-art hardware.
Experience with processor and system-level performance modelling, simulation, and evaluation before silicon exists.
Programming skills in C++ and Python.
Familiarity with GPU computing across all layers of the AI stack, from DL frameworks like PyTorch down to CUDA.
You will also be eligible for equity and .
These jobs might be a good fit