What you will be doing:
Understand, analyze, profile, and optimize AI training workloads on new hardware and software platforms, identifying fundamental performance limiters.
Prioritize and solve performance issues across all new neural networks, keeping the big picture of training performance on GPUs in mind.
Implement production-quality software across multiple layers of NVIDIA's deep learning platform stack, from drivers to DL frameworks.
Build and support NVIDIA submissions for MLPerf Training benchmarks.
Implement key DL training workloads in NVIDIA's proprietary processor and system simulators to enable future architecture studies.
Develop tools to automate workload analysis, optimization, and other critical workflows.
What we want to see:
PhD in CS, EE or CSEE (or equivalent experience) with 5+ years of relevant experience; or MS with 8+ years of experience.
Strong background in deep learning and neural networks, particularly in training.
Solid understanding of computer architecture and familiarity with GPU fundamentals.
Proven background in analyzing and tuning application performance.
Proven experience with processor and system-level performance modeling.
Proficiency in programming with C++, Python, and CUDA.
You will also be eligible for equity and .
משרות נוספות שיכולות לעניין אותך