Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Nvidia Senior Deep Learning Architect 
United States, Texas 
795848171

02.05.2024

What You'll Be Doing:

  • The software architecture group at NVIDIA has openings for Deep Learning Architect. We scale the DNN models and frameworks to the systems with hundreds of thousands of nodes.

  • This position involves tracking advancements in deep learning frameworks, models, and the latest deep learning research, both internally and externally. You will then translate these findings into scaling and networking requirements.

  • Collaborate across NVIDIA to provide guidance on scaling deep learning frameworks and models, working with teams spanning from hardware to software, and research to production.

  • Understand, analyze, profile, and optimize deep learning training workloads using new hardware and software platforms.

  • Scale deep learning training workloads on systems equipped with deep learning accelerators, InfiniBand, and ROCE networks.

  • Provide guidance for the development of future generations of deep learning processors and network hardware.

What We Need to See:

  • A Ph.D., Masters, or BS in Computer Science (CS), Electrical Engineering (EE), Computer Science and Electrical Engineering (CSEE), or closely related field or equivalent experience.

  • 5+ years of experience in DNNs, Scaling of DNNs, Parallelism of DNN frameworks, or deep learning training workloads.

  • Background in machine learning and neural networks, particularly in the area of training.

  • Experience in analyzing and optimizing application performance on cutting-edge hardware.

  • Deep understanding of parallelism techniques including Data Parallelism, Pipeline Parallelism, Tensor Parallelism, and FSDP.

  • Proficiency in developing code for one or more deep neural network (DNN) training frameworks, such as Caffe, TensorFlow, or Torch.

  • Strong programming skills in C++ and Python.

  • Familiarity with GPU computing, including CUDA and OpenCL and familiarity with InfiniBand and RoCE networks.

Ways to Stand Out from the Crowd:

  • Prior contributions to one or more DNN training frameworks as part of your previous work experience.

  • Deep understanding and contributions to the scaling of Large-Scale Language Models.

You will also be eligible for equity and .