The point where experts and best companies meet
Share
What you’ll be doing:
Develop and refine infrastructure for scalable AI workloads optimized specifically for NVIDIA hardware and software.
Debug and troubleshoot AI jobs running on NVIDIA hardware, leveraging tools such as Nsight, CUDA, and NCCL, to enhance efficiency and performance.
Collaborate with internal teams, including research, engineering, and data science, to identify areas for efficiency improvement across NVIDIA's AI workloads.
Optimize AI frameworks, particularly PyTorch, to fully utilize NVIDIA hardware capabilities.
Create and maintain NVIDIA-specific performance analysis tools and scripts to monitor, assess, and improve AI infrastructure reliability and performance.
Utilize NVIDIA's profiling tools (e.g., Nsight Systems, Nsight Compute, DCGM) to gain deep insights into bottlenecks and optimize AI task performance on NVIDIA hardware.
What we need to see:
8+ years of experience in debugging and optimizing AI workloads on NVIDIA GPUs.
Bachelor's or Master's degree in Computer Science, Software Engineering or equivalent experience.
Strong programming experience with Python, especially using AI frameworks like PyTorch, with a focus on NVIDIA optimization.
Proficiency in C++, with experience in GPU programming, CUDA, and familiarity with NVIDIA-specific optimizations.
Knowledge of NVIDIA hardware architectures, including GPU memory hierarchy, CUDA cores, and performance tuning techniques.
Ways to stand out from the crowd:
Experience with large-scale distributed AI systems on cloud or on-premises infrastructure using NVIDIA hardware.
Extensive experience in Slurm workload manager and K8s.
Understanding of parallel programming and experience with multi-threaded or distributed computing on NVIDIA platforms.
You will also be eligible for equity and .
These jobs might be a good fit