The point where experts and best companies meet

Nvidia Deep Learning Performance Architect Intern
China, Shanghai
400116155

24.06.2024

What you'll be doing:

Analyze the performance of various machine learning/DL algorithms on existing/new architectures
Identify bottlenecks and propose creative solutions to improve them
Develop high performance operators on NVIDIA GPUs for cuBLAS, TensorRT, cuDNN, cuSparse and cuTensor libraries
Design and develop software for shipping and testing the GPU operators
Build scalable automation for testing, integration, and release processes for publicly distributed deep learning libraries
Configure, maintain, and build upon deployments of industry-standard tools (e.g., Kubernetes, Jenkins, Docker, CMake, Gitlab, Jira, etc)

What we need to see:

Pursuing a B.S., M.S., or PhD degree in computer science (or similar)
Strong programming skills in C/C++ development
Familiar with GPU programming model and CUDA
Good understanding about AI compilation technologies and experience with MLIR, TVM development
Excellent problem solving skills, good communication and teamwork

These jobs might be a good fit

Nvidia Deep Learning Performance Architect Intern China, Shanghai

Get to the top of the "yes list" with a standout CV!

CREATE CV