Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Nvidia Deep Learning Performance Architect 
China, Shanghai 
670845494

24.06.2024
What you'll be doing:
  • Design and develop the architecture, interface and features of the GPU kernel library

  • Keep improving the quality and performance of the library and its GPU kernels

  • Explore and expand the boundary of innovative technologies like GPU code generation and fusion

  • Contribute to NVIDIA's AI business by collaborating closely with DL product teams as well as kernel development teams

What we need to see:
  • MS, PhD or equivalent in relevant fields (CS, EE, Math)

  • 2+ years of relevant work or research experience

  • Strong programming skills in C, C++, and Python

  • Excellent problem solving skills and learning capability

  • Experience with designing software architecture, interfaces, and building testing infrastructures

  • Good communication and a great teammate

Ways to stand out from the crowd:
  • Familiar with CUDA programming and GPU architecture

  • Familiar withTensorRT/cuDNN/cuBLASetc.

  • Background with DL fundamentals, frameworks, graph compilers, LLVM, MLIR etc.

  • Hands-on experience in development on Linux and Windows platforms, C++ build tools like CMake and DevOps tools, including Docker, Jenkins, Kubernetes etc.

  • Track record of mentoring junior engineers and leading a project and a team