The point where experts and best companies meet

Nvidia Deep-Learning Software Engineer Performance Optimization
Japan, Tokyo
895387066

29.04.2025

Japan, Tokyo

What you'll be doing:

Analyze, profile and optimize the latest DNN AI algorithms, and implement as production-quality software libraries for latency-critical use-cases on next-generation hardware.
Push the boundaries of the state of the art in DNN performance and efficiency, including model compression, quantization and architecture search techniques.
Collaborate with researchers and engineers across NVIDIA improving the architecture of future NVIDIA chips and ensure that they are ready to support the latest advances in AI.
Assist NVIDIA customers to bring ground-breaking products to life on the foundation of NVIDIA AI technology.

What we need to see:

University degree, or equivalent knowledge, in Computer Science, Electrical Engineering, Physics or Mathematics.
5+ years of work experience in related fields, such as HPC, numeric computing, machine learning, AI with responsibilities for software optimization.
Proficiency in C++, Python, data structures, algorithms, computer architecture and operating system concepts.
Knowledge of deep-learning toolchains (PyTorch, TensorFlow, Keras, ONNX, TensorRT, numeric libraries, containers, etc.)
Experience with neural network training, pruning and quantization, deploying DNN inference in production systems.
Experience optimizing and implementing compute algorithms on accelerated hardware, such as SIMD instruction sets, GPU-s, FPGA or DNN ASIC.
Familiarity with CNN, LLM and ViT architectures, as well as the latest progress in the field.
Experience creating DNN models for solving production problems in any domain, including computer vision, speech recognition, natural language processing, optimization or generative AI.

Ways to stand out from the crowd:

Experience implementing DNN inference natively using C++, CUDA kernels or low level libraries, such as BLAS.
Experience building distributed deep-learning infrastructure, HPC, cloud programming.
Contribution to open-source projects, including personal projects published as open-source (please provide link to github repository).
Published paper at relevant conferences or in journals (e.g. NIPS, ICML, ICLR, CVPR, ICCV, ECCV, SIGGRAPH, etc.)
Achievements in programming or machine learning competitions, such as Kaggle, HackerRank, TopCoder, etc.

These jobs might be a good fit

Nvidia Deep Learning Engineer Generative AI 3D Reconstruction Japan, Tokyo

Get to the top of the "yes list" with a standout CV!

CREATE CV