Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

Nvidia Deep-Learning Software Engineer Performance Optimization 
Japan, Tokyo 
895387066

01.12.2024

What you'll be doing:

  • Analyze, profile and optimize the latest DNN AI algorithms, and implement as production-quality software libraries for latency-critical use-cases on next-generation hardware.

  • Push the boundaries of the state of the art in DNN performance and efficiency, including model compression, quantization and architecture search techniques.

  • Collaborate with researchers and engineers across NVIDIA improving the architecture of future NVIDIA chips and ensure that they are ready to support the latest advances in AI.

  • Assist NVIDIA customers to bring ground-breaking products to life on the foundation of NVIDIA AI technology.

What we need to see:

  • University degree, or equivalent knowledge, in Computer Science, Electrical Engineering, Physics or Mathematics.

  • 5+ years of work experience in related fields, such as HPC, numeric computing, machine learning, AI with responsibilities for software optimization.

  • Proficiency in C++, Python, data structures, algorithms, computer architecture and operating system concepts.

  • Knowledge of deep-learning toolchains (PyTorch, TensorFlow, Keras, ONNX, TensorRT, numeric libraries, containers, etc.)

  • Experience with neural network training, pruning and quantization, deploying DNN inference in production systems.

  • Experience optimizing and implementing compute algorithms on accelerated hardware, such as SIMD instruction sets, GPU-s, FPGA or DNN ASIC.

  • Familiarity with CNN, LLM and ViT architectures, as well as the latest progress in the field.

  • Experience creating DNN models for solving production problems in any domain, including computer vision, speech recognition, natural language processing, optimization or generative AI.

Ways to stand out from the crowd:

  • Experience implementing DNN inference natively using C++, CUDA kernels or low level libraries, such as BLAS.

  • Experience building distributed deep-learning infrastructure, HPC, cloud programming.

  • Contribution to open-source projects, including personal projects published as open-source (please provide link to github repository).

  • Published paper at relevant conferences or in journals (e.g. NIPS, ICML, ICLR, CVPR, ICCV, ECCV, SIGGRAPH, etc.)

  • Achievements in programming or machine learning competitions, such as Kaggle, HackerRank, TopCoder, etc.