Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Nvidia Senior Developer Technology Engineer - 
Germany, North Rhine-Westphalia 
449735582

07.04.2024
What you’ll be doing:
  • Engage closely with internal engineering teams and external partners on solving local end-to-end LLM & Generative AI GPU deployment challenges.

  • Apply powerful profiling and debugging tools for analyzing most demanding GPU-accelerated end-to-end AI applications to detect insufficient GPU utilization resulting in suboptimal runtime performance.

  • Conduct hands-on trainings, develop sample code and host presentations to give good guidance on efficient end-to-end AI deployment targeting optimal runtime performance.

  • Guide developers of AI applications applying methodologies for efficient adoption of DL frameworks targeting maximal utilization of GPU Tensor Cores for the best possible inference performance.

  • Collaborate with GPU driver and architecture teams as well as NVIDIA research to influence next generation GPU features by providing real-world workflows and giving feedback on partner and customer needs.

What we need to see:
  • Deep theoretical knowledge about Transformer architectures - specifically LLMs and Generative AI - and convolutional neural networks.

  • 8+years of professional experience in local GPU deployment, profiling and optimization.

  • BS or MS degree in Computer Science, Engineering, or related degree.

  • Strong proficiency in C/C++, Python, software design, programming techniques.

  • Experience working with AI inference frameworks.

  • Experience with CUDA and NVIDIA's Nsight GPU profiling and debugging suite.

  • Strong verbal and written communication skills in English and organization skills, with a logical approach to problem solving, time management, and task prioritization skills.

  • Excellent interpersonal skills.

  • Some travel is required for conferences and for on-site visits with external partners.

Ways to stand out from the crowd:
  • Proficiency in GPU-accelerated AI inference driven by NVIDIA APIs, specifically cuDNN, TensorRT & TensorRT-LLM.

  • Experience with AI deployment on NPUs and ARM architectures.

  • Confirmed expert knowledge in Vulkan and / or DX12.

  • Detailed knowledge of the latest generation GPU architectures.