Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

Nvidia Senior Deep Learning Performance Engineer 
Poland, Masovian Voivodeship, Warsaw 
612055350

01.12.2024

What you’ll be doing:

  • Profile, analyze, and optimize the performance of deep learning models and workloads on ground breaking hardware and software platforms.

  • Develop tooling for profiling and microbenchmarking of DL workloads running compiled models uncovering optimization opportunities.

  • Collaborate with teams across NVIDIA to provide performance insights and recommendations that improve the design and efficiency of DL frameworks and workloads.

  • Own the development and implementation of standard methodologies for compiling, testing, and deploying high-performance deep learning models.

  • Conduct performance benchmarking on enterprise-grade GPU clusters and pre-release hardware, driving improvements to NVIDIA’s DL software stack and hardware roadmap.

What we need to see:

  • 5+ years of experiencein deep learning model implementation, software development, and performance optimization.

  • BSc, MS, or PhDin Computer Science, Computer Engineering, Electrical Engineering, Mathematics, Physics, or a related technical field, or equivalent practical experience.

  • Proficiency in Python, with extensive hands-on experience using at least one major deep learning framework (e.g., PyTorch, TensorFlow, JAX).

  • Strong problem-solving and analytical skills, with a proven track record in debugging, performance tuning, and workload optimization.

  • Experience withdeep learning compilers(e.g., PyTorch’s torch.compile, XLA, or other similar technologies)


Ways to stand out from the crowd:

  • Experience with running large-scale workloads in HPC clusters

  • Knowledge and passion for DevOps/MLOps practices for Deep Learning-based product’s development.

  • Solid understanding of Linux environments and containerization technologies such as Docker

  • Familiarity with GPU programming or parallel computing.