מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר
What you’ll be doing:
Profile, analyze, and optimize the performance of deep learning models and workloads on ground breaking hardware and software platforms.
Develop tooling for profiling and microbenchmarking of DL workloads running compiled models uncovering optimization opportunities.
Collaborate with teams across NVIDIA to provide performance insights and recommendations that improve the design and efficiency of DL frameworks and workloads.
Own the development and implementation of standard methodologies for compiling, testing, and deploying high-performance deep learning models.
Conduct performance benchmarking on enterprise-grade GPU clusters and pre-release hardware, driving improvements to NVIDIA’s DL software stack and hardware roadmap.
What we need to see:
5+ years of experiencein deep learning model implementation, software development, and performance optimization.
BSc, MS, or PhDin Computer Science, Computer Engineering, Electrical Engineering, Mathematics, Physics, or a related technical field, or equivalent practical experience.
Proficiency in Python, with extensive hands-on experience using at least one major deep learning framework (e.g., PyTorch, TensorFlow, JAX).
Strong problem-solving and analytical skills, with a proven track record in debugging, performance tuning, and workload optimization.
Experience withdeep learning compilers(e.g., PyTorch’s torch.compile, XLA, or other similar technologies)
Ways to stand out from the crowd:
Experience with running large-scale workloads in HPC clusters
Knowledge and passion for DevOps/MLOps practices for Deep Learning-based product’s development.
Solid understanding of Linux environments and containerization technologies such as Docker
Familiarity with GPU programming or parallel computing.
משרות נוספות שיכולות לעניין אותך