המקום בו המומחים והחברות הטובות ביותר נפגשים
What you'll be doing:
Understand, analyze, profile, and optimize deep learning workloads on state-of-the-art hardware and software platforms.
Build tools to automate workload analysis, workload optimization, and other critical workflows.
Collaborate with cross-functional teams to analyze and optimize cloud application performance on diverse GPU architectures.
Identify bottlenecks and inefficiencies in application code and propose optimizations to enhance GPU utilization.
Drive end-to-end platform optimization from a hardware level to the application and service levels
Design and implement performance benchmarks and testing methodologies to evaluate application performance.
Provide guidance and recommendations on optimizing cloud-native applications for speed, scalability, and resource efficiency.
Share knowledge and best practices with domain expert teams as they transition applications to distributed environments.
What we need to see:
Masters in CS, EE or CSEE or equivalent experience
5+ years of experience in application performance engineering
Experience using large scale multi node GPU infrastructure on premise or in CSPs
Background in deep learning model architectures and experience with Pytorch and large scale distributed training
Experience with application profiling tools such as NVIDIA NSight, Intel VTune etc.
Deep understanding of computer architecture, and familiarity with the fundamentals of GPU architecture. Experience with NVIDIA's Infrastructure and software stacks.
Proven experience analyzing, modeling and tuning DL application performance.
Proficiency in Python and C/C++ for analyzing and optimizing application code
Ways to stand out from the crowd:
Strong fundamentals in algorithms and GPU programming experience (CUDA or OpenCL)
Understanding of NVIDIA's server and software ecosystem
Hands-on experience in performance optimization and benchmarking on large-scale distributed systems
Hands-on experience with NVIDIA GPUs, HPC storage, networking, and cloud computing.
In-depth understanding storage systems, Linux file systems, RDMA networking
משרות נוספות שיכולות לעניין אותך