Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

Tesla Staff Performance Optimization Engineer AI Infrastructure 
United States, California, Palo Alto 
871180934

06.04.2025
What You’ll Do
  • Work with a wide variety of teams at Tesla to accelerate time-to-market for new ML models
  • Design, implement, and deploy low-overhead instrumentation methods for troubleshooting performance issues
  • Analyze collected telemetry, identifying bottlenecks and designing practical solutions to overcome those bottlenecks
  • Develop data-driven performance improvements to existing software pipelines
  • Working with each team to validate these performance improvements and incorporate them into production training runs
  • Help make Tesla's ambitious AI-related products and services a reality
What You’ll Bring
  • A deep understanding of the internals of GPU-based training and inferencing workloads, especially handoffs of data and computation between host CPUs and GPUs
  • Real-world knowledge of supporting languages and libraries used in large-scale AI training runs (CUDA/ZLUDA, OpenCL, PyTorch, Tensorflow, GPUDirect and other RDMA-enabling services)
  • Experience developing and tuning low-level software using languages like C, x86 assembly, and Rust
  • Practical experiences using different performance analysis techniques (profiling, tracing, simulative analysis) and when each should be applied
  • Excellent spoken and written communication skills, including the ability to concisely communicate data-driven root causes of performance issues and how they can be remedied
  • An irrational love for high-performance computing and extracting the maximum number of productive FLOPs from modern AI-oriented architectures