Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Tesla Software Engineer Performance AI Infrastructure 
United States, California, Palo Alto 
622212994

23.06.2024
What to Expect

As a Software Engineer within the AI group, you will work on reinforcing, optimizing, and scaling our neural network training and auto-labeling infrastructure for both Autopilot and the Humanoid robot.

What You’ll Do
  • Reduce wall clock time to convergence of our training jobs by identifying bottlenecks in the ML stack, from data-loading up to the GPU
  • Integrate efficient, low-level code with the overall high-level training framework
  • Profile our workloads and implement solutions to increase training efficiency
  • Optimize workloads for efficient hardware utilization (e.g. CPU and GPU compute, data throughput, networking)
What You’ll Bring
  • Extensive experience in CUDA kernel programming and pushing GPUs to their limits
  • Experience programming in Python
  • Experience with at least one deep learning framework (ideally in PyTorch)
  • Demonstrated experience in profiling CPU/GPU code
  • Proficient in system-level software, in particular hardware-software interactions and resource utilization
  • Good knowledge of CUDA kernels used in training state-of-the-art deep learning models
  • Experience with high-performance networking (e.g. Infiniband, RDMA, NCCL)
  • Experience with Triton, preferred