מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר

Tesla HPC Engineer AI Infrastructure
United States, California, Palo Alto
78093135

23.04.2025

שיתוף

What You’ll Do

Support the AI/ML cluster infrastructure on both GPU and Dojo platforms, focusing on systems automation, configuration management and deployment at scale
Improve our monitoring & self-healing pipelines, as well as security posture
Work with hardware and storage vendors to tune and optimize our server, storage and network performance
Performance tuning & OS provisioning on Linux systems
Manage HPC clusters, workloads and applications
Automation and systems engineering
Participate in 24x7 on-call rotation

What You’ll Bring

Proficiency with scripting languages such as Python or Bash
Proficiency with Linux & network fundamentals
Experience with configuration management software (Ansible, etc.), systems monitoring & alerting (Prometheus, Grafana, Telegraf, Splunk, etc.) is a plus
Experience with high-throughput low-latency networks, GPU-based computing systems, and/or high performance storage systems is a plus
Experience with Slurm, LSF and storage management of parallel file systems is a plus
Bachelor's Degree in Computer Science, Computer Engineering, Electrical Engineering, Physics or proof of exceptional skills in related field
3+ years of additional equivalent experience or evidence of exceptional ability related to the position

משרות נוספות שיכולות לעניין אותך

Google Solutions Manager AI/ML Infrastructure HPC United States, Georgia, Atlanta

הצטרפו למאות שיצרו קורות חיים ושדרגו את הקריירה שלהם

צרו קו"ח