Bachelor's degree in Computer Science, Mathematics, a related technical field, or equivalent practical experience.
7 years of experience with cloud infrastructure (hardware shapes, sizes, auto-scaling, auto-provisioning, etc.), working with infrastructure as a service, platform as a service, or software as a service.
Experience with distributed training and optimizing performance versus costs.
Experience coding in Python, bash scripting, and using OSS frameworks such as TensorFlow, PyTorch, Jax, etc.
Experience with orchestrators such as Slurm or Kubernetes.
Experience building and operationalizing machine learning models.
Preferred qualifications:
Experience training and fine tuning large models (i.e., image, language, segmentation, recommendation, genomics) with accelerators.
Experience with containerization, K8s, Kubernetes on cloud.