5+ years of experience leading and driving complex, ambiguous projects.
Have experience with high throughput services particularly at supercomputing scale.
Proficient with running applications on Cloud (AWS / Azure or equivalent) using Kubernetes, Docker etc.
Familiar with GPU programming concepts using CUDA.
Familiar with one of the popular ML Frameworks like Pytorch, Tensorflow.
Proficient in building and maintaining systems written in modern languages (eg: Golang, python)
Familiar with fundamental Deep Learning architectures such as Transformers, Encoder/Decoder models.
Familiarity with Nvidia TensorRT-LLM, vLLLM, DeepSpeed, Nvidia Triton Server etc.
Experience writing custom CUDA kernels using CUDA or OpenAI Triton.
Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program.