* Work along side Foundation Model Research team to prototype and develop inference for cutting edge model architectures. * Build tools to understand bottlenecks in Inference for different hardwares and use cases. * Mentor and guide engineers in the organization.
Bachelor’s degree or higher in Computer Science or related technical field.
8+ years of experience leading and driving complex, ambiguous projects.
Industry background and experience in ML technologies (LLMs, Machine Learning, NLP, Information Retrieval, Statistics).
Experience with high throughput services particularly at supercomputing scale.
Proficient with running applications on Cloud (AWS / Azure or equivalent) using Kubernetes, Docker etc.
Proficient in building and maintaining systems written in modern languages (eg: Golang, python)
Familiar with one of the popular ML Frameworks like Pytorch, Tensorflow.
Familiar with fundamental Deep Learning architectures such as Transformers, Encoder/Decoder models.
Familiarity with Nvidia TensorRT-LLM, vLLLM, DeepSpeed, Nvidia Triton Server etc.
Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program.