We work on developing, prototyping and productizing state of the art algorithms for neural network model compression. Our algorithms are implemented using PyTorch and optimizations are geared towards efficient deployment via Core ML. We optimize models across domains, including NLP, vision, text and image generative models etc. Key responsibilities of this role are: * Setting up, and/or streamlining CI and automation pipelines. Adopting the best practices and integrating with the latest Apple internal CI services for the same. * Making enhancements to the release process, automating nightly builds, setting up scheduled CI runs for different levels of testing etc. * Making innovations in model testing and benchmarking (accuracy and latency), for various combinations of model types in different domains (vision, text, audio etc) and compression algorithms (quantization, pruning, palettization etc), discovering trends, effects of various hyper parameters etc. * Be passionate about engineering efficiency, finding innovative ways to reduce test time while maintaining a high bar of test coverage * Obsess about user experience and improving it. You are someone who is excited to fix bugs, understand user pain points and actively participates in supporting the users.* Developing integration of the model optimization library with other training engines and data platforms at Apple. * Keeping the code base updated to work with the latest versions of Python, PyTorch, numpy etc. * Set up and debug training jobs, datasets, evaluation, performance benchmarking pipelines. Ability to ramp up quickly on new training code bases and run experiments. Run detailed experiments and ablation studies to profile algorithms on various models, tasks, across different model sizes. * Improving model optimization documentation, writing tutorials and guides* Self prioritize and adjust to changing priorities and asks