You will work with a multidisciplinary team to actively participate in Apple Intelligence’s data-model co-design and co-develop practice. Your responsibilities will extend to the design and development of a comprehensive data generation and curation framework for Apple Intelligence foundation models at Apple. You will also be responsible to build robust model evaluation pipelines, integral to the continuous improvement and assessment of Apple Intelligence foundation models. Furthermore, you will have the opportunity to showcase your groundbreaking research work by publishing and presenting at premier academic venues.Your work may span a variety of directions, including but not limited to:Develop and implement techniques for creating high-quality synthetic datasets across a variety of domains, including vision, text, and audio data.Innovate and experiment with new approaches for synthetic data generation to improve the diversity, realism, and representativeness of datasets.Collaborate with multi-functional teams to understand data requirements and ensure that synthetic datasets are optimized for training foundation models.Crafting and implementing semi-supervised, self-supervised representation learning techniques for growing the power of both limited labeled data and large-scale unlabeled data.Develop pipelines and tools to automate synthetic data generation for large-scale AI experiments.Stay updated with the latest research and industry trends in synthetic data generation, foundational model training, and large-scale data engineering.