During the internship you must be enrolled in a M.S. or PhD program in Electrical Engineering/Computer Science or a related field (mathematics, physics or computer engineering), with a focus on computer vision and/or machine learning
Rich experiences in video machine learning covering one of the topics: Video Understanding / Video Foundation Model / Multi-modal LLM
Proven prototyping skills and proficient in coding (C, C++, Python)
Excellent written and verbal communications skills, be comfortable presenting research to large audiences, and have the ability to work hands-on in multi-functional teams
Preferred Qualifications
Publication record in relevant venues (e.g. NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, SIGGRAPH).
Industry experiences with multi-modal foundation model and frameworks.
Knowledge and understanding of generative AI, multi-modal large language model, video caption.
Solid understanding of state-of-the-arts in Video Understanding and familiar with the challenges of developing algorithms that run efficiently on resource constrained platforms.
Team oriented, result oriented, and self motivated.
Apple is an equal opportunity employer that is committed to inclusion and diversity, and thus we treat all applicants fairly and equally. Apple is committed to working with and providing reasonable accommodation to applicants with physical and mental disabilities.