Computer Vision/Machine Learning Intern (Multi-modality LLM) at Apple

Beijing, Beijing, China -

Full Time

Start Date

Immediate

Expiry Date

02 Feb, 26

Salary

0.0

Posted On

04 Nov, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Computer Vision, Machine Learning, Video Quality Assessment, Multi-Modal LLM, Post-Training, Prototyping, C, C++, Python, Written Communication, Verbal Communication, Teamwork, Generative AI, Large Language Model, Video Understanding

Industry

Computers and Electronics Manufacturing

Description

If you are the kind of people who are passionate on pursuing excellence, embracing challenges, enjoying work with others, learning new things along the way, Apple is the right place for you. The ideal candidate will possess the self-motivation, curiosity, and initiative to achieve those goals. Analogously, the candidate is a lifelong learner who passionately seeks to improve themselves and the quality of their work. DESCRIPTION The computer vision algorithm intern will work in a dynamic team as part of the Video Engineering org which develops multi-modality based video quality assessment technologies in Apple Platform. We balance research and product to deliver the highest quality, state-of-the-art experiences, innovating through the full stack, and partnering with cross-functional teams to influence what brings our vision to life and into customers hands. Keywords: Multi-Modal LLM; Video Quality Assessment; Post-training MINIMUM QUALIFICATIONS M.S. or PhD in Electrical Engineering/Computer Science or a related field (mathematics, physics or computer engineering), with a focus on computer vision and/or machine learning Rich experiences in video machine learning covering one of the topics: Multi-Modal LLM / Video Quality Assessment/ Post Training Proven prototyping skills and proficient in coding (C, C++, Python) Excellent written and verbal communications skills, be comfortable presenting research to large audiences, and have the ability to work hands-on in multi-functional teams PREFERRED QUALIFICATIONS Publication record in relevant venues (e.g. NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, SIGGRAPH) Industry experiences with multi-modal foundation model and frameworks Knowledge and understanding of generative AI, multi-modal large language model, video quality assessment Solid understanding of state-of-the-arts in Video Understanding Team oriented, result oriented, and self motivated

Responsibilities

The intern will work in a dynamic team within the Video Engineering organization, focusing on multi-modality based video quality assessment technologies. They will balance research and product development to deliver high-quality experiences and collaborate with cross-functional teams.