AIML - Sr Data Scientist, Evaluation at Apple

Cupertino, California, United States -

Full Time

Start Date

Immediate

Expiry Date

19 Jun, 26

Salary

0.0

Posted On

21 Mar, 26

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Data Science, Machine Learning, Evaluation, Human Evaluation, A/B Testing, UX Studies, Automated Testing, Metrics, LLMs, Large Language Models, Data Curation, Roadmap Development, Pipeline Productionalization, Data Training, Foundation Models

Industry

Computers and Electronics Manufacturing

Description

At Apple, we blend world-class hardware, software, and services to produce a tight ecosystem of products used by billions of people around the world. Machine Learning and AI models are integrated throughout our ecosystem to enhance the user experience. Ensuring a delightful end to end user experience requires well-thought out data curation, rigorous evaluation methodology, and a high signal north star metric. Guiding feature development across the entire product life cycle requires partnering closely with modeling teams, engineers, design, and others. At Apple’s scale, it’s paramount that your evaluation methods scale across platforms, user-model mediums, and languages. DESCRIPTION Advance the performance of Apple Intelligence Large Language Models, across Apple platforms, languages, and modalities. Advise the data training and product roadmap for Foundation Models. Design detailed evaluation methodologies in close partnership with engineering and product teams to support critical go/no-go decisions. Develop and evangelize robust metrics for hillclimbing. Prototype impactful analyses and productionalize novel pipelines. Regularly present to partner teams and executive sponsors, both at a technical and high level. MINIMUM QUALIFICATIONS Bachelors degree and 7+ years of work experience in Data Science, Machine Learning, or Analytics. Expertise in evaluation, either Human Evaluation, A/B Testing, UX Studies, Automated Testing, or similar. Strong intuition for metrics, including their strengths and shortcomings. A track record of positive impact on shipped products. PREFERRED QUALIFICATIONS Experience with LLMs, especially as it relates to evaluation and data. Strong communication skills and experience persuading both technical audiences and high level executive audiences. Passion for Apple products.

Responsibilities

Advance the performance of Apple Intelligence Large Language Models across platforms, languages, and modalities, while advising on data training and product roadmaps for Foundation Models. Design detailed evaluation methodologies in partnership with engineering and product teams to support critical go/no-go decisions and develop robust metrics for improvement.