AIML - Staff Machine Learning Engineer - ML Platform & Technology at Apple

Seattle, Washington, USA -

Full Time

Start Date

Immediate

Expiry Date

27 May, 25

Salary

166600.0

Posted On

27 Feb, 25

Experience

7 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Platforms, Distributed Systems, Data Structures, Rdma, Python, Azure, Algorithms, Kafka, Kubernetes, Design Principles, Cuda, Computer Science, Aws

Industry

Information Technology/IT

Description

SUMMARY

Posted: Jan 23, 2025
Role Number:200587726
Want to build the machine learning platform that engineers rely on to develop next-generation Apple Intelligence products and services? As a machine learning engineer on our team, you will create platforms and tools to enable performant, scalable ML workload for Apple’s AI-driven experiences. Join our team of highly skilled, impact-focused engineers! This role also includes opportunities to open source your work and publish at top ML conferences.

DESCRIPTION

We’re searching for strong machine learning engineers to help build next-generation platform for training deep learning models at scale. You’ll be part of a team of distributed systems and machine learning experts, focusing on reliability, scalability and performance of ML related workloads including model training, inference and data processing. We’re looking for candidates with polished coding skills as well as passion for distributed systems and machine learning. In exchange, we offer a respectful work environment, flexible responsibilities, and access to world-class experts and growth opportunities-all at one of the best companies in the world. Design and develop components for our centralized, scalable ML platform. Push the limits of existing solutions for large-scale ML workloads. Develop novel techniques to circumvent the limitations of these solutions. Deploy your techniques on high-impact tasks from our partners across the company building new Apple Intelligence products and services. We encourage publishing novel work at top ML conferences and releasing contributions as open source.

MINIMUM QUALIFICATIONS

Strong programming skills in Python or Go
Understanding of data structures, software design principles, and algorithms
Experience building large-scale distributed systems with tools such as Kubernetes, Kafka, Prometheus, etc.
Experience with deep learning frameworks, such as PyTorch, or JAX
With minimum of 7+ years of industry experience
Bachelors in the area of Computer Science or equivalent, or a related domain

PREFERRED QUALIFICATIONS

Experience building large-scale deep learning infrastructure or platforms for distributed model training
Experience with large-scale AI training infra components, such as accelerators, network fabrics, CUDA, NCCL, RDMA
Experience working with public cloud vendors such as AWS, GCP, Azure.
Experience developing model parallel and data parallel training solutions and other training optimizations
Familiarity with recent developments in foundation model architectures for language and multimodal
Publication record at ML conferences such as MLSys, NeurIPS, etc.
Masters or PhD in the area of Computer Science or equivalent, or a related

Responsibilities

Please refer the Job description for details