AIML - Sr Software Data Engineer, Evaluation at Apple

Cupertino, California, United States -

Full Time

Start Date

Immediate

Expiry Date

29 May, 26

Salary

0.0

Posted On

28 Feb, 26

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Data Engineering, Stream Processing, Batch Processing, Flink, Spark, Kafka, Airflow, Iceberg, Trino, Java, Scala, Python, Data Quality, Algorithms, Data Modeling, SQL

Industry

Computers and Electronics Manufacturing

Description

Are you excited about using data to shape the experience of products used by hundreds of millions of people around the world? The Evaluation Data Engineering team, part of Apple’s SWE organization, builds the scalable and reliable data platform that powers Siri, Search, and Machine Learning across Apple. We’re looking for collaborative and mission-driven software engineers who care deeply about data quality, user impact, and building at scale. If you’re passionate about tackling complex data challenges, eager to work with petabytes of data, and inspired by Apple’s commitment to privacy and innovation, we’d love to hear from you. DESCRIPTION In this role, you’ll work cross-functionally across product and data science teams to build large-scale stream and batch processing data pipelines that power Analytics, Experimentation, and Machine Learning. You will design a unified and groundbreaking data processing framework using Flink, and/or Spark. Your work will focus on optimizing performance, ensuring data quality, and contributing to a long-term vision that extends the framework’s capabilities to new user scenarios and groundbreaking machine learning applications. You will collaborate closely with Siri, Search, and other teams to design solutions that transform raw data into datasets that drive innovation. You’ll automate dataset lifecycles with strong quality standards and help partners confidently use the data for product insights. MINIMUM QUALIFICATIONS 7+ years of experience designing, building, and maintaining distributed data processing systems at scale. 5+ years of hands-on experience with stream and/or batch processing technologies such as Flink, Spark, Kafka, Airflow, Iceberg, and Trino. 2-3 years of experience in full-stack development Proficient in at least one modern programming language (e.g., Java, Scala, and Python). MS or BS in Computer Science, Engineering, Math, Statistics, or a related field, or equivalent practical experience in data engineering. PREFERRED QUALIFICATIONS Strong in algorithms, data structures, data modeling, and SQL, with experience working on large-scale, complex, and high-dimensional datasets. Experience with machine learning algorithms or pipelines, particularly in the context of data engineering. Experience supporting ML engineers or data scientists with feature engineering or model data pipelines is a plus. Familiarity with testing tools and methodologies for validating large-scale, distributed data systems (e.g., data quality checks, pipeline testing frameworks, fault tolerance testing). Proven software engineering fundamentals, including experience with design, testing, version control, and CI/CD best practices. Comfortable working independently in a fast-paced, ambiguous environment. Excellent communication and problem-solving skills.

Responsibilities

This role involves building large-scale stream and batch processing data pipelines that support Analytics, Experimentation, and Machine Learning across products like Siri and Search. The engineer will design a unified data processing framework, focusing on performance optimization and ensuring high data quality for derived datasets.