AIML - Senior Embedded Machine Learning Engineer- Edge ML at Apple

Seattle, Washington, USA -

Full Time

Start Date

Immediate

Expiry Date

13 Jul, 25

Salary

296300.0

Posted On

13 Apr, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Analytical Skills, Python, Simd, Performance Analysis, Distillation, Computer Architecture, Pruning, Sensors, Compilers, Microcontrollers, Computer Science, Zephyr, Software

Industry

Information Technology/IT

Description

We are looking for a highly motivated and experienced Senior Embedded ML Engineer to join our team focused on enabling cutting-edge machine learning capabilities on resource-constrained edge devices. In this role, you will be at the forefront of innovation as we bridg the gap between state-of-the-art ML models and the realities of low-power, real-time, embedded hardware.

DESCRIPTION

You will play a key role in designing, implementing, and optimizing ML solutions for highly constrained compute environments. This is a cross-disciplinary role that blends expertise in embedded systems, computer architecture, and machine learning to unlock new applications in areas such as IoT, wearables, robotics, and autonomous systems. RESPONSIBILITIES: - Design and implement embedded ML pipelines on microcontrollers and custom SoCs with tight compute, memory, and power constraints. - Optimize and quantize deep learning models for real-time inference on edge platforms. - Develop and maintain low-level firmware in C/C++ to integrate ML models with custom hardware accelerators and sensors. - Conduct performance benchmarking, memory profiling, and bottleneck analysis across various embedded platforms. - Collaborate closely with ML researchers, hardware architects, and product engineers to co-design efficient ML solutions from model training to deployment. - Evaluate new edge ML techniques, compilers (e.g., TVM, TFLite Micro, CMSIS-NN), and toolchains to advance the team’s capabilities. - Contribute to the overall system architecture with a deep understanding of embedded compute, memory hierarchies, and data flow optimization.

MINIMUM QUALIFICATIONS

Strong proficiency in C/C++ and Python, with a solid foundation in embedded firmware development.
Deep understanding of computer architecture, particularly ARM Cortex-M/A cores, SIMD, caches, memory alignment, and DMA usage.
Proficiency in model deployment tools and compilers such as TensorFlow Lite for Microcontrollers, TVM, ONNX Runtime, and custom model conversion pipelines.
Demonstrated expertise in performance analysis, using tools like perf, valgrind, gprof, or hardware-specific profilers.
Experience working with hardware interfaces such as SPI, I2C, UART, and integrating with sensors or custom accelerators.
Bachelor’s, Master’s, or PhD or equivalent experience in Computer Science or a related field.

PREFERRED QUALIFICATIONS

Hands-on experience with deep learning concepts, including model architectures (CNNs, RNNs, Transformers), training workflows, and post-training optimization (quantization, pruning, distillation).
Familiarity with embedded RTOSes (e.g., FreeRTOS, Zephyr) and real-time application constraints.
Comfort with debugging low-level issues across software and hardware boundaries.
Excellent problem-solving and analytical skills with a thorough approach.
Most importantly: a strong curiosity, willingness to dive deep into unfamiliar problems, and an eagerness to learn and grow in a fast-evolving field.

Responsibilities

Please refer the Job description for details