AIML - Full Stack ML Engineer, LLM Optimization
at Apple
Cupertino, California, USA -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 09 Sep, 2024 | USD 300200 Annual | 10 Jun, 2024 | N/A | Programming Languages,C++,Natural Language,Python | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
SUMMARY
Posted: Jun 7, 2024
Weekly Hours: 40
Role Number:200554477
As a Machine Learning Engineer in the LLM Optimization team at Apple, you will have the opportunity to be part of an innovative ML organization that enables LLM for Apple products. The LLM Optimization team focuses on designing and implementing ML-based solutions to improve runtime latency, training time, memory usage, time to first token, and decoding speed across all Apple applications. The team is strategically positioned for significant contributions both in the short term (on well-known Apple products) and in the long term (on highly ambitious, high-risk, high-reward projects). This role emphasizes shipping ML-based features and products. As a Full Stack ML Engineer, you will innovate across the entire end-to-end ML production pipeline. Your responsibilities will include but are not limited to: * Designing new neural network architectures * Developing efficient model training and fine-tuning methods * Enhancing on-device and server side inference Our ideal team member is fearless in trying new things and willing to iterate on ideas. We value team members who can quickly prototype and iterate towards high-quality implementations.
DESCRIPTION
As a Full Stack ML Engineer on our team, you will leverage your background to: * Design and implement ML-based solutions to improve runtime latency, training time, memory usage, time to first token, and decoding speed for Apple applications * Innovate across the entire end-to-end ML production pipeline, including dataset creation, neural network architecture design, model training, fine-tuning methods, training time optimization, on-device and server side inference * Quickly prototype and iterate to achieve high-quality implementations for pioneering machine learning algorithms * Collaborate with hardware and software teams to integrate research findings into market-ready solutions * Translate theoretical ideas into tangible innovations, demonstrating their industrial applicability
PREFERRED QUALIFICATIONS
- Strong ML background
- Proficiency in Programming Languages and Frameworks: Python, C++, PyTorch/TensorFlow/Jax
- Experience with Natural Language Processing(NLP), ML optimization - with a focus on LLMs
- Outstanding communication and technical writing skills, capable of conveying complex concepts clearly and efficiently
- Preferred: notable achievements validated by quality publications in ML optimization, with a focus on LLMs
Responsibilities:
Please refer the Job description for details
REQUIREMENT SUMMARY
Min:N/AMax:5.0 year(s)
Information Technology/IT
IT Software - Other
Software Engineering
Graduate
Proficient
1
Cupertino, CA, USA