ML Framework (MetalLM) Engineer
at Apple
Cupertino, California, USA -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 03 Dec, 2024 | USD 264200 Annual | 05 Sep, 2024 | N/A | Triton,Optimization Techniques,Machine Learning,Design,Architecture | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
SUMMARY
Posted: Aug 2, 2024
Weekly Hours: 40
Role Number:200561493
Apple’s ML Frameworks team in GPU, Graphics and Displays org provides GPU acceleration for popular Machine learning libraries such as TensorFlow, PyTorch and JAX using Metal runtime and device backend. It optimizes compute performance with kernels and computational graphs that are fine-tuned for the unique characteristics of each Metal GPU family. We are always looking for exceptionally dedicated individuals to grow our outstanding team.
DESCRIPTION
Our team is seeking extraordinary machine learning and GPU programming engineers who are passionate about providing robust compute solutions for accelerating Machine learning libraries on Apple Silicon. Role has the opportunity to influence the design of compute and programming models in next generation GPU architectures. * Responsibilities: * Design and develop compiler based optimizations for Metal backend in ML frameworks such as torch.compile for PyTorch * Work on cutting-edge ML inference framework project and optimize code for efficient and scalable ML inference using distributed techniques * Implement features of Metal device backend for ML training acceleration technologies * Work with Core teams of PyTorch, JAX or Tensorflow to provide Metal runtime and device backend support * Tune GPU-accelerated training across products. * Performing in-depth analysis, compiler and kernel level optimizations to ensure the best possible performance across hardware families. * Intended deliverables: * GPU accelerated ML Frameworks technology * Optimized ML training across products. If this sounds of interest, we would love to hear from you!
- 3+ years of programming and problem-solving experience with C/C++/ObjC
- Experience with Distributed training or inference techniques
- GPU compute programming models & optimization techniques
- Experience with system level programming and computer architecture
PREFERRED QUALIFICATIONS
- Contributions to an AI framework such as PyTorch, JAX or Tensorflow is a plus
- Experience with graph compilers such as Triton, OpenXLA or LLVM/MLIR is a plus
- Good understanding of machine learning fundamentals
Responsibilities:
- 3+ years of programming and problem-solving experience with C/C++/ObjC
- Experience with Distributed training or inference techniques
- GPU compute programming models & optimization techniques
- Experience with system level programming and computer architectur
REQUIREMENT SUMMARY
Min:N/AMax:5.0 year(s)
Information Technology/IT
IT Software - Other
Software Engineering
Graduate
Proficient
1
Cupertino, CA, USA