Software Engineer - Systems ML - PyTorch at Meta
Bellevue, WA 98005, USA -
Full Time


Start Date

Immediate

Expiry Date

16 Oct, 25

Salary

56.25

Posted On

17 Jul, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Runtime Analysis, Hip, Simd, Llvm, Computer Science, Computer Engineering, Vectorization, Openmp, Kernel Programming

Industry

Computer Software/Engineering

Description

In this role, you will be a member of the PyTorch Core Systems team. The PyTorch team develops the open source software stack powering AI models and systems. The Systems team optimizes highly performant software to train and serve AI architectures. You will work closely with AI researchers to analyze deep learning models and optimize their performance within PyTorch. You will also partner with researchers to understand modern advances in AI guided software development and apply this directly towards PyTorch code and device optimization.

MINIMUM QUALIFICATIONS:

  • Currently has, or is in the process of obtaining a Bachelor’s degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience. Degree must be completed prior to joining Meta
  • Proven C/C++ programming skills
  • Experience in AI framework development or accelerating deep learning models on hardware architectures

PREFERRED QUALIFICATIONS:

  • Knowledge of GPU, CPU, or AI hardware accelerator architectures
  • Experience working with frameworks like PyTorch, Caffe2, TensorFlow, ONNX, TensorRT
  • OR AI high performance kernels: Experience with CUDA programming, OpenMP / OpenCL programming or AI hardware accelerator kernel programming. Experience in accelerating libraries on AI hardware, similar to cuBLAS, cuDNN, CUTLASS, HIP, ROCm etc
  • OR AI Compiler: Experience with compiler optimizations such as loop optimizations, vectorization, parallelization, hardware specific optimizations such as SIMD. Experience with MLIR, LLVM, IREE, XLA, TVM, Halide is a plus
  • OR AI frameworks: Experience in developing training and inference framework components. Experience in system performance optimizations such as runtime analysis of latency, memory bandwidth, I/O access, compute utilization analysis and associated tooling development
Responsibilities
  • Improve PyTorch’s state of the art training, post-training, and inference on modern AI hardware accelerators
  • Development of PyTorch’s software stack with a focus on AI frameworks and high performance kernel development
  • Performance tuning and optimizations of deep learning framework & software components
  • Collaborating with AI research scientists to accelerate the next generation of deep learning models such as Recommendation systems, Generative AI, Computer vision, NLP etc
Loading...