Platform ML Engineering Manager, Model Graph
at OpenAI
San Francisco, California, USA -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 16 Feb, 2025 | USD 530000 Annual | 19 Nov, 2024 | N/A | Good communication skills | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
ABOUT THE TEAM
The Platform ML team builds the ML side of our state-of-the-art internal training framework used to train our cutting-edge models. We work on distributed model execution as well as the interfaces and implementation for model code, training, and inference.
Our priorities are to maximize training throughput (how quickly we can train a new model) and researcher throughput (how quickly we can develop new models) with the goal of accelerating progress towards AGI. We frequently collaborate with other teams to speed up the development of new capabilities.
Responsibilities:
ABOUT THE ROLE
We are looking for an experienced engineering manager to help lead critical work on model definition and efficient distributed execution within our shared internal training stack. Our internal training stack is used by Research for large scale and small scale runs.
IN THIS ROLE, YOU WILL:
- Reduce the time it takes to try out new architecture ideas for training new models and increase the robustness of model code.
- Collaborate closely with researchers and other systems engineers to maximize the benefits of our shared internal training stack.
- Make it feasible to get SOTA throughput for our most important research models.
- Hire world-class AI systems engineers in one of the most competitive hiring markets.
- Coordinate the training needs of OpenAI’s research teams.
- Create a diverse, equitable, and inclusive culture that makes all feel welcome while enabling radical candor and the challenging of group think.
YOU MIGHT THRIVE IN THIS ROLE IF YOU:
- Have 3+ years of experience in engineering management and 7+ years as an IC working with high scale distributed systems and ML systems.
- Have experience with ML systems, particularly high scale distributed training or inference for modern LLMs.
- Have familiarity with the latest AI research and working knowledge of how these systems are efficiently implemented.
- Care deeply about diversity, equity, and inclusion, and have a track record of building inclusive teams.
REQUIREMENT SUMMARY
Min:N/AMax:5.0 year(s)
Information Technology/IT
IT Software - Other
Software Engineering
Graduate
Proficient
1
San Francisco, CA, USA