Senior Software Engineer, Data Acquisition
at OpenAI
San Francisco, California, USA -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 31 Jan, 2025 | USD 385000 Annual | 01 Nov, 2024 | N/A | Communication Skills,Software Development,Kubernetes,Code,Computer Science,Data Processing,Distributed Systems | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
OVERVIEW:
The Data Acquisition team within the Foundations organization at OpenAI is responsible for all aspects of data collection to support our model training operations. Our team manages web crawling and GPTBot services and works closely with Data Processing, Architecture, and Scaling teams. We are looking for a skilled Senior Software Engineer to join our Data Acquisition team.
QUALIFICATIONS:
- BS/MS/PhD in Computer Science or a related field.
- 6+ years of industry experience in software development.
- Experience with large web crawlers a plus
- Strong expertise in large stateful distributed systems and data processing.
- Proficiency in Kubernetes, and Infrastructure-as-Code concepts.
- Willingness and enthusiasm for trying new approaches and technologies.
- Ability to handle multiple tasks and adapt to changing priorities.
- Strong communication skills, both written and verbal.
Responsibilities:
- Own and lead engineering projects in the area of data acquisition including web crawling, data ingestion, and search.
- Collaborate with other sub-teams, such as Data Processing, Architecture, and Scaling, to ensure smooth data flow and system operability.
- Work closely with the legal team to handle any compliance or data privacy-related matters.
- Develop and deploy highly scalable distributed systems capable of handling petabytes of data.
- Architect and implement algorithms for data indexing and search capabilities.
- Build and maintain backend services for data storage, including work with key-value databases and synchronization.
- Deploy solutions in a Kubernetes Infrastructure-as-Code environment and perform routine system checks.
- Conduct and analyze experiments on data to provide insights into system performance.
REQUIREMENT SUMMARY
Min:N/AMax:5.0 year(s)
Information Technology/IT
IT Software - Other
Software Engineering
Graduate
Proficient
1
San Francisco, CA, USA