Member of Technical Staff at Microsoft
Redmond, Washington, United States -
Full Time


Start Date

Immediate

Expiry Date

17 Feb, 26

Salary

0.0

Posted On

19 Nov, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Dataset Design, Model Training, Reinforcement Learning, Data Infrastructure, Data Quality Assessment, Tool Development, Collaboration, Research, Multimodal Models, Data Preprocessing, Benchmarking, Ablation Studies, Visualization, Versioning, Optimization, Impact Measurement

Industry

Software Development

Description
Design & Evaluate Datasets - Build high-quality datasets and benchmarks for training AI models; run ablation studies to measure impact and optimize data effectiveness. Advance Model Training - Apply deep expertise in pre-training, post-training, and reinforcement learning (RL) for both language and multimodal models. Develop Data Infrastructure - Create and maintain scalable pipelines for ingestion, preprocessing, filtering, and annotation of large, complex datasets. Data Quality & Analysis - Assess real-world multimodal datasets (text, image, video, audio, code) for quality, diversity, and relevance; identify gaps and propose improvements. Tooling & Workflows - Build lightweight tools for dataset auditing, visualization, and versioning to streamline experimentation. Research & Innovation - Collaborate with cross-functional teams to push research and product boundaries, delivering models that make a real-world impact. Other: - Embody our Culture and Values Doctorate OR equivalent experience.
Responsibilities
The role involves designing and evaluating datasets for AI model training, as well as advancing model training through deep expertise in various learning methodologies. Additionally, the candidate will develop data infrastructure and assess the quality of multimodal datasets to propose improvements.
Loading...