Senior Applied Scientist at Microsoft
Suzhou, Jiangsu, China -
Full Time


Start Date

Immediate

Expiry Date

17 Feb, 26

Salary

0.0

Posted On

19 Nov, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Statistics, Predictive Analytics, Research, Synthetic Data Generation, Data Ingestion, Evaluation Metrics, Generative AI Systems, Multi-Agent Frameworks, Azure Machine Learning, Collaboration, Analytical Mindset, Problem-Solving, Communication, Quality Assurance, Engineering Rigor

Industry

Software Development

Description
Design and implement offline evaluation strategies that capture real-world usage and reflect end-user preferences. Develop scientifically sound metrics that diagnose model regressions, benchmark against baselines (e.g., ChatGPT, Glean), and validate product improvements. Manufacture synthetic yet realistic user activity data using LLMs to simulate diverse usage scenarios. Collaborate on multi-agent systems or agentic workflows to automate evaluation flows and generate high-signal insights. Analyze evaluation outputs to identify gaps in coverage, quality, and usability across Copilot canvases. Partner with engineering and PMs to ensure insights are integrated into product workflows and experimentation pipelines. Publish learnings in internal forums, external conferences, and contribute to best practices in applied science. Bachelor's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 4+ years related experience (e.g., statistics predictive analytics, research) OR Master's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 3+ years related experience (e.g., statistics, predictive analytics, research) OR Doctorate in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 1+ year(s) related experience (e.g., statistics, predictive analytics, research) Excellent communication and collaboration skills, with the ability to work across engineering and product management. These requirements include but are not limited to the following specialized security screenings: Master's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 6+ years related experience (e.g., statistics, predictive analytics, research) OR Doctorate in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 3+ years related experience (e.g., statistics, predictive analytics, research) OR equivalent experience. 3+ years' experience conducting research as part of a research program (in academic or industry settings). 1+ year(s) experience developing and deploying live production systems, as part of a product team. Experience with synthetic data generation, data ingestion, and management, especially for evaluation or training purposes. Experience designing or implementing evaluation metrics and methodologies for LLMs or generative AI systems. Experience developing agentic solutions using LLMs or multi-agent frameworks. Familiarity with SEVAL and its application in offline evaluation pipelines. Solid analytical mindset with a data-driven approach to problem-solving, consistently upholding high standards of quality and engineering rigor. Collaborative and team-oriented, skilled at working across disciplines, levels, and product areas to drive alignment and shared success. Proficient in using Azure Machine Learning (AML) for model development, pipeline orchestration, experiment tracking, and compute/resource management.
Responsibilities
Design and implement offline evaluation strategies that capture real-world usage and reflect end-user preferences. Collaborate on multi-agent systems to automate evaluation flows and generate high-signal insights.
Loading...