Principal Applied Scientist at Microsoft
Beijing, Beijing, China -
Full Time


Start Date

Immediate

Expiry Date

03 Mar, 26

Salary

0.0

Posted On

03 Dec, 25

Experience

10 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Applied Science, Machine Learning, Information Retrieval, Data-Driven Solutions, Experimentation, A/B Testing, Citation Systems, Provenance Systems, LLM Development, RAG Systems, Cross-Functional Collaboration, Mentoring, Error Analysis, Privacy Compliance, Ethics, Negotiation

Industry

Software Development

Description
Owns the science roadmap for grounding—including retrieval, re-ranking, attribution, and reasoning—driving initiatives from problem framing to production impact. Designs and evolves state-of-the-art retrieval and RAG orchestration across documents, tables, code, and images. Builds citation and provenance systems (e.g., passage highlighting, quote-level alignment, confidence scoring) to reduce hallucinations and increase user trust. Leads experimentation and evaluation using A/B testing, interleaving, NDCG, MRR, precision/recall, and calibration curves to guide measurable trade-offs. Advances tool-augmented grounding through schema-aware retrieval, function calling, knowledge graph joins, and real-time connectors to databases, cloud object stores, search indexes, and the web. Partners with platform engineering to productionize models with scalable inference, embedding services, feature stores, caching, and privacy-compliant multi-tenant systems. Authors white papers, contributes to internal tools and services, and may publish research to generate intellectual property. Leads high-stakes negotiations to ensure cutting-edge technologies are applied practically and effectively. Identifies and solves significant business problems using novel, scalable, and data-driven solutions. Mentors applied scientists and data scientists, establishing best practices in experimentation, error analysis, and incident review. Collaborates cross-functionally with PMs, research, infrastructure, and security teams to align on milestones, SLAs, and safety protocols. Communicates clearly through design documentation, progress updates, and presentations to executives and customers. Contributes to ethics and privacy policies, identifies bias in product development, and proposes mitigation strategies. Bachelor's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 6+ years related experience (e.g., statistics, predictive analytics, research) OR Master's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 4+ years related experience (e.g., statistics, predictive analytics, research) OR Doctorate in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 3+ years related experience (e.g., statistics, predictive analytics, research) Minimum of 4 years of hands-on experience designing and building search, retrieval, or ranking systems. Proven track record of shipping LLM-powered or Retrieval-Augmented Generation (RAG) systems into production environments. Solid coding skills and solid foundation in machine learning, with the ability to implement and optimize models effectively. Demonstrated ability to lead through ambiguity, make principled trade-offs, and deliver measurable impact in cross-functional, fast-paced settings. Master's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 9+ years related experience (e.g., statistics, predictive analytics, research) OR Doctorate in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 6+ years related experience (e.g., statistics, predictive analytics, research) OR equivalent experience. Demonstrated expertise in information retrieval, with publications in top-tier conferences or journals such as NeurIPS, ICML, ICLR, SIGIR, or ACL. Hands-on experience in large language model (LLM) development, including pretraining, supervised fine-tuning (SFT), and reinforcement learning (RL). Proven track record in optimizing LLM inference, or active contributions to open-source frameworks like vLLM, SGLang, or related projects.
Responsibilities
Owns the science roadmap for grounding, driving initiatives from problem framing to production impact. Leads experimentation and evaluation to guide measurable trade-offs and collaborates cross-functionally to align on milestones and safety protocols.
Loading...