AI Research Scientist - Evaluation, Handshake AI at Handshake
United States, , USA -
Full Time


Start Date

Immediate

Expiry Date

03 Dec, 25

Salary

375000.0

Posted On

03 Sep, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Research Projects, Python, Computer Science, Machine Learning, Evaluation Methodologies, Cognitive Science

Industry

Information Technology/IT

Description

ABOUT HANDSHAKE AI

Handshake is building the career network for the AI economy. Our three-sided marketplace connects 18 million students and alumni, 1,500+ academic institutions across the U.S. and Europe, and 1 million employers to power how the next generation explores careers, builds skills, and gets hired.
Handshake AI is a human data labeling business that leverages the scale of the largest early career network. We work directly with the world’s leading AI research labs to build a new generation of human data products. From PhDs in physics to undergrads fluent in LLMs, Handshake AI is the trusted partner for domain-specific data and evaluation at scale.
This is a unique opportunity to join a fast-growing team shaping the future of AI through better data, better tools, and better systems—for experts, by experts.

Now’s a great time to join Handshake. Here’s why:

  • Leading the AI Career Revolution: Be part of the team redefining work in the AI economy for millions worldwide.
  • Proven Market Demand: Deep employer partnerships across Fortune 500s and the world’s leading AI research labs.
  • World-Class Team: Leadership from Scale AI, Meta, xAI, Notion, Coinbase, and Palantir, just to name a few.
  • Capitalized & Scaling: $3.5B valuation from top investors including Kleiner Perkins, True Ventures, Notable Capital, and more.

DESIRED CAPABILITIES

  • PhD or equivalent research experience in machine learning, computer science, cognitive science, or a related field with focus on AI evaluation or understanding
  • Strong background in LLM research, model evaluation methodologies, interpretability, or foundational AI assessment techniques
  • Demonstrated ability to independently lead post training and evaluation research projects from theoretical framework to empirical validation
  • Proficiency in Python and deep experience with PyTorch for large-scale model analysis and evaluation
  • Experience designing and conducting experiments with large language models, benchmark development, or systematic model assessment
  • Strong publication record in post training, AI evaluation, model understanding, interpretability, or related areas that advance our comprehension of AI capabilities
  • Ability to clearly communicate complex insights about model behavior, evaluation methodologies, and their implications for AI development

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities
  • Design and conduct original research in LLM understanding, evaluation methodologies, and the dynamics of human-AI knowledge interaction
  • Develop novel evaluation frameworks and assessment techniques that reveal deep insights into model capabilities and limitations
  • Collaborate with engineers to transform research breakthroughs into scalable benchmarks and evaluation systems
  • Pioneer new approaches to measuring model understanding, reasoning capabilities, and alignment with human knowledge
  • Write high-quality code to support large-scale experimentation, evaluation, and knowledge assessment workflows
  • Publish findings in top-tier conferences and contribute to advancing the field’s understanding of AI capabilities
  • Work with cross-functional teams to establish new standards for responsible AI evaluation and knowledge alignment
Loading...