Lead Data Scientist, Generative AI Products, Digital Transformation

at  Harvard University

Boston, Massachusetts, USA -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate28 Sep, 2024Not Specified28 Jun, 20243 year(s) or aboveCloud,Physics,Corrections,Sql,Computer Science,Reach,Information Retrieval,Journals,Data Science,R,Business Opportunities,Secondary Education,Machine Learning,Documentation,Product Management,Data Visualization,Design,Technical Vision,Buy In,PublicationsNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

POSITION DESCRIPTION

As our Lead Data Scientist, you will collaborate with and shepherd the Data Science and Machine Learning team and will create data science, machine learning, and AI solutions to better address the needs of our constituents (students, alumni, faculty, researchers, staff, and community at large). You will have the chance to guide and continuously improve the ways in which we engage, educate, and empower people around the world, combining the best of human touch and technology scale, experimenting with everything from the latest AI algorithms and techniques to blended and immersive environments, multi-modal and varied-form content, and the most innovative research and teaching methodologies. You will be highly influential in advancing our LLM applications and guide teams towards impactful and ethical AI. We seek an expert who is eager to grow and disseminate GenAI model expertise across the organization.

In this role, you will translate the needs of our cross-functional stakeholders into user-facing applications that leverage NLP techniques and large language models (LLMs). As a Lead Data Scientist on our GenAI applications team, you will work on products like conversational search interfaces, chatbots, text summarizers, recommender engines, and more based on the needs of the constituents. You will partner with Product Managers, Machine Learning Engineers, Cloud Platform Engineers, and cross-functional partners to develop production-grade algorithms. Your innovations will drive value creation through personalized engagement, expanded reach, and experimental ways of learning that will continue the Harvard Business School leadership in education, business, and societal impact.

  • Architect the overall framework and infrastructure for GenAI products like search interfaces, bots, summarizers, etc. Develop and implement techniques to optimize model performance to meet specific product goals.
  • Collaborate closely with product management and engineering leads to align on technical roadmap. Guide engineering teams to effectively leverage LLM capabilities in product implementations.
  • Establish protocols and systems for building fair, accountable and transparent LLM-based applications. Lead efforts to proactively assess and mitigate risks due to model biases or failures.
  • Implement robust feedback pipelines, monitoring and corrections to ensure model safety
  • Design and oversee curation of high-quality datasets tailored for LLM training for each product. Build data science pipelines from feature generation, data visualization and models evaluation; design the solution, build initial code and provide documentation with ways of working to maximize time to value and re-usability.
  • Communicate clearly and effectively to technical and non-technical audiences, verbally and visually, to create understanding, engagement, and buy-in. Contribute novel research and analyses to leading academic conferences and journals.

Additional responsibilities are listed in the Additional Qualifications section below.

BASIC QUALIFICATIONS

  • Minimum of seven years’ post-secondary education or relevant work experience

ADDITIONAL QUALIFICATIONS AND SKILLS

Other Required Qualifications:

  • Bachelors/Advanced Degree in Mathematics, Physics, Computer Science, Engineering, Statistics, or 8+ years equivalent work experience
  • 3-5 Years Experience in developing a variety of machine learning models and algorithms in a commercial environment with a track record of creating meaningful business impact
  • Experience with production RAG pipelines and agentic information retrieval and search systems, with the ability to write production level code.
  • Strong Python skills required
  • Minimum of three years’ experience building production NLP and deep learning models using PyTorch/Tensorflow, along with using large language model architectures (BERT, GPT-3 etc.)
  • Experience building advanced workflows such as retrieval augmented generation, model chaining, dynamic prompting, PEFT/SFT, etc. using Langchain and similar tools
  • Proficiency with various prompting techniques, with a clear understanding of tradeoffs between prompting and finetuning
  • Experience with finetuning embedding models and tuning vector databases to improve performance of semantic search and retrieval systems
  • Experience with cloud computing platforms - AWS
  • Prior experience in leading data science and machine learning focused on solving business problems and seizing business opportunities

Desired/Preferred Qualifications:

  • Proficiency in at least one open-source programming language (R, Java, C++ or another) and SQL desirable
  • Experience establishing model guardrails and developing bias detection and mitigation techniques for AI applications
  • Ability to mentor and lead others; provide hands-on technical guidance; conduct code reviews
  • Ability to simultaneously coordinate and track multiple deliverables, tasks and dependencies across multiple stakeholders / business areas
  • Experience working in agile methodology

Additional duties and responsibilities include, but are not limited to, the following:

  • Identify trends and opportunities to drive innovation, both in what we do and how we do it; evaluate new data science, machine learning, and AI technologies and tools that can boost team performance, innovation and business value. Proactively analyze latest developments in large language models to deeply understand model capabilities, limitations, and best practices. Develop techniques to continually improve language understanding and model training
  • Mentor and develop junior data scientists in state-of-the-art GenAI methods
  • Set technical vision and lead initiatives to accelerate product impact through cutting-edge LLM innovations
  • Manage, coach and mentor a team of data scientists, serving at the predominant technical data science and machine learning expert
  • Actively contribute to and re-use community best practices
  • Embody the values and passions that characterize Harvard Business School, with empathy to engage with colleagues from a wide range of backgrounds
  • Promote data science, machine learning, AI, and digital and emerging technologies at Harvard Business School in relevant channels through community engagement, networking, speeches, and publications as applicable
  • This role is responsible for other duties as assigned

ABOUT US

Founded in 1908 as part of Harvard University, Harvard Business School (www.hbs.edu) is located on a 40-acre campus in Boston. The School offers two full-time MBA and PhD programs, more than 175 Executive Education programs, and certificates and courses through Harvard Business School Online. For more than a century, Harvard Business School faculty have drawn on their research, connection to practice, global expertise, and passion for teaching to educate leaders who make a difference in the world. The School and its curriculum attract the boldest thinkers and the most collaborative learners who will shape the practice of business and entrepreneurship around the globe

Responsibilities:

In this role, you will translate the needs of our cross-functional stakeholders into user-facing applications that leverage NLP techniques and large language models (LLMs). As a Lead Data Scientist on our GenAI applications team, you will work on products like conversational search interfaces, chatbots, text summarizers, recommender engines, and more based on the needs of the constituents. You will partner with Product Managers, Machine Learning Engineers, Cloud Platform Engineers, and cross-functional partners to develop production-grade algorithms. Your innovations will drive value creation through personalized engagement, expanded reach, and experimental ways of learning that will continue the Harvard Business School leadership in education, business, and societal impact.

  • Architect the overall framework and infrastructure for GenAI products like search interfaces, bots, summarizers, etc. Develop and implement techniques to optimize model performance to meet specific product goals.
  • Collaborate closely with product management and engineering leads to align on technical roadmap. Guide engineering teams to effectively leverage LLM capabilities in product implementations.
  • Establish protocols and systems for building fair, accountable and transparent LLM-based applications. Lead efforts to proactively assess and mitigate risks due to model biases or failures.
  • Implement robust feedback pipelines, monitoring and corrections to ensure model safety
  • Design and oversee curation of high-quality datasets tailored for LLM training for each product. Build data science pipelines from feature generation, data visualization and models evaluation; design the solution, build initial code and provide documentation with ways of working to maximize time to value and re-usability.
  • Communicate clearly and effectively to technical and non-technical audiences, verbally and visually, to create understanding, engagement, and buy-in. Contribute novel research and analyses to leading academic conferences and journals

Additional duties and responsibilities include, but are not limited to, the following:

  • Identify trends and opportunities to drive innovation, both in what we do and how we do it; evaluate new data science, machine learning, and AI technologies and tools that can boost team performance, innovation and business value. Proactively analyze latest developments in large language models to deeply understand model capabilities, limitations, and best practices. Develop techniques to continually improve language understanding and model training
  • Mentor and develop junior data scientists in state-of-the-art GenAI methods
  • Set technical vision and lead initiatives to accelerate product impact through cutting-edge LLM innovations
  • Manage, coach and mentor a team of data scientists, serving at the predominant technical data science and machine learning expert
  • Actively contribute to and re-use community best practices
  • Embody the values and passions that characterize Harvard Business School, with empathy to engage with colleagues from a wide range of backgrounds
  • Promote data science, machine learning, AI, and digital and emerging technologies at Harvard Business School in relevant channels through community engagement, networking, speeches, and publications as applicable
  • This role is responsible for other duties as assigne


REQUIREMENT SUMMARY

Min:3.0Max:8.0 year(s)

Education Management

IT Software - Other

Education, Teaching

Diploma

Proficient

1

Boston, MA, USA