Data Science Co-op

at  Authenticate

Remote, Oregon, USA -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate21 Jul, 2024Not Specified28 Apr, 2024N/AInterpersonal Skills,Statistics,Python,Normalization,Numpy,Computer Science,Pandas,Mathematics,Data Science,MatplotlibNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

ABOUT US:

Authenticate.com is a leading provider of identity verification and background check solutions. Our innovative platform helps businesses prevent fraud, ensure compliance, and build trust with their users. We offer a wide range of verification services, including document verification, facial recognition, database checks, and continuous monitoring.

JOB SUMMARY:

We are seeking a highly motivated and detail-oriented Data Scientist Co-op to join our team. As a Data Scientist Co-op, you will play a critical role in developing and maintaining our data infrastructure, with a focus on creating vector databases and utilizing Large Language Models (LLMs) to normalize data for criminal history and employment history. This is an excellent opportunity to apply your data science skills to real-world problems and contribute to the development of innovative solutions in the identity verification and background screening space.

REQUIREMENTS:

  • Currently enrolled in a Bachelor’s or Master’s degree program in Computer Science, Data Science, Mathematics, Statistics, or a related field
  • Strong programming skills in Python, with experience in data science libraries such as NumPy, Pandas, and scikit-learn
  • Familiarity with vector databases and Large Language Models (LLMs) such as BERT, RoBERTa, or DistilBERT
  • Experience with data preprocessing, normalization, and feature engineering
  • Knowledge of data visualization tools such as Matplotlib, Seaborn, or Plotly
  • Excellent problem-solving skills, with the ability to work independently and collaboratively as part of a team
  • Strong communication and interpersonal skills, with the ability to explain technical concepts to non-technical stakeholders

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities:

  • Design, develop, and maintain vector databases to store and query large datasets related to criminal history and employment history
  • Utilize Large Language Models (LLMs) to normalize and standardize data from various sources, ensuring consistency and accuracy
  • Collaborate with cross-functional teams to integrate vector databases and LLM-based data normalization into our background screening and identity verification products
  • Develop and implement data quality control processes to ensure data accuracy, completeness, and integrity
  • Analyze and visualize data to identify trends, patterns, and insights that can inform product development and improvement
  • Stay up-to-date with industry trends and advancements in natural language processing, machine learning, and data science
  • Communicate technical results and insights to non-technical stakeholders through clear and concise reporting


REQUIREMENT SUMMARY

Min:N/AMax:5.0 year(s)

Information Technology/IT

IT Software - DBA / Datawarehousing

Software Engineering

Graduate

Computer science data science mathematics statistics or a related field

Proficient

1

Remote, USA