Data Science Co-op
at Authenticate
Remote, Oregon, USA -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 21 Jul, 2024 | Not Specified | 28 Apr, 2024 | N/A | Interpersonal Skills,Statistics,Python,Normalization,Numpy,Computer Science,Pandas,Mathematics,Data Science,Matplotlib | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
ABOUT US:
Authenticate.com is a leading provider of identity verification and background check solutions. Our innovative platform helps businesses prevent fraud, ensure compliance, and build trust with their users. We offer a wide range of verification services, including document verification, facial recognition, database checks, and continuous monitoring.
JOB SUMMARY:
We are seeking a highly motivated and detail-oriented Data Scientist Co-op to join our team. As a Data Scientist Co-op, you will play a critical role in developing and maintaining our data infrastructure, with a focus on creating vector databases and utilizing Large Language Models (LLMs) to normalize data for criminal history and employment history. This is an excellent opportunity to apply your data science skills to real-world problems and contribute to the development of innovative solutions in the identity verification and background screening space.
REQUIREMENTS:
- Currently enrolled in a Bachelor’s or Master’s degree program in Computer Science, Data Science, Mathematics, Statistics, or a related field
- Strong programming skills in Python, with experience in data science libraries such as NumPy, Pandas, and scikit-learn
- Familiarity with vector databases and Large Language Models (LLMs) such as BERT, RoBERTa, or DistilBERT
- Experience with data preprocessing, normalization, and feature engineering
- Knowledge of data visualization tools such as Matplotlib, Seaborn, or Plotly
- Excellent problem-solving skills, with the ability to work independently and collaboratively as part of a team
- Strong communication and interpersonal skills, with the ability to explain technical concepts to non-technical stakeholders
How To Apply:
Incase you would like to apply to this job directly from the source, please click here
Responsibilities:
- Design, develop, and maintain vector databases to store and query large datasets related to criminal history and employment history
- Utilize Large Language Models (LLMs) to normalize and standardize data from various sources, ensuring consistency and accuracy
- Collaborate with cross-functional teams to integrate vector databases and LLM-based data normalization into our background screening and identity verification products
- Develop and implement data quality control processes to ensure data accuracy, completeness, and integrity
- Analyze and visualize data to identify trends, patterns, and insights that can inform product development and improvement
- Stay up-to-date with industry trends and advancements in natural language processing, machine learning, and data science
- Communicate technical results and insights to non-technical stakeholders through clear and concise reporting
REQUIREMENT SUMMARY
Min:N/AMax:5.0 year(s)
Information Technology/IT
IT Software - DBA / Datawarehousing
Software Engineering
Graduate
Computer science data science mathematics statistics or a related field
Proficient
1
Remote, USA