Sign up with

Already have an account? Sign in here

Need some help?
Talk to us at +91 7670800001

Senior Data Scientist at GPTZero

Toronto, ON, Canada -

Full Time

Start Date

Immediate

Expiry Date

18 Apr, 25

Salary

0.0

Posted On

19 Jan, 25

Experience

3 year(s) or above

Remote Job

Telecommute

Sponsor Visa

Skills

Pandas, Numpy, Machine Learning, Hive, Data Manipulation, Scipy, Visualization

Industry

Information Technology/IT

Description

WHO ARE WE?

GPTZero’s mission is to restore information quality and transparency on the internet.
Our team comes from high-performing engineering cultures, including Uber, Meta, Microsoft, Affirm, and leading AI research labs, including Princeton, Caltech, Vector, and MILA. We are working on novel models and pushing cutting-edge research to production, including AI detection, AI hallucination detection, retrieval augmented generation, and writing stylometry to over 3 million active users, and enterprise clients including Fortune 1000 and Unicorn AI companies.
We are backed by some of the best in the valley, including Uncork, Neo, Altman Capital, and in journalism, including Mark Thompson (CEO of NYT, BBC, CNN) and Tom Glocer (CEO of Reuters) who defined a generation of quality digital information.

QUALIFICATIONS

3+ years of experience with Pandas, Numpy, and Scipy for data manipulation, visualization and statistical analysis of complex datasets.
Experience with large-scale data pipelines and feature engineering (such as PySpark, BigQuery, AWS EMR, or Hive).
Strong storytelling ability to communicate complex findings to technical and non-technical audiences.
Self-starter (pitch, plan, and implement as a project owner in a fast-paced team)
Highly motivated to make positive societal impact
Ability to wear multiple hats and be a leader as our team grows
Ability to work in Canada
Bonus:
strong open-source portfolio
experience with PyTorch for machine learning
experience with databricks, and amplitude for product analytics
experience working in an early-stage startup environment

Responsibilities

Build our understanding of user behaviour and needs with product analytics using Pandas, Numpy, Scipy, and PySpark
Write clean, efficient Python and SQL to extract insights, identify patterns, and evaluate the performance of our machine learning models
Build robust, scalable pipelines to ingest, preprocess, and analyze large-scale datasets using tools like Databricks and amplitude.
Collaborate with both machine learning engineers and product teams to validate datasets, interpret model outputs, and recommend improvements.