Senior Data Scientist - Revenue Intelligence at GITHUB INC

Remote, Oregon, USA -

Full Time

Start Date

Immediate

Expiry Date

12 Nov, 25

Salary

299300.0

Posted On

13 Aug, 25

Experience

1 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Mathematics, Causal Inference, Unstructured Data, Addition, Revenue, Dashboards, Computer Science, Utilization, Storytelling, Sql, Statistics, Airflow, Economics, Classification, Experimental Design, Python, Physics, Spark, Time Series Analysis, R, Programming Languages

Industry

Information Technology/IT

Description

About GitHub: As the global home for all developers, GitHub is the complete AI-powered developer platform to build, scale, and deliver secure software. Over 150+ million developers, including more than 90% of the Fortune 100 companies, use GitHub to collaborate and experiment across 420+ million repositories. With all the collaborative features of GitHub, it has never been easier for individuals and teams to write faster, better code.
Locations: In this role you can work from Remote, United States
Overview:
GitHub Revenue is growing its Data Science team and we’re seeking experienced professionals to elevate our data and analytics efforts. As a Senior Data Scientist in Revenue, you will leverage your deep expertise and knowledge of data science, machine learning, and business to lead data acquisition efforts, conduct thorough review of data analysis and data quality, form hypotheses and discover insights in the data to support business stakeholders and their decision making. You will provide feedback to the engineering team to identify potential future business opportunities, and track advances in industry and academia to adapt algorithms and techniques to drive innovation and develop new solutions. The ideal candidate will contribute to the impact of our Data Science initiatives and gain deep insights into the latest advancements in AI, machine learning and data science.

Responsibilities:

Lead data acquisition efforts and ensure data is properly formatted and accurately described, while adhering to GitHub’s privacy policies
Mentor others in data cleaning and data analysis best practices. Identify gaps in current data sets and drive onboarding of new data sets from production systems or third-party vendors.
Resolve data integrity problems in collaboration with relevant teams to promote upstream change and long-term quality
Leverage broad and deep knowledge of modeling techniques, AI/ML tools, programming languages and query languages to create models, conduct experiments, analyze results, evaluating the methodology and performance of team members’ models and recommending improvements. Anticipate the risks of data leakage, bias/variance tradeoff, and methodological limitations.
Drive best practices relative to model validation, implementation, and application, and partners with teams across the organization to identify and explore new opportunities for driving transformative solutions for our stakeholders and customers.
Develop and articulate data-driven strategies in consideration of business priorities and lead conversations with end customers and/or internal stakeholders to understand, define, and solve business problems.
Track advances in industry and academia, and adapt algorithms and/or techniques to drive innovation and develop new solutions. Serves as a subject matter expert and mentor for team members.
Communicate complex statistics, and machine learning topics to diverse audiences (e.g., multidisciplinary teams, customers, technical and non-technical audiences)
Independently writes efficient, readable, extensible code that spans multiple features/solutions. Contributes to the code/model review process by providing feedback and suggestions for implementation and improvement.
Drive operational excellence for model deployment (i.e. performance, scalability, monitoring, maintenance, integration into engineering production system, stability)
Produce project plans to define necessary steps required for completion, leading to a measurable improvement in business performance metrics over time. Utilize project results to decide on next steps (e.g., deployment, further iterations, new projects).

Qualifications:

REQUIRED QUALIFICATIONS:

Bachelor’s Degree in Data Science, Mathematics, Physics, Statistics, Economics, Operations Research, Computer Science, or related field AND 5+ years experience in data science (e.g., managing structured and unstructured data, applying statistical techniques) or related field
OR Master’s Degree in Data Science, Mathematics, Physics, Statistics, Economics, Operations Research, Computer Science, or related field AND 3+ years experience in data science (e.g., managing structured and unstructured data, applying statistical techniques) or related field
OR Doctorate in Data Science, Mathematics, Physics, Statistics, Economics, Operations Research, Computer Science, or related field AND 1+ year(s) experience in data science (e.g., managing structured and unstructured data, applying statistical techniques) or related field
OR equivalent experience
3 + years of experience in programming languages such as Python or R, experience with query languages such as SQL and KQL, and with data manipulation tools like Spark and Airflow

PREFERRED QUALIFICATIONS:

Technical understanding of data science techniques for regression, classification, time-series analysis, experimental design, causal inference
Able to clearly communicate findings to non-technical stakeholders through storytelling and visualization with tools like Jupyter notebooks or Azure Data Explorer / PowerBI dashboards
Compensation Range: The base salary range for this job is USD $112,800.00 - USD $299,300.00 /Yr.
These pay ranges are intended to cover roles based across the United States. An individual’s base pay depends on various factors including geographical location and review of experience, knowledge, skills, abilities of the applicant. At GitHub certain roles are eligible for benefits and additional rewards, including annual bonus and stock. These rewards are allocated based on individual impact in role. In addition, certain roles also have the opportunity to earn sales incentives based on revenue or utilization, depending on the terms of the plan and the employee’s role. GitHub Leadership Principles:

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities

Lead data acquisition efforts and ensure data is properly formatted and accurately described, while adhering to GitHub’s privacy policies
Mentor others in data cleaning and data analysis best practices. Identify gaps in current data sets and drive onboarding of new data sets from production systems or third-party vendors.
Resolve data integrity problems in collaboration with relevant teams to promote upstream change and long-term quality
Leverage broad and deep knowledge of modeling techniques, AI/ML tools, programming languages and query languages to create models, conduct experiments, analyze results, evaluating the methodology and performance of team members’ models and recommending improvements. Anticipate the risks of data leakage, bias/variance tradeoff, and methodological limitations.
Drive best practices relative to model validation, implementation, and application, and partners with teams across the organization to identify and explore new opportunities for driving transformative solutions for our stakeholders and customers.
Develop and articulate data-driven strategies in consideration of business priorities and lead conversations with end customers and/or internal stakeholders to understand, define, and solve business problems.
Track advances in industry and academia, and adapt algorithms and/or techniques to drive innovation and develop new solutions. Serves as a subject matter expert and mentor for team members.
Communicate complex statistics, and machine learning topics to diverse audiences (e.g., multidisciplinary teams, customers, technical and non-technical audiences)
Independently writes efficient, readable, extensible code that spans multiple features/solutions. Contributes to the code/model review process by providing feedback and suggestions for implementation and improvement.
Drive operational excellence for model deployment (i.e. performance, scalability, monitoring, maintenance, integration into engineering production system, stability)
Produce project plans to define necessary steps required for completion, leading to a measurable improvement in business performance metrics over time. Utilize project results to decide on next steps (e.g., deployment, further iterations, new projects)