Software Engineer II, Copilot Data
at GITHUB INC
Remote, Oregon, USA -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 24 Nov, 2024 | USD 198900 Annual | 29 Aug, 2024 | 1 year(s) or above | Azure,Ruby,Go,Utilization,Communication Skills,Base Pay,Data Analysis,Revenue,Business Requirements,Collaboration,Spark,Software Coding,Addition,Languages,Airflow,Python,Data Governance,Computer Engineering,Sql,Data Models,Physics,Computer Science,Aws | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
About GitHub: As the global home for all developers, GitHub is the complete AI-powered developer platform to build, scale, and deliver secure software. Over 100 million people, including developers from 90 of the Fortune 100 companies, use GitHub to build amazing things together across 330+ million repositories. With all the collaborative features of GitHub, it has never been easier for individuals and teams to write faster, better code.
Locations: In this role you can work from Remote, United States
Overview:
The Copilot Metrics team is a unique group composed of software engineers, data engineers, and data analysts. We build a service to support our customers’ metrics needs and help them understand product usage patterns. The analysts will assist with data modeling, support product managers and leadership in understanding how customers are using our product, and identify pain points. This valuable analysis will also be used to help the product team develop better features.
As a Data Software Engineer on the Copilot Metrics team, you will be responsible for designing, developing, and maintaining efficient and reliable data pipelines. You will work closely with stakeholders across the company to gather business requirements, build data models, and ensure data quality and accessibility. Your expertise in Python, SQL, Airflow, and Spark will be crucial in optimizing our data infrastructure and enabling data-driven decision-making.
Responsibilities:
- Data Pipeline Development: Design, build, and maintain scalable data pipelines using Python, SQL, Airflow, and Spark.
- Business Requirements Gathering: Collaborate with stakeholders to understand and translate business requirements into technical specifications.
- Data Modeling:Develop and implement data models that support analytics and reporting needs, ensuring alignment with business goals.
- Data Quality and Governance: Ensure data accuracy, consistency, and reliability by implementing robust data validation and quality checks.
- Stakeholder Collaboration: Work with cross-functional teams, including data analysts, data scientists, and business leaders, to deliver high-quality data solutions.
- Performance Optimization: Continuously monitor and optimize data pipelines for performance, scalability, and cost-efficiency.
- Monitoring and Observability: Build and implement monitoring and observability metrics to ensure data quality and detect anomalies in data pipelines.
- Documentation and Communication: Maintain clear and comprehensive documentation of data processes and effectively communicate technical concepts to non-technical stakeholders.
Qualifications:
REQUIRED/MINIMUM QUALIFICATIONS:
- 2+ years experience in Software Engineering, Computer Science, or related technical discipline with proven experience maintaining production software coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, Go, Ruby, Rust, or Python
- o OR Associate’s Degree in Computer Science, Electrical Engineering, Electronics Engineering, Math, Physics, Computer Engineering, Computer Science, or related field AND 1+ year(s) experience
- o OR Bachelor’s Degree in Computer Science or related field o OR equivalent experience
- 2+ years of experience in data engineering or analytics engineering roles, with strong proficiency in Python, SQL, Airflow, and Spark, and extensive expertise in building and maintaining robust data pipelines and ETL processes
- Experience gathering business requirements and translating them into effective data models that support comprehensive data analysis and reporting
PREFERRED QUALIFICATIONS:
- Familiarity with Go and Ruby
- Experience with cloud platforms such as AWS, GCP, or Azure
- Familiarity with data warehousing solutions (e.g., Snowflake, Redshift, BigQuery)
- Knowledge of data governance and data security best practices
- Communication: Excellent verbal and written communication skills, with the ability to convey technical information to non-technical audiences
- Collaboration: Proven ability to work effectively in a collaborative, cross-functional environment
Compensation Range: The base salary range for this job is USD $75,000.00 - USD $198,900.00 /Yr.
These pay ranges are intended to cover roles based across the United States. An individual’s base pay depends on various factors including geographical location and review of experience, knowledge, skills, abilities of the applicant. At GitHub certain roles are eligible for benefits and additional rewards, including annual bonus and stock. These rewards are allocated based on individual impact in role. In addition, certain roles also have the opportunity to earn sales incentives based on revenue or utilization, depending on the terms of the plan and the employee’s role. GitHub Leadership Principles:
Responsibilities:
- Data Pipeline Development: Design, build, and maintain scalable data pipelines using Python, SQL, Airflow, and Spark.
- Business Requirements Gathering: Collaborate with stakeholders to understand and translate business requirements into technical specifications.
- Data Modeling:Develop and implement data models that support analytics and reporting needs, ensuring alignment with business goals.
- Data Quality and Governance: Ensure data accuracy, consistency, and reliability by implementing robust data validation and quality checks.
- Stakeholder Collaboration: Work with cross-functional teams, including data analysts, data scientists, and business leaders, to deliver high-quality data solutions.
- Performance Optimization: Continuously monitor and optimize data pipelines for performance, scalability, and cost-efficiency.
- Monitoring and Observability: Build and implement monitoring and observability metrics to ensure data quality and detect anomalies in data pipelines.
- Documentation and Communication: Maintain clear and comprehensive documentation of data processes and effectively communicate technical concepts to non-technical stakeholders
REQUIREMENT SUMMARY
Min:1.0Max:2.0 year(s)
Information Technology/IT
IT Software - Other
Software Engineering
Graduate
Computer Science, Electrical, Electrical Engineering, Engineering, Math
Proficient
1
Remote, USA