Senior Data Engineer at Collective Health

San Francisco, California, USA -

Full Time

Start Date

Immediate

Expiry Date

15 Nov, 25

Salary

187500.0

Posted On

16 Aug, 25

Experience

8 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Good communication skills

Industry

Information Technology/IT

Description

At Collective Health, we’re transforming how employers and their people engage with their health benefits by seamlessly integrating cutting-edge technology, compassionate service, and world-class user experience design.
As a Lead Data Engineer, you will drive the development of robust, scalable, and efficient data solutions, collaborating closely with cross-functional teams. You will provide thought leadership on data architecture, mentor junior engineers, and optimize our data ecosystem for performance and reliability.

Responsibilities

WHAT YOU’LL DO:

Architect Scalable Data Solutions - Design, develop, and optimize large-scale data pipelines using Spark (PySpark, Scala), Databricks, and distributed data processing frameworks.
Advance Data Modeling & Architecture - Lead the design and evolution of data models to support analytical, operational, and machine-learning requirements.
Enhance Data Performance & Reliability - Improve data processing performance, scalability, and reliability, while ensuring data quality and governance.
Drive Cross-Functional Collaboration - Partner with Product, Engineering, Data Science, and Analytics teams to deliver high-impact data solutions that generate actionable business and clinical insights.
Mentor & Provide Technical Leadership - Guide junior and mid-level engineers, conduct code reviews, and establish best practices in data engineering.
Ensure Data Governance & Security - Implement robust security, privacy, and compliance measures for sensitive healthcare data, ensuring adherence to industry regulations.
Influence Data Strategy - Provide input on data infrastructure decisions, emerging technologies, and process improvements.

TO BE SUCCESSFUL IN THIS ROLE, YOU’LL NEED:

8+ years of data engineering experience in fast-paced, data-driven environments.
Expertise in building scalable ETL pipelines with Spark (PySpark or Scala) and SQL.
Deep understanding of data architecture, schema design, and dimensional modeling for analytics and machine learning.
Proficiency in distributed systems such as Spark, Databricks, or Snowflake.
Experience with event-driven architectures and streaming platforms like Kafka or Kinesis.
Excellent communication skills – ability to collaborate cross-functionally and translate complex technical concepts into business impact.
Mentorship experience – experience guiding engineers and fostering a collaborative, inclusive team culture.
Security-first mindset – familiarity with data privacy, encryption, and compliance in healthcare or other regulated industries is a plus.