Senior Data Engineer at Collective Health
San Francisco, California, USA -
Full Time


Start Date

Immediate

Expiry Date

15 Nov, 25

Salary

187500.0

Posted On

16 Aug, 25

Experience

8 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Good communication skills

Industry

Information Technology/IT

Description

At Collective Health, we’re transforming how employers and their people engage with their health benefits by seamlessly integrating cutting-edge technology, compassionate service, and world-class user experience design.
As a Lead Data Engineer, you will drive the development of robust, scalable, and efficient data solutions, collaborating closely with cross-functional teams. You will provide thought leadership on data architecture, mentor junior engineers, and optimize our data ecosystem for performance and reliability.

Responsibilities

WHAT YOU’LL DO:

  • Architect Scalable Data Solutions - Design, develop, and optimize large-scale data pipelines using Spark (PySpark, Scala), Databricks, and distributed data processing frameworks.
  • Advance Data Modeling & Architecture - Lead the design and evolution of data models to support analytical, operational, and machine-learning requirements.
  • Enhance Data Performance & Reliability - Improve data processing performance, scalability, and reliability, while ensuring data quality and governance.
  • Drive Cross-Functional Collaboration - Partner with Product, Engineering, Data Science, and Analytics teams to deliver high-impact data solutions that generate actionable business and clinical insights.
  • Mentor & Provide Technical Leadership - Guide junior and mid-level engineers, conduct code reviews, and establish best practices in data engineering.
  • Ensure Data Governance & Security - Implement robust security, privacy, and compliance measures for sensitive healthcare data, ensuring adherence to industry regulations.
  • Influence Data Strategy - Provide input on data infrastructure decisions, emerging technologies, and process improvements.

TO BE SUCCESSFUL IN THIS ROLE, YOU’LL NEED:

  • 8+ years of data engineering experience in fast-paced, data-driven environments.
  • Expertise in building scalable ETL pipelines with Spark (PySpark or Scala) and SQL.
  • Deep understanding of data architecture, schema design, and dimensional modeling for analytics and machine learning.
  • Proficiency in distributed systems such as Spark, Databricks, or Snowflake.
  • Experience with event-driven architectures and streaming platforms like Kafka or Kinesis.
  • Excellent communication skills – ability to collaborate cross-functionally and translate complex technical concepts into business impact.
  • Mentorship experience – experience guiding engineers and fostering a collaborative, inclusive team culture.
  • Security-first mindset – familiarity with data privacy, encryption, and compliance in healthcare or other regulated industries is a plus.
Loading...