Staff Data Engineer at Karius

Redwood City, California, USA -

Full Time

Start Date

Immediate

Expiry Date

16 Sep, 25

Salary

248000.0

Posted On

17 Jun, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Regulatory Standards, Sql, Stream Processing, Timelines, Collaboration, Data Models, Usability, Deliverables, Algorithms, Continuous Improvement, Architecture, Business Requirements, Version Control, Data Governance, Data Integration, Azure, Cloud Services

Industry

Information Technology/IT

Description

ABOUT KARIUS

Karius is a venture-backed life science startup that is transforming the way pathogens and other microbes are observed throughout the body. By unlocking the information present in microbial cell-free DNA, we’re helping doctors quickly solve their most challenging cases, providing industry partners with access to 1000’s of biomarkers to accelerate clinical trials, discovering new microbes, and reducing patient suffering worldwide.

POSITION SUMMARY

We are seeking a seasoned Staff data engineer to drive data platform initiatives across the data value chain at Karius. We develop and operate AI-driven data analytics pipelines to deliver life-saving results in the highly complex infectious disease landscape. In this role, you will have the opportunity to develop and optimize the data platform to enable our users to extract insights from large amounts of commercial, operational, genomic, and clinical data, ultimately providing actionable insights that serve the business and patients. You will be able to incorporate the next generation of AI technologies and tools (e.g. Generative AI) into the Karius data platform to significantly increase delivered value to internal stakeholders and our customers.

PHYSICAL REQUIREMENTS

Subject to extended periods of sitting and/or standing, vision to monitor and moderate noise levels. Work is generally performed in an office environment.

POSITION REQUIREMENTS

We are seeking a data engineer with exceptional system thinking. Critical to this role is the ability to grasp business needs, identify the complexity and interconnections of data elements, and determine the desired insights to extract. The ideal candidate will excel in translating business requirements into a technical roadmap and developing remarkable solutions to satisfy those needs.

EDUCATIONAL BACKGROUND

B.S. degree in Computer Science, Software Engineering, Electrical Engineering, Bioengineering, or related technical fields involving algorithms or coding (e.g., Physics or Mathematics).

PROFESSIONAL EXPERIENCE

10+ years of data engineering / software development experience with at least 5 years of relevant experience in building enterprise-scale data platforms.

TECHNICAL SKILLS

Data Platforms and Cloud Services: Hands-on experience with data platforms (e.g. Databricks - strongly preferred, Snowflake) and cloud services (e.g AWS - strongly preferred, GCP, Azure).
Data Integration and Pipelines:
ETL/ELT tooling: Experience with ETL/ELT tools (e.g. Fivetran, Stitch, Airbyte) for integrating internal and third-party data sources
Batch and Stream Processing: Experience in building scalable infrastructure for batch processing (e.g., Spark, Hadoop) and stream processing (e.g., Kafka, Kinesis) for large volumes of data
Developer Toolset: Proficiency in programming languages for data engineering (i.e. Python and SQL) applied in conjunction with SDLC principles and developer practices (e.g. code/data version control, containerization, CI/CD, IaC, automated testing, monitoring/alerting).
Data Modeling and Architecture: Strong conceptual understanding of data modeling and practical experience with enterprise data models and data architecture components (e.g. databases, warehouse, lake, lakehouse, catalog).
Reporting and Visualization: Experience with reporting and dashboard tools (e.g. Looker, Streamlit, Tableau, PowerBI, Hex, Dash).
ML Tooling: Familiarity with data science tooling such as notebooks, standard data processing/visualization libraries (e.g. pyspark, pandas, numpy, scipy, plotly, seaborn, matplotlib, altair), and ML tooling (e.g. MLflow, SageMaker).
Generative AI: Working knowledge of generative AI concepts and hands-on experience with frameworks and tooling (e.g. LangChain, LlamaIndex, OpenAPI, RAG, vector databases, agents, Bedrock).
Data Governance and Compliance: Demonstrated experience in implementing and maintaining data governance and compliance frameworks, including handling Protected Health Information (PHI) and adhering to regulatory standards.

NON-TECHNICAL SKILLS

Ability to work in a fast-paced, dynamic startup environment
Ability to balance quality and speed when building engineering systems.
Strong organizational and time management abilities.
Excellent communication and collaboration skills.
Attention to detail and commitment to delivering high-quality solutions

PERSONAL QUALIFICATIONS

We want to add a humble, curious, and collaborative member to our team. At the Karius engineering team, we highly value deep domain expertise, drive for innovation, desire to collaborate, being open to learning and unlearning, and passion for solving hard problems with a meaningful impact on the world. A sense of ownership and personal/group accountability allows us to be a productive and high-performing team. If you share our vision we would like to have you on board.

DISCLAIMER

The above job description is intended to describe the general nature and level of work being performed by individuals assigned to this position. It is not intended to be an exhaustive list of all duties, responsibilities, and skills required. Responsibilities and duties may change or be adjusted to meet the needs of the company, and additional duties may be assigned as necessary. The job description is subject to change at any time at the discretion of Karius.

- Generative AI Integrations: Implement generative AI solutions to increase the value and usability of Karius data assets across use-cases such as data pipelining, data discovery, knowledgebase search, and conversational analytics.

Project Management: Proactively interface with cross-functional technical and non-technical stakeholders to identify unmet needs, ensure alignment with data initiatives (scope, timelines, deliverables), and communicate results and outcomes.
Collaboration: Coordinate with the engineering and IT domains to understand, interface with, and extend production data and software systems using engineering best practices.
Data Governance: Inform, implement, and follow data governance best practices and policies in conjunction with the Security and Compliance team to meet regulatory and legal requirements.
Continuous Improvement: Foster a growth mindset and self-starter attitude, continually seeking opportunities for process and system improvements with a focus on quality, practicality, and delivered value
Note that job duties and responsibilities may evolve based on company needs and technological advancements

Responsibilities

Data Platform Development: Design, develop, test, deploy, and maintain production-grade platforms and tooling that add value across the data lifecycle (ingest, transform, serve) for various use cases such as reporting, data analytics, machine learning, and bioinformatics. Examples of projects you will own and drive:
- Data Architecture: Ensure centralized, standardized, and secure data access across business domains.
- Data Pipelines and Reporting: Combine internal (including PHI) and external data sources via ETL/ELT tooling to calculate operational and commercial KPIs for data analysis and reporting use-cases, catalyzing insight generation.
- ML/AI Data Platform Capabilities: Provide support for computational teams to drive value from clinical and genomic data assets through the ML lifecycle and data science tooling Generative.
- Generative AI Integrations: Implement generative AI solutions to increase the value and usability of Karius data assets across use-cases such as data pipelining, data discovery, knowledgebase search, and conversational analytics.
Project Management: Proactively interface with cross-functional technical and non-technical stakeholders to identify unmet needs, ensure alignment with data initiatives (scope, timelines, deliverables), and communicate results and outcomes.
Collaboration: Coordinate with the engineering and IT domains to understand, interface with, and extend production data and software systems using engineering best practices.
Data Governance: Inform, implement, and follow data governance best practices and policies in conjunction with the Security and Compliance team to meet regulatory and legal requirements.
Continuous Improvement: Foster a growth mindset and self-starter attitude, continually seeking opportunities for process and system improvements with a focus on quality, practicality, and delivered value
Note that job duties and responsibilities may evolve based on company needs and technological advancements.