Software Engineer - Data Platform & GenAI at CVS Health

Washington, DC 20001, USA -

Full Time

Start Date

Immediate

Expiry Date

27 Nov, 25

Salary

83430.0

Posted On

27 Aug, 25

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Computer Science, Kafka, Azure, Google Cloud Platform, Fine Tuning, Aws, Python, Kubernetes, Docker

Industry

Information Technology/IT

Description

At CVS Health, we’re building a world of health around every consumer and surrounding ourselves with dedicated colleagues who are passionate about transforming health care.
As the nation’s leading health solutions company, we reach millions of Americans through our local presence, digital channels and more than 300,000 purpose-driven colleagues – caring for people where, when and how they choose in a way that is uniquely more connected, more convenient and more compassionate. And we do it all with heart, each and every day.

POSITION SUMMARY

We are seeking a talented and motivated Software Engineer to join our HR Tech Data Platform team. In this role, you will design, implement, and deploy production grade data pipelines and GenAI/LLM applications in a cloud environment. This is an exciting opportunity to work with cutting edge technologies and contribute to impactful projects in a collaborative setting.

Key Responsibilities:

Design, develop, and maintain scalable data pipelines for processing large datasets and events using distributed computing frameworks (e.g., Spark, Pub/Sub), adhering to the highest privacy standards.
Design and implement Snowflake and star schema models for efficient data retrieval and analytics.
Build, deploy, and fine tune LLM based GenAI applications leveraging Google Cloud services (e.g., GKE, Cloud Functions, Vector Search).
Develop solutions using NoSQL and SQL databases.
Collaborate with stakeholders, engineers, and product teams to deliver robust and efficient analytics solutions.
Implement best practices for logging, data quality, reliability, and security in cloud-based environments.
Monitor, troubleshoot, and optimize pipeline performance and application scalability.
Write clean, maintainable, modular, and well documented code.
Participate in code reviews and contribute to the continuous improvement of engineering processes.

REQUIRED QUALIFICATIONS

2+ years’ hands-on experience developing and deploying production-grade data pipelines and machine learning models in a cloud environment.
2+ years’ experience building applications using Large Language Models (e.g., OpenAI, Gemini), including fine-tuning.
2+ years’ experience with Python and distributed data processing frameworks (e.g., PySpark, Kafka, Flink).
2+ years’ experience working with cloud platforms (preferably Google Cloud Platform, AWS, or Azure).
2+ years’ experience working with CI/CD pipelines and containerization technologies (Docker, Kubernetes).

PREFERRED QUALIFICATIONS

Strong problem solving skills and attention to detail.
Excellent communication and teamwork abilities.
Experience with vector databases, Pub/Sub, and Kubernetes.
Exposure to MLOps practices and tools.
Experience building agentic workflows.

EDUCATION

Bachelor’s degree in computer science or a related field required.

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities

Design, develop, and maintain scalable data pipelines for processing large datasets and events using distributed computing frameworks (e.g., Spark, Pub/Sub), adhering to the highest privacy standards.
Design and implement Snowflake and star schema models for efficient data retrieval and analytics.
Build, deploy, and fine tune LLM based GenAI applications leveraging Google Cloud services (e.g., GKE, Cloud Functions, Vector Search).
Develop solutions using NoSQL and SQL databases.
Collaborate with stakeholders, engineers, and product teams to deliver robust and efficient analytics solutions.
Implement best practices for logging, data quality, reliability, and security in cloud-based environments.
Monitor, troubleshoot, and optimize pipeline performance and application scalability.
Write clean, maintainable, modular, and well documented code.
Participate in code reviews and contribute to the continuous improvement of engineering processes