Senior Data Engineer at CodaMetrix

Boston, Massachusetts, USA -

Full Time

Start Date

Immediate

Expiry Date

20 Aug, 25

Salary

170000.0

Posted On

21 May, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Data Transformation, Information Systems, Spark, Licensure, Mongodb, Rest, Apache Spark, Analytical Skills, Scala, Informatics, Programming Languages, Python, Pii, Relational Databases, Data Engineering, Nosql, Kanban, It, Java, Github, Metadata, Working Experience, Scrum, Json

Industry

Information Technology/IT

Description

CodaMetrix is revolutionizing Revenue Cycle Management with its AI-powered autonomous coding solution, a multi-specialty AI-platform that translates clinical information into accurate sets of medical codes. CodaMetrix’s autonomous coding drives efficiency under fee-for-service and value-based care models and supports improved patient care. We are passionate about getting physicians and healthcare providers away from the keyboard and back to clinical care.

REQUIREMENTS

Required
BS, MS degree in Computer Science, Informatics, Information Systems or other related fields or equivalent work experience
5+ years of working experience with Databricks platform using PySpark/Scala
5+ years of experience with big data technologies on data ingestion using Apache Spark, Apache Kafka, and other distributed computing tools.
5+ years of strong SQL experience on relational and non-relational databases (SQL, NoSQL, MongoDB, etc.).
Experience with object-oriented or functional programming languages: Scala, Java, and Python are all preferred
Experience with both structured and unstructured data formats such as Parquet, CSV, JSON, XML
Experience working with Terraform to provision cloud infrastructure.
Experience with GitHub for version control, collaborative development, and CI/CD pipelines.
Hands-on experience building and managing data pipelines in large-scale, cloud-based environments.
Good knowledge of BI Tools; Tableau is a huge plus
Agile Development (SDLC, Scrum, Kanban)
You have experience building and optimizing ‘big data’ data pipelines, architectures and data sets. You have strong analytical skills related to working with both structured and unstructured datasets. You have built processes supporting data transformation, data structures, metadata, dependency and workload management. Strong project management and interpersonal skills. Experience supporting and working with cross-functional teams in a dynamic environment.
Preferred
Knowledge of HIPAA compliance requirements as well as other security/compliance practices such as PII and SOC2 a big plus
Experience with Streaming workloads and integrating Spark with Apache Kafka
Experience with consuming or authoring REST and/or SOAP web service APIs
Familiarity with machine learning concepts or AI applications in the context of data engineering
You understand what IaC means and have experience with common tools to implement it
The estimated hiring range for this role is $115,000 - $170,000 (plus applicable bonus/plus equity). This hiring range could vary by region based upon local market data. Final salary is ultimately decided upon taking into account a wide range of factors, including but not limited to: skills and experience, licensure and certifications, education, specific location and dynamic market data.

Responsibilities

Create, maintain, populate and optimize the CodaMetrix data platform and analytics architecture.
Assemble large, complex data sets that meet functional / non-functional business requirements using the Databricks platform.
Develop and manage ETL processes using Spark and Kafka to ingest, clean, and transform data from different sources (databases, APIs, external feeds, etc.) into usable formats for downstream analysis and reporting.
Implement data quality checks and ensure that data is accurate, consistent, and free from errors.
Implement data governance with data privacy and security regulations (e.g., GDPR, HIPAA).
Identify, design, and implement internal process improvements such as automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Collaborate with software engineers to ensure that data infrastructure is compatible with applications and services that rely on data.
Optimize data processing workflows for speed, efficiency, and scalability.
Work with stakeholders including the Analytics, Machine Learning, Executive and Product teams to assist with data-related technical issues and support their data infrastructure needs.
Ensure that data infrastructure supports real-time and batch data processing.
Work with structured, semi-structured, and unstructured data, managing large volumes of data and ensuring its accessibility.
Review code, provide constructive feedback, and ensure high standards of engineering excellence within the team.
Lead and mentor junior and mid-level data engineers, providing guidance and training on best practices, architecture design, and data pipeline management.
Establish best practices for data engineering and promote their adoption across teams.