Azure Database & Data Lake Engineer at Northramp LLC

Bethesda, Maryland, USA -

Full Time

Start Date

Immediate

Expiry Date

15 Nov, 25

Salary

0.0

Posted On

15 Aug, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Sql, Scripting Languages, Communication Skills, Eligibility

Industry

Information Technology/IT

Description

OPPORTUNITY OVERVIEW

Northramp is seeking an experienced Azure Database & Data Lake Engineer to to design, build, and sustain modern data pipelines and storage solutions in a federal government environment. As a consultant, you will help solve complex data‑ingestion challenges, strengthen client relationships, and expand work under existing contracts through insight, quality, and collaboration.
This role offers a hybrid work arrangement, with up to 50% of work performed remotely.
The ideal will have a strong understanding of cloud‑based data architectures, ETL processes, and Azure services including Data Lake Storage, Cosmos DB, Databricks, and vector databases. The engineer will be responsible for ingesting clinical, genomic, and imaging data into an Azure Data Lake; implementing scalable ETL pipelines; and ensuring that data is properly linked and accessible for downstream analytics. Familiarity with health‑care datasets, de‑identification requirements, and federal IT standards (e.g., FISMA and NIST 800‑53) is essential.

REQUIRED QUALIFICATIONS

US Citizenship
Ability to pass a federal background check successfully and maintain eligibility for continued access authorizations
Public Trust / IT-I / IT-II eligibility
Bachelor’s degree
Minimum of four (4) of experience building cloud-based search or AI solutions
Hands‑on experience with Azure Data Lake Storage, Azure Cosmos DB, Azure Databricks, and data‑integration tools
Proficiency with ETL scripting languages (e.g., Python, Spark) and familiarity with SQL
Strong analytical and problem‑solving skills; ability to work independently and proactively
Excellent oral and written communication skills for both technical and non-technical audiences
Flexibility to adapt to changes in work scope to meet organizational goals

Responsibilities

Design and implement ETL processes to ingest CRISPI datasets and other structured sources into Azure Data Lake Storage.
Develop pipelines to load genomic and imaging data, ensuring cross‑referential linkage via patient and encounter identifiers.
Create and maintain vector‑based index tables for clinical notes using NLP/embedding techniques to support AI‑driven search.
Collaborate with OMOP ETL engineers, search developers, and program managers to align data ingestion with downstream models and interfaces.
Optimize data‑storage structures for performance and cost, and implement data governance controls in compliance with NIH and federal policies.
Document data architectures, schemas, and transformation logic for future maintenance and knowledge transfer.
Attend stakeholder meetings to solicit feedback and adjust data‑ingestion strategies as requirements evolve.