Data Engineer at SAIC
, , -
Full Time


Start Date

Immediate

Expiry Date

23 Jan, 26

Salary

0.0

Posted On

26 Oct, 25

Experience

10 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Data Engineering, AI/ML, Databricks, Spark, Delta Lake, MLflow, Python, SQL, ETL Orchestration, CI/CD, AWS, Data Validation, Feature Engineering, Governance, Security, Collaboration

Industry

Defense and Space Manufacturing

Description
We are seeking a Data Engineer with hands-on AI/ML project experience in Databricks to join the Databricks Solution Team within the IRS Advanced Analytics Program (AAP). This role is responsible for building, optimizing, and maintaining data pipelines and feature engineering workflows that directly support model training, deployment, and monitoring for IRS mission teams. As part of the AAP common services mission, the Data Engineer will deliver scalable, reusable, and compliant data engineering solutions using Databricks and AWS. The ideal candidate brings a strong background in data engineering for AI/ML use cases — ensuring data readiness and accessibility across the entire AI/ML lifecycle. Key Responsibilities Design, build, and maintain data pipelines in Databricks (Spark, Delta Lake, MLflow) specifically tailored for AI/ML and GenAI use cases. Implement data ingestion, transformation, and feature engineering workflows that feed model training and inference processes. Collaborate with mission data scientists to ensure datasets are optimized for model development and experimentation. Integrate pipelines into CI/CD workflows for automated, repeatable, and compliant model operations. Optimize data workflows for performance, scalability, and cost-efficiency across multi-tenant workloads. Apply governance and security controls (Unity Catalog, IAM, audit logging) to protect sensitive IRS data. Support data validation, schema enforcement, and quality checks to ensure reliable model outcomes. Partner with Product Manager and Chief Architect to align data engineering capabilities with roadmap priorities and platform evolution. Required Qualifications Bachelor’s degree in Computer Science, Data Engineering, or related field and 14 years or more experience; Master's degree at 12 or more years experience. Must be a U.S. Citizen with the ability to obtain and maintain a Public Trust security clearance. 5+ years of data engineering experience with AI/ML-focused projects. Hands-on expertise with Databricks, Spark, Delta Lake, and MLflow in the context of AI/ML pipelines. Proficiency in Python, SQL, and data transformation frameworks. Experience delivering feature engineering and data prep for model development and operationalization. Familiarity with ETL orchestration tools (Airflow, Databricks Workflows, or similar). Knowledge of CI/CD integration for data pipelines (Terraform, Git-based workflows). Awareness of AI/ML lifecycle data needs (training, validation, inference, retraining). Desired Skills Certifications: Databricks Certified Data Engineer Associate/Professional. Experience in federal or regulated data environments (FedRAMP, NIST 800-53). Familiarity with AWS data services (S3, Glue, Lambda, Redshift) integrated with Databricks. Exposure to Trustworthy AI practices (bias monitoring, lineage, explainability). Strong problem-solving and collaboration skills with architects, MLOps engineers, and mission data scientists.
Responsibilities
The Data Engineer will design, build, and maintain data pipelines in Databricks specifically for AI/ML use cases. This role involves collaborating with data scientists and integrating pipelines into CI/CD workflows to ensure efficient model operations.
Loading...