Data Engineer SME at General Dynamics Information Technology

Arlington, Texas, United States -

Full Time

Start Date

Immediate

Expiry Date

29 Mar, 26

Salary

253000.0

Posted On

29 Dec, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Apache Airflow, ElasticSearch, Python, Kafka, Data Pipeline Orchestration, Cloud Services, Data Processing, Data Quality, Data Governance, Event-Driven Architecture, Microservices Development, Linux, Database Optimization, Version Control, AI Applications, ETL Processes

Industry

IT Services and IT Consulting

Description

Type of Requisition: Regular Clearance Level Must Currently Possess: Top Secret/SCI Clearance Level Must Be Able to Obtain: Top Secret/SCI Public Trust/Other Required: None Job Family: Data Science and Data Engineering Job Qualifications: Skills: Apache Airflow, ElasticSearch, Python Certifications: None Experience: 5 + years of related experience US Citizenship Required: Yes Job Description: Iron EagleX (IEX), a wholly owned subsidiary of General Dynamics Information Technology, delivers agile IT and Intelligence solutions. Combining small-team flexibility with global scale, IEX leverages emerging technologies to provide innovative, user-focused solutions that empower organizations and end users to operate smarter, faster, and more securely in dynamic environments. Job Description: We are seeking an Data Engineering SME to design, build, and operate data pipelines that ingest, store, and process high-volume, multi-source data primarily for modern AI/ML processes. You will partner with software, analytics, and product teams to create model-ready datasets (features, embeddings, and prompts), implement scalable storage layers (data lakehouse and vector stores), and enable low-latency retrieval for query, inference, and RAG. Responsibilities include orchestrating streaming and batch pipelines, optimizing compute for GPU/CPU workloads, enforcing data quality and governance, and instrumenting observability. This role is ideal for someone passionate about turning raw data into reliable, performant inputs for AI models and other analytics while right-sizing technologies and resources for scale and speed. This is an onsite position in Crystal City, VA. Job Duties Include (but not limited to): Design, develop, and implement scalable data pipelines and ETL processes using Apache Airflow, with a focus on data for AI applications. Develop messaging solutions utilizing Kafka to support real-time data streaming and event-driven architectures. Build and maintain high-performance data retrieval solutions using ElasticSearch/OpenSearch. Implement and optimize Python-based data processing solutions. Integrate batch and streaming data processing techniques to enhance data availability and accessibility. Ensure adherence to security and compliance requirements when working with classified data. Work closely with cross-functional teams to define data strategies and develop technical solutions aligned with mission objectives. Deploy and manage cloud-based infrastructure to support scalable and resilient data solutions. Optimize data storage, retrieval, and processing efficiency. Required Skills & Experience: Experience with Apache Airflow for workflow orchestration. Strong programming skills in Python. Experience with ElasticSearch/OpenSearch for data indexing and search functionalities. Understanding of vector databases, embedding models, and vector search for AI applications. Expertise in event-driven architecture and microservices development. Hands-on experience with cloud services (e.g. MinIO), including data storage and compute resources. Strong understanding of data pipeline orchestration and workflow automation. Working knowledge of Linux environments and database optimization techniques. Strong understanding of version control with Git. Due to US Government Contract Requirements, only US Citizens are eligible for this role. Nice to Have Skills: Proficiency in Kafka for messaging and real-time data processing. Understanding of LLM prompt engineering and associated ETL applications. Knowledge of SuperSet for data visualization and analytics. Familiarity with Kubernetes for container orchestration. Exposure to Apache Spark for large-scale data processing. Education & Certifications: Bachelor's degree in Computer Science, Information Systems, Engineering, or a related field (or equivalent experience). Advanced degrees are a plus. Security Clearance: An active TS/SCI security clearance is REQUIRED, and candidates must have or be willing to obtain a CI Poly. Candidates without this clearance will not be considered. Equal Opportunity Employer / Individuals with Disabilities / Protected Veterans The likely salary range for this position is $187,000 - $253,000. This is not, however, a guarantee of compensation or salary. Rather, salary will be set based on experience, geographic location and possibly contractual requirements and could fall outside of this range. Scheduled Weekly Hours: 40 Travel Required: Less than 10% Telecommuting Options: Onsite Work Location: USA VA Arlington Additional Work Locations: Total Rewards at GDIT: Our benefits package for all US-based employees includes a variety of medical plan options, some with Health Savings Accounts, dental plan options, a vision plan, and a 401(k) plan offering the ability to contribute both pre and post-tax dollars up to the IRS annual limits and receive a company match. To encourage work/life balance, GDIT offers employees full flex work weeks where possible and a variety of paid time off plans, including vacation, sick and personal time, holidays, paid parental, military, bereavement and jury duty leave. To ensure our employees are able to protect their income, other offerings such as short and long-term disability benefits, life, accidental death and dismemberment, personal accident, critical illness and business travel and accident insurance are provided or available. We regularly review our Total Rewards package to ensure our offerings are competitive and reflect what our employees have told us they value most. We are GDIT. A global technology and professional services company that delivers consulting, technology and mission services to every major agency across the U.S. government, defense and intelligence community. Our 30,000 experts extract the power of technology to create immediate value and deliver solutions at the edge of innovation. We operate across 50 countries worldwide, offering leading capabilities in digital modernization, AI/ML, Cloud, Cyber and application development. Together with our clients, we strive to create a safer, smarter world by harnessing the power of deep expertise and advanced technology. Join our Talent Community to stay up to date on our career opportunities and events at gdit.com/tc. Equal Opportunity Employer / Individuals with Disabilities / Protected Veterans Join our 30,000 everyday heroes. We are GDIT. A global technology and professional services company that delivers consulting, technology and mission services to every major agency across the U.S. government, defense and intelligence community. Our 30,000 experts extract the power of technology to create immediate value and deliver solutions at the edge of innovation. We operate across 30 countries worldwide, offering leading capabilities in digital modernization, AI/ML, Cloud, Cyber and application development. Together with our clients, we strive to create a safer, smarter world by harnessing the power of deep expertise and advanced technology. For more information about GDIT's Privacy Policy, click here: https://www.gdit.com/privacy-policy/notices/

Responsibilities

Design, build, and operate data pipelines for high-volume, multi-source data primarily for AI/ML processes. Collaborate with cross-functional teams to create model-ready datasets and implement scalable storage solutions.