MLOps Engineer — AI/ML Systems & Deployment (TS/SCI Preferred) at Rackner

Dayton, Ohio, United States -

Full Time

Start Date

Immediate

Expiry Date

22 Jun, 26

Salary

0.0

Posted On

24 Mar, 26

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

MLOps, AI/ML Systems, Kubernetes, Python, PyTorch, TensorFlow, Kubeflow, Airflow, Argo, MLflow, Docker, CI/CD, Prometheus, Grafana, OpenTelemetry, lakeFS

Industry

IT Services and IT Consulting

Description

MLOps Engineer — AI/ML Systems & Deployment (TS/SCI Preferred) Dayton, OH (On-site Preferred) | Remote Eligible (CAC-Ready Candidates) Mission Environment | AI/ML Infrastructure | National Security Impact About the Role At Rackner, we are building the operational backbone that turns AI/ML capability into real-world mission outcomes. We are seeking an MLOps Engineer to own the lifecycle of AI/ML systems—from experimentation to deployment—within a mission-critical, classified environment supporting Air Force and NASIC-aligned programs. This is not a research role; This is where models become reliable, deployable, auditable systems. You will operate at the intersection of: Machine learning Distributed systems Cloud-native infrastructure …and ensure that AI/ML systems work in the environments where failure is not an option. What You’ll Do Own the ML Lifecycle (End-to-End) Build and operate production-grade ML pipelines Orchestrate workflows using Kubeflow, Airflow, or Argo Implement model versioning, lineage, and reproducibility standards Operationalize AI/ML Systems Deploy models into mission environments (including constrained or classified systems) Transition workflows from Jupyter experimentation → containerized pipelines → production systems Enable both batch and real-time inference architectures Engineer for Reliability, Not Just Performance Design systems for reproducibility, auditability, and stability Implement monitoring for: model performance & drift system health & latency Use tools like Prometheus, Grafana, and OpenTelemetry Build Cloud-Native ML Infrastructure Deploy and manage Kubernetes-based ML workloads Containerize pipelines using Docker / OCI standards Scale compute for training and inference workloads Establish Data Discipline Enable data versioning and governance (lakeFS or similar) Support feature engineering and dataset preparation pipelines Apply metadata standards (e.g., STAC) where applicable Create Repeatable Systems Develop runbooks, playbooks, and deployment standards Build systems that can be operated by others; not just understood by you What You Bring Core Experience Experience deploying ML systems into production environments Strong background in Python and ML frameworks (PyTorch, TensorFlow, etc.) Hands-on experience with: ML pipeline orchestration tools (Kubeflow, Airflow, Argo) Experiment tracking (MLflow, ClearML) Infrastructure & Systems Experience with Kubernetes and containerized workloads Familiarity with CI/CD for ML systems Understanding of distributed systems and scalable architectures ML Application Exposure Experience working with: LLMs or transformer-based models computer vision systems (YOLO, Faster R-CNN) Focus on deployment and integration, not pure research Mindset Systems thinker who values reliability over novelty Comfortable operating in ambiguous, high-stakes environments Able to translate experimental work into operational capability Why This Role Matters (What You Get) This role is a career accelerator for engineers who want to: Move beyond experimentation Own systems that actually get deployed and used Operate at the systems level Work across ML, infrastructure, and mission integration Build in high-trust environments Where correctness, auditability, and reliability matter Develop rare, high-demand expertise MLOps in constrained / classified environments is a differentiated skillset Shape how AI is operationalized—not just built Who We Are Rackner is a software consultancy that builds cloud-native solutions for startups, enterprises, and the public sector. We are an energetic, growing consultancy with a passion for solving big problems across industries. We enable digital transformation through: Distributed systems DevSecOps AI/ML Cloud-native architecture Our approach is cloud-first, cost-effective, and outcome-driven—focused on delivering real capability, not just code. Benefits & Perks 100% covered certifications & training aligned to your role 401(k) with 100% match up to 6% Highly competitive PTO Comprehensive Medical, Dental, Vision coverage Life Insurance + Short & Long-Term Disability Home office & equipment plan Industry-leading weekly pay schedule Apply If you’re an engineer who wants to move from building models → owning systems, we want to talk. #MLOps #MachineLearning #Kubernetes #AIEngineering #CloudNative #DevSecOps #ArtificialIntelligence #DataEngineering #DefenseTech #NationalSecurity #AIInfrastructure #Hiring #TechCareers

Responsibilities

The MLOps Engineer will own the end-to-end ML lifecycle, building and operating production-grade ML pipelines using tools like Kubeflow or Airflow, and ensuring model versioning and reproducibility standards are met. This role involves operationalizing AI/ML systems by deploying models into mission environments, enabling both batch and real-time inference architectures.