MLOps Engineer (Kubernetes, Cloud, ML Workflows) at FitNext Co
London, England, United Kingdom -
Full Time


Start Date

Immediate

Expiry Date

27 Nov, 25

Salary

0.0

Posted On

27 Aug, 25

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Code, Python, Bedrock, Kubernetes, Aws, Infrastructure, Docker, Incident Response

Industry

Information Technology/IT

Description

ABOUT THE COMPANY

One of the world’s fastest-growing AI companies is seeking an MLOps Engineer to help scale and optimize its machine learning infrastructure. The company collaborates with leading AI labs to advance frontier model capabilities in reasoning, coding, multimodality, multilinguality, and STEM knowledge, while also delivering mission-critical AI systems for global enterprises.
Headquartered in the United States, the leadership team includes experts from Meta, Google, Microsoft, Apple, Amazon, and top universities such as Stanford, Caltech, and MIT. Recognized as one of the most promising B2B companies shaping the future of AI, the organization operates at the forefront of innovation.

REQUIRED SKILLS

  • 7+ years of experience as a DevOps Engineer in large-scale, cloud-based environments (AWS preferred).
  • 2+ years of hands-on experience in MLOps environments.
  • Strong expertise with Kubernetes (including GPU clusters) and Docker.
  • Proficiency in Python, Go, or similar languages, focused on automation/tooling.
  • Experience with CI/CD tools such as ArgoCD or GitHub Actions for ML workflows.
  • Knowledge of Infrastructure-as-Code, particularly Terraform.
  • Familiarity with observability tools such as Prometheus and Grafana.
  • Background in incident response, including on-call rotations.

BONUS SKILLS

  • Practical experience with AWS ML services such as SageMaker or Bedrock.
  • Knowledge of emerging MLOps frameworks and tools.
  • Experience enabling Data Scientists with scalable ML experimentation infrastructure.

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities

The MLOps Engineer will be responsible for implementing best practices, managing high-scale ML workflows, and evolving a robust MLOps platform that supports the full ML lifecycle. The position involves close collaboration with ML engineers and product teams to deliver scalable, reliable, and secure infrastructure for machine learning systems.

Key responsibilities include:

  • Managing GPU-enabled Kubernetes clusters for distributed ML workloads.
  • Building automation and tooling in Python or Go.
  • Developing and maintaining CI/CD pipelines tailored for ML workflows.
  • Implementing observability solutions to ensure performance and reliability.
  • Driving innovation in infrastructure for large-scale, production-grade ML systems.
  • This is an on-site role in London (Soho, near Tottenham Court Road). Applications are considered only from candidates available to work on-site.
Loading...