Senior Cloud Infrastructure / Devops Engineer (m/f/d) at Agile Robots SE

81369 München, , Germany -

Full Time

Start Date

Immediate

Expiry Date

08 Nov, 25

Salary

0.0

Posted On

09 Aug, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Pipeline Design, Docker, Service Providers, Containerization, Aws, Information Systems, Orchestration, Devops, Code, Artifactory, Backup Solutions, Firewalls, Oracle, Python, Computer Science, Automation

Industry

Information Technology/IT

Description

ESSENTIAL SKILLS

Bachelor’s or Master’s degree in Computer Science, Information Systems, or related field (or equivalent professional experience)
5+ years in DevOps, SRE, or infrastructure engineering, including substantial work with hybrid cloud/on-premise environments
Proven success in designing, deploying, and maintaining large-scale Kubernetes clusters
Experience managing mission-critical services in production environments
Proficiency in scripting and automation languages (Python, Bash) for infrastructure automation and tooling development
Strong proficiency in containerization (Docker) and orchestration (Kubernetes)
Expertise in CI/CD pipeline design and automation (GitLab CI/CD, Jenkins, or similar).
Hands-on experience with artifact repositories (Artifactory including self-hosted and proxy caching), monitoring tools (Prometheus, Grafana, ELK), and backup solutions (Borg, Restic, Veeam)
Solid Linux administration skills, including networking, firewalls, and system security
Familiarity with cloud service providers (AWS, Oracle, GCP) and infrastructure-as-code tools (Terraform, Kustomize, Helm)
Knowledge of storage architectures, RAID configurations, data durability, and database management in distributed environments

ABOUT US

Agile Robots SE is an international high-tech company based in Munich, Germany with a production site in Kaufbeuren and more than 2300 employees worldwide. Our mission is to bridge the gap between artificial intelligence and robotics by developing systems that combine state-of-the-art force-moment-sensing and world-leading image-processing technology. This unique combination of technologies allows us to provide user-friendly and affordable robotic solutions that enable intelligent precision assembly.
This is made possible by our employees, who bring out the best in each and every day with creativity and enthusiasm. Become part of this team and shape the future of robotics with us!
We are proud of our diversity and welcome your application regardless of gender and sexual identity, nationality, ethnicity, religion, age, or disability.

Responsibilities

ABOUT THE ROLE

As a Senior Cloud Infrastructure / DevOps Engineer, you will lead the design, implementation, and management of our hybrid cloud and on-premise infrastructure that powers our data collection systems with robotics components. You’ll architect and maintain scalable infrastructure solutions, implement robust CI/CD pipelines, and ensure high availability across diverse environments. Your work will directly enable reliable data collection and processing workflows by providing resilient, automated infrastructure that spans multiple cloud providers and regions. This role is ideal for someone who thrives at the intersection of infrastructure automation, reliability engineering, and distributed systems.

YOUR RESPONSIBILITIES

Design, implement and scale robust on-premise and cloud infrastructure to support high-volume workloads
Deploy and manage container orchestration platforms (Kubernetes, cloud and on-prem), ensuring high availability, scalability, and performance
Oversee procurement, deployment, and lifecycle management of hardware resources in coordination with vendors
Deploy, maintain, and optimize key infrastructure services, build and maintain CI/CD pipelines for development and operational workflows, including runners, pipelines, and caching strategies
Develop and enforce backup and disaster recovery strategies for critical systems (e.g., Confluence, project repositories, station-collected data, and field-collected data)
Implement centralized logging, monitoring, and alerting using stacks such as Prometheus, ELK, or Grafana
Proactively manage system health, performance, and capacity planning
Establish failover, redundancy, and disaster recovery procedures across environments.
Set up and manage infrastructure in region-specific cloud providers (including China/Malaysia), ensuring compliance with local regulations and data transfer practices.
Orchestrate data ingestion, migration, and synchronization between edge, on-premise, and central cloud systems
Work closely with software, data, and ML teams to ensure infrastructure supports evolving product and analytics needs
Lead initiatives to improve system reliability, scalability, and operational efficiency