Data Site Reliability Engineer at Adroit People Ltd
London, England, United Kingdom -
Full Time


Start Date

Immediate

Expiry Date

15 Aug, 25

Salary

0.0

Posted On

16 May, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Python, Gcs, Kubernetes, Data Infrastructure

Industry

Information Technology/IT

Description

Data Infrastructure SRE Team
The SRE team is responsible for managing largest multi-cloud storage abstraction and caching
platform, which supports critical machine learning training workloads that power user-facing
features across the Apple ecosystem. Operating across both first-party and third-party cloud
environments brings complex and unique challenges.
The team is expected to address these challenges through a strong foundation in cloud object
storage, data analysis, automation, collaboration, and advanced expertise in Kubernetes.
Description
Requirement is to have a seed time in UK led by a strong SRE Lead/hands-on technical

manager.

  • A genuine passion for Infrastructure as a Service (IaaS)
  • A commitment to automation and operational efficiency
  • Ownership of projects from design through delivery
  • A solutions-oriented approach, coupled with the ability to gain alignment on technical

direction

  • Consistent and timely execution of design implementations aligned with project objectives
  • The ability to provide constructive technical feedback, fostering team-wide growth and

continuous improvement

About the Overall Data SRE Team structure:

  • Participates in a rotating on-call schedule, including occasional weekend coverage when

necessary

  • Currently headquartered in Cupertino, with active expansion in Bangalore and London to

support global operations across time zones

  • Leverages a diverse stack including open-source tools, commercial solutions, and

internally developed systems

  • Encourages open dialogue, values strong ideas, and recognizes impactful results

Minimum Qualifications of the team.

  • 5+ years experience in building, operating and scaling a large application in a private,

public or hybrid cloud environment

  • Experience hiring and leading a team of engineers in their respective timezones
  • Deep expertise in Kubernetes, with hands-on experience using platforms such as Google

Kubernetes Engine (GKE) or Amazon Elastic Kubernetes Service (EKS)

  • Proficient in designing, developing, and releasing code in languages such as Python, Go,

or Rust

  • Practical experience with object storage technologies, including Amazon S3 or Google

Cloud Storage (GCS)

  • Strong background in designing and troubleshooting complex networking issues in both

public and private cloud infrastructures

  • Solid understanding of Linux internals, standard networking protocols, and distributed

systems architecture

Preferred Qualifications

  • Proven drive to automate manual operations and enhance processes through continuous

iteration

  • Strong understanding of best practices for deploying large-scale, distributed applications
  • Hands-on experience managing diverse system environments using configuration

management tools or software delivery platforms such as Spinnaker, Helm, or Flux

  • Demonstrated expertise in deploying, supporting, and monitoring both new and existing

services, platforms, and application stacks

  • Solid familiarity with container orchestration and management using Kubernete

Job Type: Temporary
Contract length: 6 months

Experience:

  • Data Infrastructure : 8 years (preferred)
  • Kubernetes: 8 years (preferred)
  • Python , GCS: 8 years (preferred)

Work authorisation:

  • United Kingdom (preferred)
Responsibilities

Please refer the Job description for details

Loading...