Senior Site Reliability Engineer at Rithum

London, England, United Kingdom -

Full Time

Start Date

Immediate

Expiry Date

28 Nov, 25

Salary

0.0

Posted On

28 Aug, 25

Experience

3 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Storage, Security, Cost Planning, Infrastructure, Continuous Improvement, Object Oriented Design, Scripting Languages, Containerization, Ecs, Computer Science, Communication Skills, Typescript, Collaborative Environment, Coding Practices, Python, Bash

Industry

Information Technology/IT

Description

Rithum™ is the world’s most trusted commerce network, accelerating how brands, suppliers, and retailers work together to deliver seamless e-commerce experiences. We provide an unmatched platform for brands and retailers, enabling them to accelerate growth, optimise operations across channels, scale product offerings and enhance margins.
Today, more than 40,000 companies trust Rithum to grow their business across hundreds of channels, representing over $50 billion in annual GMV. Using our commerce, marketing, and delivery solutions, our customers create optimised consumer shopping journeys from beginning to end.

QUALIFICATIONS

Minimum Qualifications

3+ years’ experience working as an SRE, DevOps Engineer or related
Experience with logging and monitoring systems like CloudWatch, Grafana or Prometheus
Experience with AWS foundations, including compute, storage, and security
Good AWS knowledge including application design, migration support, cost planning, capacity allocation, and application resiliency
Expertise in creating multi-region cloud systems with a solid disaster recovery plan
Experience with both high-level and scripting languages like Python, Bash or Typescript
Experience troubleshooting and debugging complex, distributed applications
IaC experience automating infrastructure with CDK, Terraform or Ansible
Experience with continuous deployment pipelines and containerization like EKS or ECS
Strong understanding of software engineering fundamentals, including object-oriented design, modular architecture, and maintainable coding practices.

Preferred Qualifications

You have a bachelor’s degree, or higher, in Computer Science or related field; or equivalent practical experience demonstrating strong software engineering fundamentals.
Experience working in a highly collaborative environment with both platform and product teams,
Excellent collaboration and communication skills, consistently learning new technologies and helps foster an environment of continuous improvement and innovation.
Client satisfaction focus.

Responsibilities

RESPONSIBILITIES

Collaborate with developers, Client Support, and cross-functional teams to build production automation, analysis tools, and improving reliability and performance.
Design, implement, and maintain robust application monitoring and observability systems for a distributed, highly available, and scalable software stack leveraging AI/ML to detect anomalies and asset with incidents.
Analyse and resolve problems in legacy environments while designing and implementing modern, scalable solutions from the ground up.
Participate in the rotating on-call schedule, ensuring that user emergencies, platform alerts, and support requests are addressed.
Drives automation and operational efficiency.

OTHER DUTIES

Please note this job description is not designed to cover or contain a comprehensive listing of activities, duties or responsibilities required of the employee for this job. Duties, responsibilities, and activities may change at any time with or without notice.