Senior Site Reliability Engineer at Woolworths Group

Surry Hills, New South Wales, Australia -

Full Time

Start Date

Immediate

Expiry Date

20 Nov, 25

Salary

0.0

Posted On

21 Aug, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

User Experience, Sql Server, Reliability, Microsoft Azure, Elasticsearch, Integration, Working Environment, Google Cloud, Mongodb, Powershell, Platforms, Infrastructure, Automation Tools, Performance Analysis, Components, Nosql, Operations, Engineers, Bash, Digital Services

Industry

Information Technology/IT

Description

The opportunity to collaborate with some of the brightest and best minds in Australia
Be part of a great team culture with a team that loves to have fun
Surry Hills based, with a hybrid working model along with an “on call” component

WHAT YOU’LL EXPERIENCE

Our Team Members are at the heart of everything we do, and we’re always looking for ways to support your career journey and reward great work:

A flexible hybrid working environment
Team discounts across our range of Woolworths Group brands you know and love and a robust rewards program that celebrates and incentivises purpose-driven work.
A global business with endless career possibilities around every corner and across every discipline – with valuable exposure to a vast and exciting business network.
A range of programs to help you prioritise and manage your well-being, including 24/7 access to the Sonder app.
A progressive and competitive leave policy that gives you more space for what matters to you.

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities

As a Senior Site Reliability Engineer (SRE) you will be responsible for ensuring the reliability, availability, and performance of our critical digital services and systems. You will bridge the gap between software engineering and operations, employing a blend of coding skills and operational expertise to design and maintain scalable and resilient applications and infrastructure. You will establish and uphold service level objectives (SLOs), automate processes, conduct performance analysis, and respond swiftly to incidents, aiming to minimise downtime and enhance the overall user experience. Your focus on reliability and efficiency will ensure a stable and optimised technology environment, aligning with the organisation’s business objectives.

Work with product teams to design, implement and enhance highly available and scalable systems, ensuring reliability and performance of applications.
Collaborate with cross-functional teams to define and establish service level objectives (SLOs) and service level agreements (SLAs) for critical applications and components.
Maintain observability and monitoring tools, alerts, and dashboards to provide visibility into system health and performance, proactively identifying and resolving any performance bottlenecks or availability issues.
Alongside the Incident Commander, play a lead remediation role on major incident bridges.
Play a key role in post-incident reviews to identify root causes and implement preventive measures to avoid future incidents.
Automate repetitive tasks and processes to improve efficiency and reduce manual intervention.