Lead Site Reliability Engineer at Department for Work and Pensions
Manchester, England, United Kingdom -
Full Time


Start Date

Immediate

Expiry Date

21 Jun, 25

Salary

72664.0

Posted On

21 Mar, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Certificate Management, Norway, Scripting, Performance Management, Security, Code, Reliability Engineering, Infrastructure, Logging, Port Management

Industry

Civil Engineering

Description

JOB SUMMARY

Please note this role requires you to pass Security Check clearance. For further information, please see ‘Selection process details’.
Are you someone who has excellent stakeholder management and problem-solving skills?
Do you enjoy finding the root cause of a problem and building automated solutions to ensure it doesn’t happen again?
If so, we’d love to hear from you.
As a Lead Site Reliability Engineer (SRE), you will drive the adoption of SRE best practices across the teams you work with.
You will collaborate with application development and operations engineers in the practice of Site Reliability Engineering.
Be accountable for the reliability of the applications you support.
Working with Delivery Managers, Product Managers, and other SREs as part of a multidisciplinary team, you will actively manage the work backlog and develop reliability improvements. You will also lead initiatives to automate low-value tasks while balancing project delivery demands.
You will provide technical leadership to wider operational teams, along with technical oversight of the products and services they support.
Helping to develop and support the engineers in your team, introducing new technologies or practices to improve team knowledge, skills, and capability.

JOB DESCRIPTION

As a Lead Site Reliability Engineer, you will play a pivotal role in ensuring the reliability and performance of our applications and infrastructure. You will lead by example, providing technical direction and supporting the development and progression of SREs within your team.

Key Responsibilities:

  • Lead by example, provide technical direction, and support the development and progression of SREs within your team.
  • Work across multiple teams as an engineering specialist, implementing organizational engineering standards.
  • Support teams in building reusable, repeatable, observable, and reliable infrastructure.
  • Design and develop techniques for improving application reliability, including run books, knowledge transfer, and ongoing SRE strategy within the wider engineering community.
  • Collaborate with teams to investigate and resolve major or complex incidents, ensuring the right skills and expertise are available to respond effectively.
  • Assess the impact of change requests in consultation with stakeholders, providing technical expertise and advice.

There will be a contractual requirement to join an “on-call” rota providing night cover 18:00-08:00 with occasional shifts 08:00-18:00 Saturday or Sunday. The cover is shared around the team and would normally equate to one shift per week.

NATIONALITY REQUIREMENTS

This job is broadly open to the following groups:

  • UK nationals
  • nationals of the Republic of Ireland
  • nationals of Commonwealth countries who have the right to work in the UK
  • nationals of the EU, Switzerland, Norway, Iceland or Liechtenstein and family members of those nationalities with settled or pre-settled status under the European Union Settlement Scheme (EUSS)
  • nationals of the EU, Switzerland, Norway, Iceland or Liechtenstein and family members of those nationalities who have made a valid application for settled or pre-settled status under the European Union Settlement Scheme (EUSS)
  • individuals with limited leave to remain or indefinite leave to remain who were eligible to apply for EUSS on or before 31 December 2020
  • Turkish nationals, and certain family members of Turkish nationals, who have accrued the right to work in the Civil Service

Further information on nationality requirements

Responsibilities

Key Responsibilities:

  • Lead by example, provide technical direction, and support the development and progression of SREs within your team.
  • Work across multiple teams as an engineering specialist, implementing organizational engineering standards.
  • Support teams in building reusable, repeatable, observable, and reliable infrastructure.
  • Design and develop techniques for improving application reliability, including run books, knowledge transfer, and ongoing SRE strategy within the wider engineering community.
  • Collaborate with teams to investigate and resolve major or complex incidents, ensuring the right skills and expertise are available to respond effectively.
  • Assess the impact of change requests in consultation with stakeholders, providing technical expertise and advice
Loading...