Site Reliability Engineer

at  WTW

Reigate RH2, England, United Kingdom -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate20 Jan, 2025Not Specified21 Oct, 2024N/AOwnership,Puppet,Programming Languages,Powershell,Devops,Python,Scripting Languages,Communication Skills,Reliability Engineering,External Clients,Docker,Kubernetes,Security,Interpersonal SkillsNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

SUMMARY:

We are seeking a Site Reliability Engineer to join our SRE team based in Reigate. The ideal candidate will have excellent communication skills, experience working with multiple stakeholders, and a track record in Azure and Observability platforms.
You will be joining Insurance Consulting and Technology (ICT) at an exciting time of transformation as we work on improving the delivery of value for customers and the business. You will be working in flexible agile squads delivering value on multiple greenfield workstreams in the delivery family to deliver core foundational functionality that will be used by multiple SaaS product offerings across the business.
You will be with other Site Reliability and Response teams as well as with the core Applications Teams, whose responsibility is to deliver and manage business critical services that are used 24×7 by our clients and colleagues around the world. This role is open to flexible and hybrid working arrangements, with presence in the Reigate office up to
two days per week.

THE REQUIREMENTS:

The essential skills/experience for this position are:

  • Solid experience in Site Reliability Engineering or a similar role such as DevOps
  • Experience of running 24x7 services in a public cloud, ideally Azure
  • Deep understanding of cloud infrastructure and services, including best practices for monitoring, scaling, and security
  • Experience with observability platforms such as Datadog or similar tools
  • Strong interpersonal skills, with the ability to work effectively with many stakeholders
  • Solid verbal and written communication skills, and the ability to present technical information clearly and concisely
  • Previous experience working with external clients is needed
  • Experience with conducting Post-mortems or Post Incident Reviews
  • Confidence in making decisions and taking ownership of projects
  • Experience with Azure DevOps pipelines (or similar) and scripting languages, such as Python or PowerShell
  • Customer centric, passionate about delivering great services
  • You’re collaborative, enjoy problem solving and mentoring others

Other highly desirable, but not essential skills are:

  • Azure certifications, such as Azure Administrator, Azure Developer, or Azure DevOps Engineer
  • Familiarity with Infrastructure as Code (IaC) tools like Pulumi, Terraform, ARM Templates, or AzureBicep
  • Knowledge of containerization and orchestration technologies, such as Docker and Kubernetes
  • Familiarity with programming languages such as C# would be welcome
  • Previous experience working with Configuration as Code technologies such as Puppet or Ansible
  • Familiar with high volume Web APIs
  • Familiar with PagerDuty

Responsibilities:

THE ROLE:

  • Collaborate with cross-functional teams to ensure the reliability, availability, and performance of our client-facing services
  • Maintain and configure observability platforms such as Datadog
  • Proactive monitoring of production and other environments to ensure stability, availability, security and integrity
  • Design and implement automation and processes to improve the efficiency and effectiveness of the teams and other support functions
  • Engage with business stakeholders to gather requirements, address concerns, and provide updates on projects and system status
  • Contribute to the design, build and operational management of the services
  • Lead incident response, troubleshooting, and root cause analysis to mitigate and prevent future issues
  • Work closely with engineering, support and operations teams to upskill and promote knowledge transfer, producing training materials and articlesParticipate in on-call rotation to provide support and ensure system uptime
-


REQUIREMENT SUMMARY

Min:N/AMax:5.0 year(s)

Information Technology/IT

IT Software - Other

Software Engineering

Graduate

Proficient

1

Reigate RH2, United Kingdom