Lead Site Reliability Engineer

at  Dexory

Wallingford, England, United Kingdom -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate28 Dec, 2024Not Specified29 Sep, 2024N/AGood communication skillsNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

BENEFITS

Joining our team and company isn’t just about expertise: it’s about an attitude that embraces uncertainty, a desire to solve challenging problems, and an opportunity to contribute to a technology platform that genuinely is state-of-the-art. As a company, we’re still in the scale-up phase of our life, so you’ll have a significant role in shaping the future of our products, culture, and engineering team. We offer a fun, flexible, and fast-paced environment that’s a great match for people looking for something out of the norm.

Responsibilities:

WHAT DOES THIS ROLE INVOLVE?

As the SRE (Site Reliability Engineering) Lead at Dexory, you will be at the helm of efforts to support the safety and reliability of our overall platform. This position involves providing SRE support for a globally-distributed, hardware-oriented product that integrates autonomous robot systems and data insights. Your role will be pivotal in developing and maintaining company-wide monitoring, alerting, and management systems. You will work across various teams to implement robust incident management strategies and support engineering teams in collecting and publishing critical metrics and alerts. Additionally, you will prepare comprehensive documentation and runbooks to handle changes and incidents efficiently.

YOUR KEY RESPONSIBILITIES WILL INCLUDE:

  • Monitoring and maintaining our systems for metrics collection, alerting, and incident management.
  • Working across teams to ensure a robust incident management strategy is in place.
  • Preparing documentation and runbooks for handling changes and incidents.
  • Providing support for all engineering teams to collect and publish useful metrics and alerts.
  • Creating an infrastructure to report on key operational and uptime metrics and integrating these into the company’s OKR process.
  • Preparing and maintaining a robust security posture and working with internal and external stakeholders to explain and validate this.


REQUIREMENT SUMMARY

Min:N/AMax:5.0 year(s)

Information Technology/IT

IT Software - Other

Software Engineering

Graduate

Proficient

1

Wallingford, United Kingdom