DevOps Engineer - Site Reliability Engineering team at SAP
Montréal, QC, Canada -
Full Time


Start Date

Immediate

Expiry Date

08 Nov, 25

Salary

74600.0

Posted On

09 Aug, 25

Experience

3 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Good communication skills

Industry

Information Technology/IT

Description

WE HELP THE WORLD RUN BETTER

At SAP, we keep it simple: you bring your best to us, and we’ll bring out the best in you. We’re builders touching over 20 industries and 80% of global commerce, and we need your unique talents to help shape what’s next. The work is challenging – but it matters. You’ll find a place where you can be yourself, prioritize your wellbeing, and truly belong. What’s in it for you? Constant learning, skill growth, great benefits, and a team that wants you to grow and succeed.

Responsibilities

We are looking for an engineer to join an already established SRE team for the SAP Business Technology Platform.
As a Site Reliability Engineer, you will have the opportunity to operate and support business critical Cloud services. As part of your daily job, you will proactively monitor the service behavior and identify areas for improvement. You will participate in the development of tools for monitoring and troubleshooting cloud services built on latest open source and SAP technologies, following SRE principles.

Responsibilities:

  • Act as technical expert during Live site incidents (downtimes of supported services in scope), investigate and solve incidents on a deep technical level.
  • Drive root cause analysis and follow-up improvements to prevent issues from reoccurring.
  • Perform in-depth troubleshooting and log analysis to identify and solve complex issues in accordance with internal and external SLAs.
  • Build software-based solutions to address improvements in service reliability and stability.
  • Enhance infrastructure and platform monitoring by gathering system metrics (4 Golden Signals) and implementing tools for recovery.
  • Integrate and collaborate closely with development teams and work with them on outputs from Postmortems and product improvements.
  • Learn new technologies and keep up to date with latest development increments.
  • Create and maintain technical documentation.
  • Define, advocate, apply SRE best practices.
  • Participate in the on-call rotation (follow the sun approach) to react to major incidents. On-call has a special compensation package.
Loading...