(Senior) Site Reliability Engineer (SRE) - Sovereign Cloud (f/m/d) at SAP
10557 Berlin, Moabit, Germany -
Full Time


Start Date

Immediate

Expiry Date

25 Aug, 25

Salary

0.0

Posted On

26 May, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Good communication skills

Industry

Information Technology/IT

Description

WE HELP THE WORLD RUN BETTER

At SAP, we enable you to bring out your best. Our company culture is focused on collaboration and a shared passion to help the world run better. How? We focus every day on building the foundation for tomorrow and creating a workplace that embraces differences, values flexibility, and is aligned to our purpose-driven and future-focused work. We offer a highly collaborative, caring team environment with a strong focus on learning and development, recognition for your individual contributions, and a variety of benefit options for you to choose from.

Responsibilities

We are seeking a highly skilled (Senior) Site Reliability Engineer (SRE) to join our Sovereign Cloud Automation & Tooling (SAT) Team in Berlin. This role is pivotal in ensuring the stability, performance, and scalability of our cloud infrastructure. The ideal candidate will have extensive experience in observability tools, automation, and cloud technologies across AWS and Azure environments.

Your tasks will include:

  • Design, implement, and manage robust observability solutions using tools such as Dynatrace, Grafana, Prometheus, and Site24x7.
  • Develop, maintain, and improve monitoring, alerting, and incident response processes to ensure system reliability and minimize downtime.
  • Collaborate with development teams to enhance application performance, scalability, and reliability through proactive monitoring insights.
  • Manage and optimize cloud infrastructure across AWS and Azure using Infrastructure as Code (IaC) tools like Terraform and Bicep.
  • Develop and maintain CI/CD pipelines to automate software deployments and infrastructure updates.
  • Write and maintain automation scripts in Bash, Shell, and Python to support operational efficiency.
  • Identify performance bottlenecks, implement improvements, and support post-incident reviews to drive continuous improvement.
  • Work closely with security teams to ensure compliance, security best practices, and data protection in cloud environments.
  • Maintain comprehensive documentation for observability configurations, automation processes, and cloud infrastructure standards.
Loading...