Senior Site Reliability Engineer

at  Zalando

Berlin, Berlin, Germany -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate02 Aug, 2024Not Specified04 May, 2024N/AGood communication skillsNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

Responsibilities:

THE ROLE & THE TEAM

Lounge by Zalando is an online shopping club for fashion and lifestyle products. We offer our customers daily, time-limited sale campaigns with discounts of up to 75% off the recommended retail price. We aim to connect with millions of members across 17 European markets, inviting them to engage in an exciting treasure hunt.
As an (Senior) Site Reliability Engineer in the Lounge - Site Reliability Engineering team you will bring 22 engineering teams together with efficient and standardised practices, enabling them by developing software systems, providing infrastructure and automated solutions with an ever watchful-eye, to ensure a reliable and highly available platform for the Lounge by Zalando customers.
You’ll use your knowledge of distributed systems and architecture to improve the reliability and performance of Lounge by Zalando engineering platforms and services. You’ll also build internal services and tools that empower our in-house customers–peer engineering teams–to develop, deploy, and operate their own services at scale. You will have the opportunity to solve challenges on a large scale distributed systems using technologies like Kubernetes, React, Node.js, TypeScript, Golang, Scala, Java, PostgreSQL, AWS Stack like Redis, DynamoDB. Check out our Tech Blog for more information on the technologies we use.

WHAT WE’D LOVE YOU TO DO (AND LOVE DOING)

  • Keep our mission-critical systems up and running, proactively spotting areas where automation can help, and prioritising tasks accordingly
  • Leverage your knowledge of distributed systems to identify and fix network, system, and service-level challenges. Practise sustainable incident response, and drive structural improvement with blameless postmortems.
  • Communicate effectively with our engineering teams in this regard.
  • Capacity planning, definition of service level indicators, and analysis of system performance.
  • Participate (optional) in on-call rotations (paid), assuming responsibility for addressing incidents and challenges in production.
  • Plan, challenge and conduct fire drills, production readiness reviews with engineering teams


REQUIREMENT SUMMARY

Min:N/AMax:5.0 year(s)

Information Technology/IT

IT Software - Application Programming / Maintenance

Software Engineering

Graduate

Proficient

1

Berlin, Germany