Site Reliability Engineer (m/f/d) at Lager und Transport

Vienna, , Austria -

Full Time

Start Date

Immediate

Expiry Date

09 Sep, 26

Salary

0.0

Posted On

11 Jun, 26

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

GCP, Terraform, Ansible, SaltStack, Kubernetes, Docker, Java, Python, Go, Bash, Prometheus, Grafana, CI/CD, Linux, Incident Management, Networking

Industry

Retail

Description

Company Description We, the IT of REWE Group International, shape modern IT solutions for retail with over 700 dedicated colleagues—both nationally and internationally. Together, we develop IT products and services that enrich the everyday lives of our customers and partners. We are looking for a skilled Senior SRE (m/f/d) to join our team. The ideal candidate will ensure the reliability, availability, and performance of our critical infrastructure and services. This role involves collaborating with cross-functional teams to build and maintain scalable and efficient systems, implement automation, and drive improvements in system reliability. Job Description Design, implement, and maintain highly reliable and scalable infrastructure and services using cloud platforms (e.g. GCP). Automate repetitive tasks using tools such as Terraform, Ansible and SaltStack. Collaborate with development and operations teams to ensure smooth deployment and operation of services using CI/CD pipelines (e.g. Gitlab). Establish and monitor Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to ensure system reliability using monitoring tools like Prometheus and Grafana. Perform capacity planning and optimization to handle growth and scale. Lead incident management and post-mortem processes to ensure continuous improvement. In addition to conducting root analysis of system failures. Qualifications Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent experience). Strong understanding of cloud infrastructure (specifically GCP) and containerization technologies (Docker, Kubernetes). Proficiency in scripting and programming languages (Java, Python, Go, Bash). Experience with monitoring and observability tools (Prometheus, Grafana, ELK Stack, Fluentd, Splunk). Solid knowledge of networking (DNS, TCP/IP, HTTP), security best practices (SSL/TLS, firewalls, IAM) and system administration (Linux, Windows). Experience with Incident Management (Jira, ServiceNow), version control systems (Git, SVN) and CI/CD Additional Information Long-term, interesting and varied work for a reliable employer in a supportive team A family-friendly company culture with flexible working hours and remote working options available according to your individual needs Staff shopping and travel discounts Numerous training and further development opportunities within the Group (5% of working time for self-organized training and education) Easy public access A wide variety of tasks combined with the flexibility you need to plan your personal life A lunch allowance An industry-standard, attractive and performance based annual gross salary starting at 40.000 Euro (on a full-time basis) with the possibility of higher pay according to experience and qualifications No matter where you are in your career, we have a path for you. Whether you’re looking for your first job, advancement in your field, or a new career shift. We’re proud to employ great people who are passionate about their jobs. But they’re all different. No matter who you are, what you need and where you’re going, REWE Group can be a part of it. Apply now! Please upload your resume to give us insight of your work experience - anonymously if you like! We promote a diverse and inclusive work environment. Therefore, we welcome applications from people of different gender, age, cultural or social background, sexual identity and applications from people with disabilities. In addition, we would like to increase the proportion of women in technical professions and are particularly pleased to receive applications from women for this position. Joblevel Karriereseite: Position mit Berufserfahrung

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities

Design and maintain scalable cloud infrastructure on GCP while automating repetitive tasks using tools like Terraform and Ansible. Establish system reliability through SLIs/SLOs and lead incident management and post-mortem processes.