Associate Site Reliability Engineer at Vodafone United States

Athens, Attica, Greece -

Full Time

Start Date

Immediate

Expiry Date

13 Feb, 26

Salary

0.0

Posted On

15 Nov, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Site Reliability Engineering, Data Infrastructure, Monitoring Strategies, Database Systems, Data Pipelines, ETL Processes, Incident Management, Problem Management, Cloud Solutions, Automation, Database Management, Scripting, CI/CD Tools, Communication Skills, Analytical Skills, Performance Improvement

Industry

Telecommunications

Description

You will be responsible for designing and implementing robust monitoring strategies, ensuring the performance and scalability of data systems, and actively participating in continuous improvement of service reliability. Additionally, you will support incident and problem management, provide feedback to development teams, and contribute to the design and review of data-centric IT solutions. More Specifically you will: Support the development of an effective, inclusive SRE team with a focus on data infrastructure, ensuring alignment with SRE principles and practices Design optimization strategies for data infrastructure, including database systems (e.g., Oracle, MySQL), data pipelines, and ETL processes Define monitoring requirements and implement alerting mechanisms for data services and infrastructure, using tools such as Prometheus, Grafana, and ELK stack Provide feedback to data engineers and data platform teams regarding system malfunctions and potential areas for performance improvement Participate actively in Incident & Problem Management for both data infrastructure and business-related incidents, focusing on root cause analysis and future-proofing solutions Manage operational costs (OPEX) for data-related services, covering aspects like capacity planning, database licensing, support, and other related expenses Collaborate closely with internal and external teams, including on-shore and off-shore units, to drive alignment on data service performance and reliability Mentor and guide new SRE team members, helping them understand data-related aspects of SRE and best practices for reliability engineering Lead automation initiatives to streamline data processing, reduce manual interventions, and increase business SLAs for data services Ensure resilience, redundancy, and high availability for critical data services and workflows (both new and existing) Oversee the successful transition of data solutions from development to production, ensuring compliance with SRE principles Review the functional design of new data-centric deliverables, offering recommendations to optimize performance and reliability Bachelor's degree in Computer Science, Engineering, or a related field 5+ years of experience in IT Operations, focusing on data technologies and platforms (e.g., AWS, GCP, Oracle, MySQL) Proficient in CI/CD tools (e.g., Jenkins, GitOps), scripting (e.g., Python, Bash), and cloud-based data solutions Expertise in monitoring tools like Prometheus, Grafana, ELK, and experience with containerization and Kubernetes Strong skills in automation (e.g., Ansible, Terraform) and database management (Oracle PL/SQL, large-scale databases) Analytical mindset, strong problem-solving skills, and excellent communication abilities in Greek and English

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities

You will design and implement robust monitoring strategies while ensuring the performance and scalability of data systems. Additionally, you will support incident and problem management and contribute to the continuous improvement of service reliability.