Site Reliability Engineer at Chevron

Kavala, Macedonia and Thrace, Greece -

Full Time

Start Date

Immediate

Expiry Date

06 Jan, 26

Salary

0.0

Posted On

08 Oct, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Site Reliability Engineering, Full Stack Infrastructure, Network Administration, Security, Identity Management, Access Management, Cybersecurity, Cloud Architecture, Windows OS, Linux OS, Performance Monitoring, Troubleshooting, Change Management, API Integration, Automation, Agile Methodologies

Industry

Chemical Manufacturing

Description

Improves and protects the software and systems behind all of organization's IT services, including management of scalability, availability, latency, performance, security, and capacity, and delivering of software faster, better, and cheaper. Use broad full-stack knowledge and experience for proactive incident prevention by baselining against expected service performance, process improvement from application of lessons learned, and utilization of data analytics to proactively identify problem areas and operational gaps Effectively and efficiently lead product line agile teams in troubleshooting and resolving system problems, including analyzing application and critical system performance Serve as a technical resource during critical and major incidents supporting multiple technologies Find opportunities to avoid future issues by improving logging and creating automated resolutions based on triggers. Develop automation scripts for repetitive tasks to eliminate toil / operations support activities Oversee production environments by monitoring availability and maintaining a holistic view of system health Measure and optimize system performance, continuously seeking innovation and improvement to meet customer needs. Align, collaborate, and build relationships with peers, company leadership, subject matter experts and users to improve knowledge of end-to-end DevOps / Site Reliability Engineering best practices Collaborate with SRE Community of Practice thought leaders to define SRE capabilities and best practices and integrate capability framework throughout the organization Hands-on experience as an IT professional with knowledge of full stack infrastructure and experience troubleshooting incidents and production issues Working knowledge in several technology disciplines required for the full end-to-end service operations stack: network administration & security (CISCO/Juniper), identity & access management (Active Directory, Azure AD, SAML, OpenID Federation, certificates, and keys), cybersecurity, on-prem & cloud architecture, Windows & Linux OS, performance monitoring & management, troubleshooting (application & database), change management, API integration, and automation (Ansible, PowerShell, KQL, or Shell scripting) Strong communication (written/verbal) and facilitation skills with solid ability to convey business and technical information to a diverse audience Strong analytical and problem-solving skills with ability to engage difficulties with persistence Basic understanding of software development lifecycle and software engineering best practices, including code management (Git/GitHub) and CI/CD pipeline Experience working in an agile team Scrum / Kanban) is considered a plus

Responsibilities

The Site Reliability Engineer improves and protects the software and systems behind the organization's IT services, focusing on scalability, availability, and performance. They lead agile teams in troubleshooting system problems and develop automation scripts to eliminate repetitive tasks.