SRE Observability Engineer - London at Photon Career Site
London, England, United Kingdom -
Full Time


Start Date

Immediate

Expiry Date

17 Jul, 26

Salary

0.0

Posted On

18 Apr, 26

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Openshift, Kubernetes, Grafana, Geneos Itrs, Mimir, Loki, Tempo, Prometheus, Promql, Helm, Bash, Python, Technical Documentation, Observability, Containerization, Automation

Industry

IT Services and IT Consulting

Description
SRE Observability Engineer   Key Responsibilities: The Monitoring and Observability team is responsible for managing: * Operating with a global footprint. * Collaborating across various organizations within Citi to understand and develop observability solutions for enterprise-wide deployment at scale. * Managing the legacy monitoring stack across the Production Management organization within Citi. * Driving the strategic delivery of end-to-end Observability solutions in Citi. * Providing in-depth analysis with interpretive thinking to define problems and develop innovative solutions. * Directly impacting the business by influencing strategic functional decisions through advice, counsel, or provided services. * Persuading and influencing others through strong and comprehensive communication and diplomacy skills. * Performing other duties and functions as assigned.  Essential Skills: * OpenShift/Kubernetes Administration: Experience deploying, managing, and troubleshooting containerized applications on OpenShift/Kubernetes, including resource management and networking. * Grafana & Observability Stack: * Proficiency in administering Geneos ITRS at scale. * Proficiency in administering Grafana (user management, data sources, dashboards, alerts). * Working knowledge of Grafana backend components: Mimir (metrics), Loki (logs), and Tempo (traces). * Experience with Prometheus for metric collection and PromQL for querying. * Helm Chart Management: Experience with Helm for deploying applications, including creating, modifying, and managing Helm charts, library charts, and dependencies. * Technical Documentation: Ability to create clear and concise documentation for systems and processes. Desired Skills: * Application Deployment: Ability to deploy applications using Lightspeed Enterprise. * Google Cloud Operations: Experience with Google Cloud operations. * Scripting & Automation: Experience with Bash or Python scripting for automating operational tasks.
Responsibilities
The SRE Observability Engineer will manage and develop enterprise-wide observability solutions while maintaining legacy monitoring stacks. They will drive strategic delivery of end-to-end observability and influence functional decisions through expert analysis and communication.
Loading...