Director PCS Cloud Operations SRE at GE HealthCare
Bengaluru, karnataka, India -
Full Time


Start Date

Immediate

Expiry Date

21 Jun, 26

Salary

0.0

Posted On

23 Mar, 26

Experience

10 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

SRE, CloudOps, SLI/SLO/SLA, Automation, Observability, APM/RUM, AIOps, Incident Management, Change Management, DR/BCP, FinOps, CI/CD, Infrastructure-as-Code, DevSecOps, AWS, Team Leadership

Industry

Hospitals and Health Care

Description
Job Description Summary The Director – Cloud Operations provides leadership, innovation, and oversight for SRE and CloudOps across PCS. The role establishes the operating foundations, metrics, and automation needed to run mission‑critical, greenfield applications with high reliability and security, and is accountable for meeting product SLAs while scaling Cloud Operations and institutionalizing modern SRE practices in close partnership with product, platform, and security teams. Job Description Essential Responsibilities Serve as the functional leader for the PCS Digital Cloud Operations team. Define the operating model, governance, and KPIs; drive automation and observability; and ensure secure, reliable deployments across environments with continuous improvement and tight collaboration with security. This role reports to the VP of Engineering – PCS Apps & Platform. Key responsibilities include: Own Cloud Operations for PCS cloud applications; stand up and scale CloudOps capabilities to support multiple products while adhering to committed SLAs. Institutionalize SRE practices: implement SLI/SLO/SLA frameworks, error budgets, incident/post‑mortem processes, and reliability runbooks; champion automation to reduce toil and improve service health and monitoring. Build end‑to‑end observability (APM/RUM, logs, metrics, traces, health dashboards, proactive alerting) and evolve toward auto‑healing and AIOps for anomaly detection and closed‑loop remediation. Drive change, incident, and problem management with clear RACI and stakeholder communications; reduce MTTR through streamlined L1–L4 escalation. Establish and test DR/BCP posture; conduct AWS Well‑Architected and operational readiness reviews for services (AWS‑first, with multi‑cloud considerations as needed). Lead FinOps practices: cost allocation and accountability, right‑sizing, savings plans/reserved instances, spend governance, and unit‑economics optimization. Evolve the operating model in partnership with platform and application teams; standardize CI/CD templates and “everything‑as‑code” for speed and repeatability. Build and develop a high‑performing team: hire, coach, and grow CloudOps/SRE talent and the next set of leaders; uphold high standards for quality and customer satisfaction. Core KPIs & outcome metrics: Service availability versus SLA/SLO and error‑budget burn rate. MTTD/MTTR and incident recurrence; % incidents with post‑mortems completed. Change failure rate and lead time for changes for production deployments. % automated runbooks/toil reduction; % services with complete SLI/SLO coverage. Basic Qualifications Bachelor’s degree in computer science or a STEM field. A minimum of 10 years experience in leading technical teams in complex, fast‑paced environments, including 5+ years of in Cloud Ops and SRE leadership roles Proven expertise in the areas of DevSecOps, Day‑2 Ops, APM/RUM, and Cloud Operations. Proficiency building and operating services on public cloud (AWS‑first) with CI/CD and Infrastructure‑as‑Code (e.g., Terraform/CloudFormation). Track record establishing SLIs/SLOs/SLAs, observability, and incident/change management at scale. Strong leadership and team management skills, with the ability to inspire and motivate a team of engineers. Excellent project management skills, with the ability to manage multiple complex projects simultaneously. In-depth knowledge of SaaS technologies, cloud computing, and medical device development processes. Desired Characteristics Technical competencies: Experience scaling CloudOps/SRE for multiple products and customer deployments. Deep fluency in SLI/SLO/SLA design, error budgets, runbooks, and auto‑healing patterns. Strong AWS architecture and operations; Well‑Architected reviews; capacity and cost optimization (FinOps). Modern observability (APM/RUM/logs/metrics/traces) and AIOps for predictive analytics/anomaly detection. Security by design (DevSecOps, policy‑as‑code) and DR/BCP planning/testing. Leadership competencies: Clear, decisive communicator able to influence across product, platform, and security stakeholders. Builder‑coach mindset: hire, mentor, and grow managers and ICs; create leaders of leaders. Change agent who challenges the status quo while maintaining high standards for quality and customer satisfaction. Operates with ownership, bias for action, and strong judgment in an ambiguous, high‑growth environment. Top 5 Critical Competencies & Skills SRE & Reliability Leadership — SLI/SLO/SLA management, error budgets, disciplined post‑mortems. Cloud Operations at Scale (AWS‑first) — operational readiness, DR/BCP, change/incident/problem management, and Well‑Architected operations. Observability & AIOps — end‑to‑end telemetry, APM/RUM, automated remediation to reduce MTTR and toil. DevSecOps & Policy‑as‑Code — secure‑by‑default pipelines and vulnerability management with measurable SLAs. FinOps & Cost Governance — cost allocation, right‑sizing, and spend optimization to improve unit economics while scaling. Additional Information Relocation Assistance Provided: No At GE HealthCare, we see possibilities through innovation. We’re partnering with our customers to fulfill healthcare’s greatest potential through groundbreaking medical technology, intelligent devices, and care solutions. Better tools enabling better patient care. Together, we are not only building a healthier future but living our purpose to create a world where healthcare has no limits.
Responsibilities
This director role provides leadership and oversight for SRE and Cloud Operations across PCS, establishing operating foundations, metrics, and automation for mission-critical applications to ensure high reliability and security. Key duties involve owning Cloud Operations, institutionalizing SRE practices like SLI/SLO frameworks, building end-to-end observability, and leading FinOps practices.
Loading...