GCP Principal Engineer SRE

at  HSBC

Sheffield, England, United Kingdom -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate03 Dec, 2024Not Specified05 Sep, 2024N/AKubernetes,Virtualisation,Databases,High Pressure,Bash,Incident Response,Infrastructure,Participation,Jenkins,Code,Docker,Storage,Automation,Communication Skills,Information Technology,Operating Systems,Computer Science,Google Cloud PlatformNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

Job description

JOIN A DIGITAL FIRST BANK THAT’S POWERED BY PEOPLE.

Our technology team builds innovative digital solutions rapidly and at scale to deliver the next generation of banking services for our customers around the world.
We have an entrepreneurial mindset. Our people work together, creating an agile, collaborative, and innovative culture. You’ll learn and expand your skills, plus we will support you every step of the way as you grow your career.
Our CTO practice and are seeking a talented, highly motivated and experienced GCP Principal Engineer with a strong emphasis on Site Reliability Engineering (SRE) to join the HSBC GCP team. This role reports into the Google cloud platform Lead and as an SRE-focused Principal Engineer, you will play a critical role in ensuring the reliability, performance, and scalability of our infrastructure and applications with a particular focus on GCP Shared Services. Your expertise will be instrumental in supporting the team in delivering a seamless and highly available service to our customers.

Requirements

  • Bachelor’s degree in Computer Science, Information Technology, or a related field. (Or equivalent experience)
  • Expert level knowledge on Google Cloud Platform
  • Proven mastery of core technologies associated with Cloud infrastructure operations and automation including Storage, Networks, Virtualisation, Databases and Operating Systems
  • Deep and broad understanding of resilience with associate continuous improvement approaches
  • Significant, practical experience of major incident response and troubleshooting methodologies. Demonstrable experience of leading a team to successful remediation of high pressure, materially impacting incidents.
  • Practical, enterprise experience with implementing containerization and orchestration technologies, such as Docker and Kubernetes.
  • Outstanding influencing and communication skills, with experience of communicating complex subject matter to senior executives.
  • Strong programming and automation skills (e.g. Python and Bash)
  • Strong experience with selecting and optimising monitoring and alerting tools (e.g., Cloud Monitoring, Cloud Logging, ELK).
  • Deep experience in infrastructure as code (IaC) tools such as Terraform and Ansible
  • Deep experience working with CI/CD tooling such as Jenkins and Cloud Build
  • Expert- level industry certifications across a broad range of technical subjects.
  • This role may require participation in an on-call rotation to address critical incidents outside of regular business hours

Responsibilities:

  • Provide strategic direction to the GCP Service Resilience Engineering practices across the range of key technologies and infrastructure components used, leveraging deep enterprise experience and current or emerging industry best practices.
  • Perform hands on development/coding using GCP Foundation Platform at HSBC
  • Provide technical direction for small engineering pod to set the standards and deliver upon core SRE practices such as infrastructure reliability, capacity planning, security, automation, incident management, performance optimisation, monitoring and alerting.
  • Collaborate closely and maintain strong relationships with a broad range of senior stakeholders across many functions, including Architecture, Security and Compliance, Cyber Security and other Cloud platform teams.
  • Proactive identification and remediation of risks and issues before they present a significant impact to the platform and its customers. Support definition of epics, stories or tasks required for the service resilience engineering team.
  • Be a champion for safe change. Contribute to process improvements and help to promote a culture of safety and accountability through change-related ceremonies and individual opportunities for feedback.
  • As a senior technical leader on the team, use experience to provide positive and regular inputs to more strategic matters. This would include but not be limited to culture, operating models and cross- cloud consistency.
  • In the event of a major incident, assume operational control and be the ultimate technical escalation point, owning the technical resolution and associated follow-up actions, focusing on organisational learning and avoidance of recurrence.
  • Partner engagement is a regular requirement. Influencing enterprise partners to drive changes to their roadmaps that favour HSBC priorities, strongly influence relevant product developments and shared initiatives to achieve stronger business outcomes.
  • Developing oneself and others. Maintain and make progress against an ambitious personal development plan while seizing opportunities for day-to-day and more strategic technical development of the wider GCP team and its customers.
  • Be a voice in the industry and represent HSBC at a senior, technical level at appropriate conferences, workshops and seminars.

Requirements

  • Bachelor’s degree in Computer Science, Information Technology, or a related field. (Or equivalent experience)
  • Expert level knowledge on Google Cloud Platform
  • Proven mastery of core technologies associated with Cloud infrastructure operations and automation including Storage, Networks, Virtualisation, Databases and Operating Systems
  • Deep and broad understanding of resilience with associate continuous improvement approaches
  • Significant, practical experience of major incident response and troubleshooting methodologies. Demonstrable experience of leading a team to successful remediation of high pressure, materially impacting incidents.
  • Practical, enterprise experience with implementing containerization and orchestration technologies, such as Docker and Kubernetes.
  • Outstanding influencing and communication skills, with experience of communicating complex subject matter to senior executives.
  • Strong programming and automation skills (e.g. Python and Bash)
  • Strong experience with selecting and optimising monitoring and alerting tools (e.g., Cloud Monitoring, Cloud Logging, ELK).
  • Deep experience in infrastructure as code (IaC) tools such as Terraform and Ansible
  • Deep experience working with CI/CD tooling such as Jenkins and Cloud Build
  • Expert- level industry certifications across a broad range of technical subjects.
  • This role may require participation in an on-call rotation to address critical incidents outside of regular business hours.

This role is based in Sheffield/Hybrid


REQUIREMENT SUMMARY

Min:N/AMax:5.0 year(s)

Information Technology/IT

IT Software - Other

Software Engineering

Graduate

Computer science information technology or a related field

Proficient

1

Sheffield, United Kingdom