Platform Owner, AIOps/SREngineering
at National Grid
Warwick CV34 6DA, , United Kingdom -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 17 Nov, 2024 | GBP 95000 Annual | 26 Oct, 2024 | 7 year(s) or above | Good communication skills | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
ABOUT US
Every day, we deliver safe and secure energy to homes, communities, and businesses, connecting people to the energy they need for their lives. Our expertise and track record position us uniquely to shape the sustainable future of our industry as the pace of change accelerates.To succeed, we must anticipate customer needs, reduce energy delivery costs, and pioneer flexible energy systems. This requires delivering on our promises and seeking opportunities for growth.
In IT and Digital, we collaborate closely with the diverse energy businesses within the National Grid group, revolutionizing operations through technology. Embracing Agile methodologies and Digital mindsets, we drive efficiency and bring new capabilities to internal and external customers as we lead the charge towards a carbon-free future.
Our work is critical, as National Grid powers millions of homes and businesses in the UK and US, and the technology we employ is vital to this task. The successful applicant for this position will play a crucial role in our mission, supported by our multicultural, customer-centric global team, with opportunities for professional development.
Responsibilities:
JOB PURPOSE
As a Platform Owner of AI Ops and SRE, your primary objective is to design and oversee the implementation of complex systems that meet functional and non-functional requirements. You will play a key role in developing system design policies, standards, and innovation processes specific to AI Ops and SRE. Additionally, you will actively monitor emerging technologies and assess their potential impact on the organization. Your responsibilities will include driving the strategic vision for AI Ops and SRE within the platform, ensuring alignment among stakeholders, and promoting a cohesive approach to AI Ops and SRE implementation.
WHAT YOU’LL DO
As a Platform Owner of AI Ops and SRE, your primary responsibility is to develop comprehensive strategies for implementing AI Ops and SRE practices within the organization. This involves understanding business requirements, assessing technical capabilities, and identifying areas where AI and automation can be leveraged to enhance reliability, performance, and operational efficiency.
Your key responsibilities as a Platform Owner of AI Ops and SRE include:
- Developing AI Ops and Site Reliability Engineering (SRE) Strategies: You will be responsible for developing strategies that incorporate AI Ops and SRE practices within the data center and cloud domain. This involves understanding business requirements, assessing technical capabilities, and identifying opportunities to leverage AI and automation for improved reliability and performance.
- Designing Cloud Architecture Solutions: You will design cloud and on-premise architecture solutions that integrate AI technologies and SRE principles. This includes designing scalable and resilient systems, implementing monitoring and alerting mechanisms, and ensuring high availability and fault tolerance.
- Collaborating with Development and Operations Teams: You will work closely with development and operations teams to provide technical guidance and ensure the successful implementation of AI Ops and SRE practices. This involves reviewing designs, providing recommendations, and promoting best practices for building and operating reliable and efficient cloud-based applications.
- Implementing AI-Driven Monitoring and Analytics: You will implement AI-driven monitoring and analytics solutions within the cloud domain. This includes leveraging machine learning and data analysis techniques to identify and predict system anomalies, performance bottlenecks, and potential failures.
- Establishing Incident Response and Resolution Processes: You will define and establish incident response and resolution processes aligned with SRE practices. This includes setting up incident management frameworks, defining escalation paths, and implementing effective incident response strategies to minimize downtime and ensure quick resolution.
- Driving Continuous Improvement and Optimization: You will drive continuous improvement and optimization efforts within the cloud domain. This involves analyzing system metrics, conducting root cause analysis, and implementing changes to optimize cloud performance, reliability, and efficiency. Automation and self-healing mechanisms will be employed to enhance system resilience and reduce manual intervention.
- Staying Current with Industry Trends: It is crucial to stay updated with the latest industry trends, technologies, and best practices related to AI Ops, SRE, cloud, and on-premises computing. This includes attending conferences, participating in relevant communities, and continuously learning and exploring new tools and techniques to enhance the organization’s AI Ops and SRE capabilities within the cloud and on-premise domain.
- Creating and delivering traceable and auditable customer success metrics for the platform services/products.
- Monitoring and analyzing platform performance metrics and reporting on the overall health of the platform to senior leadership.
- Managing the infrastructure platform within budget guardrails to ensure alignment with company priorities and goals.
- Collaborating with Transversal Teams to align Non-Functional Requirements (NFRs) and prioritize them jointly.
REQUIREMENT SUMMARY
Min:7.0Max:10.0 year(s)
Information Technology/IT
IT Software - Other
Software Engineering
Graduate
Proficient
1
Warwick CV34 6DA, United Kingdom