Problem & Major Incident Manager at The AA
London, England, United Kingdom -
Full Time


Start Date

Immediate

Expiry Date

26 Nov, 25

Salary

0.0

Posted On

26 Aug, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Good communication skills

Industry

Information Technology/IT

Description

THIS IS THE JOB

We’re looking for a dynamic and experienced Problem & Major Incident Manager to lead the charge in maintaining service stability and resilience across our IT operations. This role is primarily focused on Problem Management, where you’ll take ownership of identifying root causes, driving long-term solutions, and proactively improving service availability. You’ll also play a key role in managing Major Incidents, ensuring swift resolution of high-impact issues through effective coordination and communication.
This is a hands-on, high-visibility role that blends strategic thinking with operational execution. During office hours, your focus will be on Problem Management, with occasional support for Major Incidents. Out-of-hours, you’ll participate in a one-week-in-three rotation to lead Major Incident Management, ensuring continuity and rapid response across our critical services.

WHAT DO I NEED?

We’re looking for someone who:

  • Has strong Problem Management experience, with a proven ability to lead the end-to-end lifecycle of problem records and drive service stability.
  • Is a confident collaborator, able to work effectively with technical teams and manage challenging personalities with empathy, resilience, and tenacity.
  • Holds a minimum of ITIL V3 Foundation certification (V4 preferred), with a solid understanding of ITIL best practices.
  • Communicates clearly and empathetically able to translate technical language into business impact.
  • Thrives in high-pressure situations and can juggle multiple priorities with ease.
  • Is passionate about service excellence and continuous improvement.
Responsibilities
  • Leading the lifecycle of Problem Management - identifying root causes, driving resolutions, and improving service availability.
  • Managing Major Incidents with confidence and clarity - facilitating bridge calls, coordinating with technical teams, and ensuring timely communication.
  • Hosting reviews and governance meetings with senior stakeholders to drive continuous improvement.
  • Creating knowledge articles and capturing lessons learned to support a culture of shared learning.
  • Working collaboratively across teams to ensure service continuity and operational integrity
Loading...