Incident Commander at CaseWare
Toronto, ON, Canada -
Full Time


Start Date

Immediate

Expiry Date

23 Nov, 25

Salary

0.0

Posted On

23 Aug, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Good communication skills

Industry

Information Technology/IT

Description

Caseware is one of Canada’s original Fintech companies, having led the global audit and accounting software industry for over 30 years, with more than 500,000 users across 130 countries and available in 16 different languages. While you might not have heard of us (yet) over 36,000 accounting and audit professionals list Caseware as a skill on their LinkedIn profiles!
We are seeking a proactive, calm-under-pressure Incident Commander to manage incident response within our SaaS operations. In this role, you’ll serve as the authoritative voice during incidents, steering resolution while ensuring effective communication across teams, driving resolution, performing root cause analysis (RCA), and ensuring clear post-incident documentation for both internal teams and customers.
Contact: Dana Liulica – Talent Acquisition Partner

Responsibilities
  • Initiate and oversee the incident response efforts, acting as primary bridge upon detection within a 24/7 SaaS environment.
  • Collaborate with cross-functional teams including engineers, product management, and support. Leverage and implement integration in tools such as JIRA, PagerDuty, New Relic, AWS and Microsoft Teams to monitor, manage, and coordinate incident handling.
  • Drive teams to resolve incidents quickly and efficiently.
  • Understand the software and infrastructure landscape to guide resolution strategies.
  • Ensure appropriate stakeholders are involved in active incident to support rapid recovery.
  • Communicate clearly and effectively with both internal and external stakeholders to provide timely updates and resolution plans.
  • Track and report uptime metrics to internal and external audiences, promoting transparency in system reliability and performance.
  • Coordinate and lead post-mortem sessions after significant events. Documenting root causes, lessons learned, and actionable items. Follow up on action to ensure implementation and prevent recurrence.
  • Create comprehensive post-incident (PIRs) and RCA documents that outline timeline, impact, remediation, root cause and preventive steps.
  • Implement proactive strategies and tools to reduce risks and strengthen system resilience.
Loading...