Site Reliability Engineer at Motorola Solutions
Gatineau, QC J8Z 3H6, Canada -
Full Time


Start Date

Immediate

Expiry Date

19 Oct, 25

Salary

0.0

Posted On

20 Jul, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Software Development, Disaster Recovery, Devops, Root Cause, Incident Response, Computer Engineering, Hybrid Cloud, Systems Programming, Network Architecture, Kanban, Communication Skills, Creativity, High Availability Architecture, Capacity Planning, Load, Architecture

Industry

Information Technology/IT

Description

COMPANY OVERVIEW

At Motorola Solutions, we believe that everything starts with our people. We’re a global close-knit community, united by the relentless pursuit to help keep people safer everywhere. Our critical communications, video security and command center technologies support public safety agencies and enterprises alike, enabling the coordination that’s critical for safer communities, safer schools, safer hospitals and safer businesses. Connect with a career that matters, and help us build a safer future.

JOB DESCRIPTION

As a software engineer on the Emergency Call Management site reliability engineering (ECM-SRE) team you will join a team of talented software engineers who work directly with product and engineering teams to constantly improve reliability across our suite of public safety products.

Your responsibilities will include:

  • Architecture and implementation of Monitoring/Observability objectives. This includes maintenance of Alert response playbooks.
  • Creation and reinforcement of the HA and reliability strategy.
  • Triage of customer-reported incidents and problems to the proper software team, requiring troubleshooting and problem management skills.
  • Maintenance and reporting of SLOs and error budget.
  • Facilitation of Chaos Engineering activities with multiple engineering teams.
  • Developing the SRE culture and sharing best practices across Motorola Solutions’ Emergency Call Management organization.
  • On-call support alongside multiple engineering teams for products and services in production. This role focuses on Incident Command to maintain focus and direction of the incident process. This is essential to meet regulatory reporting requirements.
  • Assist Motorola Solutions’ customer support teams in creating customer facing communication documents, requiring strong communication skills.
  • Facilitation of Failure Mode and Effects Analysis with multiple engineering teams.
  • The right individual will have a passion for observability, reliability, automation, incident response, and enabling innovation.

BASIC REQUIREMENTS

  • BS in Computer Engineering (or equivalent degree)
  • 4+ years of professional software development
  • Excellent communication skills
  • Experience developing cloud-based applications
  • Experience developing REST-based APIs and implementing microservice principles and architectures
  • Experience with modern DevOps tooling (including CI/CD pipelines)
  • Familiarity with the concepts involved in designing a high availability architecture
  • Familiarity with observability and monitoring
  • Familiarity with automated testing
  • Creativity and persistence when solving complex problems
  • Enthusiasm for learning key technologies, architectures, processes, and best practices

PREFERRED SKILLS

  • Familiarity with SRE or DevOps
  • Familiarity with container deployment and orchestration technologies at scale
  • Familiarity with SLOs and SLIs
  • Familiarity with incident response, disaster recovery, root cause analysis, and postmortems
  • Familiarity with IaC
  • Familiarity with chaos engineering
  • Familiarity with redundancy and failovers
  • Familiarity with capacity planning and load balancing
  • Familiarity with service mesh
  • Familiarity with feature flags, canary releases, or blue/green deployments
  • Familiarity with hybrid cloud architecture
  • Familiarity developing cloud-based applications with a multi-tenant database architecture
  • Familiarity with systems programming (network stack, file system, OS services) and networking (L2 vs. L3, network architecture, VLANs, etc)
  • Experience working in Agile teams leveraging Scrum, Kanban, or other methodologies and/or understanding of Agile development concepts
  • Experience being on-call for a product in production

DESCRIPTION DE POSTE:

En tant qu’ingénieur logiciel au sein de l’équipe d’ingénierie de la fiabilité des sites de gestion des appels d’urgence (ECM-SRE), vous rejoindrez une équipe d’ingénieurs logiciels talentueux qui travaillent directement avec les équipes de produits et d’ingénierie pour améliorer constamment la fiabilité de notre ensemble de produits de sécurité publique.

Vos responsabilités comprendront :

  • Architecture et mise en œuvre des objectifs de surveillance/observabilité. Cela comprend la maintenance des livres de jeu de réponse aux alertes.
  • Création et renforcement de la stratégie de disponibilité élevée et de fiabilité.
  • Traitement des incidents et problèmes signalés par les clients et attribution au service logiciel approprié, nécessitant des compétences en dépannage et en gestion des problèmes.
  • Maintenance et rapport des objectifs de niveau de service (SLO) et du budget d’erreurs.
  • Facilitation des activités d’ingénierie du chaos avec plusieurs équipes d’ingénierie.
  • Développement de la culture SRE et partage des meilleures pratiques au sein de l’organisation de gestion des appels d’urgence de Motorola Solutions.
  • Support d’astreinte aux côtés de plusieurs équipes d’ingénierie pour les produits et services en production. Ce rôle met l’accent sur le commandement des incidents pour maintenir la concentration et la direction du processus d’incident. Cela est essentiel pour répondre aux exigences de rapport réglementaires.
  • Aide aux équipes de support client de Motorola Solutions dans la création de documents de communication destinés aux clients, nécessitant de solides compétences en communication.
  • Facilitation de l’analyse des modes de défaillance et de leurs effets avec plusieurs équipes d’ingénierie.
  • La bonne personne aura une passion pour l’observabilité, la fiabilité, l’automatisation, la gestion des incidents et la promotion de l’innovation.

BASIC REQUIREMENTS

  • BS in Computer Engineering (or equivalent degree)
  • 4+ years of professional software development experience with Devops and developing cloud-based applications and REST-based APIs and implementing microservice principles and architectures.
  • Licence en génie informatique (ou diplôme équivalent)
  • Plus de 4 ans d’expérience professionnelle en développement de logiciels avec Devops et développement d’applications basées sur le cloud et d’API basées sur REST et mise en œuvre de principes et d’architectures de microservices.

TRAVEL REQUIREMENTS

Under 10%

Responsibilities
  • Architecture and implementation of Monitoring/Observability objectives. This includes maintenance of Alert response playbooks.
  • Creation and reinforcement of the HA and reliability strategy.
  • Triage of customer-reported incidents and problems to the proper software team, requiring troubleshooting and problem management skills.
  • Maintenance and reporting of SLOs and error budget.
  • Facilitation of Chaos Engineering activities with multiple engineering teams.
  • Developing the SRE culture and sharing best practices across Motorola Solutions’ Emergency Call Management organization.
  • On-call support alongside multiple engineering teams for products and services in production. This role focuses on Incident Command to maintain focus and direction of the incident process. This is essential to meet regulatory reporting requirements.
  • Assist Motorola Solutions’ customer support teams in creating customer facing communication documents, requiring strong communication skills.
  • Facilitation of Failure Mode and Effects Analysis with multiple engineering teams.
  • The right individual will have a passion for observability, reliability, automation, incident response, and enabling innovation
Loading...