Site Reliability Engineer

at  TrustFlight

Vancouver, BC, Canada -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate25 Dec, 2024Not Specified27 Sep, 2024N/AGood communication skillsNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

TrustFlight is at the forefront of digitizing the aviation industry with the creation of intelligent workflow applications that automate operating and maintenance processes, enabling our customers to focus on the data and insights that matter. TrustFlight has bases in both England (London & Leamington Spa) and Canada (Vancouver). Our business is rapidly expanding, and we’re proud to share that we’re entirely self-funded and consistently profitable.
Not only are we disrupting the sector, we are creating a great place to work that gives our people the freedom to create, innovate and influence how we do this. We continue to build an amazing group of people who are all here to make our products, services and culture the most envied in the industry!
We are seeking a talented Site Reliability Engineer (SRE) to join our Operations team. In this role, you will focus on ensuring the reliability, scalability, and performance of our systems and services across multiple cloud platforms. You’ll work with a variety of technologies, including cloud environments, containerized applications, and automation tools to continuously improve our infrastructure.

YOU IDEALLY NEED THE FOLLOWING TO QUALIFY:

  • A passion for delivering modern scalable, performant, efficient, and resilient cloud-hosted systems
  • Demonstrable experience of building and supporting internal platforms (pipelines, tooling, etc) that allow engineering teams to build, deploy, and operate software systems efficiently and effectively
  • Experience or knowledge as an SQL Database Administrator (DBA), including database backup and recovery processes
  • Hands-on experience with the Azure platform, including resource provisioning and cost management
  • Excellent troubleshooting skills, with a focus on diagnosing and resolving complex issues across distributed systems
  • A focus on building security and quality into development processes (i.e. DevSecOps)
  • Demonstrable understanding of the following:
  • Virtualization, Kubernetes, and containerisation (Docker, containerd)
  • Azure App Service, SQL Database, Front Door/CDN/WAF, Cognitive Search
  • CI/CD pipelines (preferably GitLab CI and Azure DevOps)
  • Infrastructure-as-Code (preferably Terraform)
  • Release processes and configuration management
  • Web application and Microservice architecture
  • Networking, including DNS, NGINX, firewalls, routing, load balancing, and VPNs
  • Strong scripting skills for automation using PowerShell, Azure CLI, and Bash
  • Master-level organisation and detailed documentation skills
  • Excellent collaboration and team-working skills
  • Familiarity with the capabilities and practical applications of current AI technologies, and experience leveraging AI tools in operational processes

Responsibilities:

  • Maintain high reliability, availability, and scalability across cloud platforms (Azure, GCP) and resilient shared services (e.g., CI/CD)
  • Automate infrastructure and resource provisioning, improving system efficiency and streamlining workflows
  • Monitor system performance and capacity, implementing optimizations to ensure smooth operations
  • Respond to incidents, investigate and resolve issues with operational environments
  • Manage database backups and recovery processes, ensuring data integrity and availability
  • Implement disaster recovery plans and regularly perform failover testing to ensure operational readiness
  • Ensure security is a first-class priority across all areas of the platform, adhering to industry best practices and compliance requirements
  • Contribute to blameless post-mortems for production incidents
  • Provide support to our UK teams (and in emergencies) outside of regular business hours on a rota basis or similar
    In this role, you will be energized and guided by our experienced Operations team, fostering a dynamic environment for continuous learning and improvement. At TrustFlight, we deeply value teamwork and are committed to the personal and professional growth of each team member. We are looking for professionals who are confident in their ability to acquire new skills and grow their expertise through dedicated mentorship and a supportive work culture.


REQUIREMENT SUMMARY

Min:N/AMax:5.0 year(s)

Information Technology/IT

IT Software - Application Programming / Maintenance

Software Engineering

Graduate

Proficient

1

Vancouver, BC, Canada