Senior Site Reliability Engineer at Lightspeed
Montréal, QC, Canada -
Full Time


Start Date

Immediate

Expiry Date

02 Jun, 25

Salary

0.0

Posted On

02 Mar, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Kubernetes, Mysql, Postgresql, Google Cloud Platform, Computer Science, System Administration, Docker, Code, Metrics, Data Systems, Problem Solving, Excel, Bash

Industry

Information Technology/IT

Description

Data is the new Gold at Lightspeed .
We’re on the hunt for a Senior Site Reliability Engineer to join our dynamic SRE team! Our mission? Empower data teams with a scalable, secure, and high-performance infrastructure that keeps data flowing seamlessly across Lightspeed.
If you’re passionate about data security, reliability, and high availability, and love building robust infrastructure and governance frameworks, we’d love to connect!

Role:

  • Collaborate with the Data’s teams to design and implement scalable, reliable, secure, and cost-efficient Cloud infrastructure.
  • Ensure security in holistically manner, which includes infrastructure, supply chain and interaction with third party systems.
  • Contribute to the development of data and infrastructure self-service workflows.
  • Advocate for best practices in terms of Infrastructure as Code, High Availability, Disaster Recovery and Security.
  • Perform competitive analysis for infrastructure frameworks and data processing solutions

And a little bit of….

  • Participate in the day-to-day support and troubleshooting
  • Contribute as part of the wider team to achieve organisational objectives even if this means doing things that aren’t strictly within the scope of your role

What will make you succeed:

  • The Data teams we are supporting count on us and our KPI is their happiness
  • The infrastructure health, cost and security is kept in check and continuously improved

Experience:

  • Bachelor’s degree in Computer Science, Engineering, or equivalent experience
  • Strong experience managing production environment
  • Strong experience with Google Cloud Platform
  • Strong experience managing infrastructure with code

Skills and Attributes:

  • Make Security Non-NegotiableIntegrate security into every decision you make.
  • Demonstrate Proficiency in Bash, Go, or PythonShow you can script and automate tasks with confidence.
  • Master Infrastructure as CodeProve your expertise in IaC, preferably Terraform.
  • Understand GCP Data and Security ComponentsKnow Google Cloud’s offerings and how to secure them.
  • Excel in Linux/Unix and NetworkingShowcase robust system administration and networking skills.
  • Know Your Containers and DatabasesBring hands-on experience with Docker, Kubernetes, MySQL, and PostgreSQL.
  • Get Hands-On with Networking ToolsFrom VPN to VPC and VPC-SC, keep data flowing securely.
  • Embrace Problem-Solving and AdaptabilityTackle challenges head-on and pivot as needed.
  • Stay Eager to Learn and Step Out of Your Comfort ZoneContinuous improvement and curiosity drive success.
  • Own Your DeliverablesTake full accountability for your work—quality and follow-through matter.
  • Optimize Observability, Monitoring, and Alerting
  • Implement robust logging, metrics, and tracing solutions to detect issues before they become incidents, and to provide real-time insights into the health of your data systems.
Responsibilities
  • Collaborate with the Data’s teams to design and implement scalable, reliable, secure, and cost-efficient Cloud infrastructure.
  • Ensure security in holistically manner, which includes infrastructure, supply chain and interaction with third party systems.
  • Contribute to the development of data and infrastructure self-service workflows.
  • Advocate for best practices in terms of Infrastructure as Code, High Availability, Disaster Recovery and Security.
  • Perform competitive analysis for infrastructure frameworks and data processing solution
Loading...