Staff Site Reliability Engineer

at  Lightspeed

Toronto, ON, Canada -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate19 Nov, 2024Not Specified22 Aug, 20249 year(s) or aboveCareer OpportunitiesNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

Hi there! Thanks for stopping by
Are you actively looking for a new opportunity? Or just checking the market? Well… you might just be in the right place!
We’re looking for a Staff Site Reliability Engineer to join our NuOrder by Lightspeed team in North America NuORDER by Lightspeed builds software solutions that help merchants grow the size and the profitability of their business. You’ll join a team responsible for supporting the group in cross-cutting concerns, such as cloud infrastructure, reliability and incident management, data warehousing and analytics, cost transparency and efficiency, and much more. You will also be supporting our growing Dev teams with the infrastructure and tools needed to continue scaling. You will build and support multi-region infrastructures and networks, and help run our products in a reliable, efficient and secure manner by implementing, advising and advocating the well-known DevOps principles.

What you’ll be doing:

  • Work closely with development teams to empower them with the necessary tools and practices for monitoring software health in production, defining and measuring reliability metrics (SLI, SLO), and managing error budgets.
  • Design, build and maintain robust infrastructure built upon GCP, leveraging cloud native technologies such as GKE, Cloud SQL, BigQuery, etc.
  • Develop and manage CI/CD pipelines for efficient deployment and release using a number of technologies (GitLab, Gihub, Helm, Terraform, etc.).
  • Drive incident management process and conduct post-mortem analysis to prevent future outages.
  • Mentor junior SREs and developers, providing guidance on best practices in cloud architecture, data management, and software development.
  • Conduct system performance benchmarks and implement enhancements to improve system reliability and throughput.
  • Collaborate with cross-functional teams to identify, design, and implement internal process improvements in a cost-efficient manner.
  • Design and build robust, scalable, and highly available systems.
  • Build platform solutions and apply software engineering principles to improve the reliability of our software and accelerate software delivery
  • Manage infrastructure change through infrastructure as code (IaC)
  • Be part of our on-call rotation.
  • Stay current with industry trends and emerging technologies, advocating for the adoption of new technologies and practices that improve product quality and team efficiency.

What you need to bring:

  • Bachelor’s degree in Computer Science, Engineering, or possess a related level of real-world experience.
  • 7-9+ years of experience across site reliability engineering, systems administration, and/or software engineering.
  • Strong expertise in container orchestration platforms, specifically Kubernetes.
  • Strong understanding of both relational (e.g., PostgreSQL, MySQL) and NoSQL databases (e.g., MongoDB, Cassandra, Redis).
  • Deep understanding of network protocols and IP networking, as well as experience with network troubleshooting.
  • Proficiency in programming languages such as Java, Python, Go, etc.
  • Proven track record of managing large-scale infrastructure in cloud environments, such as Google Cloud, AWS or Azure.
  • Experience with monitoring tools (e.g., Prometheus, Grafana, Datadog) and logging solutions (e.g., ELK stack).
  • Strong understanding of security best practices.
  • Exceptional problem-solving skills and the ability to work under pressure to troubleshoot and resolve complex issues.
  • Excellent communication skills to effectively collaborate with cross-functional teams.
  • Strong leadership skills, capable of leading projects and influencing engineering decisions across the organization.

We know that people are more than what’s on their CV. If you’re unsure that you have the right profile for the role… hit the ‘Apply’ button and give it a try!
What’s in it for you?

Come live the Lightspeed experience…

  • Ability to do your job in a truly flexible environment;
  • Genuine career opportunities in a company that’s creating new jobs everyday;
  • Work in a team big enough for growth but lean enough to make a real impact.

… and enjoy a range of benefits that’ll keep you happy, healthy and (not) hungry:

  • Lightspeed share scheme (we are all owners)
  • Lightspeed RSU program (we are all owners)
  • Unlimited paid time off policy
  • Flexible working policy
  • Health insurance
  • Health and wellness benefits
  • Paid leave assistance for new parents
  • Linkedin learning
  • Volunteer day

    LI-AL2

    To all recruitment agencies: Lightspeed does not accept unsolicited agency resumes. If we have not directly engaged your company in writing to supply candidates for a specific vacancy, Lightspeed will not be responsible for any fees related to unsolicited resumes.
    Lightspeed is a proud equal opportunity employer and we are committed to creating an inclusive and barrier-free workplace. Lightspeed welcomes and encourages applications from people with disabilities. Accommodations are available on request for candidates taking part in all aspects of the selection process.
    Where to from here?
    Obviously, this has to be mutually beneficial: we want you to step into a role you love, and we want to offer you a place you’re proud to come to every day. .
    Lightspeed is building communities through commerce, and we need people from all backgrounds and lived experiences to do that. We were founded in 2005, in Montreal’s gay village and our original members were all part of the LGBTQ+ community. The ethos of our business has been about inclusion from the very beginning, and we strive to provide a workplace where everyone belongs.
    Who we are:
    Powering the businesses that are the backbone of the global economy, Lightspeed’s one-stop commerce platform helps merchants innovate to simplify, scale, and provide exceptional customer experiences. Our cloud commerce solution transforms and unifies online and physical operations, multichannel sales, expansion to new locations, global payments, financial solutions, and connection to supplier networks.
    Founded in Montréal, Canada in 2005, Lightspeed is dual-listed on the New York Stock Exchange (NYSE: LSPD) and Toronto Stock Exchange (TSX: LSPD). With teams across North America, Europe, and Asia Pacific, the company serves retail, hospitality, and golf businesses in over 100 countries.
    Lightspeed handles your information in accordance with our Applicant Privacy Statement.

Responsibilities:

  • Work closely with development teams to empower them with the necessary tools and practices for monitoring software health in production, defining and measuring reliability metrics (SLI, SLO), and managing error budgets.
  • Design, build and maintain robust infrastructure built upon GCP, leveraging cloud native technologies such as GKE, Cloud SQL, BigQuery, etc.
  • Develop and manage CI/CD pipelines for efficient deployment and release using a number of technologies (GitLab, Gihub, Helm, Terraform, etc.).
  • Drive incident management process and conduct post-mortem analysis to prevent future outages.
  • Mentor junior SREs and developers, providing guidance on best practices in cloud architecture, data management, and software development.
  • Conduct system performance benchmarks and implement enhancements to improve system reliability and throughput.
  • Collaborate with cross-functional teams to identify, design, and implement internal process improvements in a cost-efficient manner.
  • Design and build robust, scalable, and highly available systems.
  • Build platform solutions and apply software engineering principles to improve the reliability of our software and accelerate software delivery
  • Manage infrastructure change through infrastructure as code (IaC)
  • Be part of our on-call rotation.
  • Stay current with industry trends and emerging technologies, advocating for the adoption of new technologies and practices that improve product quality and team efficiency


REQUIREMENT SUMMARY

Min:9.0Max:14.0 year(s)

Information Technology/IT

IT Software - Application Programming / Maintenance

Software Engineering

Graduate

Computer science engineering or possess a related level of real-world experience

Proficient

1

Toronto, ON, Canada