Software Engineer, Observability at Lyft
Toronto, ON, Canada -
Full Time


Start Date

Immediate

Expiry Date

18 Oct, 25

Salary

108000.0

Posted On

19 Jul, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Computer Science, Systems Engineering, Automation, Aws, Kubernetes, Software Development, Python, Kibana, Managed Services, Go, Teams

Industry

Information Technology/IT

Description

At Lyft, our purpose is to serve and connect. We aim to achieve this by cultivating a work environment where all team members belong and have the opportunity to thrive.

EXPERIENCE:

  • 3+ years of experience working on teams responsible for software development, automation, and systems engineering.
  • Bachelor’s Degree or equivalent experience in Computer Science or a relevant discipline.
  • Proficiency in creating production-ready code in one or more high-level languages, such as Go or Python.
  • Experience operating infrastructure in public cloud environments, such as AWS, including familiarity with Managed Services.
  • Experience in building and maintaining observability infrastructure to support robust monitoring and analysis.
  • Familiarity with Kubernetes and managing multi-cluster environments in production settings
  • Experience using monitoring, alerting, and logging systems such as Prometheus, Grafana and Kibana.
Responsibilities
  • Maintain and analyze metrics from operating systems, control planes, and applications to assist in fault detection and performance enhancement.
  • Maintain, improve, and develop tooling and systems that enhance the reliability, scalability, and efficiency of our platform.
  • Assist engineering teams in defining service-level objectives (SLOs) and provide the necessary tooling to monitor and balance feature development speed and reliability.
  • Collaborate with cross-functional engineering teams to enhance Lyft’s observability and meet developers’ needs, ensuring alignment with design and production readiness reviews, platform management, and capacity planning.
  • Keep and maintain our documentation at a world-class level by documenting infrastructure operations processes and insights.
  • Identifying repeatable actions, and automating repetitive tasks.
  • Participate in our team’s on-call rotations, respond to incidents, and support other teams to mitigate customer-impacting events.
Loading...