Sign up with

Already have an account? Log in here

Need some help?
Talk to us at +91 7670800001

Software Engineer, Observability at Lyft

Toronto, ON, Canada -

Full Time

Start Date

Immediate

Expiry Date

18 Oct, 25

Salary

108000.0

Posted On

19 Jul, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Computer Science, Systems Engineering, Automation, Aws, Kubernetes, Software Development, Python, Kibana, Managed Services, Go, Teams

Industry

Information Technology/IT

Description

At Lyft, our purpose is to serve and connect. We aim to achieve this by cultivating a work environment where all team members belong and have the opportunity to thrive.

EXPERIENCE:

3+ years of experience working on teams responsible for software development, automation, and systems engineering.
Bachelor’s Degree or equivalent experience in Computer Science or a relevant discipline.
Proficiency in creating production-ready code in one or more high-level languages, such as Go or Python.
Experience operating infrastructure in public cloud environments, such as AWS, including familiarity with Managed Services.
Experience in building and maintaining observability infrastructure to support robust monitoring and analysis.
Familiarity with Kubernetes and managing multi-cluster environments in production settings
Experience using monitoring, alerting, and logging systems such as Prometheus, Grafana and Kibana.

Responsibilities

Maintain and analyze metrics from operating systems, control planes, and applications to assist in fault detection and performance enhancement.
Maintain, improve, and develop tooling and systems that enhance the reliability, scalability, and efficiency of our platform.
Assist engineering teams in defining service-level objectives (SLOs) and provide the necessary tooling to monitor and balance feature development speed and reliability.
Collaborate with cross-functional engineering teams to enhance Lyft’s observability and meet developers’ needs, ensuring alignment with design and production readiness reviews, platform management, and capacity planning.
Keep and maintain our documentation at a world-class level by documenting infrastructure operations processes and insights.
Identifying repeatable actions, and automating repetitive tasks.
Participate in our team’s on-call rotations, respond to incidents, and support other teams to mitigate customer-impacting events.