Staff Infrastructure Network Engineer – Kubernetes, Cloud & Service Mesh at Lyft

Toronto, ON, Canada -

Full Time

Start Date

Immediate

Expiry Date

10 Dec, 25

Salary

172000.0

Posted On

12 Sep, 25

Experience

8 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Load Balancing, Incident Response, Large Scale Systems, Infrastructure, Aws, Code, Https, Computer Science

Industry

Information Technology/IT

Description

Applications for this position will be accepted until September 24, 2025. Lyft reserves the right to close the application process early or extend the deadline at its discretion.
At Lyft, our purpose is to serve and connect. We aim to achieve this by cultivating a work environment where all team members belong and have the opportunity to thrive.
Lyft’s Infrastructure teams are responsible for building the foundational systems that engineers rely on in order to build stable, scalable, and efficient services. We build standardized infrastructure that helps software developers move fast, while still providing them with the flexibility they need to innovate within their teams. We are looking for experienced leaders to guide these teams as they create an exceptional development experience for all of Lyft. These are high leverage roles, as your work will have a multiplicative effect across Engineering, and contribute directly to Lyft’s overall stability.
We’re looking for a Staff Infrastructure Network Engineer to join our Infrastructure team and play a pivotal role in designing, scaling, and evolving the systems that power Lyft’s entire engineering platform.
As a staff-level engineer, you’ll be a technical leader. You’ll set direction, solve deeply complex infrastructure challenges, and deliver solutions that are reliable, scalable, and secure. This role is ideal for someone who thrives on impact, collaboration, and working at scale.
You’ll partner with teams across Lyft to ensure our foundational network infrastructure supports high performance, reliability, and developer velocity.

EXPERIENCE:

Bachelor’s degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience
8+ years of hands-on experience operating Kubernetes in production and working in cloud environments such as AWS or GCP
Proficient in Infrastructure as Code (IaC) using tools like Terraform, and experienced in automating deployments in large-scale environments
Strong background in Linux systems, networking, DNS, load balancing, and protocols such as HTTP, HTTPS, and gRPC
Proven experience designing or operating distributed network management systems, including service mesh or service proxies (e.g., Envoy, Istio, NGINX, Cilium)
Deep understanding of AWS networking concepts, including VPCs, subnets, NAT gateways, NLBs, and security groups
Demonstrated ability to debug complex, multi-layer infrastructure issues and lead incident response across highly available, large-scale systems

Responsibilities

Design, implement, and maintain Lyft’s service mesh and edge routing using Envoy Proxy and Kubernetes on AWS, ensuring secure and reliable service-to-service communication
Scale and operate Lyft’s Kubernetes networking stack, including Ingress controllers, CNI plugins, and service discovery, to support high-availability microservices
Develop and optimize load balancing algorithms and traffic policies across control and data planes to improve latency, resiliency, and cost efficiency
Build and maintain observability infrastructure (e.g., Prometheus, OpenTelemetry, Grafana, Datadog) to enable real-time visibility and incident response
Lead incident investigation and resolution for production network issues, debugging across layers including Kubernetes, VPC networking, service mesh, and L4/L7 traffic flow