Site Reliability Engineer - Observability at Rivian and Volkswagen Group Technologies
Palo Alto, California, USA -
Full Time


Start Date

Immediate

Expiry Date

13 Dec, 25

Salary

194610.0

Posted On

16 Sep, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Kubernetes, Python, Decision Making, Docker, Reliability Engineering, Soft Skills, Computer Science, Automation, Containerization, Cloud

Industry

Information Technology/IT

Description
  • Palo Alto, California
  • Software Engineering
    About Us
    Rivian and Volkswagen Group Technologies is a joint venture between two industry leaders with a clear vision for automotive’s next chapter. From operating systems to zonal controllers to cloud and connectivity solutions, we’re addressing the challenges of electric vehicles through technology that will set the standards for software-defined vehicles around the world.
    The road to the future is uncharted. By combining our expertise across connectivity, AI, security and more, we’ll map a new way forward. Working together, we’ll create a future that’s more connected, more intelligent, more sustainable for everyone.
    Role Summary
    We are seeking a Senior Site Reliability Engineer (SRE) specializing in Observability to join RivianVW’s Data Platform - Production Engineering team. In this role, you will design, implement, and scale robust observability systems to ensure the health, performance, and reliability of our production environment. You will collaborate closely with cross-functional teams to create telemetry solutions that provide actionable insights into our distributed systems.
    Responsibilities
-

Observability Platform Design: Architect, implement, and maintain observability systems, leveraging tools like Datadog, LGTM stack, OpenTelemetry, and Vector to enable real-time performance monitoring, logging, and alerting.

  • Telemetry Optimization: Evolve and scale telemetry pipelines to ensure low latency and high availability for metrics, logs, and traces across multi-cloud environments.
  • Performance Engineering: Proactively identify performance bottlenecks, optimize systems, and provide recommendations for reliability improvements.
  • Scalable Automation: Implement automation solutions to scale systems sustainably while driving improvements in reliability and deployment velocity.
  • Incident Management: Collaborate with the incident response team to establish data-driven debugging and troubleshooting processes using observability data.
  • Tooling Development: Create and maintain self-service observability tools and dashboards to empower teams across the organization.
  • Cross-functional Collaboration: Partner with development, DevOps, and infrastructure teams to define SLOs/SLIs and ensure observability is embedded throughout the software lifecycle.

Qualifications
-

Educational Background: Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience.

  • Experience: 5+ years in Site Reliability Engineering or a related role with a strong emphasis on observability.
  • Technical Expertise:
  • Proficiency in designing and operating observability platforms with tools like Prometheus, Grafana, Loki, Jaeger, or Datadog.
  • Experience with OpenTelemetry and distributed tracing in microservices architectures.
  • Deep knowledge of Kubernetes (e.g., EKS), ArgoCD, and Crossplane.
  • Programming Skills: Strong proficiency in Python, Go, or similar languages for building automation and custom telemetry solutions.
  • Cloud & Systems: Familiarity with multi-cloud setups, containerization (Docker), and Linux system fundamentals.
  • Soft Skills: Exceptional problem-solving, communication, and a data-driven approach to decision-making.

Pay Disclosure
Salary Range/Hourly Rate for California Based Applicants: $146,900 - $194,610 USD
Actual Compensation will be determined based on experience, location, and other factors permitted by law.
Benefits Summary: Rivian and Volkswagen Group Technologies provides robust medical, prescription, dental and vision insurance packages for full-time employees, their spouse or domestic partner, and their children up to age 26. Coverage is effective on the first day of employment.

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities

Please refer the Job description for details

Loading...