Senior Site Reliability Engineer (Mobile) at PayPal
San Jose, California, United States -
Full Time


Start Date

Immediate

Expiry Date

14 Jan, 26

Salary

0.0

Posted On

16 Oct, 25

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Site Reliability Engineering, Mobile Development, iOS Development, Android Development, Performance Profiling, Observability Platforms, Automation, Python, Go, Swift, Kotlin, Incident Response, Root Cause Analysis, CI/CD, Gradle, Bazel

Industry

Software Development

Description
Delivers complete solutions spanning all phases of the Software Development Lifecycle (SDLC) (design, implementation, testing, delivery and operations), based on definitions from more senior roles. Advises immediate management on project-level issues Guides junior engineers Operates with little day-to-day supervision, making technical decisions based on knowledge of internal conventions and industry best practices Applies knowledge of technical best practices in making decisions Reliability & Observability Implement mobile-specific SLIs and SLOs (e.g., crash-free sessions, ANRs, app startup time, network success rates). Build and maintain Datadog dashboards, alerts, and playbooks for mobile services. Ensure instrumentation in mobile apps produces actionable telemetry. Tooling & Automation Develop automation and tools under the guidance of the Staff676 SRE, including: Crash/ANR triage workflows. Automated regression detection for performance and reliability. Dashboards and bots to surface release health. Contribute to libraries and scripts for consistent instrumentation across iOS/Android apps. Incident Response Participate in on-call rotations for mobile incidents. Drive root cause analysis, document lessons learned, and contribute to blameless postmortems. Collaborate with backend SREs and service owners on cross-system issues. Collaboration & Enablement Partner with mobile developers to embed reliability practices into development and testing. Provide guidance on instrumentation, monitoring, and alerting best practices. Advocate for operational readiness of features before release. Continuous Improvement Identify areas of operational toil and automate repetitive tasks. Contribute to the evolution of reliability processes and tooling across mobile teams. Minimum of 5 years of relevant work experience and a Bachelor's degree or equivalent experience. Required 6+ years of experience in software engineering, SRE, or mobile development roles. Strong understanding of iOS and/or Android development and performance profiling. Hands-on experience with observability platforms (e.g., Datadog, Firebase Crashlytics, Sentry). Experience building automation, scripts, or tools for reliability (Python, Go, or similar). Working knowledge of Swift/Kotlin for instrumentation in mobile apps. Strong problem-solving and debugging skills, especially in cross-system issues. Experience in incident response and root cause analysis. Familiarity with CI/CD for mobile (Harness, Fastlane, Jenkins). Knowledge of Gradle and Bazel build systems. Previous exposure to setting up on-call models or alerting frameworks. Experience collaborating in a distributed, global engineering team. Mobile apps have reliable monitoring and alerting with meaningful signals in Datadog. Automation and tools reduce manual toil and improve incident detection and resolution speed. Mobile development teams are better equipped to handle on-call and operational ownership. Issues impacting mobile reliability are identified early and mitigated quickly. End-user experience improves through reduced crashes, ANRs, and performance regressions.
Responsibilities
The Senior Site Reliability Engineer (Mobile) delivers complete solutions across all phases of the Software Development Lifecycle, guiding junior engineers and making technical decisions. They implement mobile-specific SLIs and SLOs, build and maintain dashboards, and participate in incident response.
Loading...