Senior Engineer, Site Reliability at LinkedIn

Bengaluru, karnataka, India -

Full Time

Start Date

Immediate

Expiry Date

28 Jun, 26

Salary

0.0

Posted On

30 Mar, 26

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Site Reliability Engineering, System Design, SLOs, SLAs, SLIs, Incident Response, Root Cause Analysis, Automation, Observability, Monitoring, Logging, Tracing, Alerting, Performance Tuning, Capacity Planning, Cloud Migrations

Industry

Software Development

Description

Company Description LinkedIn is the world’s largest professional network, built to create economic opportunity for every member of the global workforce. Our products help people make powerful connections, discover exciting opportunities, build necessary skills, and gain valuable insights every day. We’re also committed to providing transformational opportunities for our own employees by investing in their growth. We aspire to create a culture that’s built on trust, care, inclusion, and fun – where everyone can succeed. Join us to transform the way the world works. Job Description At LinkedIn, our approach to flexible work is centered on trust and optimized for culture, connection, clarity, and the evolving needs of our business. The work location of this role is hybrid, meaning it will be performed both from home and from a LinkedIn office on select days, as determined by the business needs of the team. At LinkedIn, the Productivity Engineering Site Reliability Engineering (SRE) team plays a critical role in ensuring our enterprise business applications are reliable, scalable, secure, and highly automated. We are seeking a Senior Engineer, Site Reliability (Sr. SRE) to design, build, and operate highly reliable systems that power mission-critical financial and enterprise applications. In this role, you will partner closely with Development and SRE teams from early design through production, driving improvements in reliability, performance, and scalability across complex application ecosystems. You will collaborate with cross-functional infrastructure teams to modernize and scale financial systems platforms. You will lead and contribute to key initiatives across application, database, and middleware layers, including performance optimization, observability, and modernization from legacy to modern platforms. This is an exciting opportunity for an engineer who thrives on solving complex production challenges, driving automation at scale, and building resilient systems. Responsibilities Partner with development teams to influence system design for reliability, scalability, and performance from early stages. Define and implement SRE practices including SLOs, SLAs, SLIs, error budgets, and operational readiness. Own end‑to‑end production and development operations, including monitoring, alerting, incident response, and root cause analysis. Troubleshoot and resolve issues across application, database, middleware, and infrastructure layers. Automate operational workflows to reduce manual effort and improve system reliability. Contribute to observability strategy across monitoring, logging, tracing, and alerting platforms. Build and improve internal tools and platforms to increase engineering productivity and operational efficiency. Improve deployment, release, and change management processes through automation and standardization. Perform performance tuning, capacity planning, and system optimization across distributed systems. Support modernization initiatives, including cloud migrations and adoption of new technologies. Collaborate across development, infrastructure, and business teams to deliver scalable solutions. Participate in on‑call rotations and support 24x7 production operations through shift‑based coverage. Qualifications Basic Qualifications BA/BS in Computer Science, Engineering, or related field, or equivalent practical experience. 5+ years of experience in SRE, DevOps, Production Engineering, or similar roles. Experience operating and troubleshooting large‑scale production systems. Experience with Unix/Linux systems and networking fundamentals. Proficiency in at least one programming or scripting language (e.g., Python, Shell). Experience with monitoring, alerting, and incident management systems. Experience with Oracle ERP (EBS/Fusion), Oracle Database, or enterprise financial systems. Experience with Oracle technologies (e.g., RAC, Data Guard) or similar database platforms. Preferred Qualifications Understanding of SRE principles and building systems aligned with SLA/SLO/SLI objectives. Experience with database recovery and disaster recovery setup. Experience with MS SQL Server and MySQL. Familiarity with Java/J2EE‑based architectures and performance tuning. Experience with configuration management tools (e.g., Ansible, Chef). Familiarity with observability platforms (e.g., Oracle EM, Grafana, Azure Log Analytics). Exposure to SOX‑compliant environments and governance frameworks. Understanding of distributed systems, storage systems, and web architectures. Experience working with diverse teams and communicating in fast‑paced environments. Experience using AI‑assisted coding platforms. Suggested Skills Site Reliability Database ( Oracle, MS SQL Server, MySQL) Oracle Cloud Infrastructure (OCI) Business Applications (Oracle ERP/Fusion) You will Benefit from our Culture We strongly believe in the well-being of our employees and their families. That is why we offer generous health and wellness programs and time away for employees of all levels. Additional Information India Disability Policy LinkedIn is an equal employment opportunity employer offering opportunities to all job seekers, including individuals with disabilities. For more information on our equal opportunity policy, please visit https://legal.linkedin.com/content/dam/legal/Policy_India_EqualOppPWD_9-12-2023.pdf Global Data Privacy Notice for Job Candidates Please follow this link to access the document that provides transparency around the way in which LinkedIn handles personal data of employees and job applicants: https://legal.linkedin.com/candidate-portal. Workplace Type: Hybrid Career Track & Grade: IC3/8 Department: Engineering

Responsibilities

The Senior Site Reliability Engineer will partner with development teams to influence system design for reliability and define and implement SRE practices like SLOs, SLAs, and SLIs. Responsibilities also include owning end-to-end production operations, troubleshooting issues across all layers, and driving automation to improve system reliability and efficiency.