Senior Site Reliability Engineer at Scapia

Bengaluru, karnataka, India -

Full Time

Start Date

Immediate

Expiry Date

05 Jul, 26

Salary

0.0

Posted On

06 Apr, 26

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Site Reliability Engineering, Linux System Administration, Networking, AWS, Golang, Bash, Python, Terraform, Prometheus, Grafana, Docker, Kubernetes, CI/CD, GitHub Actions, Jenkins, Infrastructure as Code

Industry

Financial Services

Description

About the Company Scapia! A co-branded credit card that’s out there to make travel happen for people, by converting their everyday expenses into travel experiences. We’re a bunch of passionate people who work together, brainstorm, and debate with each other, and don’t stop until we’re proud of our work. Customer delight tops everything else! We’ve worked hard to create an environment of honesty and passion that sets everyone up for success. Role Overview We’re looking for a Site Reliability Engineer to join our engineering team and help build and maintain our infrastructure platform. You’ll design and operate scalable platforms, establish best practices for infrastructure management, and drive reliability and scalability initiatives that shape our platform and engineering practices. You’ll work closely with product and development teams to ensure our systems are reliable, performant, and scalable as we grow and ship features. Key Responsibilities Build and maintain platforms and tooling to enable development teams to deploy and operate services efficiently Design and manage cloud infrastructure using Infrastructure as Code for scalability, reliability, and security Establish best practices for infrastructure management, reliability, and scalability across the organization Build and maintain CI/CD pipelines and developer tooling to enhance productivity and reduce deployment friction Build and maintain observability solutions (monitoring, logging, tracing) to ensure system health and performance Collaborate with development teams to embed reliability, performance, and scalability into services from the start Required Qualifications 4-6 years of experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering Strong knowledge of Linux system administration, networking, and OS fundamentals Deep understanding of AWS services Programming experience in Golang, Bash, or Python Hands-on experience with Infrastructure as Code (Terraform or similar) Experience with monitoring tools (Prometheus, Grafana) and observability platforms Production experience with Docker and Kubernetes (EKS, GKE, or self-managed) Strong troubleshooting and problem-solving skills for complex distributed systems Experience with CI/CD tools (GitHub Actions, Jenkins, or similar) Understanding of security best practices in cloud environments Nice to have Experience with GCP or Azure Preferred certifications: AWS Certified Solutions Architect Certified Kubernetes Administrator (CKA) Google Cloud Professional Cloud Architect

Responsibilities

You will design and operate scalable cloud infrastructure while building tools to enable efficient service deployment. Additionally, you will collaborate with development teams to implement observability solutions and ensure system reliability and performance.