Senior DevOps Engineer / Site Reliability Engineer - 6+ (4+ in DevOps) year at Prometteur Solutions Pvt. Ltd.

Surat, gujarat, India -

Full Time

Start Date

Immediate

Expiry Date

02 Jun, 26

Salary

0.0

Posted On

04 Mar, 26

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Aws, Docker, Kubernetes, Terraform, Github Actions, Prometheus, Grafana, Python, Bash, Ecs, Eks, Cloudformation, Linux, Networking, Ssl/Tls, Incident Response

Industry

IT Services and IT Consulting

Description

Company Description We at Prometteur Solutions Pvt. Ltd. are a team of IT experts, who came with a promise of delivering technology-empowered business solutions. We provide world-class software and web development services that focus on playing a supportive role to your business and its holistic growth. Our highly-skilled associates and global delivery capabilities ensure the accessibility and scale to align client's technology solutions with their business needs. Our offerings span the entire IT lifecycle: from Consulting through Packaged, Custom, and Cloud Applications as well as a variety of Infrastructure Services. Job Description ● Own production infrastructure reliability, uptime, and performance across all environments ● Architect, deploy, and maintain scalable, secure AWS infrastructure and dedicated servers ● Design and manage containerized workloads using Docker and orchestration platforms (ECS / EKS / Kubernetes) ● Build, optimize, and enforce CI/CD pipelines using GitHub Actions and modern Git workflows ● Implement infrastructure as code (IaC) using Terraform or CloudFormation ● Establish and maintain monitoring, logging, and alerting systems for proactive incident detection ● Define and track SLOs, SLIs, and error budgets aligned with business priorities ● Lead incident response, root cause analysis (RCA), and post-mortems ● Optimize AWS cloud costs through right-sizing, usage analysis, and architectural improvements ● Support high-traffic and multi-tenant SaaS systems with a focus on isolation and scalability ● Ensure security, compliance, access control, secrets management, and audit readiness ● Collaborate closely with backend, frontend, and mobile teams to improve reliability and deployment workflows ● Maintain clear documentation for infrastructure, operational processes, and runbooks Qualifications Core DevOps & Cloud ● 6+ years of experience in DevOps / SRE / Platform Engineering ● Strong hands-on expertise with AWS (EC2, VPC, IAM, RDS, S3, Lambda, CloudWatch, etc.) ● Experience managing dedicated / bare-metal servers and Linux-based systems ● Deep understanding of networking, DNS, SSL/TLS, load balancing, and firewalls Containers & Orchestration (Mandatory) ● Strong experience with Docker ● Hands-on experience with container orchestration platforms: ○ ECS / EKS / Kubernetes ● Ability to design scalable and fault-tolerant container-based systems Monitoring, Logging & Observability (Mandatory) ● Proven experience with monitoring and observability stacks, such as: ○ Prometheus, Grafana ○ ELK / OpenSearch, Loki ● Ability to design alerting strategies that reduce noise and improve response time CI/CD & Automation ● Strong experience with GitHub and Git-based workflows ● Hands-on expertise in building CI/CD pipelines using GitHub Actions ● Strong scripting skills (Bash, Python preferred) Additional Information All your information will be kept confidential according to EEO guidelines.

Responsibilities

The role involves owning production infrastructure reliability, uptime, and performance, which includes architecting, deploying, and maintaining scalable AWS infrastructure and dedicated servers. Responsibilities also cover designing and managing containerized workloads, building CI/CD pipelines, and leading incident response and root cause analysis.