AWS - Site Reliability Engineer

at  Epam Systems

Desde casa, Río Negro, Argentina -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate29 Jan, 2025Not Specified30 Oct, 20245 year(s) or abovePython,Software DevelopmentNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

Join EPAM as an AWS SRE. In this role, you’ll collaborate with service teams to improve the reliability and efficiency of workloads and services using SRE practices. If you’re a senior engineer with a good track record of highly scalable, distributed systems projects in the past 5 years, we’d love to hear from you.
EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.

REQUIREMENTS

  • Senior Engineers with a good track record of highly scalable, distributed systems projects in the past 5 years
  • Previous experience working as an SRE engaged with active development teams is a must, and the candidate should have a good understanding of SRE methodologies and philosophies
  • AWS cloud expertise
  • Ideally has experience running multi region workloads, and has in depth knowledge of most of the commonly used AWS services
  • Observability experience with distributed services, for example, experience of distributed tracing and similar concepts
  • Independent and self-directed people to work alongside client engineering teams under minimal supervision
  • Strong programming and automation experience: Python, Golang
  • Understanding of the software development lifecycle

Responsibilities:

  • Collaborate with service teams to improve the reliability and efficiency of workloads and services using SRE practices
  • Develop and improve CI/CD processes to enhance release cadence and success
  • Build, consume toil backlog, automating toilsome tasks
  • Document knowledge and processes
  • Practice and promote sustainable incident response and blameless postmortems
  • Write code that improves scalability, performance, maintainability, and security
  • Implement distributed monitoring practices
  • Refine monitoring processes, configurations, and thresholds
  • Contribute towards the identification and implementation of service level indicators and objectives for workloads and services


REQUIREMENT SUMMARY

Min:5.0Max:10.0 year(s)

Information Technology/IT

IT Software - Other

Software Engineering

Graduate

Proficient

1

Desde casa, Argentina