Reliability Engineer at Epam Systems
Desde casa, Yucatán, Mexico -
Full Time


Start Date

Immediate

Expiry Date

19 Mar, 25

Salary

0.0

Posted On

14 Feb, 25

Experience

2 year(s) or above

Remote Job

No

Telecommute

No

Sponsor Visa

No

Skills

Continuous Integration, Communication Skills, Powershell, Python

Industry

Information Technology/IT

Description

We are seeking a Reliability Engineer to join our remote team. In this role, you will ensure our information systems’ stability, integrity, and efficiency, which support core organizational functions. You will also be instrumental in identifying and resolving issues that affect the reliability of our systems and services. A successful candidate will thrive in a fast-paced environment and be committed to proactive service optimization and issue prevention.

REQUIREMENTS

  • Minimum of 2 years experience as a Reliability Engineer
  • Proven scripting skills in Python and PowerShell to automate tasks and processes
  • Strong knowledge of cloud platforms, specifically Azure and GCP
  • Experience with Azure DevOps pipelines for continuous integration and deployment
  • Proficient in debugging and troubleshooting complex software and hardware issues
  • Familiarity with monitoring tools such as GCP Cloud Logging, Grafana, and Azure Logs
  • Solid understanding of Site Reliability Engineering (SRE) principles and practices
  • Fluent English communication skills at a B2 level or higher
Responsibilities
  • Monitor system performance and reliability, identifying and resolving issues before they impact users
  • Develop and implement maintenance procedures to reduce system downtime and increase overall efficiency
  • Collaborate with development teams to enhance system design and architecture with a focus on reliability and scalability
  • Conduct root cause analysis on incidents to prevent recurrence
  • Optimize system configurations and settings for improved performance and reliability
  • Implement and manage monitoring tools and software to provide critical operational metrics and insights
Loading...