Service Reliability Engineer - Catalyst
at IO Global
Remote, Scotland, United Kingdom -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 24 Aug, 2024 | Not Specified | 25 May, 2024 | N/A | Devops,Infrastructure,Rust,Aws,Computer Science,Ansible,Code | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
SUMMARY
The Service Reliability Engineer (SRE) at Project Catalyst plays a crucial role in ensuring the reliability, availability, and performance of our production systems supporting our open-source projects. Reporting to the Senior Service Reliability Engineer, this role engages closely with development teams and key stakeholders to integrate software engineering principles with systems engineering. The responsibilities include creating and maintaining tools, automations, and infrastructure code to enhance platform efficiency and resilience. Successful candidates will contribute significantly to our mission by improving service scalability and performance while fostering a culture of collaboration and continuous improvement.
EDUCATION / EXPERIENCE
- BS degree in Computer Science or related technical field, or equivalent practical experience.
- Extensive experience in DevOps, SysAdmin, or a similar role, with a strong background in Infrastructure as Code (using Terraform and Ansible).
- Prior experience with Rust and additional cloud providers (AWS preferred, GCP, or Azure) is advantageous. Cloud certifications are a plus.
SPECIALIST SKILLS
- Deep knowledge of Infrastructure as Code (IaC) principles.
- Practical experience in designing and implementing cloud-based solutions.
- Familiarity with Rust as a software development tool is a plus.
Responsibilities:
- Design, write, and deliver tools and software using Go, Python, and Bash to enhance the availability, scalability, and efficiency of our services.
- Manage the entire lifecycle of services—from inception and design, through deployment, operation, and refinement.
- Conduct sustainable incident response and lead blameless postmortems.
- Participate in on-call rotations, addressing service interruptions and technical challenges promptly.
- Collaborate with development teams to design solutions that prioritize customer experience, scalability, and performance.
- Analyze system performance and reliability to provide enhancement recommendations.
- Establish and maintain service-level objectives (SLOs), service-level indicators (SLIs), and error budgets.
- Implement and advocate for Security Best Practices.
The above list of responsibilities is not an exhaustive list of duties and you will be expected to perform different tasks as necessitated by your changing role within the organisation.
REQUIREMENT SUMMARY
Min:N/AMax:5.0 year(s)
Information Technology/IT
IT Software - Application Programming / Maintenance
Software Engineering
BSc
Computer Science
Proficient
1
Remote, United Kingdom