Site Reliability Engineer at Camlin Group
Lisburn BT28 2EX, , United Kingdom -
Full Time


Start Date

Immediate

Expiry Date

01 May, 25

Salary

0.0

Posted On

01 Feb, 25

Experience

4 year(s) or above

Remote Job

No

Telecommute

No

Sponsor Visa

No

Skills

Good communication skills

Industry

Information Technology/IT

Description

COMPANY DESCRIPTION:

Camlin is a global technology leader that operates with the vision of bringing revolutionary products to life for a wide range of industries, including power and rail, and also has interests in a number of R&D projects in a variety of scientific sectors.
At Camlin we believe in high quality engineering and design, allowing us to develop market leading products and services. In short, we love creating value for our customers by solving difficult problems. As of today, the Camlin operation spans over 20 countries across the globe.

JOB OVERVIEW:

We are seeking a dedicated and experienced Site Reliability Engineer (SRE) to join our dynamic team. The SRE will be responsible for ensuring the reliability, performance, and availability of our critical systems and services. This role requires a blend of software engineering and operations skills to build and run large-scale, distributed, fault-tolerant systems.

EQUAL EMPLOYMENT OPPORTUNITY STATEMENT

Individuals seeking employment at Camlin are considered without regards to race, colour, religion, national origin, age, sex, marital status, ancestry, physical or mental disability, gender identity, or sexual orientation

Responsibilities

System Reliability and Performance:

  • Design, implement, and maintain scalable and reliable infrastructure.
  • Monitor system performance, detect issues, and ensure maximum uptime.
  • Develop and implement strategies for disaster recovery and data backup.

Automation and Tooling:

  • Automate repetitive tasks to improve efficiency and reduce human error.
  • Build and maintain tools for deployment, monitoring, and operations.
  • Create and maintain CI/CD pipelines to streamline application delivery.

Incident Management:

  • Respond to and resolve incidents, minimizing impact on customers.
  • Conduct post-incident reviews to identify root causes and prevent recurrence.
  • Develop and maintain incident response protocols and playbooks.

Collaboration and Communication:

  • Work closely with development teams to integrate reliability into the software development lifecycle.
  • Communicate effectively with stakeholders about system status and health.
  • Provide guidance and mentorship to junior team members.

Security and Compliance:

  • Ensure systems comply with security standards and best practices.
  • Implement and maintain security measures, including patch management and vulnerability assessments.
  • Assist in audits and compliance initiatives as required.
Loading...