Systems Reliability Engineer - EMEA
at IQGeo
Cambridge CB2, England, United Kingdom -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 30 Nov, 2024 | Not Specified | 02 Sep, 2024 | N/A | Pipelines,Platforms,Graphs,Customer Experience,Bash,Github,Python,Kubernetes,Scalability,Mitigation Strategies,Aws,Reliability,Integration,Mttr,Docker,Postgresql,Code,Implementation Services | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
JOB SUMMARY
The Systems Reliability Engineering (SRE) department serves to support IQGEO’s hosted customers in AWS, provide consultative services alongside Implementation Services for external deployments in all three cloud providers and some on-prem scenarios. SRE provides a CI/CD pipeline to deliver solutions from Engineering and Implementation Services to customers. The SRE team works closely with the Engineering, Support, Implementation Services, and Pre-Sales departments to ensure that they are successful with cloud infrastructure and Kubernetes deployments.
JOB DESCRIPTION
We are looking for a Systems Reliability Engineer Professional, who will be working closely with Engineering, Implementation Services, Support, and the rest of the Operations Team to enable rapid delivery of capabilities through a secure Continuous Integration/Continuous Delivery (CI/CD) Pipeline to production systems. You will be responsible for building and maintaining the entire customer pipeline and infrastructure needs for the hosted offerings of IQGEO. You will be a key member of the team, where the rubber meets the road, doing work that matters for our customers, and utilizing the latest technologies in Cloud Computing, GIS, containers, and Kubernetes. Providing backend support and ensuring that support and Implementation Services are setup for success from an infrastructure and pipeline perspective. Be available for an on-call rotation.
REQUIRED SKILLS & ABILITIES
- Expert with GitHub or similar source repository and CI/CD collaboration platform
- Expert with platforms such as Kubernetes, EKS, or another container orchestration platform
- Expert with scripting language such as Python or bash; sed/awk is preferred
- Experience with Docker or similar container technology
- Experience with PostgreSQL
- Understanding of a well architected Cloud Infrastructure deployment in AWS, Including:
- Compute
EDUCATION AND EXPERIENCE
- Bachelor’s degree from a four-year college or university, and engineering Degree is preferred
- 5+ years of telecom industry, wireless industry, or customer management (or relevant) experience
- An equivalent combination of education and experience will be considered
Actively and consistently supports all efforts to simplify and enhance the customer experience.
- Build resilient, self-healing systems that could scale seamlessly (high-performance) and improve system reliability (always available)
- Monitor system health using various charts, graphs and logs, detect and trace problems and react to issues at scale
- Write post-mortems, participate in forensic root cause analysis to implement corrective measures preventing issue(s) from reoccurring
- Create, modify, evolve and document risk-mitigation strategies to eliminate potential risks that could impact performance, scalability and reliability of systems and services,
- Create, modify, evolve, repair and/or maintain scripts to secure CI/CD pipelines across Multiple Domains
- Create, modify or evolve current processes for source control, build, integration, automated test, security scanning, and delivery of applications
- Will be required to interact with Product House and Implementation Services to deliver the end-to-end pipelines of software delivery
- Create automated development and operations scripts/processes to ensure reliability, scalability, repeatability of pipelines without error, bugs or with very minimal customer impact
- Leverage Infrastructure as Code and Configuration as Code to automate deployments
- Be held to MTTR or other SLA
Responsibilities:
Actively and consistently supports all efforts to simplify and enhance the customer experience.
- Build resilient, self-healing systems that could scale seamlessly (high-performance) and improve system reliability (always available)
- Monitor system health using various charts, graphs and logs, detect and trace problems and react to issues at scale
- Write post-mortems, participate in forensic root cause analysis to implement corrective measures preventing issue(s) from reoccurring
- Create, modify, evolve and document risk-mitigation strategies to eliminate potential risks that could impact performance, scalability and reliability of systems and services,
- Create, modify, evolve, repair and/or maintain scripts to secure CI/CD pipelines across Multiple Domains
- Create, modify or evolve current processes for source control, build, integration, automated test, security scanning, and delivery of applications
- Will be required to interact with Product House and Implementation Services to deliver the end-to-end pipelines of software delivery
- Create automated development and operations scripts/processes to ensure reliability, scalability, repeatability of pipelines without error, bugs or with very minimal customer impact
- Leverage Infrastructure as Code and Configuration as Code to automate deployments
- Be held to MTTR or other SLAs
REQUIREMENT SUMMARY
Min:N/AMax:5.0 year(s)
Information Technology/IT
IT Software - Application Programming / Maintenance
Information Technology
Graduate
Engineering
Proficient
1
Cambridge CB2, United Kingdom