Site Reliability Engineer (Contract - Outside of IR35)
at TwinStream
Bristol, England, United Kingdom -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 29 Nov, 2024 | GBP 550 Annual | 30 Aug, 2024 | N/A | Go,Ansible,Jenkins,Sql,Performance Tuning,Security Protocols,Disaster Recovery,Aws,Openshift,Python,Hosting Services,Java,Troubleshooting,Relational Databases,Automation Tools,Kubernetes,Shell Scripting | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
WHO ARE WE:
In 2019, our founders were working as engineers solving complex cross domain problems in defence and security organisations.
TwinStream was formed to consolidate their collective expertise and experience into one business, providing technical excellence and exceptional service to their clients. We have teams working both on-site with clients and remotely from home.
DETAILS :
Rate: £500 - £550
Location: Onsite in Bristol with possible 24/7 call out when on rota.
As some of our clients work in specific industries, any offer will be conditional upon successful completion of DV clearance.
About the role:
We are looking for skilled engineers to join a new team that will deploy and maintain our established cross-domain system for a customer. The system uses an AMQP event-driven microservices architecture and extensively utilizes docker container services. As a team member, you will maintain a continuous deployment pipeline, work with feature delivery teams to promote component releases into production, and apply configuration management tools to ensure all deployments are consistent and correctly configured.
The system is designed to be highly observable and available. The team will use monitoring tools to verify that all components are meeting SLA/SLO requirements. If any problems are identified, the team will take preventive actions to minimize customer impact and restore service as quickly as possible.
This role is perfect for an experienced engineer who is comfortable working in a managed service environment and wants to gain more experience with best-of-breed DevOps tools and techniques.
Key Responsibilities of the Site Reliability Engineer:
- Collaborate with Feature Development teams to promote new component versions into production as efficiently as possible.
- Maintain the system to agreed service level and availability objectives using real-time monitoring tools and system generated metrics.
- Instrumentation of new system metrics and alerts to pre-empt issues and improve performance.
- Respond to monitoring alerts and customer incidents, taking preventative/remedial action to minimise customer impact.
- Liaising with key customer stakeholders to schedule capability changes and capture new service requirements as they arise.
- Apply automation techniques to reduce manual operations burden.
Skills & Experience Required:
- Must be eligible and willing to undergo DV clearance.
- Experience in infrastructure automation tools (CloudFormation, Terraform or Ansible)
- Experience working with docker containers & container orchestration tools (such as Kubernetes, OpenShift or Docker Swarm)
- Experience using and maintaining CI / CD tools (such as Jenkins or GitHub actions)
- Good understanding of relational databases and SQL
- Linux command line, administration and shell scripting
- Solid understanding of monitoring, auto-scaling, performance tuning, troubleshooting and disaster recovery best practices
- Working knowledge of network security protocols
- Working knowledge of AWS
- Experience with monitoring tools such as InfluxDB, Prometheus or Grafana
DESIRABLE SKILLS:
- Experience of working in a managed service environment
- Experience using, developing with and maintaining cloud hosting services (ideally AWS EC2, RDS, S3, Lambda)
- Experience of event-driven integration with MQ messaging (RabbitMQ or similar AMQP solution)
- Knowledge of cross domain principles & technologies
- Industry experience writing well-tested code in one of our platform languages (Java, Go, Python or similar)
How To Apply:
Incase you would like to apply to this job directly from the source, please click here
Responsibilities:
- Collaborate with Feature Development teams to promote new component versions into production as efficiently as possible.
- Maintain the system to agreed service level and availability objectives using real-time monitoring tools and system generated metrics.
- Instrumentation of new system metrics and alerts to pre-empt issues and improve performance.
- Respond to monitoring alerts and customer incidents, taking preventative/remedial action to minimise customer impact.
- Liaising with key customer stakeholders to schedule capability changes and capture new service requirements as they arise.
- Apply automation techniques to reduce manual operations burden
REQUIREMENT SUMMARY
Min:N/AMax:5.0 year(s)
Information Technology/IT
IT Software - Application Programming / Maintenance
Software Engineering
Graduate
Proficient
1
Bristol, United Kingdom