Site Reliability Engineer (Contract - Outside of IR35)

at  TwinStream

Bristol, England, United Kingdom -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate29 Nov, 2024GBP 550 Annual30 Aug, 2024N/AGo,Ansible,Jenkins,Sql,Performance Tuning,Security Protocols,Disaster Recovery,Aws,Openshift,Python,Hosting Services,Java,Troubleshooting,Relational Databases,Automation Tools,Kubernetes,Shell ScriptingNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

WHO ARE WE:

In 2019, our founders were working as engineers solving complex cross domain problems in defence and security organisations.
TwinStream was formed to consolidate their collective expertise and experience into one business, providing technical excellence and exceptional service to their clients. We have teams working both on-site with clients and remotely from home.

DETAILS :

Rate: £500 - £550
Location: Onsite in Bristol with possible 24/7 call out when on rota.
As some of our clients work in specific industries, any offer will be conditional upon successful completion of DV clearance.
About the role:
We are looking for skilled engineers to join a new team that will deploy and maintain our established cross-domain system for a customer. The system uses an AMQP event-driven microservices architecture and extensively utilizes docker container services. As a team member, you will maintain a continuous deployment pipeline, work with feature delivery teams to promote component releases into production, and apply configuration management tools to ensure all deployments are consistent and correctly configured.
The system is designed to be highly observable and available. The team will use monitoring tools to verify that all components are meeting SLA/SLO requirements. If any problems are identified, the team will take preventive actions to minimize customer impact and restore service as quickly as possible.
This role is perfect for an experienced engineer who is comfortable working in a managed service environment and wants to gain more experience with best-of-breed DevOps tools and techniques.

Key Responsibilities of the Site Reliability Engineer:

  • Collaborate with Feature Development teams to promote new component versions into production as efficiently as possible.
  • Maintain the system to agreed service level and availability objectives using real-time monitoring tools and system generated metrics.
  • Instrumentation of new system metrics and alerts to pre-empt issues and improve performance.
  • Respond to monitoring alerts and customer incidents, taking preventative/remedial action to minimise customer impact.
  • Liaising with key customer stakeholders to schedule capability changes and capture new service requirements as they arise.
  • Apply automation techniques to reduce manual operations burden.

Skills & Experience Required:

  • Must be eligible and willing to undergo DV clearance.
  • Experience in infrastructure automation tools (CloudFormation, Terraform or Ansible)
  • Experience working with docker containers & container orchestration tools (such as Kubernetes, OpenShift or Docker Swarm)
  • Experience using and maintaining CI / CD tools (such as Jenkins or GitHub actions)
  • Good understanding of relational databases and SQL
  • Linux command line, administration and shell scripting
  • Solid understanding of monitoring, auto-scaling, performance tuning, troubleshooting and disaster recovery best practices
  • Working knowledge of network security protocols
  • Working knowledge of AWS
  • Experience with monitoring tools such as InfluxDB, Prometheus or Grafana

DESIRABLE SKILLS:

  • Experience of working in a managed service environment
  • Experience using, developing with and maintaining cloud hosting services (ideally AWS EC2, RDS, S3, Lambda)
  • Experience of event-driven integration with MQ messaging (RabbitMQ or similar AMQP solution)
  • Knowledge of cross domain principles & technologies
  • Industry experience writing well-tested code in one of our platform languages (Java, Go, Python or similar)

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities:

  • Collaborate with Feature Development teams to promote new component versions into production as efficiently as possible.
  • Maintain the system to agreed service level and availability objectives using real-time monitoring tools and system generated metrics.
  • Instrumentation of new system metrics and alerts to pre-empt issues and improve performance.
  • Respond to monitoring alerts and customer incidents, taking preventative/remedial action to minimise customer impact.
  • Liaising with key customer stakeholders to schedule capability changes and capture new service requirements as they arise.
  • Apply automation techniques to reduce manual operations burden


REQUIREMENT SUMMARY

Min:N/AMax:5.0 year(s)

Information Technology/IT

IT Software - Application Programming / Maintenance

Software Engineering

Graduate

Proficient

1

Bristol, United Kingdom