LatAM Site Reliability Engineer

at  Sporty Group

Remote, Scotland, United Kingdom -

Start DateExpiry DateSalaryPosted OnExperienceSkillsTelecommuteSponsor Visa
Immediate22 Nov, 2024Not Specified28 Aug, 20244 year(s) or aboveLogging,Scalability,Python,High Availability,Security,Latin America,Orchestration,Networking ProtocolNoNo
Add to Wishlist Apply All Jobs
Required Visa Status:
CitizenGC
US CitizenStudent Visa
H1BCPT
OPTH4 Spouse of H1B
GC Green Card
Employment Type:
Full TimePart Time
PermanentIndependent - 1099
Contract – W2C2H Independent
C2H W2Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

We consistently top the charts as one of if not the most used Sports Betting website in the countries we operate in.
With millions of weekly active users, we strive to be the best in industry for our users.
In addition to our DevOps Team we are building a Site Reliability Team whose purpose is to focus on site reliability and security. It will also involved deployment, configuration, and monitoring, as well as the availability, latency, change management, emergency response, and capacity management of services in production.

REQUIREMENTS

4+ years SRE/DevOps experience
Be based in Latin America
Experience independently leading the planning and deployment of a project
Experienced with cloud platforms, especially AWS, including solid knowledge of how to utilize cloud resources to fulfill the demand from other teams and production
Familiar with one program language or script language (Python, Java….)
Experience managing multiple kubernetes clusters in production (virtualization, orchestration, scalability, security, and high availability), skillset such as Helm, Rancher, ArgoCD
Solid networking protocol and cyber security knowledge, especially the TCP / IP stack and HTTP protocol
A strong understanding of cache, including CDN, HTTP cache (CloudFlare, AWS CloudFront)
Experienced with CloudNative Monitoring solution in Large distributed system using observation model(Trace, Metric, Logging), skillset such as Prometheus, Jaeger, Loki, ELK, Grafana
Excellent troubleshooting skills, including Linux OS issue diagnosis and OS parameter optimization

Responsibilities:

Work with a team of DevOps/SRE and DBA professionals
Improve existing infrastructure and processes currently deployed in as well as streamlining processes deploy to new countries in the future
Holistically improve all aspects of our current infrastructure including: reducing costs; streamlining environment provisioning; lowering response times and incorporating the latest techniques and technologies
Monitor and maintain the existing cloud infrastructure via autoscaling, automated alerts, andOpsWork and Grafana dashboards
Take ownership and responsibility for our cloud operation activities
Liaise with external security agencies for annual audits as well as perform our own internal security sweeps
Aid in reconfiguring existing architecture to allow for rapid deployments to new countries
Mentoring less experienced team members


REQUIREMENT SUMMARY

Min:4.0Max:9.0 year(s)

Information Technology/IT

IT Software - Network Administration / Security

Software Engineering

Graduate

Proficient

1

Remote, United Kingdom