Platform Engineer at The Casumo Company Limited

Swieqi, Northern Region, Malta -

Full Time

Start Date

Immediate

Expiry Date

06 Aug, 26

Salary

0.0

Posted On

08 May, 26

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Kubernetes, Google Cloud Platform, Terraform, RabbitMQ, Kafka, Cloudflare, CI/CD, Java, JVM Performance, Elasticsearch, Kibana, Prometheus, Grafana, Linux, Bash, Python

Industry

Gambling Facilities and Casinos

Description

Casumo is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. Join us at Casumo, where you are invited to be your authentic YOU-MO! Company Overview: Welcome to Casumo, your passport to a world of fun, excitement, and responsible gaming. We're an international online casino company with a knack for creating unforgettable gaming experiences. Our secret sauce? A blend of innovation, security, and a dash of playful charm. Nowadays, we're on the hunt for a curious and problem-solving oriented Platform Engineer! Position Overview: As a Platform Engineer, you’ll be responsible for operating and improving our production infrastructure while championing reliability engineering practices across the organisation. This role combines hands-on platform engineering with incident management, performance optimisation, and systems reliability. You’ll collaborate closely with development teams, security, and infrastructure stakeholders to ensure our services remain scalable, secure, and highly available. Responsibilities: Platform Engineering Operate, scale, and continuously improve our production Kubernetes clusters running on Google Cloud Platform (GCP). Build, manage, and evolve cloud infrastructure using Infrastructure as Code (Terraform). Maintain and optimise critical messaging and event-streaming infrastructure, including RabbitMQ and Kafka. Manage edge networking, security, and traffic routing through Cloudflare. Automate operational processes and improve CI/CD pipelines to support safe, fast, and reliable releases. Collaborate with development teams to optimise Java services, JVM performance, container resource allocation, and connection pooling. Maintain and troubleshoot observability and logging platforms such as Elasticsearch and Kibana. Reliability & Incident Management Lead real-time incident response, structured triage, and mitigation during production outages. Coordinate cross-functional teams and communicate effectively with both technical and non-technical stakeholders during incidents. Design and execute load testing strategies to validate system resilience under peak traffic conditions. Conduct blameless postmortems, identify systemic risks, and implement long-term improvements to prevent recurring incidents. Improve monitoring, alerting, and observability using Prometheus and Grafana, helping reduce alert fatigue and improve Mean Time To Recovery (MTTR). Requirements: 3+ years of experience in Platform Engineering, DevOps, or Site Reliability Engineering Strong hands-on experience operating and troubleshooting Kubernetes in production environments Experience running infrastructure on GCP or another major cloud provider Solid experience with Infrastructure as Code, particularly Terraform Experience managing message brokers and event-streaming platforms such as RabbitMQ and Kafka Experience managing production incidents, conducting load testing, and driving structured postmortems Strong Linux systems knowledge with excellent troubleshooting skills across infrastructure, networking, and application layers Scripting and automation experience using Bash and/or Python Experience troubleshooting outages related to Kubernetes scaling, performance degradation, or network and edge misconfigurations Preferred Experience: Experience operating MySQL and ClickHouse environments, including high availability, replication, backups, failover, and capacity planning Experience working within high-traffic, high-concurrency environments such as iGaming, fintech, or SaaS Experience implementing and evolving SLO/SLI frameworks Think we're a good match? Apply now! The Perks (Malta Office) Being a part of the Casumo group provides an unparalleled experience. You’ll find yourself surrounded by the brightest minds within the most inspiring and collaborative office spaces! In addition to that, you’ll enjoy: Private health insurance Wellness incentives, including a fitness allowance and mental well-being services Flexible national holidays: public holidays mean more time off, choose how and when to enjoy them! 2 weeks Work From Anywhere (10 days), increased to 4 weeks (20 days) after longer duration of employment within the Company: explore the world while working remotely Gourmet lunches and healthy snacks prepared by our in-house chef Variety of discounts from local vendors Access to some of the greatest tools and platforms for developing your professional skills and building success within your role A range of training courses, known as Casumo College, for continuous learning and growth Social events for building strong relationships with colleagues from all across the organisation Our ABC values: ASPIRE At Casumo, "aspire" means pushing beyond the ordinary and transforming obstacles into stepping stones. Challenges are our breakfast of champions, and comfort zones are out of bounds. Mediocrity? Left behind. Our mantra? Dream big, aim high, and always be ready for the next adventure in innovation. BELIEVE Belief at Casumo isn't just a feel-good sticker; it's the glue that binds us. Turning "me" achievements into "we" victories, we're a tight-knit crew of dreamers, doers, and relentless supporters. With a high-five arsenal and a trusty cheerleading squad, we're on a mission to prove that together, we're not just strong; we're Casumo strong. CARE Care is our secret ingredient, the cherry on top of our game. It's not only about ensuring our players have a blast (responsibly, of course); it's about weaving a fabric of support so tight, even the toughest challenges can't tear us apart. From tailoring player experiences to being there for each other, we're all about creating memorable moments.

Responsibilities

Operate and scale production Kubernetes clusters on GCP while managing infrastructure as code via Terraform. Lead incident response, performance optimization, and reliability engineering to ensure high availability of services.