DevOps Engineer — HPC & GPU Platform (Remote, Paris-based) at GECI Int.
, , -
Full Time


Start Date

Immediate

Expiry Date

14 Jul, 26

Salary

0.0

Posted On

15 Apr, 26

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Python, Go, AWS, GPU, HPC, Prometheus, Grafana, Distributed tracing, CI/CD, Slurm, Benchmarking, Observability, Compute clusters, Pulsar, Infrastructure

Industry

IT Services and IT Consulting

Description
We are looking for a DevOps engineer with a strong software development background to join a distributed GPU compute platform project for a leading UK SaaS company (fintech/enterprise planning space). The context You will be embedded in a senior engineering team building a greenfield GPU-accelerated compute platform on AWS. The central SRE team manages the underlying infrastructure — your role is to build the tooling on top of it. What you will work on Build GPU benchmarking frameworks on AWS: scheduling benchmark runs, collecting and storing results, enabling performance comparison across versions Develop correctness validation tooling: automated testing of numerical accuracy of GPU compute outputs against reference results Implement distributed observability across all platform services: structured logging, distributed tracing (Pulsar), performance metrics Contribute to broader HPC coding tasks alongside the engineering team What we are looking for Strong Python or Go developer — you write real application code, not just scripts Experience with observability tooling (Prometheus, Grafana, distributed tracing) Comfortable with AWS (EC2, IAM, VPC) and CI/CD pipelines HPC or GPU environment experience is a strong plus — Slurm, compute clusters, GPU workloads ENSIMAG, Centrale, INSA, X or equivalent engineering background preferred English fluent — the team is distributed across France and the UK Modalities 100% remote, 1 day/week in London Start: May 2026 Contract: freelance/portage Rate: competitive, based on experience
Responsibilities
You will build GPU benchmarking frameworks and correctness validation tooling on an AWS-based compute platform. Additionally, you will implement distributed observability and contribute to broader HPC coding tasks within the engineering team.
Loading...