Software Engineer (AI Platform) at Satori Analytics

, , Greece -

Full Time

Start Date

Immediate

Expiry Date

07 Sep, 26

Salary

0.0

Posted On

09 Jun, 26

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Python, System Design, API Design, Data Modeling, Concurrency/Async, Distributed Systems, Relational Databases, LLM APIs, FastAPI, Pydantic, Docker, CI/CD, Testing Strategy, Debugging, Code Review, Data Pipelines

Industry

IT Services and IT Consulting

Description

Are you passionate about AI? 🤖 At Satori Analytics, we aim to change the world one algorithm at a time by bringing clarity to global brands through Data & AI. From cloud-based ecosystems for fintech to predictive models for airlines, our cutting-edge solutions cover the entire data lifecycle—from ingestion to AI applications. As a fast-growing scale-up, our team of 100+ tech specialists—including Data Engineers, Data Scientists, and more—delivers innovative analytics solutions across industries like FMCG, retail, manufacturing and FSI. Join us as we lead the data revolution in South-Eastern Europe and beyond! Together with a partnering company, we're looking for a strong Software Engineer to build the systems behind an AI agent evaluation platform — systems that test, grade, and stress AI agents at scale, and turn the results into actionable signal about how those agents behave.This is a software engineering role first — designing services, data models, and pipelines that have to be correct, tested, and maintainable — applied to an AI problem domain. You'll work with LLMs regularly, but the core of the job is engineering, not prompt-tuning. What Your Day Might Look Like: Backend services & APIs: Build and maintain the services, data models, and APIs that power the platform — designed for correctness, testability, and scale. Simulation & orchestration: Work on the systems that coordinate complex, multi-step interactions between AI agents and external systems, improving their reliability and throughput. Evaluation & scoring: Design systems that grade agent outputs, combining deterministic checks with model-assisted judgment — and make scoring reliable, explainable, and reproducible. Data pipelines: Build pipelines that generate, transform, and quality-check large volumes of structured data and benchmark content. Quality & reliability: Add the tests, instrumentation, and safeguards needed to trust outputs from systems that are inherently non-deterministic. Your Superpowers 🚀 4+ years building and shipping production software, with strong proficiency in Python. Deep software engineering fundamentals: system and API design, data modeling, concurrency/async, testing strategy, debugging, and code review. You can own a non-trivial service end-to-end. Experience designing and operating distributed or service-oriented systems (queues, workers, APIs) — not just calling them. Comfort designing schemas and working with relational databases, plus the migrations and performance concerns that come with them. Working knowledge of LLM APIs — orchestration, structured outputs, and handling non-determinism. We expect you to use LLMs effectively, but this is not a prompt-engineering role. Ability to reason about correctness of probabilistic systems: how to test, measure, and trust outputs that aren't byte-for-byte deterministic. High quality bar: you write tests, types, and docs by default, and you keep changes small and reviewable. Bonus points for: Experience building agentic or multi-agent systems, tool-use, or orchestration frameworks. Background in evaluation / benchmarking of ML or LLM systems (rubrics, golden datasets, model-as-judge, inter-rater reliability). Experience with distributed task queues and async workloads. Modern Python tooling and typed codebases (e.g. type checkers, linters, Pydantic, FastAPI). Retrieval / search experience and working with data ingest pipelines. Some comfort with the infra side (Docker, CI/CD) so you can ship what you build. Perks on Perks Competitive salary. Training budget to level up your skills from top tech partners like Microsoft, AWS, Salesforce, and Databricks – whether it’s certifications or courses, we’ve got you covered. Private insurance, top-tier tech gear, and the chance to work with a stellar crew. Ready to create some data magic with us? Hit that apply button and let’s get started. ✨

Responsibilities

Design and build the backend services, data models, and pipelines for an AI agent evaluation platform. Coordinate complex interactions between AI agents and external systems while ensuring scoring is reliable and reproducible.