Data Engineer at Fetcherr

Warsaw, Masovian Voivodeship, Poland -

Full Time

Start Date

Immediate

Expiry Date

27 Jun, 26

Salary

0.0

Posted On

29 Mar, 26

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Python, SQL, ETL/ELT Pipelines, Airflow, Dagster, Spark, Dask, Beam, AWS, GCP, Azure, BigQuery, Docker, Kubernetes, CI/CD, Pytest

Industry

Software Development

Description

Fetcherr is an AI-driven company specializing in deep learning, algorithmic trading, and large-scale data solutions. Our core technology, the Large Market Model (LMM), enables accurate demand forecasting and real-time, data-driven decision-making. Originally focused on the airline industry, Fetcherr is expanding its AI solutions across additional industries. We are seeking a Data Engineer to join our growing data team to design, build, and maintain data pipelines and infrastructure. You will work on distributed systems, orchestration frameworks, cloud environments, and modern data technologies to ensure that data flows reliably and efficiently across the organization. Key Responsibilities: Optimize processes for scalability and efficiency in distributed environments. Ensure data quality, integrity, and performance across workflows. Work with cloud-native solutions and containerized environments (Docker/Kubernetes). Implement and manage orchestration frameworks (Airflow/Dagster/etc.). Collaborate with cross-functional teams to support business and research needs. Develop and enforce CI/CD processes and data testing frameworks (pytest, GE, or similar). Monitor, troubleshoot, and continuously improve pipeline reliability and performance. Requirements You’ll be a great fit if you have... 4+ years of professional experience with Python and SQL. Proven experience building and maintaining ETL/ELT pipelines (batch/streaming) with orchestration frameworks (Airflow, Dagster, or similar). Hands-on experience with distributed computing frameworks (Spark, Dask, Beam) and large-scale data processing. Experience with major cloud platforms (AWS, GCP, or Azure); GCP/BigQuery is an advantage. Proficiency with Docker/Kubernetes and CI/CD pipelines (GitLab CI, GitHub Actions, or similar). Solid understanding of software engineering practices (data structures, algorithms, TDD, code quality). Familiarity with data testing and monitoring frameworks (pytest, Great Expectations, observability tools). Nice to Have: Experience with dbt, ClickHouse, or Ray. Familiarity with Python data libraries (pandas, NumPy, Apache Arrow, Jinja). Background in functional programming.

Responsibilities

The Data Engineer will be responsible for designing, building, and maintaining data pipelines and infrastructure, focusing on distributed systems, orchestration frameworks, and cloud environments. Key tasks include optimizing processes for scalability, ensuring data quality, and implementing orchestration frameworks like Airflow or Dagster.