Data Engineer at Qode

, , Vietnam -

Full Time

Start Date

Immediate

Expiry Date

18 Aug, 26

Salary

0.0

Posted On

20 May, 26

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Databricks, AWS, Apache Airflow, Airbyte, dbt, PySpark, SQL, ETL/ELT, Delta Lake, Docker, Kubernetes, Python, Data Modeling, CDC, Data Lakehouse, Data Pipelines

Industry

Software Development

Description

About the roleWe are looking for a Data Engineer to join our Data Platform team, focusing on building scalable data pipelines and enabling analytics across the organization.In this role, you will work with modern data stack tools like Databricks, AWS, Airflow, Airbyte, and dbt to design and maintain data workflows that support reporting, analytics, and data-driven decisions.This is a good fit if you enjoy working with large-scale data systems, building reliable pipelines, and optimizing performance in a cloud-based environment. Your Responsibilities Design and build scalable ETL/ELT pipelines using both batch and streaming approaches Develop ingestion workflows from multiple sources such as databases, APIs, and event streams Implement ingestion strategies including full load, incremental load, and CDC Orchestrate data workflows using Apache Airflow Manage data connectors using Airbyte Work with Databricks Lakehouse to build and optimize data processing pipelines Write and optimize complex SQL queries for analytics and transformation Build modular and testable data models using dbt (staging → intermediate → marts) Maintain data quality, observability, and reliability across the platform Work with AWS services such as S3, Lambda, EC2, IAM Containerize data services using Docker and Kubernetes (EKS) when needed Document pipelines, data models, and data dictionaries for long-term maintainability Requirements At least 6 years of experience in Data Engineering Strong understanding of data architectures such as Data Lake, Data Warehouse, and Lakehouse Hands-on experience with ETL/ELT pipelines, including batch and streaming processing Familiar with ingestion patterns: full load, incremental, CDC, event-driven Experience working with Databricks (Delta Live Tables, Jobs, Notebooks) Strong skills in PySpark or Spark SQL for large-scale data processing Solid understanding of Delta Lake (ACID, time travel, schema evolution) Experience with Apache Airflow (DAGs, scheduling, monitoring) Experience with Airbyte or similar ingestion tools Strong SQL skills (CTEs, joins, window functions, query optimization) Experience with dbt for transformation, testing, and documentation Hands-on experience with AWS (S3, Lambda, IAM, etc.) Be proficient in English communication skills (at least C1 level) Nice to Have Experience with Docker, Kubernetes (EKS) Experience running Airflow or Airbyte on Kubernetes Familiar with data quality tools such as Great Expectations or Soda Experience with Terraform or Infrastructure as Code Exposure to data governance or catalog tools (e.g., Databricks Catalog) Experience with CI/CD pipelines (e.g., GitHub Actions) Strong Python skills for automation and pipeline scripting 👉 Our Benefit Packages: Attractive salary range and we are open to negotiate if you're a strong fit. Hybrid/Remote-friendly culture, work where you grow best! Flexible hours, async teamwork (we respect your focus time) Work equipment support Allowance for Certification & Skill Development Year-end bonus & performance-based rewards 22 paid leaves from your 5th year - take a full month off Career growth with personal coaching sessions Open, collaborative team culture - no micromanagement, only trust Tools & AI-powered workflows that make remote work easier About CoderPushCoderPush is a remote-first technology company that partners with startups and global businesses to build scalable, high-quality software products. We focus on long-term collaboration, clear communication, and delivering real impact through strong engineering and product thinking.Please find more at: https://coderpush.com/

Responsibilities

Design and build scalable ETL/ELT pipelines using batch and streaming approaches to support organizational analytics. Maintain data quality and observability while managing workflows with tools like Airflow, Airbyte, and Databricks.