Senior / Lead Data Engineer (Python, Kafka, Iceberg, ClickHouse) at VANGUARD SOFTWARE PTE LTD

Singapore, , Singapore -

Full Time

Start Date

Immediate

Expiry Date

16 Sep, 25

Salary

7000.0

Posted On

17 Jun, 25

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Mandarin, Information Systems, English, Technical Discussions, Python, Sql, Data Systems, Java, Data Engineering, Computer Science, Performance Tuning

Industry

Information Technology/IT

Description

JOB SUMMARY

We are looking for a talented and experienced Senior/Lead Data Engineer to join our innovative team. The Senior/Lead Data Engineer will play a critical role in leading the design, development, and maintenance of our data infrastructure, pipelines, and analytical tools. The ideal candidate will have a strong technical background in data engineering, expertise in analytical tools, and proven leadership skills to mentor and guide a team of data engineers.

JOB REQUIREMENTS

Bachelor’s degree in Computer Science, Information Systems, or equivalent qualification.
6+ years of experience in data engineering , including 2+ years in a technical lead or senior IC capacity.
Proficient in Python , Java, and SQL, with strong expertise in schema design, performance tuning , and warehouse modeling.
Hands-on experience with lakehouse architectures (e.g., Iceberg , Delta Lake), data warehouses (e.g., Hive, ClickHouse ), and object storage (e.g., AWS S3, MinIO).
Strong knowledge of orchestration tools (Airflow, DolphinScheduler), ETL/ELT design, and streaming frameworks ( Kafka, Flink, Spark ).
Proven experience independently setting up and managing end-to-end data architecture in on-premise environments.
Demonstrated success mentoring engineers in high-performance, cross-functional teams.
Familiarity with Git-based workflows, CI/CD pipelines, and observability tools for production-grade data systems.
Self-motivated with a strong ownership mindset, adaptability, and willingness to travel when needed.
Fluent in English and Mandarin (written/spoken), with experience leading technical discussions in multilingual and cross-regional settings.

Responsibilities

Design and implement scalable batch and streaming data pipelines using tools such as Flink, Spark, Debezium, and Seatunnel (an open-source data integration tool).
Ingest and process high-volume data from APIs, operational databases, and semi-structured formats (e.g., JSON, CSV, XML, logs) to support diverse analytical use cases.
Build reusable transformation pipelines to consolidate cross-domain data (e.g., user behavior, transactions) into analytics-ready marts.
Architect and optimize data storage and modeling layers using MinIO/S3, Iceberg, ClickHouse, and other OLAP or object storage platforms to improve query performance and data reliability.
Maintain multi-layered data warehouse architecture (staging, core, mart) aligned with business needs.
Ensure robust CI/CD, lineage, observability, and compliance through tools like OpenMetadata and DataHub.
Mentor junior engineers, conduct rigorous code reviews, and promote engineering best practices across the data team.
Work cross-functionally with product managers, analysts, business stakeholders to translate data needs into scalable pipelines and business insights.Stay current with data engineering trends and technologies, and continuously drive platform improvements.