Senior Data Engineer - Python & Databricks at Endava

Melbourne, Victoria, Australia -

Full Time

Start Date

Immediate

Expiry Date

07 Nov, 25

Salary

0.0

Posted On

08 Aug, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Celebrations, It, Recognition Awards, Mining, Transformation, Data Models, Cost Efficiency, Career Opportunities, Star Schema, Continuous Improvement, Postgresql, Database Systems, Finance, Data Standards, Data Validation, Training, Career Development, Engineers

Industry

Information Technology/IT

Description

Company Description
Technology is our how. And people are our why. For over two decades, we have been harnessing technology to drive meaningful change.
By combining world-class engineering, industry expertise and a people-centric mindset, we consult and partner with leading brands from various industries to create dynamic platforms and intelligent digital experiences that drive innovation and transform businesses.
From prototype to real-world impact - be part of a global shift by doing work that matters.
Job Description

As a Senior Data Engineer specialising in Python and Databricks, you will design, build, and optimise data pipeline solutions on Azure Databricks and related cloud platforms. Working closely with data scientists, analysts, and engineers, you will ensure our data infrastructure supports advanced analytics and business insights across industries (including energy, resources, and mining). You will join a collaborative, agile team where continuous improvement, innovation, and knowledge sharing are part of the culture.

Work closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver effective pipeline solutions.
Design, develop, and maintain robust ETL/ELT pipelines on Azure using Databricks (Spark) along with services like Azure Data Factory and Synapse, to ingest, process, and transform large datasets.
Implement data validation, cleansing, and governance procedures to guarantee data quality, integrity, and security. This includes enforcing data standards and addressing data quality issues proactively.
Continuously improve the scalability, efficiency, and cost-effectiveness of data pipelines. Identify opportunities to enhance performance, reliability, and cost-efficiency across our data systems.
Monitor data pipeline performance and promptly troubleshoot any issues or failures to ensure high data availability and consistency. Leverage observability tools and best practices to maintain reliable pipelines.
Develop streaming or event-driven data processes as needed for real-time analytics, leveraging frameworks like Apache Kafka and Spark Structured Streaming.
Maintain clear documentation of data pipelines, data models, and processes for transparency and team knowledge sharing. Follow best practices in coding, testing, and version control to ensure maintainable and auditable workflows.

QUALIFICATIONS

Proficiency in Python for data engineering (including PySpark and libraries like pandas/Polars) and in SQL for data querying and transformation.
Solid understanding of data warehousing concepts and dimensional data modeling (e.g. star schema, Kimball methodology).
Hands-on experience with relational database systems and SQL (e.g. SQL Server, PostgreSQL) and familiarity with NoSQL databases (e.g. MongoDB, Cassandra) for varied data storage needs.
Strong experience designing and implementing ETL/ELT processes and integrating data from multiple sources.
Proven experience working with cloud data platforms, especially Microsoft Azure.
Expertise in Azure Databricks and the Spark ecosystem for large-scale data processing is required (experience with Delta Lake, Azure Data Lake Storage, Azure Data Factory/Synapse is highly valued).
Familiarity with data pipeline orchestration and automation tools (e.g. Azure Data Factory, Apache Airflow, or Azure Functions) and with CI/CD pipelines for deploying data workflows.
Experience monitoring data pipeline performance and using observability tools to ensure data reliability is a plus.
Exposure to event-driven architectures and streaming data tools (such as Apache Kafka or Spark Streaming) is beneficial for handling real-time data flows.
Experience working in Agile teams with iterative development, and a collaborative approach to problem-solving.
Holding a current Databricks certification (e.g. Databricks Certified Data Engineer) is a strong advantage.
Background in or understanding of data from the energy, resources, or mining industry is a plus, as it will help in delivering business-focused insights in these sectors.
Additional Information

Discover some of the global benefits that empower our people to become the best version of themselves:

Finance: Competitive salary package, share plan, company performance bonuses, value-based recognition awards, referral bonus;
Career Development: Career coaching, global career opportunities, non-linear career paths, internal development programmes for management and technical leadership;
Learning Opportunities: Complex projects, rotations, internal tech communities, training, certifications, coaching, online learning platforms subscriptions, pass-it-on sessions, workshops, conferences;
Work-Life Balance: Hybrid work and flexible working hours, employee assistance programme;
Health: Global internal wellbeing programme, access to wellbeing apps;
Community: Global internal tech communities, hobby clubs and interest groups, inclusion and diversity programmes, events and celebrations.

Responsibilities

Please refer the Job description for details