Engineer (A2 DES, Databricks, PySpark, Python) at KPMG Nederland

Bengaluru, karnataka, India -

Full Time

Start Date

Immediate

Expiry Date

10 May, 26

Salary

0.0

Posted On

09 Feb, 26

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Databricks, PySpark, Python, Spark SQL, ETL, Data Engineering, Azure Data Factory, Azure Data Lake Storage, Data Governance, Data Quality, Machine Learning, Data Pipelines, Generative AI, Microsoft Fabric, Cloud Services, Problem Solving

Industry

Business Consulting and Services

Description

Roles & responsibilities Role Overview: The Associate 2 - “Data Engineer with Databricks/Python skills” will be part of the GDC Technology Solutions (GTS) team, working in a technical role in the Audit Data & Analytics domain that requires developing expertise in KPMG proprietary D&A (Data and analytics)) tools and audit methodology. He/she will be a part of the team responsible for extracting and processing datasets from client ERP systems (SAP/Oracle/Microsoft Dynamics) or other sources to provide insights through data warehousing, ETL and dashboarding solutions to Audit/internal teams and be involved in developing solutions using a variety of tools & technologies The Associate 2 - “Data Engineer” will be predominantly responsible for: Data Engineering ·Understand requirements, validate assumptions, and develop solutions using Azure Databricks, Azure Data Factory or Python. Able to handle any data mapping changes and customizations within Databricks using PySpark ·Build Azure Databricks notebooks to perform data transformations, create tables, and ensure data quality and consistency. Leverage Unity Catalog for data governance and maintaining a unified data view across the organization ·Analyze enormous volumes of data using Azure Databricks and Apache Spark. Create pipelines and workflows to support data analytics, machine learning, and other data-driven applications ·Able to integrate Azure Databricks with ERP systems or third part systems using APIs and build Python or PySpark notebooks to apply business transformation logic as per the common data model ·Debug, optimize and performance tune and resolve issues, if any, with limited guidance, when processing large data sets and propose possible solutions ·Must have experience in concepts like Partitioning, optimization, and performance tuning for improving the performance of the process ·Implement best practices of Azure Databricks design, development, Testing and documentation ·Work with Audit engagement teams to interpret the results and provide meaningful audit insights from the reports ·Participate in team meetings, brainstorming sessions, and project planning activities ·Stay up-to-date with the latest advancements in Azure Databricks, Cloud and AI development, to drive innovation and maintain a competitive edge ·Enthusiastic to learn and use Azure AI services in business processes. ·Work experience on using Microsoft Fabric is an added advantage ·Write production ready code ·Design, develop, and maintain scalable and efficient data pipelines to process large datasets from various sources using Azure Data Factory (ADF). ·Integrate data from multiple data sources and ensure data consistency, quality, and accuracy, leveraging Azure Data Lake Storage (ADLS). ·Design and implement ETL (Extract, Transform, Load) processes to ensure seamless data flow across systems using Azure ·Work experience on Microsoft Fabric is an added advantage ·Enthusiastic to learn, adapt and integrate Gen AI into the business process and should have experience working with Azure AI services ·Optimize data storage and retrieval processes to enhance system performance and reduce latency. Technical Skills Primary Skills: Ø2-4 years of experience in data engineering, with a strong focus on Databricks, PySpark, Python and Spark SQL. ØProven experience in implementing ETL processes and data pipelines ØHands-on experience with Azure Databricks, Azure Data Factory (ADF), Azure Data Lake Storage (ADLS) ØAbility to write reusable, testable, and efficient code ØDevelop low-latency, high-availability, and high-performance applications ØUnderstanding of fundamental design principles behind a scalable application ØGood knowledge of Azure cloud services ØFamiliarity with Generative AI and its applications in data engineering ØKnowledge of Microsoft Fabric and Azure AI services is an added advantage Enabling Skills ·Excellent analytical, and problem-solving skills ·Quick learning ability and adaptability ·Effective communication skills ·Attention to detail and good team player ·Willingness and ability to deliver within tight timelines ·Flexible to work timings and willingness to work on different projects/technologies

Responsibilities

The Associate 2 - Data Engineer will be responsible for extracting and processing datasets from client ERP systems to provide insights through data warehousing, ETL, and dashboarding solutions. They will also develop solutions using Azure Databricks and Python, ensuring data quality and consistency.