Consultant - Cloud Data Engineer at KPMG India
Bangalore, karnataka, India -
Full Time


Start Date

Immediate

Expiry Date

12 Aug, 26

Salary

0.0

Posted On

14 May, 26

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Python, SQL, Azure, GCP, Apache Spark, Apache Airflow, dbt, Apache Kafka, ETL/ELT, Data Modeling, Docker, Kubernetes, Terraform, Azure Synapse Analytics, Google BigQuery, PySpark

Industry

Business Consulting and Services

Description
Cloud Engineering – Consulting  The Consulting business at KPMG Global Services (KGS) is a diverse team of more than 6400 professionals. We work with KPMG Firms worldwide to transform the businesses of clients across industries through the latest technology and innovation. Our technology professionals combine deep industry knowledge with strong technical experience to navigate through complex challenges and deliver real value for our clients.    The Role  We are looking for 3 -5 years of experience as Senior Data & Cloud Engineer, a highly skilled Senior Data & Cloud Engineer to design, build, and optimize next-generation data platforms for our clients. The ideal candidate is a master of core data engineering principles such as distributed computing and advanced data modelling and possesses deep, hands-on expertise in at least one major cloud ecosystem (Azure or GCP). You will be responsible for the entire data lifecycle, from real-time ingestion to providing analytics-ready datasets for our Machine Learning and Business Intelligence teams. Your key responsibilities will include -  * Architect Pipelines: Design and build scalable ETL/ELT pipelines to ingest high-velocity telemetry and log data into our cloud warehouse. * Model Implementation: Partner with Data Scientists to build and maintain Feature Stores and automated ML training pipelines. * Infrastructure as Code: Utilize Docker, Kubernetes, and Terraform to deploy and manage containerized data services. * Optimization: Continuously monitor and optimize query performance, sharding strategies, and cloud costs across the data stack. * Data Governance: Ensure high data quality, integrity, and security standards are met across all production environments.   Mandatory Skills * Programming: Expert-level Python and Advanced SQL (Window functions, CTEs, and performance tuning). * Data Modeling: Proven experience with Star/Snowflake Schemas and managing Slowly Changing Dimensions (SCD Type 1, 2, & 3). * Distributed Computing: Hands-on experience with Apache Spark (PySpark/Scala) for processing multi-terabyte datasets. * Orchestration & Transformation: Proficiency with Apache Airflow for workflow management and dbt for modular data modeling. * Streaming: Experience with Apache Kafka or similar message brokers for event-driven architectures. * Key tech stacks – Cloud Expertise (either Azure or GCP) Feature Azure  GCP Data Warehouse Azure Synapse Analytics Google BigQuery Storage / Lake Azure Data Lake (ADLS Gen2) Google Cloud Storage Serverless ETL Azure Data Factory (ADF) Google Cloud Dataflow Managed Spark Azure Databricks Google Cloud Dataproc Streaming/Pub-Sub Azure Event Hubs Google Cloud Pub/Sub Compute / NoSQL Azure Functions / Cosmos D Cloud Functions / Bigtable Governance & Lineage Data Catalog/ Microsoft Purview Lake Formation/ Google Dataplex   The person  * Outstanding interpersonal skills with the ability to inspire teams and develop excellent client relationships  * Previous experience in consulting and working on complex solutions to business executives    * Embraces a growth mindset for business development, with an ability to influence senior stakeholders  * Excellent communication (written and oral) skills capable of discussing complex ideas with various groups, ranging from a board room to business unit leads.  * Good team player with strong commitment to professional and client service excellence. 
Responsibilities
Design and build scalable ETL/ELT pipelines to ingest high-velocity data into cloud warehouses. Optimize query performance and manage containerized data services using Infrastructure as Code.
Loading...