Data Engineer at EXL Talent Acquisition Team

Gurugram, haryana, India -

Full Time

Start Date

Immediate

Expiry Date

19 May, 26

Salary

0.0

Posted On

18 Feb, 26

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Python, Databricks, PySpark, Plotly Dash, Data Analysis, SQL, Query Optimization, ETL, Data Modeling, Data Warehousing, Pandas, NumPy, Git, Data Governance, Data Quality, Apache Spark

Industry

Business Consulting and Services

Description

EXL Decision Analytics We are looking for a skilled Data Engineer with strong expertise in Python, Databricks, PySpark, Plotly Dash, Data Analysis, SQL, and Query Optimization. The ideal candidate will be responsible for developing scalable data pipelines, performing complex data analysis, and building interactive dashboards to support business decision-making. Key Responsibilities: · Design, develop, and maintain scalable and efficient data pipelines using PySpark and Databricks · Perform data extraction, transformation, and loading (ETL) from diverse structured and unstructured data sources · Develop and maintain data models, data warehouses, and data marts in Databricks · Proficient in Python, Apache Spark, and PySpark · Integrate third party data from multiple sources with internal data · Write and optimize complex SQL queries for high performance and scalability across large datasets · Collaborate closely with data scientists, analysts, and business stakeholders to gather and understand data requirements. · Ensure data quality, consistency, and integrity throughout the data lifecycle using validation and monitoring techniques. · Develop and maintain modular, reusable, and well-documented code and technical documentation for data workflows and processes. · Implement data governance, security, and compliance best practices. Key Responsibilities: · Strong hands-on experience with PySpark, Databricks, and distributed data processing. · Proficiency in Python, with deep knowledge of Pandas and NumPy for data manipulation and analysis. · Strong foundation in Data Structures and Algorithms for writing efficient and optimized code. · Solid understanding of ETL processes, data modeling, and data warehousing concepts. · Experience with SQL and performance tuning for large-scale datasets. · Familiarity with data visualization tools such as Power BI, or Tableau. · Knowledge of version control systems like Git and collaborative development workflows. · Strong problem-solving skills, attention to detail, and ability to work independently or in a team. Candidate Profile: · 5+ years of relevant experience in Data Engineering tools · Programming Languages: Python and SQL · Data Processing Tools: pandas, NumPy, PySpark · Cloud Platforms: Databricks (for scalable computing resources) · Version Control & Collaboration: Git, GitHub, GitLab · Deployment and Monitoring: Databricks, Github Actions

Responsibilities

The role involves designing, developing, and maintaining scalable and efficient data pipelines using PySpark and Databricks, alongside performing complex data extraction, transformation, and loading (ETL) from various sources. Key duties also include building interactive dashboards and ensuring data quality and integrity throughout the lifecycle.