Databricks Architect at NTT DATA

Bengaluru, karnataka, India -

Full Time

Start Date

Immediate

Expiry Date

04 Mar, 26

Salary

0.0

Posted On

04 Dec, 25

Experience

10 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Databricks, Delta Lake, Unity Catalog, Apache Spark, PySpark, ETL, ELT, CI/CD, DevOps, Machine Learning, Data Governance, Data Quality, Data Modeling, Cloud Platforms, Performance Tuning, Cost Optimization

Industry

IT Services and IT Consulting

Description

Define and own the end-to-end architecture of data platforms built on Databricks, including ingestion, transformation, storage, and consumption layers. Design and implement Lakehouse architectures using Delta Lake, Unity Catalog, and structured streaming. Proven expertise in implementing data governance using Unity Catalog, including fine-grained access control, column-level lineage, data classification, audit logging, and centralized metadata management across workspaces and cloud environments Architect scalable ETL/ELT pipelines using Apache Spark, PySpark, and Databricks Workflows. Lead the integration of Databricks with enterprise systems such as data catalogs, data quality frameworks, ML platforms, and BI tools. Guide teams in implementing CI/CD pipelines, version control, and automated testing for Databricks notebooks and jobs. Collaborate with data scientists, engineers, and business stakeholders to support ML model lifecycle management using MLflow. Integrate data quality frameworks (e.g., Great Expectations, OpenMetadata) into Databricks workflows to ensure accuracy and reliability Implement data governance using Unity Catalog, including: Fine-grained access control Column-level lineage Data classification Audit logging Centralized metadata management Establish data validation and cleansing processes across Bronze, Silver, and Gold layers in Medallion architecture Demonstrated experience or familiarity with AI systems such as Databricks DBRX, Mosaic AI, and AutoML, including their integration into data. Provide technical leadership in performance tuning, cost optimization, and cluster configuration. Conduct architectural reviews, code audits, and mentoring sessions to ensure adherence to standards and scalability. Stay current with Databricks innovations and advocate for adoption of new features and capabilities. Minimum Skills Required: * Bachelor's or Masters degree in Computer Science, Software Engineering, Information Technology, or related field required. 10+ years of experience in data architecture and engineering, with 5+ years in Databricks and Apache Spark. Deep expertise in Delta Lake, structured streaming, PySpark, and SQL. Strong understanding of Lakehouse architecture, data mesh, and modern data stack principles. Experience with Unity Catalog, Databricks Repos, Jobs API, and Workflows. Proven ability to design and implement secure, governed, and highly available data platforms. Familiarity with cloud platforms (Azure, AWS, GCP) and their integration with Databricks. Experience with data modeling, dimensional modeling, and temporal data structures. Experience with CI/CD, DevOps, and infrastructure-as-code tools (Terraform, GitHub Actions, Azure DevOps). Knowledge of machine learning lifecycle, MLflow, and model deployment strategies. An understanding of E-R data models (conceptual, logical, and physical). Understanding of advanced data warehouse concepts is required Strong analytical skills, including a thorough understanding of how to interpret customer business requirements and translate them into technical designs and solutions. Capable of collaborating effectively across a variety of IT and Business groups, across regions, roles and able to interact effectively with all levels. Ability to identify w"

Responsibilities

Define and own the end-to-end architecture of data platforms built on Databricks. Lead the integration of Databricks with enterprise systems and guide teams in implementing CI/CD pipelines.