Cloud Data Engineer at Jefferies

Pune, maharashtra, India -

Full Time

Start Date

Immediate

Expiry Date

15 Jul, 26

Salary

0.0

Posted On

16 Apr, 26

Experience

5 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

AWS, Python, SQL, PySpark, ETL/ELT, Terraform, Data Pipelines, Data Modeling, Data Governance, CI/CD, Cloud Architecture, Data Warehousing, Infrastructure as Code, Streaming Platforms, Agile, GenAI

Industry

Investment Banking

Description

Innovation Hub Overview Jefferies is creating a Technology Innovation Hub in Pune, a greenfield opportunity to build the systems that power global markets. As our first India technology center, this hub brings together hands on builders who engineer the platforms behind Jefferies’ growth across capital markets, investment banking, and institutional securities. We’re scaling toward an elite team of 500 engineers while maintaining the agility, ownership, and meritocratic spirit that defines Jefferies. From cloud and data to AI, risk, and core business technologies, teams in Pune will lead high impact work with a global mandate. Team Overview IT Infrastructure Technology Jefferies’ IT Infrastructure Technology team builds and runs the core technology backbone that enables the firm to operate globally. It covers enterprise infrastructure engineering and operations while driving modernization initiatives such as Cloud Adoption and Network Resiliency. The group supports critical platforms including networking, end-user computing, cloud, databases, server/Unix engineering, communications, and global data centers (including low-latency colocations near exchanges). Role summary Build and operate scalable, reliable data pipelines and infrastructure on AWS to power analytics, reporting, and data-driven decision-making. As a Cloud Data Engineer, you will design and implement data ingestion, transformation, and orchestration workflows, optimize data storage and processing, and ensure data quality and governance. You will partner with Analytics, Data Platform, ML Engineering, and business stakeholders to deliver high-quality data products that meet business needs. You will collaborate with Analytics Engineers, Data Analysts, ML Engineers, Data Platform Engineers, Cloud Security, Database, and business stakeholders all divisions. Strong teamwork, customer service orientation, and the ability to translate business requirements into technical solutions are essential. Experience working in Agile teams using Jira and Confluence is expected Key responsibilities * Design, build, and maintain scalable data pipelines and ETL/ELT workflows on AWS using services such as Glue, EMR, Lambda, Step Functions, Kinesis, and S3 * Implement data ingestion from diverse sources (databases, APIs, streaming platforms, third-party providers); ensure reliability, performance, and error handling * Develop and optimize data transformation logic using SQL, Python, PySpark, or similar frameworks; ensure data quality, consistency, and lineage * Design and implement data storage solutions on AWS: S3 data lakes, Redshift data warehouses, DynamoDB, RDS, and Aurora; optimize for cost, performance, and access patterns * Build and maintain Infrastructure as Code (Terraform) for data infrastructure; follow team standards for modules, state management, and Terraform Enterprise workflows * Implement CI/CD pipelines for data workflows using GitHub, Bamboo, GitLab, or similar tools; ensure automated testing, deployment, and monitoring * Establish data governance and security controls: encryption at rest and in transit, IAM policies, data classification, audit logging, and compliance with regulatory requirements * Collaborate with Analytics Engineers, Data Analysts, and ML Engineers to understand data requirements and deliver datasets optimized for downstream consumption * Monitor and troubleshoot data pipeline performance, failures, and data quality issues; implement proactive alerting and remediation * Partner with Cloud Architecture, Cloud Security, and Database teams to ensure data infrastructure aligns with enterprise standards and best practices * Document data pipelines, data models, and operational procedures; contribute to team knowledge base * Drive automation and toil reduction; leverage GenAI and agentic workflows to improve engineering productivity * Proficiency using GenAI assistants (ChatGPT, Claude, GitHub Copilot) for SQL generation, pipeline code development, troubleshooting, and documentation * Ability to implement AI agent-driven automation (agentic workflows) for data engineering tasks (e.g., data quality checks, anomaly detection, pipeline optimization) with enterprise safety controls: human-in-the-loop validation, comprehensive logging and auditability, guardrails to prevent data corruption, rollback mechanisms, and secure credential handling * Proactive mindset to identify data engineering toil; ship automation that measurably reduces manual work and improves pipeline reliability * Monitor data pipeline health and SLAs; respond to failures and data quality incidents promptly * Participate in on-call rotation as needed for critical data workflows * Conduct post-incident reviews for data pipeline failures; track corrective actions to completion * Maintain runbooks and operational documentation for data infrastructure * Continuously improve pipeline performance, cost efficiency, and data quality Requirements * 7+ years in data engineering, data platform, or analytics engineering roles with strong cloud focus * Deep expertise in AWS data services: S3, Glue, EMR, Redshift, Athena, Lambda, Kinesis, Step Functions, and Lake Formation * Strong programming skills in Python and SQL; experience with PySpark or similar big data frameworks * Proficiency building ETL/ELT pipelines at scale; experience with data orchestration tools (Airflow, Step Functions, Prefect) * Solid understanding of data modelling, data warehousing, and dimensional design (star schema, snowflake schema) * Experience with Infrastructure as Code (Terraform) and CI/CD pipelines for data infrastructure * Strong understanding of data governance, security, and compliance best practices * Familiarity with streaming data platforms (Kafka, Kinesis, MSK) is a plus * Excellent problem-solving skills and attention to data quality and reliability * Strong communication and collaboration skills; ability to work with technical and non-technical stakeholders Preferred * AWS Certified Data Analytics or Big Data Specialty certification * Experience with Snowflake or Databricks for data processing and analytics * Familiarity with dbt (data build tool) for transformation workflows * Background in software engineering or site reliability engineering * Experience with data catalog and metadata management tools (AWS Glue Catalog, Collibra, Alation) * Knowledge of DataOps practices and data quality frameworks We have been made aware of bad actors falsely claiming to be associated with Jefferies Group soliciting individuals to attend virtual job interviews, complete online tests or courses and sending fictitious employment offer letters. Please note that any email contact with Jefferies personnel will come from an “@jefferies.com” email address. Further, Jefferies will not notify shortlisted candidates through social media platforms (e.g. WhatsApp or Telegram) or ask candidates to make payment to participate in the hiring process. #LI-MF1 Jefferies is a leading global, full-service investment banking and capital markets firm that provides advisory, sales and trading, research, and wealth and asset management services. With more than 40 offices around the world, we offer insights and expertise to investors, companies, and governments. At Jefferies, we believe that diversity fosters creativity, innovation and thought leadership through the infusion of new ideas and perspectives. We have made a commitment to building a culture that provides opportunities for all employees regardless of our differences and supports a workforce that is reflective of the communities where we work and live. As a result, we are able to pool our collective insights and intelligence to provide fresh and innovative thinking for our clients. Jefferies is an equal employment opportunity employer, and takes affirmative action to ensure that all qualified applicants will receive consideration for employment without regard to race, creed, color, national origin, ancestry, religion, gender, pregnancy, age, physical or mental disability, marital status, sexual orientation, gender identity or expression, veteran or military status, genetic information, reproductive health decisions, or any other factor protected by applicable law. We are committed to hiring the most qualified applicants and complying with all federal, state, and local equal employment opportunity laws. As part of this commitment, Jefferies will extend reasonable accommodations to individuals with disabilities, as required by applicable law.

Responsibilities

Design, build, and maintain scalable data pipelines and ETL/ELT workflows on AWS to support analytics and reporting. Collaborate with cross-functional teams to implement data governance, security controls, and infrastructure automation.