Pyspark Engineer at Citi

Gurugram, haryana, India -

Full Time

Start Date

Immediate

Expiry Date

10 Mar, 26

Salary

0.0

Posted On

10 Dec, 25

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Pyspark, BigData, Hive, Hadoop, Spark, Python, Scala, Unix Scripting, SQL, Data Warehousing, ETL, Machine Learning, Data Governance, Performance Tuning, Code Re-engineering, Container Technology

Industry

Financial Services

Description

Engineering Degree with 1-2 years of experience in BigData systems, Hive, Hadoop, Spark (Python/ scala) and cloud based data management technologies Hands-on experience in Unix Scripting, Python and Scala programing along with strong experience in SQL. Comfortable working with completed unstructured, undocumented code and turning it around into best in class code redesigning costly compute and data processes and aligning to best development standards Experienced in working with large and multiple datasets, data warehouses and ability to pull data using relevant programs and coding. Well versed with necessary data preprocessing and application engineering skills At least 3 years of experience designing software systems with intense computational needs across real time and batch process . Experience and understanding of Supervised, unsupervised machine learning techniques Exposure to data ingestion, ETL tools such as Talend, modeling tools, Performance Management tooling such as Pepper data, Cloudera stack will be a plus Knowledge of data management, data governance, data security and regulatory practices Ability to identify, clearly articulate and solve complex business problems and present them to the management in a structured and simpler form Should have experience of working in onsite, offsite delivery model Experience working with large and multiple datasets, data warehouses and ability to pull data using relevant programs and coding. Experience in Credit Cards and Retail Banking Should have excellent communication and inter-personal skills Strong process/project management skills Multiple stake holder management Control orientated and Risk awareness Fast Learner with a desire to excel and attitude to partner and solve problems in complex environments placing business objectives at center or all activity. Experience in Performance Tuning, Code Re-engineering is preferred. Experience in broad IT architecture and design preferred across data and channels Experience in query tuning, automation technologies (Autosys, Jenkins, Service Now) preferred Exposure to container technology, Machine learning will be a plus Bachelors/University degree or equivalent experience ------------------------------------------------------ Python (Programming Language), Spark SQL. ------------------------------------------------------

Responsibilities

The Pyspark Engineer will be responsible for redesigning unstructured code into best-in-class code and optimizing data processes. They will work with large datasets and data warehouses, utilizing relevant programming skills to pull and preprocess data.