Senior Data Engineer / Data Architect (Riyadh, on-site)
at Oivan Group Oy
Riyadh, منطقة الرياض, Saudi Arabia -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 20 Feb, 2025 | Not Specified | 21 Nov, 2024 | 8 year(s) or above | Good communication skills | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
WHAT WE NEED
We are seeking an experienced Senior Data Engineer / Data Expert with a strong background in data streaming, Apache Kafka, and Airflow to join our dynamic data engineering team. This role will involve managing and enhancing a large-scale infrastructure designed for extensive social media data scraping, integration into a data lake house, and coordination across multiple data-centric teams. This position requires expertise in handling complex data pipelines, identifying and resolving security vulnerabilities, and ensuring the optimal storage and retrieval of data within our systems.
WHO WE ARE
PDPL Statement
By submitting your application and CV, you give us consent to handle and store your personal information in our information systems according to the Saudi Arabian Personal Data Protection Law. This information will be processed in line with the legal requirements and in accordance with the principles of data privacy and protection
Responsibilities:
- Design, develop, and maintain scalable data pipelines to support AI model development.
- Design, build, and maintain efficient, scalable, and reliable data pipelines using Apache Kafka, streaming services, and Airflow.
- Coordinate with the data acquisition team to ensure seamless data flow while addressing and resolving any security vulnerabilities identified in the Airflow and Kubernetes (K8) setup
- Implement data solutions to handle large volumes of structured and unstructured data, including videos, audio, images, and text.
- Collaborate with AI researchers, machine learning engineers, and software engineers to ensure data is available and ready for model training.
- Ensure data quality, integrity, and security throughout the data lifecycle.
- Optimize data processing workflows for performance and scalability.
- Lead initiatives to validate and manage multilingual data, specifically for Arabic and English datasets, with a focus on YouTube data.
- Coordinate model training and data validation efforts, to enhance multilingual processing capabilities.
- Oversee the completion of Phase 01 data scraping project, which includes managing data collection from social media platforms under API constraints.
- Coordinate with the team to finalize infrastructure setup for scraping 2 million hours of video from YouTube and 10 million photos from other sources.
REQUIREMENT SUMMARY
Min:8.0Max:13.0 year(s)
Information Technology/IT
IT Software - DBA / Datawarehousing
Software Engineering
Graduate
Computer Science, Engineering
Proficient
1
Riyadh, Saudi Arabia