Data Engineer /Scientist
at GECO Asia
Jurong, Southwest, Singapore -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 07 Sep, 2024 | Not Specified | 08 Jun, 2024 | N/A | Python,Etl Tools | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
JOB DESCRIPTION:
We are seeking a highly skilled Data Engineer/Scientist to join our dynamic team. The ideal candidate will have experience in implementing data ingestion and transformation pipelines (ETL) and model training and serving pipelines at scale, particularly leveraging GPU resources. Additionally, expertise in model API development and containerization is essential. You will play a crucial role in building and optimizing the infrastructure that supports our advanced data analytics and machine learning initiatives.
QUALIFICATIONS:
- Proven experience as a Data Engineer/Scientist or in a similar role.
- Strong proficiency in Python and experience with data manipulation libraries (e.g., Pandas).
- Hands-on experience with ETL tools and frameworks.
- Experience with GPU-based model training frameworks (e.g., TensorFlow, PyTorch).
- Proficiency in developing and deploying model APIs.
- Experience with containerization technologies (e.g., Docker) and orchestration platforms (e.g., Kubernetes).
Responsibilities:
- Design, develop, and maintain robust ETL pipelines to ingest, transform, and load data from various sources.
- Ensure data quality, consistency, and reliability throughout the ETL process.
- Work with both structured and unstructured data sources to enable comprehensive data integration.
- Implement scalable model training pipelines utilizing GPU resources to accelerate performance.
- Develop and optimize model serving pipelines to ensure efficient deployment of machine learning models.
- Monitor and manage the lifecycle of machine learning models, including versioning and updates.
- Create and maintain APIs for serving machine learning models to various applications and services.
- Ensure APIs are robust, scalable, and secure to handle production workloads.
- Collaborate with software engineers to integrate model APIs into existing systems.
- Containerize data processing and machine learning workflows using Docker or similar technologies.
- Deploy and manage containerized applications in orchestration platforms like Kubernetes.
- Ensure high availability, scalability, and fault tolerance of containerized solutions.
- Work closely with data scientists, analysts, and software engineers to understand requirements and deliver solutions.
- Document data pipelines, model training workflows, and deployment processes.
- Stay updated with the latest trends and technologies in data engineering and data science.
REQUIREMENT SUMMARY
Min:N/AMax:5.0 year(s)
Information Technology/IT
IT Software - Other
Software Engineering
Graduate
Proficient
1
Jurong, Singapore