Data Engineer at Prodapt Solutions
Chennai, tamil nadu, India -
Full Time


Start Date

Immediate

Expiry Date

14 Jul, 26

Salary

0.0

Posted On

15 Apr, 26

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Python, ETL Pipelines, NLP, Data Processing, Machine Learning, PyTorch, TensorFlow, REST APIs, FastAPI, Flask, LLMs, Embeddings, Fine-tuning, Hugging Face, Vector Databases, Data Cleaning

Industry

technology;Information and Internet

Description
Overview Job Description – Senior Software Engineer/LE (NLP / Data Pipelines) Experience 2–5 years total Key Responsibilities Develop and maintain ETL pipelines for unstructured data (logs, documents, tickets) Preprocess and transform data into model-ready formats (JSONL, embeddings, chunks) Assist in SLM fine-tuning workflows (dataset prep, training, evaluation) Build and integrate APIs for model inference Support data cleaning, deduplication, and validation Collaborate with Tech Lead on model experiments and improvements Must-Have Skills Strong proficiency in Python (pandas, data processing, scripting) with experience in building AI/ML solutions Experience with unstructured text processing / NLP basics Experience in designing and implementing ETL pipelines for data cleaning, transformation, and batch processing Experience in dataset creation and curation for model training, including instruction tuning, supervised fine-tuning, and evaluation datasets Familiarity with machine learning frameworks (PyTorch / TensorFlow) Experience in developing and integrating REST APIs using frameworks like FastAPI or Flask Basic understanding of LLMs / embeddings / fine-tuning concepts Good to Have Exposure to Hugging Face / Transformers Experience with vector databases (FAISS / pgvector) Understanding of LoRA / QLoRA (conceptual) Exposure to cloud platforms (Azure / AWS / GCP)

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities
Develop and maintain ETL pipelines for unstructured data while preprocessing it into model-ready formats. Collaborate with the technical lead to assist in SLM fine-tuning workflows and build APIs for model inference.
Loading...