AI Engineer – Hybrid RAG Solution (LLM & RAG)
at GSB SOLUTIONS
Bogotá, Cundinamarca, Colombia -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 07 Feb, 2025 | Not Specified | 09 Nov, 2024 | 3 year(s) or above | Elasticsearch,Cloud Services,Distributed Systems,Infrastructure,Computer Science,English,Fine Tuning,Artificial Intelligence,Analytical Skills,Python | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
JOB SUMMARY:
We are looking for an experienced AI Engineer specializing in Retrieval-Augmented Generation (RAG) to build and optimize hybrid AI solutions leveraging Large Language Models (LLMs). This role involves working with cutting-edge language models and retrieval systems to deliver highly accurate, context-aware, and responsive AI applications. You’ll collaborate with cross-functional teams to develop scalable solutions that enhance information retrieval, comprehension, and generation capabilities in real-world applications.
QUALIFICATIONS:
- Bachelor’s or Master’s degree in Computer Science, Artificial Intelligence, or a related field, or equivalent practical experience.
- 3+ years of experience in AI/NLP, with a focus on LLMs, transformer-based architectures, and retrieval systems.
- Proven experience building and deploying RAG solutions or other hybrid AI architectures.
- Strong understanding of information retrieval methods, including dense retrieval, sparse retrieval, and embeddings-based techniques.
- Proficiency in Python, TensorFlow or PyTorch, and experience with libraries and tools related to LLMs, such as Hugging Face Transformers.
- Familiarity with retrieval frameworks like Elasticsearch, FAISS, or OpenSearch.
- Knowledge of prompt engineering, fine-tuning, and deployment of language models for production environments.
- Strong analytical skills, with experience in optimizing LLM and retrieval model performance.
- English required
PREFERRED SKILLS:
- Experience with cloud services and infrastructure (AWS, GCP, Azure) and MLOps tools for model deployment and monitoring.
- Contributions to open-source RAG projects or experience working with OpenAI, LangChain, or similar frameworks.
Knowledge of vector databases, memory-augmented networks, and distributed system
Responsibilities:
- Design, develop, and deploy hybrid RAG architectures integrating LLMs with retrieval-based systems for improved relevance and contextual responses.
- Fine-tune and optimize large language models, enhancing their performance and adaptability to domain-specific requirements.
- Implement and manage RAG pipelines that effectively combine retrieval mechanisms with generative capabilities, ensuring high accuracy and efficiency.
- Develop custom plugins, adapters, or APIs to integrate retrieval systems (e.g., Elasticsearch, FAISS) with generative models, facilitating seamless information retrieval.
- Monitor and troubleshoot issues within RAG pipelines, fine-tuning retrieval parameters and model hyperparameters to optimize performance.
- Work closely with data engineers to manage and preprocess large datasets for training, ensuring high-quality and diverse data coverage.
- Evaluate and benchmark the performance of RAG solutions, using metrics such as response accuracy, latency, and user satisfaction.
- Stay up-to-date with advancements in NLP, LLMs, and RAG methodologies, continually improving existing architectures and recommending new techniques.
REQUIREMENT SUMMARY
Min:3.0Max:8.0 year(s)
Information Technology/IT
IT Software - Other
Software Engineering
Graduate
Computer Science
Proficient
1
Bogotá, Cundinamarca, Colombia