Machine Learning Engineer (LLM Infrastructure) - Toronto
at Prodigy Labs

Toronto, ON, Canada -

Start Date	Expiry Date	Salary	Posted On	Experience	Skills	Telecommute	Sponsor Visa
Immediate	20 Sep, 2024	Not Specified	21 Jun, 2024	2 year(s) or above	Optimization Techniques,Big Data,Training,Cloud,Aws,Distributed Systems,Python,Computer Science,Fine Tuning,Cuda,Azure,Code	No	No

Add to Wishlist Apply All Jobs

Required Visa Status:

Citizen	GC
US Citizen	Student Visa
H1B	CPT
OPT	H4 Spouse of H1B
GC Green Card

Employment Type:

Full Time	Part Time
Permanent	Independent - 1099
Contract – W2	C2H Independent
C2H W2	Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

We are looking for a Machine Learning Engineer who has strong experience in building systems that accelerate the development and deployment of machine learning models, especially large language models (LLMs). You will partner closely with Machine Learning researchers and internal users to understand requirements and apply strong ML fundamentals to build high performance and reusable APIs and can also apply them in real production settings.

REQUIREMENTS

5-6 years of AI, Big Data and cloud expertise
3-4 years of Alternative data experience
2+ years of experience building machine learning training pipelines or inference services in a production setting
Experience with LLM deployment, fine tuning, training, prompt engineering, etc
Experience with LLM inference latency optimization techniques, e.g. kernel fusion, quantization, dynamic batching, etc.
Experience with CUDA, model compilers, and other model-specific optimizations
Experience building, deploying, and monitoring complex microservice architectures.
Degree in Computer Science or Engineering
Prior Experience with: -Docker, Kubernetes, Infrastrure as code (Terraform)and containerization, Agile Methodology, Distributed systems, Databricks ML, Cloud (Azure (preferred) or AWS)
Expert level – Python, SQL
Experience (or knowledge of) Mosaic ML, Ray Framework
Experience with Lang Chain or LlamaIndex
Experience with any vector database

Responsibilities:

Architect/Enable distributed compute aligning workloads to Small/Mid/High end GPUs
Leverage appropriate storage hardware and data formats to improve read/re-read efficiency
Identify and remediate latency contributors especially IO bottlenecks, inefficient data shuffling, under/over utilized compute
Scale models by employing distributed training using Data / Model Parallelism techniques
Parallelize inference processing to improve prediction latency.
Provide Subject Matter Expertise in Graph and Vector databases for various use cases including Knowledge Graphs, RAG etc.
Implement LLM observability and monitoring solutions
Manage infrastructure and large-scale system design and diagnose both model and system failures
Mitigate reputation risk through AI driven Data Quality to ensure highest quality data and services are offered to clients

REQUIREMENT SUMMARY

Experience:Min:2.0Max:6.0 year(s)

Industry:Information Technology/IT

Functional area of job:IT Software - Other

Domain:Software Engineering

Qualifications:LLM

English Proficiency:Proficient

Number of posts:1

Address of job:Toronto, ON, Canada

Machine Learning Engineer (LLM Infrastructure) - Toronto
at Prodigy Labs

Required Visa Status:

Employment Type:

REQUIREMENT SUMMARY

INDIA

AUSTRALIA

UNITED ARAB EMIRATES

Machine Learning Engineer (LLM Infrastructure) - Torontoat Prodigy Labs

Required Visa Status:

Employment Type:

REQUIREMENT SUMMARY

Machine Learning Engineer (LLM Infrastructure) - Toronto
at Prodigy Labs