Research Engineer (Contract) / Research Interns - MLLM Serving Optimization at PBY Ventures
Vancouver, BC V6B 5S3, Canada -
Full Time


Start Date

Immediate

Expiry Date

16 Nov, 25

Salary

35.0

Posted On

17 Aug, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Distributed Systems, Computer Science, Cloud Services

Industry

Information Technology/IT

Description

PBY Ventures a Vancouver based Venture Studio is seeking a contract MLLM Research Engineer to support the work of one of our Scholars in Residence. This will be a flexible remote contract role either full or part-time paying $50 per hour. Ideally you would be located in or around Vancouver BC but we are open to candidates anywhere in Canada.
All applications should be sent to apply@pbyventures.com

QUALIFICATIONS

 Bachelor’s degree or higher in Computer Science, Electrical and Computer Engineering (ECE), or a related field.
 Experience with one or more SOTA LLM serving frameworks such as vLLM, sglang, or lmdeploy.
 Strong proficiency in PyTorch.
 Familiarity with distributed systems, serverless architectures, and cloud computing platforms.
 Experience with inference optimization for large-scale AI models.
 Familiarity with multimodal architectures and serving requirements.
 Previous experience in deploying AI platforms on cloud services.
Job Types: Full-time, Part-time, Fixed term contract, Internship / Co-op
Contract length: 78 weeks
Pay: $35.00-$50.00 per hour
Work Location: Hybrid remote in Vancouver, BC V6B 5S

Responsibilities

 Design, implement, and optimize a high-performance serving platform for MLLMs.
 Integrate SOTA open-source serving frameworks such as vLLM, sglang, or lmdeploy.
 Develop techniques for efficient resource utilization and low-latency inference for MLLMs in serverless
environments.
 Optimize memory usage, scalability, and throughput of the serving platform.
 Conduct experiments to evaluate and benchmark MLLM serving performance.
 Contribute novel ideas to improve serving efficiency and publish findings when applicable.

Loading...