Technical Research Assistant – LLM for Clinical Decision Support at M31 AI

Toronto, ON M5S 1A8, Canada -

Full Time

Start Date

Immediate

Expiry Date

12 Nov, 25

Salary

30.0

Posted On

12 Aug, 25

Experience

0 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Computer Science, Biomedical Engineering, Publications, Calibration, Unix, Resume, Git, Graduate Students, Physics

Industry

Information Technology/IT

Description

Overview
Are you passionate about large language models (LLMs) and healthcare? Do you want to translate cutting-edge NLP into tools that clinicians can safely use at the point of care? We’re hiring a full-time Technical Research Assistant to help design, build, and evaluate LLM-powered clinical decision support systems.
This role is ideal for graduate students, recent grads, or practicum/capstone students who want hands-on experience applying LLMs to clinical workflows. You’ll work with clinical text and knowledge sources (guidelines, reports, EHR notes) and contribute directly to retrieval-augmented generation (RAG), fine-tuning, and rigorous evaluation of models intended for real-world clinical impact.

What You’ll Do

Build RAG pipelines that ground LLM outputs in trusted clinical sources (guidelines, formularies, structured knowledge bases).
Curate and preprocess datasets from clinical text and semi-structured data (PHI de-identification, normalization, ontology mapping to SNOMED CT / ICD-10 / LOINC).
Design prompts, tools, and function-calling schemas for CDS tasks (summarization, guideline lookup, order suggestion, discharge instructions).
Train and evaluate models (instruction-tuning, preference optimization) using frameworks like PyTorch/Transformers and orchestration libraries (e.g., LangChain/LlamaIndex).
Implement vector search and indexing (FAISS/Milvus/pgvector) and optimize retrieval quality (chunking, hybrid search, re-ranking).
Stand up robust evaluation: automatic metrics (exact match/F1, citation accuracy, groundedness) and human-in-the-loop reviews with clinicians using structured rubrics.
Build guardrails and safety checks (PHI leakage tests, hallucination detection, contraindication checks, prompt-injection resilience).
Maintain reproducible, well-documented code and experiments (Git, Jupyter, Docker; experiment tracking).
Collaborate closely with clinicians, data stewards, and AI researchers; participate in study design, IRB/REB documentation support, and write-ups for publications.

Why Join Us

Gain hands-on experience with one of the most promising areas of healthcare AI: pediatric neuroimaging
Eligible to be counted as practicum or experiential learning credit for graduate or professional programs (check with your program advisor)
Learn to develop, deploy, and validate AI models in a healthcare research environment
Receive mentorship from clinicians, AI researchers, and imaging scientists
Contribute to peer-reviewed publications and high-impact research
Work with large-scale imaging data from leading hospitals and research centers

Required Skills & Background

Bachelor’s degree in Computer Science, Biomedical Engineering, Physics, or related field (Graduate students encouraged to apply)
Strong Python skills and experience with deep learning/LLM stacks (PyTorch; Hugging Face Transformers).
Familiarity with retrieval and LLM application patterns (RAG, tool use/function calling, prompt engineering).
Experience building data pipelines and parsing clinical or technical text (regex/NLP, basic SQL).
Comfort with experiment design and evaluation for QA/summarization tasks (EM/F1, calibration, citation/attribution checks).
Proficiency with Git, UNIX, and Jupyter; attention to reproducibility and clean code.

Nice-to-Have

Exposure to healthcare data standards or ontologies (HL7 FHIR, SNOMED CT, ICD-10, LOINC).
Experience with vector databases, re-rankers, or hybrid search.
Knowledge of privacy, security, and compliance in Canadian contexts (PHIPA/PIPEDA) and general best practices for handling sensitive data.
Prior work on safety/guardrails, red-teaming, or evaluation frameworks (e.g., rubric-based human evals, RAGAS checks).

Application Requirements

Resume/CV
Brief cover letter outlining technical experience and interest in the position
Unofficial transcript
GitHub portfolio, code samples, or publications (optional but encouraged)

Job Type: Full-time
Pay: $30.00-$40.00 per hour
Expected hours: 37.5 per week
Work Location: Hybrid remote in Toronto, ON M5S 1A

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities

Build RAG pipelines that ground LLM outputs in trusted clinical sources (guidelines, formularies, structured knowledge bases).
Curate and preprocess datasets from clinical text and semi-structured data (PHI de-identification, normalization, ontology mapping to SNOMED CT / ICD-10 / LOINC).
Design prompts, tools, and function-calling schemas for CDS tasks (summarization, guideline lookup, order suggestion, discharge instructions).
Train and evaluate models (instruction-tuning, preference optimization) using frameworks like PyTorch/Transformers and orchestration libraries (e.g., LangChain/LlamaIndex).
Implement vector search and indexing (FAISS/Milvus/pgvector) and optimize retrieval quality (chunking, hybrid search, re-ranking).
Stand up robust evaluation: automatic metrics (exact match/F1, citation accuracy, groundedness) and human-in-the-loop reviews with clinicians using structured rubrics.
Build guardrails and safety checks (PHI leakage tests, hallucination detection, contraindication checks, prompt-injection resilience).
Maintain reproducible, well-documented code and experiments (Git, Jupyter, Docker; experiment tracking).
Collaborate closely with clinicians, data stewards, and AI researchers; participate in study design, IRB/REB documentation support, and write-ups for publications