(Senior) Machine Learning Scientist, AI Foundation Model Specialist at Deep Genomics

America, Limburg, Netherlands -

Full Time

Start Date

Immediate

Expiry Date

27 Jul, 25

Salary

0.0

Posted On

27 Apr, 25

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Machine Learning, Computer Science, Statistics, Infrastructure, Deep Learning, Physics, Communication Skills

Industry

Information Technology/IT

Description

ABOUT US

Deep Genomics is at the forefront of using artificial intelligence to transform drug discovery. Our proprietary AI platform decodes the complexity of genome biology to identify novel drug targets, mechanisms, and therapeutics inaccessible through traditional methods. We co-develop drug programs and AI models with partners and internally, and pursue major technology builds with pharmaceutical partners. With expertise spanning machine learning, bioinformatics, data science, engineering, and drug development, our multidisciplinary team located in Toronto, Cambridge, MA, and select other sites is revolutionizing how new medicines are created.

WHERE YOU FIT IN

We are seeking a passionate and highly skilled Machine Learning Scientist with expertise in Foundation Models to join our core AI research team, focusing on developing and scaling our next-generation foundation models for biology. You will work with domain area experts to explore fundamental research questions at the intersection of deep learning and biology, and apply the latest in AI research to unique, large-scale biological datasets (many terabytes) to develop models that push the state-of-the-art. If you are excited by using your strong ML knowledge to work with others and to solve frontier problems in drug discovery, this is a unique opportunity.

BASIC QUALIFICATIONS

PhD or MSc with a strong research focus in Machine Learning, Computer Science, Statistics, Physics, or a related quantitative field.
Deep understanding of the theoretical underpinnings and practical application of modern deep learning, including architectures like Transformers and related sequence models (e.g. state-space models).
Proven ability to implement, train, and debug highly-performant deep learning models using frameworks like PyTorch or JAX.
Experience working with large datasets and understanding the challenges associated with scale (even if not directly managing infrastructure).
Strong analytical and problem-solving skills, with the ability to translate ambiguous scientific problems into tractable ML formulations.
Excellent communication skills, capable of discussing complex ideas with both technical and scientific audiences.

PREFERRED QUALIFICATIONS

2+ years of relevant post-graduate experience in an industrial R&D or applied science setting, applying advanced ML to solve complex scientific or technical problems.
Experience technically leading projects or mentoring junior researchers/engineers.
Demonstrated potential through strong PhD/MSc research, impactful projects, relevant internships, or open-source contributions.
Track record of impactful research demonstrated through first or senior author publications in top-tier ML or relevant scientific journals/conferences(e.g., NeurIPS, ICML, ICLR).
Proficiency with cloud computing platforms (e.g., AWS, GCP) for large-scale model training and experimentation.

Responsibilities

Collaborate closely with computational biologists to integrate domain knowledge, define scientifically meaningful tasks, and translate biological challenges into ML frameworks. You will work together to formulate key scientific questions in biology that can be addressed through innovative ML/AI approaches and design computational experiments to test hypotheses.
Contribute to, and potentially lead (depending on level) research into novel deep learning architectures, training paradigms (e.g., self-supervised, multi-modal), and algorithms tailored for large-scale biological sequence data and related modalities.
Rigorously implement, train, debug, and evaluate models to demonstrate scientific validity and potential for downstream application.
Stay current with advancements in both machine learning and computational biology literature, identifying cross-disciplinary opportunities to solve real-world challenges.
Contribute to team knowledge sharing and code quality through documentation and code reviews.
Share research findings through internal presentations, and potentially external publications or conference presentations.