Bioinformatician (Spatial & Single-Cell) at Deep Science Ventures

, , United Kingdom -

Full Time

Start Date

Immediate

Expiry Date

30 Jun, 26

Salary

0.0

Posted On

01 Apr, 26

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

Skills

Bioinformatics, Spatial Transcriptomics, Single-Cell Proteomics, Python, Statistics, Data Analysis, Machine Learning, Cloud Computing, Genomics, Systems Biology, Tumour Biology, Cancer Immunology, Pipeline Building, Quality Control, Normalization, Integration

Industry

Venture Capital and Private Equity Principals

Description

Big Picture Bio is a seed-stage techbio company, backed by DSV, building a computational drug discovery platform that constructs causal biological networks from two primary sources: structured extraction of published experimental literature and large-scale primary human single-cell omics data. Our multi-agent AI system reasons over these networks to generate, simulate, and rank mechanistic hypotheses for combination therapies — with system accuracy verified against top-tier researchers at the Allen Institute. We are initially focused on oncology. The Role (remote, timezone-restricted) You will design and build production bioinformatics pipelines for new modalities—spatial transcriptomics, single-cell proteomics, and spatial proteomics—extending our existing scRNA-seq infrastructure. These pipelines feed directly into an agentic hypothesis generation system: the quality of what goes in determines the quality of every therapeutic hypothesis that comes out. You’ll work closely with our Head of AI & Technology (Dr. Francesco Moramarco) and Head of Platform (Dr. Moustafa Khedr) to: Build end-to-end pipelines (ingestion, QC, normalization, integration, annotation, differential analysis) Design modality-specific statistics: spot deconvolution, spatial autocorrelation, ADT normalization, protein-RNA joint embedding, segmentation, spillover correction Extend hierarchical cell type annotation across modalities Codify best-practice workflows into reusable templates for agent execution Sanity-check outputs to catch batch effects and artifacts before they propagate PhD in computational biology, bioinformatics, genomics, systems biology, or related quantitative field 2–6 years experience in early-stage/high-growth startups Pipeline-building experience with spatial transcriptomics (Visium, MERFISH, Xenium) from scratch Experience with single-cell or spatial proteomics (CITE-seq, CyTOF, CODEX, IMC) Strong Python engineering in the anndata ecosystem (scanpy/squidpy/muon) Deep single-cell & spatial statistics knowledge (pseudobulk, multiple testing correction, mixed-effects models, compositional analysis) Strong biology grounding; can distinguish biology vs confound; assess mechanistic plausibility Timezone: at least 5 hours overlap with UK working hours (UTC−4 through UTC+4 preferred) Strong desirables Tumour biology / cancer immunology (TME, immune evasion, resistance) Comfort working in an AI-mediated workflow and writing analysis plans executed by agents Experience building pipelines/tools consumed by others; cloud compute (GCP preferred); R proficiency Nice to have Wet lab experience and familiarity with the 10x Genomics ecosystem We know job descriptions like this can read as a wish list. If you don't tick every box, but believe you can build what we need - apply anyway! We care more about what you've built and how you think than whether your CV maps perfectly to every bullet point. Competitive compensation commensurate with experience and profile, plus equity participation. Flexible arrangement depending on location and preference (full-time employment or long-term consultancy). Small, technically intense team with high autonomy and ownership. Remote-first with meaningful overlap with UK hours; minimal management layers and direct impact on decisions.

Responsibilities

You will design and build production bioinformatics pipelines for new modalities, extending existing infrastructure. This includes building end-to-end pipelines and ensuring the quality of data for therapeutic hypothesis generation.