Genomics · Precision MedicineVol. IV · March 12, 2024

Why Single-Cell Sequencing Is Transforming Precision Medicine

Bulk RNA-seq has long averaged away the very signal researchers need most. A new class of single-cell methods is now resolving individual cell states — and it is changing how we think about disease, drug targets, and patient stratification.

By Ananya G. Bhaya

6 min readMicrosoft Research · NBU

In the canonical experiment, a researcher grinds up tissue, extracts RNA from millions of cells simultaneously, and sequences the result. The output is a single averaged profile — one number per gene, per sample. For twenty years, this was enough. It told us which genes were up-regulated in tumour versus normal tissue, which pathways were perturbed in disease. But it told us nothing about the cellular composers of that orchestra.

Single-cell RNA sequencing (scRNA-seq) changes the resolution entirely. Instead of a symphony recording, you receive the sheet music for every individual instrument. Each cell yields its own transcriptomic profile — tens of thousands of measurements per cell, across hundreds of thousands of cells per experiment.

The numbers are staggering. The Human Cell Atlas project, launched in 2016, aims to map every cell type in the human body. As of early 2024, it has characterised over 50 million cells across 38 organs. The pancreatic tumour atlas from our own lab resolved 120,000 cells from 40 biopsies — revealing immunosuppressive myeloid subtypes that bulk-seq had never detected, subtypes that now explain why checkpoint inhibitors fail in pancreatic cancer.

"The averaging effect of bulk RNA-seq doesn't just smooth noise — it systematically erases the rarest, most clinically relevant cell states."

— Nature Methods, 2023

The Averaging Problem

Consider a tumour biopsy that is 5% regulatory T cells (Tregs). In bulk-seq, their signal is diluted to near-zero by the 95% of other cells. Yet those Tregs may be entirely responsible for the immunosuppressive microenvironment preventing the patient from responding to immunotherapy. Bulk analysis would never flag them.

This averaging problem is not merely a sensitivity issue. It can produce actively misleading conclusions. A gene that appears uniformly "down-regulated" across a tumour may in fact be highly expressed in a small sub-clone driving metastasis — while being silenced in the bulk of slow-growing cells. Bulk methods report the average; single-cell methods report the distribution.

Cell States vs. Cell Types

One of the most conceptually important shifts enabled by scRNA-seq is the distinction between cell types and cell states. A macrophage in a healthy liver is classified the same cell type as a macrophage in the core of a pancreatic tumour — but their transcriptomes diverge dramatically. The tumour macrophage has been co-opted, switching from inflammatory sentinel to immunosuppressive protector of the malignancy.

This distinction has immediate therapeutic relevance. Several Phase II trials are now targeting not cell types but specific cell states defined by scRNA-seq signatures. The goal is to re-programme tumour-associated macrophages rather than deplete them — a strategy that would have been inconceivable without single-cell resolution.

"120,000 cells. 40 biopsies. Three myeloid subtypes invisible to bulk sequencing — each predicting a different survival outcome."

— Bhaya et al., Nature Communications 2024

Implications for Drug Targeting

The drug development pipeline has long suffered from a fundamental mismatch: targets identified in cell lines or bulk-seq analyses often fail in clinical trials because the target cell population was never properly characterised. scRNA-seq is beginning to close this gap.

Mapping the precise cell state in which a drug target is expressed — and identifying which patients harbour sufficient numbers of those cells — transforms target validation from a population-level average into a patient-specific assessment. This is precision medicine in its most literal sense.

Beyond oncology, single-cell methods are being applied to autoimmune disease (mapping synovial cell states in rheumatoid arthritis), neurodegeneration (resolving microglial subtypes in Alzheimer's), and cardiovascular disease (characterising fibroblast heterogeneity post-infarction). Each application reveals the same pattern: the most actionable biology is hidden in the minority.

Disease Stratification

Perhaps the most immediate clinical application is patient stratification. Two patients with histologically identical pancreatic adenocarcinomas may have entirely different tumour microenvironments. One may harbour an inflamed, immune-infiltrated subtype susceptible to checkpoint blockade; the other a cold, desmoplastic microenvironment that physically excludes T cells. A bulk biopsy would classify them identically. A single-cell atlas would not.

Our lab's work on the pancreatic tumour atlas identified three distinct myeloid subtypes with independent prognostic value. Patients enriched for the SPP1-high immunosuppressive subtype had a median survival of 9.4 months; those dominated by FOLR2-high resident macrophages survived 22.1 months. Both groups were indistinguishable by standard pathology.

Computational Challenges

The biological signal in scRNA-seq is extraordinary — so is the noise. Single cells contain far less RNA than bulk samples, and the dropout problem (where a gene is expressed but not detected due to low capture efficiency) complicates analysis substantially. Modern pipelines — Seurat, Scanpy, scVI — address this through probabilistic models and graph-based dimensionality reduction, but they introduce their own assumptions and artefacts.

Batch effects present a second major challenge. Cells sequenced in different laboratories, using different protocols or sequencing depths, cluster by technical artefact rather than biology unless carefully corrected. Harmony, scANVI, and related integration methods have become indispensable parts of any multi-study analysis.

The field is moving fast. Spatial transcriptomics now adds positional information — where each cell sits within tissue architecture. CITE-seq simultaneously measures RNA and protein. Multiome assays co-profile transcriptome and chromatin accessibility from the same nucleus. Each additional modality multiplies both the biological insight and the computational complexity.

What Comes Next

The bottleneck is no longer data generation — sequencing costs have fallen by five orders of magnitude since the human genome project. The bottleneck is interpretation: turning atlases of millions of cells into actionable clinical knowledge. This requires better foundation models trained on single-cell data, tighter integration with clinical phenotypes and outcomes, and — crucially — prospective trials designed around single-cell-defined patient subgroups.

The next decade will determine whether single-cell sequencing remains a research instrument or becomes a clinical diagnostic. The biology suggests it should be the latter. The question is whether the infrastructure — computational, regulatory, and economic — can follow fast enough.

Ananya G. Bhaya is a computational biologist at Microsoft Research and NBU, working on single-cell multi-omics integration and precision oncology. Contact: ananya.gb@research.microsoft.com

← All DispatchesThe Computational Biology Review · Vol. IV