PROPOSE: Predictive and robust probe selection for spatial transcriptomics
Abstract
Fluorescence in situ hybridization (FISH) is a widely used method for visualizing gene expression in cells and tissues. A key challenge is determining a small panel of genes (typically less than 1% of the genome) to probe in a FISH experiment that are most informative about gene expression, either in general or for a specific experimental objective. We introduce predictive and robust probe selection (PROPOSE), a method that uses deep learning to identify informative marker genes using data from single-cell RNA sequencing (scRNA-seq). Using datasets spanning different brain regions, species, and scRNA-seq technologies, we show that our method reliably identifies gene panels that provide more accurate prediction of the genome-wide expression profile, thereby capturing more information while using fewer probes. Furthermore, PROPOSE can be readily adapted to meet specific experimental goals, such as classifying cell types or discerning neuronal electrical properties from scRNA-seq data. Finally, we demonstrate using a recent MERFISH dataset that PROPOSE’s binarization of gene expression levels enables models trained on scRNA-seq data to generalize with input data obtained via FISH, despite the complex domain shift between these technologies.