STEM: A Method for Mapping Single-cell and Spatial Transcriptomics Data with Transfer Learning

Minsheng Hao, Erpai Luo, Yixin Chen, Yanhong Wu, Chen Li, Sijie Chen, Haoxiang Gao, Haiyang Bian, Lei Wei, Xuegong Zhang
bioRxiv (2023)


Profiling spatial variations of cellular composition and transcriptomic characteristics is important for understanding the physiology and pathology of tissues in health or diseases. Spatial transcriptomics (ST) data are powerful for depicting spatial gene expression but the currently dominating high-throughput technology is yet not at single-cell resolution. On the other hand, single-cell RNA-sequencing (SC) data provide high-throughput transcriptomic information at the single-cell level but lack spatial information. Integrating these two types of data would be ideal for revealing transcriptomic landscapes at single-cell resolution. We developed the method STEM (SpaTially aware EMbedding) for this purpose. It uses deep transfer learning to encode both ST and SC data into a unified spatially aware embedding space, and then uses the embeddings to infer the SC-ST mapping and predict pseudo-spatial adjacency between cells in the SC data. Semi-simulation and real data experiments verified that the embeddings preserved the spatial information and eliminated technical biases between SC and ST data. Besides, we can use attribution analysis in STEM to reveal genes whose expressions dominate spatial information. We applied STEM to data of human squamous cell carcinoma and of hepatic lobule to uncover the spatial localization of rare cell types data and reveal cell-type-specific gene expression variation along a spatial axis. STEM is a powerful tool for mapping SC and ST data to build single-cell level spatial transcriptomic landscapes, and can provide mechanistic insights into the spatial heterogeneity and microenvironments of tissues.