SEraster: a rasterization preprocessing framework for scalable spatial omics data analysis
Abstract
Motivation Spatial omics data demand computational analysis but many analysis tools have computational resource requirements that increase with the number of cells analyzed. This presents scalability challenges as researchers use spatial omics technologies to profile millions of cells.
Results To enhance the scalability of spatial omics data analysis, we developed a rasterization preprocessing framework called SEraster that aggregates cellular information into spatial pixels. We apply SEraster to both real and simulated spatial omics data prior to spatial variable gene expression analysis to demonstrate that such preprocessing can reduce resource requirements while maintaining high performance. We further integrate SEraster with existing analysis tools to characterize cell-type spatial cooccurrence. Finally, we apply SEraster to enable analysis of a mouse pup spatial omics dataset with over a million cells to identify tissue-level and cell-type-specific spatially variable genes as well as cooccurring cell-types that recapitulate expected organ structures.
Availability and implementation Source code is available on GitHub (https://github.com/JEFworks-Lab/SEraster) with additional tutorials at https://JEF.works/SEraster.