A Unified Pipeline for FISH Spatial Transcriptomics
Abstract
In recent years, high-throughput spatial transcriptomics has emerged as a powerful tool for investigating the spatial distribution of mRNA expression and the effects it may have on cellular function. There is a lack of standardized tools for analyzing spatial transcriptomics data, leading many groups to write their own in-house tools that are often poorly documented and not generalizable to other datasets. Currently, the only publicly available tools for extracting annotated transcript locations from raw multiplexed fluorescent in situ hybridization (FISH) images are starfish, which is lacking in some key areas, and MERlin, which is restricted to only MERFISH data. To address this, we have expanded and improved the starfish library and used those tools to create PIPEFISH, a semi-automated and generalizable pipeline that performs transcript annotation for FISH-based spatial transcriptomics. PIPEFISH has options for image processing, decoding, and cell segmentation, and calculates quality control metrics on the output to allow the user to assess the pipeline’s performance on their data. We used this pipeline to annotate transcript locations from three real datasets from three different common types of FISH image-based experiments: MERFISH, seqFISH, and targeted in situ sequencing (ISS), and verified that the results were high quality using the internal quality metrics of the pipeline and also a comparison to a orthogonal method of measuring RNA expression in a similar tissue sample. We have made PIPEFISH publicly available through Github for anyone interested in analyzing data from FISH-based spatial transcriptomic assays.