You have successfully completed running the Space Ranger pipeline, and now with your Visium data in hand, you are ready to continue your analysis journey. If you are a new Visium user, and even if you are an experienced bioinformatician, you may find yourself overwhelmed by the wide variety of software tools and unsure of the best approach for your particular experiment:
- How do I evaluate the quality of my data?
- How do I integrate single cell RNAseq with my spatial data?
- Can I aggregate my data?
- Can I identify the cells that make up a spot?
- What is the best way to present my data?
Here, we will provide some resources that may help you explore your results and address some of these questions. Typically, the next steps will involve checking data quality, preparing for further analyses (e.g. aggregating or integrating data), and exploring your data using Loupe Browser or community-developed tools.
Space Ranger’s web summary file (web_summary.html) is an easy first stop to assess the quality of your data. Several metrics, encompassing sequencing, mapping and spot quality, are useful in determining the overall success of an experiment.
In general, it is desirable to have accurate base calling with the majority of reads containing valid barcodes and UMIs. A high percentage of tissue-associated reads confidently mapped to the transcriptome or probe set (depending on your Visium assay type) is also preferable.
Consult the Technical Note for Visium FFPE that discusses web summary file interpretation and expected metric values in greater detail for further reading.
Once you have completed your initial processing of the data using Space Ranger, there are a number of intermediate steps you might need to take depending on your experimental design and final analysis destination. There are 10x and community-developed tools (discussed later) that will aid in this process.
You might be interested in studying gene expression from consecutive sections of the same tissue block, or samples of the same tissue under different conditions, such as healthy vs. diseased. You can combine samples using Space Ranger aggr in a process commonly referred to as aggregation. An advantage of using this approach is that you can now make gene expression comparisons between the tissues, assuming similar technical variability. However, you may observe a batch effect in the aggregated data. There are tools suitable for correcting a batch effect in spatial data and we describe one approach here. Note that Space Ranger aggr currently does not correct for batch or chemistry effects.
In your analysis journey, you might also be interested in studying cell-cell communication or gene expression at single cell resolution. The data generated by Space Ranger preserves the spatial location of the gene transcripts. However, the resolution of the data is potentially lower than a single cell, because a spot can cover multiple cells depending on cell size. If you have a single cell gene expression dataset derived from the same tissue block or a similar sample as your spatial data, you can increase the resolution of your spatial data through mapping or deconvolution techniques. These techniques seek to assign the dominant cell type or the relative proportion of cell types in a spot. This process of using single cell data to improve spatial data resolution is commonly referred to as integration. We review common tools used for integration here and describe the use of one tool here.
Loupe Browser is a 10x-developed desktop application for visualization, image alignment and analysis of 10x data. It opens cloupe files generated by the Space Ranger count and aggr pipelines. Loupe Browser is an excellent place to stop on your data analysis journey after reviewing the web summary file. You can visualize the spatial context of the clustering information overlaid on the tissue image, along with dimensionally reduced representations of your Visium data (e.g. UMAP and t-SNE). Loupe’s functionality allows for intuitive exploration of your data to subset, run differential expression analysis, filter spots, identify features in the tissue, etc. to help generate or test primary hypotheses. Consult the Loupe Browser tutorials for spatial data for further exploration.
There is a strong and evolving ecosystem for community-developed tools that support the analysis of 10x Genomics Visium data. These tools require some programming knowledge; R and Python are two commonly used languages in the field. These tools are often used for downstream analyses beyond those enabled by 10x’s Space Ranger or Loupe Browser. Here is a short list of popular 10x-compatible community developed tools:
- Seurat and Giotto are R packages for the analysis and visualization of spatial gene expression data. Tutorials using a Visium mouse brain dataset for Seurat and Giotto are good first places to start. Giotto additionally also has a tutorial using a Visium mouse kidney. Bioconductor is a collection of R packages that includes tools for analyzing and visualizing spatial data. An online book here demonstrates the use of a variety of bioconductor packages for common spatial data workflows.
- Squidpy is a part of scverse, which is a collection of Python tools for the analysis of omic data. Squidpy is another tool for the analysis and visualization of spatial data with tutorials that cover such topics as data processing, visualization, nuclei segmentation and spot deconvolution. It is designed around modularity and scalability.
You can find further inspiration by visiting 10x Genomics’ publication page to find articles related to spatial gene expression, which cover broad areas of interest ranging from immunology to computational methods.
Another resource is The Museum of Spatial Transcriptomics. It is a curated list of spatial gene expression publications. The publication list includes computational methods as well as papers that provide a comprehensive historical perspective on the development and technologies within the field.
If you would like to start with review articles, here are a few examples:
- “Analysis and Visualization of Spatial Transcriptomic Data”
- “Advances in spatial transcriptomic data analysis”
- “Deciphering tissue structure and function using spatial transcriptomics"
These reviews cover common topics such as spatial data preprocessing, visualization, and cell and spatial feature identification. They also touch upon advanced topics like enhancing spot resolution and cell-cell communication.
If this is your first journey into analyzing big data (e.g. spatial data), or you are in need of a refresher on the Unix command line or computer programming, there are plenty of online resources to aid you.
- Data Carpentry and Software Carpentry offer online lessons on programming (R and Python) and the Unix command line.
- R for Data Science and Python Data Science Handbook are two introductory online books that cover data processing and visualization in their respective programming languages.
There are also online materials that specifically cover spatial transcriptomic data analysis. Users of all levels may find these materials useful:
- “Analyzing and Visualizing Visium Gene Expression Data”, is a 10x webinar highlighting Visium with H&E stained fresh frozen tissue
- “Introducing Visium Spatial Gene Expression with Immunofluorescence”, is a 10x webinar that showcases an immunofluorescent -stained fresh frozen sample
- “Spatial Biology Without Limits” is a 10x webinar series with a section called “Data-driven discovery with Visium Spatial Gene Expression for FFPE” that gives a nice introduction to the analysis journey.
- “Orchestrating Spatially-Resolved Transcriptomics Analysis with Bioconductor” is an online book that covers common spatial analysis workflows.
- "Spatial transcriptomics data analysis in Python" is a GitHub repository collecting Python jupyter notebook tutorials for common spatial gene expression analysis tools that were presented at the SCOG Virtual Workshop (May 23-24, 2022).
This article is not meant to be an exhaustive list of all the stops you can take on your spatial data analysis journey. New and exciting spatial gene expression analysis tools, articles, courses, and other resources continue to be released at a rapid pace. The landscape of spatial data analysis is dynamic and continually evolving. We hope this article helps you get started in your journey!