Space Ranger Spot Deconvolution Outputs

The spaceranger count pipeline outputs several CSV files and dendrogram plots which contain automated LDA-based spot deconvolution results for many values of K topics. The algorithm starts by performing LDA with an initial K = N graph-based clusters + 2 to model the Visium spot data as a mixture of K topics (inferred cell types). It then hierarchically clusters topics and collapses branches for decrementing K until K =2. See the algorithms page for more details.

Space Ranger's spot deconvolution is not supported for Visium HD Spatial Gene Expression.

The results are organized in the deconvolution folder as follows.

File Name	Description
deconvolution/dendrogram_k.png	Dendrogram showing how topics are hierarchically clustered and collapsed down to K = 2
deconvolution/dendrogram_k_distances.png	Dendrogram showing how topics are hierarchically clustered by distance between topics
deconvolution/deconvolution_k/	Folders containing deconvolution results for every K
deconvolution/deconvolution_k/deconvolution_topic_features_k.csv	Contains pseudocount and log2 fold change of features in each topic for annotating topics with cell type
deconvolution/deconvolution_k/deconvolution_spots_k.csv	Contains the estimated proportions of each topic within every spot

For a dataset with 10 graph-based clusters (largest K = 12), the deconvolution_topic_features_k12.csv is a table of topics and features that shows the relevance of every feature to each topic. It can be used for annotating topics to their biological cell type.

The column headers are: Feature ID, Feature Name, Feature count topic 1, Feature log2 fold change topic 1, Feature count topic 2, Feature log2 fold change topic 2, …

The feature count for each topic is a pseudocount of the number of UMIs assigned to each topic after LDA over the entire tissue. The log2 fold change calculates the increase of the feature in that topic relative to all other topics and quantifies the relevance of the gene to a topic.

The deconvolution_topic_features CSV for smaller K’s have topics named to indicate which topics were collapsed together. For example topic 1to2 is a topic that combines topic 1 and topic 2 together from the highest level of K.

The deconvolution_spots_k12.csv is a table of topics and spot barcodes that show the proportion of topics found in each spot under tissue. The sum of topic proportions for each spot is always equal to 1.

The column headers are: Barcode, Topic 1, Topic 2, …

The deconvolution_spots CSV for smaller K values have topics named to indicate which topics were collapsed together. For example topic 1to2 is a topic that combines topic 1 and topic 2 together from the highest level of K.

The dendrogram_k12.png shows the order of branching. X axis has the topics for largest initial K and the Y axis shows where the branches are cut for every level of K. For example, at K = 10, Topics 1 and 2 are collapsed to Topic 1to2 and Topics 5 and 6 are collapsed to Topic 5to6.

The dendrogram_k12_distances.png uses height to convey similarity of topics in addition to the order of branching. For example, in this plot it is apparent that topics 10, 11, and 12 are almost equally similar to each other even though from the other dendrogram it may seem like topics 11 and 12 are much more similar to each other than to topic 10.

Output organization

Deconvolution topic features CSV

Deconvolution spots CSV

Dendrograms