10x Genomics Support/Space Ranger 2.1/Analysis/

Space Ranger Aggr Outputs

For convenient multi-sample analysis, the spaceranger aggr pipeline generates output files that contain all the data from the individual input jobs, aggregated into single output files. The capture area suffix of each barcode is updated to prevent barcode collisions.

A successful spaceranger aggr run should conclude with a message similar to this:

Outputs: - Aggregation metrics summary HTML: /opt/runs/AGG123/outs/web_summary.html - Aggregation metrics summary JSON: /opt/runs/AGG123/outs/summary.json - Secondary analysis output CSV: /opt/runs/AGG123/outs/analysis - Filtered feature-barcode matrices MEX: /opt/runs/AGG123/outs/filtered_feature_bc_matrix - Filtered feature-barcode matrices HDF5: /opt/runs/AGG123/outs/filtered_feature_bc_matrix.h5 - Copy of the input aggregation CSV: /opt/runs/AGG123/outs/aggregation.csv - Loupe Browser file: /opt/runs/AGG123/outs/cloupe.cloupe - Aggregated tissue positions list: /opt/runs/AGG123/outs/aggr_tissue_positions.csv - Spatial folder containing spatial images and scalefactors: /opt/runs/AGG123/outs/spatial   Pipestance completed successfully!

Upon completion, all the main pipeline outputs can be found in the outs/ subfolder which has the following structure from the example code:

outs ├── aggregation.csv ├── aggr_tissue_positions.csv ├── analysis │   ├── clustering │   ├── diffexp │   ├── pca │   ├── tsne │   └── umap ├── cloupe.cloupe ├── filtered_feature_bc_matrix │   ├── barcodes.tsv.gz │   ├── features.tsv.gz │   └── matrix.mtx.gz ├── filtered_feature_bc_matrix.h5 ├── spatial │   ├── LV123 │   │   ├── scalefactors_json.json │   │   ├── tissue_hires_image.png │   │   └── tissue_lowres_image.png │   ├── LB456 │   │   ├── scalefactors_json.json │   │   ├── tissue_hires_image.png │   │   └── tissue_lowres_image.png │   └── LP789 │   ├── scalefactors_json.json │   ├── tissue_hires_image.png │   └── tissue_lowres_image.png ├── summary.json └── web_summary.html

The files are similar to the spaceranger count pipeline outputs. The aggr pipeline also outputs a copy of the input aggregation.csv.

Important
spaceranger aggr does not perform a spot-calling step, it simply aggregates the spot calls from each input job into a final set of spot calls.

The outputs for multi-library spaceranger aggr run will be identical the outputs for single library, with the inclusion of the Antibody specific metrics in the web_summary.html, summary.json file and in the filtered feature-barcode matrices.

The spaceranger aggr pipeline outputs summary.json that contains metrics relating to the aggregated datasets. Note: Square brackets denote a variable that depends on the pipeline input, such as, [library_id]_frac_reads_kept means that if your aggregation contains two libraries with IDs sample123 and sample456, there are two output metrics sample123_frac_reads_kept and sample456_frac_reads_kept.

For aggregated datasets that contain both Gene and Protein Expression libraries, there are additional metrics for Protein Capture that include [Antibody] in the metric name.

MetricDescription
filtered_bcs_transcriptome_unionThe estimated number of barcodes associated with a spot under tissue, summed across all input libraries.
[pre/post]_total_readsTotal number of sequenced reads, summed across all input libraries.
[pre/post]_multi_transcriptome_total_raw_reads_per_filtered_bctotal_reads divided by filtered_bcs_transcriptome_union
[library_id]_pre_normalization_raw_reads_per_filtered_bcThe mean total reads per spot prior to depth normalization, for the library denoted by library_id
[library_id]_pre_normalization_cmb_reads_per_filtered_bcThe mean confidently mapped and barcoded (CMB) reads per spot prior to depth normalization, for the library denoted by library_id.
[library_id]_frac_reads_keptThe fraction of reads that were retained after depth normalization for the library denoted by library_id
lowest_frac_reads_keptThe lowest fraction of reads retained, corresponding to the library which lost the most reads during normalization. A low value may indicate a large disparity in the initial depth of the input libraries.

If one or more of the aggregated samples was a Targeted Gene Expression sample, these additional metrics will also appear:

MetricDescription
[library_id]_pre_normalization_targeted_reads_per_filtered_bcThe mean targeted reads per spot prior to depth normalization, for the library denoted by library_id
[library_id]_frac_targeted_reads_keptThe fraction of reads mapped uniquely and confidently to targeted genes that were retained after depth normalization for the library denoted by library_id. This field will be shown instead of the metric [library_id]_frac_reads_kept above