Understanding Outputs

Cell Ranger ARC strives to maintain compatibility with common analysis tools by using standard output file formats whenever possible. The Chromium-specific data, including cellular and molecular barcodes, can be accessed by third-party tools or scripts that can parse the additional elements utilized by Cell Ranger ARC.

All pipeline outputs are produced in a single pipeline output directory, which is specified by the --id argument for cellranger-arc (defaults to the flow cell serial number, e.g., HAWT7ADXX when the --id argument is not specified).

Output files are saved in the outs/ subdirectory within this pipeline output directory. The contents include:

Multiomic outputs:
- Web summary HTML, run summary metrics, and per barcode summary metrics
- Secondary analysis including feature linkage
- Feature-barcode matrices in MEX and HDF5 formats
ATAC outputs:
Gene Expression outputs:
- GEX position-sorted alignments BAM
- GEX molecule info HDF5

Example subdirectory structure of the /outs directory:


├── analysis
│   ├── clustering
│   │   ├── atac
│   │   │   ├── peaks_graphclust
│   │   │   │   ├── clusters.csv
│   │   │   │   ├── differential_accessibility.csv
│   │   │   │   └── differential_expression.csv
│   │   │   ├── peaks_kmeans_2_clusters
│   │   │   │   ├── clusters.csv
│   │   │   │   ├── differential_accessibility.csv
│   │   │   │   └── differential_expression.csv
│   │   │   ├── peaks_kmeans_3_clusters
│   │   │   │   ├── clusters.csv
│   │   │   │   ├── differential_accessibility.csv
│   │   │   │   └── differential_expression.csv
│   │   │   ├── peaks_kmeans_4_clusters
│   │   │   │   ├── clusters.csv
│   │   │   │   ├── differential_accessibility.csv
│   │   │   │   └── differential_expression.csv
│   │   │   └── peaks_kmeans_5_clusters
│   │   │       ├── clusters.csv
│   │   │       ├── differential_accessibility.csv
│   │   │       └── differential_expression.csv
│   │   └── gex
│   │       ├── gene_expression_graphclust
│   │       │   ├── clusters.csv
│   │       │   ├── differential_accessibility.csv
│   │       │   └── differential_expression.csv
│   │       ├── gene_expression_kmeans_2_clusters
│   │       │   ├── clusters.csv
│   │       │   ├── differential_accessibility.csv
│   │       │   └── differential_expression.csv
│   │       ├── gene_expression_kmeans_3_clusters
│   │       │   ├── clusters.csv
│   │       │   ├── differential_accessibility.csv
│   │       │   └── differential_expression.csv
│   │       ├── gene_expression_kmeans_4_clusters
│   │       │   ├── clusters.csv
│   │       │   ├── differential_accessibility.csv
│   │       │   └── differential_expression.csv
│   │       └── gene_expression_kmeans_5_clusters
│   │           ├── clusters.csv
│   │           ├── differential_accessibility.csv
│   │           └── differential_expression.csv
│   ├── dimensionality_reduction
│   │   ├── atac
│   │   │   ├── lsa_components.csv
│   │   │   ├── lsa_dispersion.csv
│   │   │   ├── lsa_features_selected.csv
│   │   │   ├── lsa_projection.csv
│   │   │   ├── lsa_variance.csv
│   │   │   ├── tsne_projection.csv
│   │   │   └── umap_projection.csv
│   │   └── gex
│   │       ├── pca_components.csv
│   │       ├── pca_dispersion.csv
│   │       ├── pca_features_selected.csv
│   │       ├── pca_projection.csv
│   │       ├── pca_variance.csv
│   │       ├── tsne_projection.csv
│   │       └── umap_projection.csv
│   ├── feature_linkage
│   │   ├── feature_linkage.bedpe
│   │   └── feature_linkage_matrix.h5
│   └── tf_analysis
│       ├── filtered_tf_bc_matrix
│       │   ├── barcodes.tsv.gz
│       │   ├── matrix.mtx.gz
│       │   └── motifs.tsv
│       ├── filtered_tf_bc_matrix.h5
│       └── peak_motif_mapping.bed
├── cell_types
│   ├── 10x_Cloud # for cloud-based annotation models
│   │   ├── cell_annotation_differential_expression.csv
│   │   ├── cell_annotation_results.json.gz
│   │   └── cell_types.csv
│   └── Azimuth
│       ├── cell_annotation_differential_expression.csv
│       └── cell_types.csv
├── atac_cut_sites.bigwig
├── atac_fragments.tsv.gz
├── atac_fragments.tsv.gz.tbi
├── atac_peak_annotation.tsv
├── atac_peaks.bed
├── cloupe.cloupe
├── filtered_feature_bc_matrix
│   ├── barcodes.tsv.gz
│   ├── features.tsv.gz
│   └── matrix.mtx.gz
├── filtered_feature_bc_matrix.h5
├── gex_molecule_info.h5
├── per_barcode_metrics.csv
├── raw_feature_bc_matrix
│   ├── barcodes.tsv.gz
│   ├── features.tsv.gz
│   └── matrix.mtx.gz
├── raw_feature_bc_matrix.h5
├── summary.csv
└── web_summary.html

More information about the contents of the pipeline output directory can be found in the Pipestance Structure page.

Brief description of output files:

File Name	Description
`web_summary.html`	Run summary metrics and charts in HTML format.
`summary.csv`	Run summary metrics in CSV format.
`raw_feature_bc_matrix.h5`	Raw feature barcode matrix stored as a CSC sparse matrix in hdf5 format. The rows consist of all the gene and peak features concatenated together and the columns consist of all observed barcodes with non-zero signal for either ATAC or gene expression.
`raw_feature_bc_matrix`	Raw feature barcode matrix stored as a CSC sparse matrix in MEX format. The rows consist of all the gene and peak features concatenated together and the columns consist of all observed barcodes with non-zero signal for either ATAC or gene expression.
`per_barcode_metrics.csv`	ATAC and GEX read count summaries generated for every barcode observed in the experiment. For more details see Per-barcode metrics.
`gex_possorted_bam.bam`	GEX reads aligned to the genome and transcriptome annotated with barcode information in BAM format.
`gex_possorted_bam.bam.bai`	Index for gex_possorted_bam.bam.
`gex_molecule_info.h5`	Count and barcode information for every GEX molecule observed in the experiment in hdf5 format.
`filtered_feature_bc_matrix.h5`	Filtered feature barcode matrix stored as a CSC sparse matrix in hdf5 format. The rows consist of all the gene and peak features concatenated together (identical to raw feature barcode matrix) and the columns are restricted to those barcodes that are identified as cells.
`filtered_feature_bc_matrix`	Filtered feature barcode matrix stored as a CSC sparse matrix in MEX format. The rows consist of all the gene and peak features concatenated together (identical to raw feature barcode matrix) and the columns are restricted to those barcodes that are identified as cells.
`cloupe.cloupe`	Loupe Browser visualization file with all the analysis outputs.
`atac_possorted_bam.bam`	ATAC reads aligned to the genome annotated with barcode information in BAM format.
`atac_possorted_bam.bam.bai`	Index for atac_possorted_bam.bam.
`atac_peaks.bed`	Locations of open-chromatin regions identified in this sample. These regions are referred to as "peaks".
`atac_peak_annotation.tsv`	Annotations of peaks based on genomic proximity alone. Note that these are not functional annotations and they do not make use of linkage with GEX data.
`atac_fragments.tsv.gz`	Count and barcode information for every ATAC fragment observed in the experiment in TSV format.
`atac_fragments.tsv.gz.tbi`	Index for atac_fragments.tsv.gz.
`atac_cut_sites.bigwig`	Genome track of observed transposition sites in the experiment smoothed at a resolution of 400 bases in BIGWIG format.
`analysis`	Various secondary analyses that utilize the ATAC data, the GEX data, and their linkage: dimensionality reduction and clustering results for the ATAC and GEX data, differential expression, and differential accessibility for all clustering results above and linkage between ATAC and GEX data. See Analysis Overview for more information.

Overview

Outputs at a glance