Cell Ranger multi Filtered Outputs

The per_samples_outs/ directory is produced after a successful execution of the multi pipeline and contains filtered data, i.e., data from cell-associated barcodes in this sample. These are the main outputs of interest.

Contents of the following folders located within the per_samples_outs/ directory are described here. Click on the folder name below or scroll down to learn more.

count
vdj_t or vdj_t_gd
vdj_b
antigen_analysis

The count/ folder contains the results of 5' Single Cell Gene Expression analysis:


├── count
├── analysis
│   ├── clustering
│   ├── diffexp
│   ├── pca
│   ├── tsne
│   └── umap
├── aggregate_barcodes.csv
├── feature_reference.csv
├── sample_cloupe.cloupe
├── sample_filtered_barcodes.csv
├── sample_filtered_feature_bc_matrix
│   ├── barcodes.tsv.gz
│   ├── features.tsv.gz
│   └── matrix.mtx.gz
├── sample_filtered_feature_bc_matrix.h5
├── sample_molecule_info.h5
├── sample_alignments.bam
└── sample_alignments.bam.bai

File/Folder	Description
`analysis`	Folder containing the results of graph-based clusters and K-means clustering 2-10; differential gene expression analysis between clusters; and PCA, t-SNE, and UMAP dimensionality reduction. Learn more
`aggregate_barcodes.csv`	Contents from both antibody and antigen aggregate barcode algorithms. If Antibody and Antigen Capture Libraries are included, and a specific barcode has been determined to be both an antigen and an antibody aggregate, this file contains two lines for that barcode. The first line is the antibody UMI count and the second line is the antigen UMI count associated with that aggregate barcode. The `library_type` column distinguishes antibody vs. antigen aggregate barcodes.
`feature_reference.csv`	A copy of the input `feature_reference.csv`
`sample_cloupe.cloupe`	A Loupe Browser readable file.
`sample_filtered_barcodes.csv`	File containing a list of barcodes associated with aligned reads. The barcode sequence ends in a suffix with a dash separator followed by a number. The number denotes a GEM well, and is used to virtualize barcodes in order to achieve a higher effective barcode diversity when combining samples generated from separate GEM chip channel runs. The number should be “1” across all barcodes when analyzing a sample from a single GEM well. The suffix-based preservation of GEM well information is especially useful when running `cellranger aggr` on multiple libraries generated from different GEM chip channels.
`sample_filtered_feature_bc_matrix`	Contains only detected cell-associated barcodes. Each element of the matrix is the number of UMIs associated with a feature (row) and a barcode (column). This file can be input into third-party packages and allows users to wrangle the barcode-feature matrix (e.g. to filter outlier cells, run dimensionality reduction, normalize gene expression). This file is similar to the `filtered_feature_bc_matrix` file described here
`sample_filtered_feature_bc_matrix.h5`	Same information as `sample_molecule_bc_matrix` in H5 format.
`sample_molecule_info.h5`	Contains per-molecule information for all molecules that contain a valid barcode and valid UMI and were assigned with high confidence to a gene or Feature Barcode. This file is a required input to run `cellranger aggr`. Learn more
`sample_alignments.bam`	Indexed BAM file containing position-sorted reads aligned to the genome and transcriptome, as well as unaligned reads. Learn more
`sample_alignments.bam.bai`	Companion file to the `sample_alignment.bam` that serves as an external index. In cases where the reference transcriptome is generated from a genome with very long chromosomes (>512 Mbp), Cell Ranger v7.0+ generates a `sample_alignments.bam.csi` index file instead.

Important

TCR with gamma-delta chains: The cellranger multi pipeline allows users to analyze TCR libraries enriched for gamma (TRG) and delta (TRD) chains. However gamma-delta analysis is not a supported workflow and algorithm performance cannot be guaranteed. TRG/D outputs are located in the outs/multi/vdj_t_gd folder. Output files in the vdj_t_gd folder are similar to those of vdj_t/b.

The vdj_t/ and vdj_b/ folders contain the results of V(D)J immune profiling analysis for T cells and B cells, respectively. The output file names and file structure in these folders are identical, and are only described once:


|── vdj_b/t
    ├── airr_rearrangement.tsv
    ├── cell_barcodes.json
    ├── clonotypes.csv
    ├── concat_ref.bam
    ├── concat_ref.bam.bai
    ├── concat_ref.fasta
    ├── concat_ref.fasta.fai
    ├── consensus_annotations.csv
    ├── consensus.bam
    ├── consensus.bam.bai
    ├── consensus.fasta
    ├── consensus.fasta.fai
    ├── donor_regions.fa
    ├── filtered_contig_annotations.csv
    ├── filtered_contig.fasta
    ├── filtered_contig.fastq
    ├── vdj_contig_info.pb
    └── vloupe.vloupe

File/Folder	Description
`airr_rearrangement.tsv`	Annotated contigs and consensus sequences of V(D)J rearrangements in the AIRR format. Learn more
`cell_barcodes.json`	List of barcodes identified as T/B cells.
`clonotypes.csv`	High-level descriptions of each clonotype. Learn more
`concat_ref.bam`	For each clonotype consensus, each reference sequence is the annotated germline segments concatenated together. This file shows how both the per-cell contigs and the clonotype consensus contig relate to the germline reference. `concat_ref.bam` is expected to reveal polymorphisms, somatic mutations, and recombination-induced differences such as non-templated nucleotide additions.
`concat_ref.bam.bai`	Companion file to the `concat_ref.bam` that serves as an external index.
`concat_ref.fasta`	Concatenated V(D)J reference segments for the segments detected on each consensus sequence. These serve as an approximate reference for each consensus sequence.
`concat_ref.fasta.fai`	Companion file to the `concat_ref.fasta` that serves as an external index.
`consensus_annotations.csv`	High-level and detailed annotations of each clonotype consensus sequence.
`consensus.bam`	Each reference sequence is a clonotype consensus sequence, and each record is an alignment of a single cell's contig against this consensus. For a clonotype consensus sequence, this file shows how the constituent per-cell assemblies support the consensus.
`consensus.bam.bai`	Companion file to the `consensus.bam` that serves as an external index.
`consensus.fasta`	The clonotype consensus sequences is the consensus sequence of each assembled contig. It is identical to the sequence of the top (most frequent) exact subclonotype. The consensus sequence should be full-length (starting in the 5' UTR and ending at the C gene primer binding site). Poor data quality may result in partial sequence.
`consensus.fasta.fai`	Companion file to the `consensus.fasta` that serves as an external index.
`filtered_contig_annotations.csv`	High-level annotations of each high-confidence, cellular contig. This is a subset of all_contig_annotations.csv. Learn more
`filtered_contig.fasta`	High-confidence contig sequences in cell barcodes in FASTA format.
`filtered_contig.fastq`	High-confidence contig sequences in cell barcodes in FASTQ format.
`vdj_contig_info.pb`	This file stores the contig annotations, V(D)J reference and additional metadata in a protobuf binary file format. This file is required to run the `cellranger aggr` pipeline. Learn more
`vloupe.vloupe`	Loupe V(D)J Browser readable file.

Folder containing the results of Antigen Capture analysis. Only present if an Antigen Capture library is included in the analysis. The two files in this folder are antigen_specificity_scores.csv (if the [antigen-specificity] section was provided in the multi config CSV) and per_barcode.csv.

The primary outputs of the antigen specificity algorithm are located in the antigen_specificity_scores.csv. The barcode column shows all cell-associated barcodes, the antigen and antigen_umi columns show on-target antigen IDs and per barcode on-target antigen UMI counts, and the control and control_umi columns show the negative control antigen IDs and negative control antigen UMI counts. The antigen specificity score is calculated per barcode (described in the Antigen Algorithm page and reported in the antigen_specificity_scores column. For a TCR Antigen Capture (BEAM-T) library, MHC allele ID is shown in the mhc_allele column. If a given barcode is associated with a clonotype, the clonotype and exact sub-clonotype IDs are reported in the raw_clonotype_id and exact_subclonotype_id columns, respectively.


barcode,antigen,antigen_umi,control,control_umi,antigen_specificity_score,mhc_allele,raw_clonotype_id,exact_subclonotype_id
AAACGGGAGCCCGAAA-1,BEAM01,0,BEAM12,2,0.0,HLA-A*02:01,clonotype1,1
AAACGGGAGCCCGAAA-1,BEAM02,0,BEAM12,2,0.0,HLA-A*02:01,clonotype1,1
AAACGGGAGCCCGAAA-1,BEAM03,0,BEAM12,2,0.0,HLA-A*02:01,clonotype1,1
AAACGGGAGCCCGAAA-1,BEAM04,0,BEAM12,2,0.0,HLA-A*02:01,clonotype1,1

The per_barcode.csv is a barcode lookup table to find barcodes that are called as Gene Expression or V(D)J cells. The is_gex_cell column identifies barcodes called as cells based on the Gene Expression library, the is_vdj_cell column identifies barcodes called as cells based on the V(D)J library, the raw_clonotype_id column shows the clonotype ID assigned to that barcode (if one exists), and the exact_subclonotype_id column shows the exact subclonotype ID assigned to that barcode (if one exists).


barcode,is_gex_cell,is_vdj_cell,raw_clonotype_id,exact_subclonotype_id
AAACCTGAGGTAGCCA-1,true,true,clonotype1,1
AAACCTGAGTGCTGCC-1,true,true,clonotype1,1
AAACCTGCAATCCGAT-1,true,true,clonotype1,1
AAACCTGGTACCGGCT-1,true,true,clonotype1,1
AAACCTGTCAGTTAGC-1,true,true,clonotype1,1
AAACCTGTCCGAAGAG-1,true,false,,
AAACGGGAGCTCAACT-1,true,true,clonotype1,1
AAACGGGCAGTAACGG-1,true,true,clonotype1,1
AAACGGGTCATAACCG-1,true,true,clonotype1,1
AAAGATGCAAGCGAGT-1,true,true,clonotype1,1
AAAGATGCACGGACAA-1,true,false,,

Overview

count

vdj_t/vdj_b

antigen_analysis

Antigen specificity scores CSV

Per barcode CSV