3'/5' Multiplex Outputs (Cell Ranger multi)

This page details the cellranger multi output structure for 3'/5' sample multiplexing data, using a 3' on-chip multiplexing (OCM) experiment as an example.

Your specific output files and directory structure may differ from this example depending on your experimental design and analysis parameters.

Upon completion, the cellranger multi pipeline will produce an outs/ directory with a structure similar to the following:


├── config.csv
├── filtered_feature_bc_matrix
│   ├── barcodes.tsv.gz
│   ├── features.tsv.gz
│   └── matrix.mtx.gz
├── filtered_feature_bc_matrix.h5
├── multiplexing_analysis
│   └── cells_per_tag.json
├── per_sample_outs
│   ├── OH1
│   ├── OH2
│   ├── OH3
│   └── OH4
├── qc_library_metrics.csv
├── qc_report.html
├── qc_sample_metrics.csv
├── raw_cloupe.cloupe
├── raw_feature_bc_matrix
│   ├── barcodes.tsv.gz
│   ├── features.tsv.gz
│   └── matrix.mtx.gz
├── raw_feature_bc_matrix.h5
└── raw_molecule_info.h5

The files included at the top level may include:

config.csv: a duplicate of the input config CSV file.
feature_reference.csv: a duplicate of the input Feature Reference CSV file, if provided.
filtered_feature_bc_matrix.h5: filtered feature-barcode matrix (containing only barcodes called as cells), concatenated across all samples, in HDF5 format.
qc_library_metrics.csv: quality control report in CSV format for the entire experiment.
qc_report.html: quality control report in HTML format for the entire experiment.
qc_sample_metrics.csv: quality control report in CSV format by sample.
raw_cloupe.cloupe: Loupe Browser file containing all samples and all cell-associated barcodes in the experiment.
raw_feature_bc_matrix.h5: raw feature-barcode matrix (containing all barcodes) in HDF5 format.
raw_molecule_info.h5: information about all molecules in the experiment. This file includes background and cell-associated barcodes from all samples in addition to valid barcodes that were not assigned to a sample.

Continue reading for descriptions of the directories contained within the top-level outputs.

In addition to the HDF5 format listed above, two directories contain the filtered (concatenated across all samples) and raw feature-barcode matrices in Market Exchange (MEX) Format:


└── filtered_feature_bc_matrix
    ├── barcodes.tsv.gz
    ├── features.tsv.gz
    └── matrix.mtx.gz
...
└── raw_feature_bc_matrix
    ├── barcodes.tsv.gz
    ├── features.tsv.gz
    └── matrix.mtx.gz

For 3'/5' multiplexing experiments, a multiplexing_analysis directory is output. The files in this folder are generic to the entire multiplexing experiment:


└── multiplexing_analysis
    ├── assignment_confidence_table.csv
    ├── barcode_sample_assignments.csv
    ├── cells_per_tag.json
    ├── tag_calls_per_cell.csv
    └── tag_calls_summary.csv

Output File	Description
`assignment_confidence_table.csv`	A table that contains all information from the tag assignment algorithm for each cell-associated barcode, in CSV format. This file is not output for on-chip multiplexing (OCM).
`barcode_sample_assignments.csv`	If barcode-sample-assignment was specified in the multi config CSV, a duplicate of the input barcode-sample assignment CSV file will be generated. This file is not output for on-chip multiplexing (OCM).
`cells_per_tag.json`	Lists the cell-associated barcodes that were assigned a given tag, for each tag, in JSON format.
`tag_calls_summary.csv`	A table that summarizes the multiplexing results, including the number of cells assigned no tag, one tag, and more than one tag, in CSV format. This file is not output for on-chip multiplexing (OCM).
`tag_calls_per_cell.csv`	This table summarizes the assigned tags and the UMI counts per tag for all barcodes assigned as singlets or multiplets in a CSV format. Barcodes called as Unassigned or Blanks are not included in this table. This file is not output for on-chip multiplexing (OCM).

The assignment_confidence_table.csv table provides a summary of all the information from the tag-assignment algorithm for each cell-associated barcode, including the probability that a given barcode belongs to a given state. The user may modify the confidence threshold for assigning tags to barcodes on their own in a data-science environment like Python or R to enable further downstream analysis.


Row,CMO301,CMO302,Barcode,Multiplet,Blank,Assignment,Assignment_Probability,CMO301_cnts,CMO302_cnts
0,0.999721559521299,3.4331367105764475e-10,AAACCCAAGAGTGTGC-1,0.0002784107425317906,2.939285533268778e-08,CMO301,0.999721559521299,4.0744873049856905,2.444044795918076
1,2.1178952222935904e-09,0.9999759641250832,AAACCCAAGATGCTTC-1,2.3239920572975783e-06,2.1709764964310346e-05,CMO302,0.9999759641250832,2.6020599913279625,3.8196755199942927
2,0.999822627035718,3.287157579467574e-07,AAACCCAAGGTACAGC-1,8.871810165359213e-05,8.83261468704939e-05,CMO301,0.999822627035718,3.630834517828051,2.3404441148401185
3,0.8947739014808398,9.424785606470267e-09,AAACCCAAGTAGCTCT-1,0.10522608718358256,1.910792092383626e-09,Unassigned,0.8947739014808398,4.219767844658398,2.9916690073799486
4,0.033490762078236785,0.004751323355950833,AAACCCACACGGATCC-1,0.9617539697731167,3.944792695789878e-06,Multiplet,0.9617539697731167,3.6148972160331345,3.4896772916636984

Column descriptions:

CMO301, CMO302, etc.: one column per tag used in the experiment, indicates the probability that a given barcode belongs to each of those singlet states.
Barcodes: the cell-associated barcode.
Multiplet: the coarse-grained multiplet state, which contains the probability that the given barcode is some-type of multiplet. Obtained by summing the probabilities across each possible multiplet state in the experiment.
Blanks: contains the probability that the barcode contains un-tagged cells i.e., is considered to have too few tag counts for every tag.
Assignment: the state assigned by the tag-assignment algorithm for the given barcode. Values: one of the singlet states, Multiplet, Blanks, or Unassigned (if every state has probability less than 90%, which is the default minimum confidence threshold).
Assignment_Probability: the probability of the most-likely state.
CMO301_cnts, CMO302_cnts, etc.: (starting from Cell Ranger v7.1) one column per tag used in the experiment, indicates the log-transformed UMI count for each tag.

The cells_per_tag.json lists the cell-associated barcodes that were assigned a given OCM barcode tag, for each tag, in JSON format. For each OCM barcode (e.g., OB1) the cell-associated barcodes (e.g., "AAACCCTGTCAGCCGT-1", etc.) are listed below it:


{
"OB1":[
        "AAACCCTGTCAGCCGT-1",
        "AAAGGTTGTGAGGACT-1",
        "AAAGGTTGTGCACGCT-1",
        "AACATCCGTATGGCTT-1",
        "AACCTTAGTGAGATAT-1",
...
   "OB2":[
        "AAACCCGCAATGACGT-1",
        "AAACGTCCAGGGCTTA-1",
        "AAACTCACAGTTGCGA-1",
        "AACCAACCACCTATAT-1",
        "AACCAACCATAGGCGC-1",
...
}

The tag_calls_summary.csv summarizes multiplexing results by providing statistics about categories including the number of cells assigned no tag, one tag, more than one tag, etc. The category No tag assigned includes both cells that were considered Blanks and cells considered Unassigned.


Category,num_cells,pct_cells,median_umis,stddev_umis
No tag molecules,0,0.0,None,None
No tag assigned,386,2.9,None,None
1 tag assigned,12465,93.6,None,None
More than 1 tag assigned,472,3.5,None,None
CMO301,6437,48.3,3442.0,7988.3
CMO302,6028,45.2,3515.5,5167.2
CMO301|CMO302,472,3.5,12414.0,9696.0

Column descriptions:

num_cells: number of cells in the category.
pct_cells: percent of cells in the category.
median_umis: median UMI counts for the tag(s), amongst cells assigned those tag(s).
stddev_umis: standard deviation of the UMI counts for the tag(s), amongst cells assigned those tag(s).

The tag_calls_per_cell.csv file contains tag calls per cell, one line for each barcode. It contains all singlet and multiplet cells; Unassigned or Blanks are not included.


cell_barcode,num_features,feature_call,num_umis
AAACCCAAGCAACAGC-1,1,CMO301,16778
AAACCCAAGCTCGTGC-1,1,CMO301,1735
AAACCCACATGACTGT-1,1,CMO301,1625
AAACCCAGTCCACAGC-1,1,CMO301,19323
AAACCCAGTCGCGGTT-1,1,CMO301,1678

Column descriptions:

cell_barcode: cell-barcode.
num_features: number of tags assigned to that cell.
feature_call: names of tag(s) assigned, delimited by "|".
num_umis: number of molecules (UMIs) for each tag assigned, delimited by "|".

Outputs demultiplexed per sample (in a multiplexed experiment) are provided in the per_sample_outs directory. In this example, two samples are shown (OH1 and OH2):


├── OH1
│   ├── analysis
│   ├── metrics_summary.csv
│   ├── sample_cloupe.cloupe
│   ├── sample_filtered_barcodes.csv
│   ├── sample_filtered_feature_bc_matrix
│   ├── sample_filtered_feature_bc_matrix.h5
│   ├── sample_molecule_info.h5
│   ├── sample_raw_feature_bc_matrix
│   ├── sample_raw_feature_bc_matrix.h5
│   └── web_summary.html
├── OH2
│   ├── analysis
│   ├── metrics_summary.csv
│   ├── sample_cloupe.cloupe
│   ├── sample_filtered_barcodes.csv
│   ├── sample_filtered_feature_bc_matrix
│   ├── sample_filtered_feature_bc_matrix.h5
│   ├── sample_molecule_info.h5
│   ├── sample_raw_feature_bc_matrix
│   ├── sample_raw_feature_bc_matrix.h5
│   └── web_summary.html

Each per-sample directory contains the following files and folders:

analysis: secondary analysis results, including dimensionality reduction, clustering, and differential gene expression.
metrics_summary.csv: experimental metrics in CSV format.
sample_cloupe.cloupe: sample-specific Loupe Browser file
sample_filtered_barcodes.csv: sample-specific filtered barcodes in CSV format.
sample_filtered_feature_bc_matrix: sample-specific filtered feature-barcode matrix in Market Exchange (MEX) Format
sample_filtered_feature_bc_matrix.h5: sample-specific filtered feature-barcode matrix (containing only barcodes called as cells within this sample) in HDF5 format.
sample_molecule_info.h5: information about all molecules in the sample. This file includes background and cell-associated barcodes.
sample_raw_feature_bc_matrix: sample-specific raw feature-barcode matrix in Market Exchange (MEX) Format. This file is provided with on-chip multiplexing (OCM) but not cell hashing or CellPlex (CMO) datasets.
sample_raw_feature_bc_matrix.h5: sample-specific raw feature-barcode matrix (containing all barcodes assigned to this sample) in HDF5 format. This file is provided with on-chip multiplexing (OCM) but not cell hashing or CellPlex (CMO) datasets.
web_summary.html: sample-specific web summary HTML, a starting point for quality control.

Overview

Top-level directory structure

Feature-barcode matrices (MEX)

Multiplexing analysis

Assignment confidence table

Cells per tag JSON

Tag calls summary

Tag calls per cell

Per-sample outs