10x Genomics Support/Cell Ranger/Advanced/

Cell Ranger Multi Config CSV (Manual Page)

The cellranger multi pipeline uses a configuration CSV file to specify input file paths and analysis options. The general layout for all analyses includes the [gene-expression] and [libraries] sections. The other sections may be included depending on the analysis.

The information below is divided by config section and it is noted if the option only applies to certain assays or has unique recommendations for specific analyses. Example config CSV layouts for specific assays are shown on these pages:

The [gene-expression] section specifies information about the Gene Expression library.

FieldDescription
referenceRequired. Absolute path to folder containing 10x Genomics-compatible genome reference.
create-bamRequired. Enable or disable BAM file generation. Setting create-bam=false reduces the total computation time and the size of the output directory (BAM file not generated). We recommend setting create-bam=true if unsure. See https://10xgen.com/create-bam for additional guidance.
r1-lengthOptional. Limit the length of the input Read 1 sequence of Gene Expression libraries to the first N bases, where N is a user-supplied value. Note that the length includes the 10x Barcode and UMI sequences so do not set this below 26. This and r2-length are useful options for determining the optimal read length for sequencing. Default: do not trim Read 1.
r2-lengthOptional. Limit the length of the input Read 2 sequence of Gene Expression libraries to the first N bases, where N is a user-supplied value. Trimming occurs before sequencing metrics are computed and therefore, limiting the length of Read 2 may affect Q30 scores. Default: do not trim Read 2.
chemistryOptional. Assay configuration. By default, the assay configuration is detected automatically (recommended). Typically, users will not need to specify a chemistry. However, options are available if needed. Default: auto. Starting from Cell Ranger v8.0, it is possible to specify library-specific assay configurations. For details, refer to the Libraries section.
expect-cellsOptional. Override the pipeline’s auto-estimation of cells. See cell calling algorithm overview for details on how this parameter is used. If used, enter the expected number of recovered cells.

  • Up to 30,000 cells are supported with standard kits for 3' Cell Multiplexing and up to 60,000 cells for HT kits.

  • For FRP, specifying library-level expect-cells in the [gene-expression] section is only valid for the singleplex FRP configuration; note this option name has a dash (-).
  • force-cellsOptional. Force pipeline to use this number of cells, bypassing cell detection. Default: detect cells using Cell Ranger's cell calling algorithm.

    For FRP, specifying library-level force-cells in the [gene-expression] section is only valid for the singleplex Fixed RNA Profiling configuration; note this option name has a dash (-).
    include-intronsOptional. Set to false to exclude intronic reads in count. Including introns in analysis is recommended to maximize sensitivity. Default: true

    This option does not apply to Fixed RNA Profiling analysis.
    no-secondaryOptional. Disable secondary analysis, e.g. clustering. Default: false
    check-library-compatibilityOptional. This option allows users to disable the check that evaluates 10x Barcode overlap between libraries when multiple libraries are specified (e.g., Gene Expression + Antibody Capture). Setting this option to false will disable the check across all library combinations. We recommend running this check (default), however if the pipeline errors out, users can bypass the check to generate outputs for troubleshooting. Default: true

    These are the options for 3', Fixed RNA Profiling (FRP), and 5' chemistries.

    • auto: Chemistry autodetection (default)
    • threeprime: Single Cell 3'
    • SC3Pv1, SC3Pv2, SC3Pv3, SC3Pv4: Single Cell 3' v1, v2, v3, or v4
    • SC3Pv3HT: Single Cell 3' v3.1 HT
    • SC-FB: Single Cell Antibody-only 3' v2 or 5'
    • fiveprime: Single Cell 5'
    • SC5P-PE: Paired-end Single Cell 5'
    • SC5P-R2: R2-only Single Cell 5'
    • SC5P-R2-v3: R2-only Single Cell 5' v3
    • SC5PHT : Single Cell 5' v2 HT
    • SFRP: Fixed RNA Profiling (Singleplex)
    • MFRP: Fixed RNA Profiling (Multiplex, Probe Barcode on R2)
    • MFRP-R1: Fixed RNA Profiling (Multiplex, Probe Barcode on R1)
    • MFRP-RNA: Fixed RNA Profiling (Multiplex, RNA, Probe Barcode on R2)
    • MFRP-Ab: Fixed RNA Profiling (Multiplex, Antibody, Probe Barcode at R2:69)
    • MFRP-Ab-R2pos50: Fixed RNA Profiling (Multiplex, Antibody, Probe Barcode at R2:50)
    • MFRP-RNA-R1: Fixed RNA Profiling (Multiplex, RNA, Probe Barcode on R1)
    • MFRP-Ab-R1: Fixed RNA Profiling (Multiplex, Antibody, Probe Barcode on R1)
    • ARC-v1 for analyzing the Gene Expression portion of Multiome data. If Cell Ranger auto-detects ARC-v1 chemistry, an error is triggered.

    These [gene-expression] options only apply to 3' Cell Multiplexing data analysis.

    FieldDescription
    min-assignment-confidenceOptional. The minimum estimated likelihood to call a sample as tagged with a Cell Multiplexing Oligo (CMO) instead of "Unassigned". Users may wish to tolerate a higher rate of mis-assignment in order to obtain more singlets to include in their analysis, or a lower rate of mis-assignment at the cost of obtaining fewer singlets. By default, this value is 0.9. Contact [email protected] for further advice.
    cmo-setOptional. The default CMO reference IDs are built into the Cell Ranger software and do not need to be specified. However, this option can be used to specify the path to a custom CMO set CSV file, declaring CMO constructs and associated barcodes. See CMO Reference section for details.
    barcode-sample-assignmentOptional. Absolute path to a barcode-sample assignment CSV file that specifies the barcodes that belong to each sample. See details below to set up this file.

    These [gene-expression] options only apply to Fixed RNA Profiling data analysis.

    FieldDescription
    probe-setRequired. Absolute path to the probe set reference CSV file. This file is included with the Cell Ranger package v7.0 and later (i.e., cellranger-x.y.z/probe_sets/) and on the Downloads page.
    filter-probesOptional. Include all non-deprecated probes listed in the probe set reference CSV file. Probes that are predicted to have off-target activity to homologous genes are excluded from analysis by default. Setting filter-probes to false will result in UMI counts from all non-deprecated probes, including those with predicted off-target activity, to be used in the analysis. Probes whose ID is prefixed with DEPRECATED are always excluded from the analysis. Default: true

    The [feature] section specifies information about the Feature Barcode library.

    FieldDescription
    referenceRequired only for Antibody Capture, Antigen Capture, or CRISPR Guide Capture libraries. Absolute path to the Feature reference CSV file, declaring Feature Barcode constructs and associated barcodes.
    r1-lengthOptional. Limit the length of the input Read 1 sequence of Feature Barcode libraries to the first N bases, where N is a user-supplied value. Note that the length includes the 10x Barcode and UMI sequences so do not set this below 26. This and r2-length are useful options for determining the optimal read length for sequencing. Default: do not trim Read 1.
    r2-lengthOptional. Limit the length of the input Read 2 sequence of Feature Barcode libraries to the first N bases, where N is a user-supplied value. Trimming occurs before sequencing metrics are computed and therefore, limiting the length of Read 2 may affect Q30 scores. Default: do not trim Read 2.

    The [libraries] section specifies all the input library data (see also Specifying Input FASTQ Files).

    FieldDescription
    fastq_idRequired. The Illumina sample name to analyze. This will be as specified in the sample sheet supplied to the demultiplexing software.
    fastqsRequired. Absolute path to the folder containing the FASTQ files to be analyzed. Generally, this will be the fastq_path folder generated by the demultiplexing software. If the same library was sequenced on multiple flow cells, the FASTQs folder from each flow cell must be specified a separate line in the CSV (see 5' example here). Doing this will treat all reads from the library, across flow cells, as one sample.
    If you have multiple libraries for the sample, you will need to run cellranger multi on them individually, and then combine them with cellranger aggr.
    feature_typesRequired. The underlying feature type of the library (listed below).
    lanesOptional. The lanes associated with this sample, separated with a pipe (e.g., 1|2). Default: uses all lanes
    physical_library_idOptional. Library type. Note: by default, the library type is detected automatically based on specified feature_types (recommended). Users typically do not need to include the physical_library_id column in the CSV file.
    subsample_rateOptional. The rate at which reads from the provided FASTQ files are sampled. Must be strictly greater than 0 and less than or equal to 1.
    chemistryOptional (only applicable to FRP). Library-specific assay configuration. By default, the assay configuration is detected automatically (recommended). Typically, users will not need to specify a chemistry. However, options are available if needed (see chemistry options). Default: auto

    These are the options for 3', Fixed RNA Profiling (FRP), and 5' feature types.

    • Gene Expression
    • Antibody Capture
    • CRISPR Guide Capture
    • Multiplexing Capture for 3' Cell Multiplexing
    • VDJ
    • VDJ-T
    • VDJ-T-GD
    • VDJ-B
    • Antigen Capture
    • Custom

    Antigen Capture should be used only for BEAM libraries. For other (non-BEAM) antigen libraries (TotalSeq™-C, Immudex's dMHC Dextramer® libraries with dCODE Dextramers), set feature_types to Antibody Capture. Setting this option to VDJ will autodetect the chain type.

    The [samples] is used to specify sample-level options for multiplexed experiments.

    FieldDescription
    sample_idRequired. A name to identify a multiplexed sample. Must be alphanumeric with hyphens and/or underscores, and less than 64 characters.
    expect_cellsOptional. Override the pipeline’s auto-estimation of cells. See Gene Expression algorithm overview for details. If used, enter the expected number of recovered cells.

    For FRP, specifying sample-level expect_cells in the [samples] section is only valid for the multiplex Fixed RNA Profiling configuration; note this column name has an underscore (_).
    force_cellsOptional. Force pipeline to use this number of cells, bypassing cell detection. Default: detect cells using EmptyDrops.

    For FRP, specifying sample-level force_cells in the [samples] section is only valid for the multiplex Fixed RNA Profiling configuration; note this column name has an underscore (_).
    descriptionOptional. A description for the sample.

    This [samples] option only applies to 3' Cell Multiplexing data analysis.

    FieldDescription
    cmo_idsRequired. The Cell Multiplexing oligo IDs used to multiplex this sample. Only input CMOs used in the experiment. If multiple CMOs were used for a sample, separate IDs with a pipe (e.g., CMO301|CMO302).

    This [samples] option only applies to Fixed RNA Profiling data analysis.

    FieldDescription
    probe_barcode_idsRequired. The Fixed RNA Probe Barcode IDs used for this sample, and for multiplex GEX + Antibody Capture libraries, the corresponding Antibody Multiplexing Barcode IDs.

    We recommend specifying both barcodes in the config CSV (e.g., BC001+AB001) when an Antibody Capture library is present. The barcode pair order is BC+AB and they are separated with a "+" (no spaces). Alternatively, you can specify the Probe Barcode ID alone and Cell Ranger’s barcode pairing auto-detection algorithm will automatically match to the corresponding Antibody Multiplexing Barcode.

    If multiple Probe Barcodes were used for a sample, separate IDs with a pipe (e.g., BC001|BC002).

    The [vdj] section specifies information about the V(D)J library.

    FieldDescription
    referenceRequired for V(D)J Immune Profiling libraries. Absolute path of folder containing 10x Genomics-compatible V(D)J reference.
    inner-enrichment-primersOptional. If inner enrichment primers other than those provided in the 10x Genomics kits are used, they need to be specified here as a text file with one primer per line.
    r1-lengthOptional. Limit the length of the input Read 1 sequence of V(D)J libraries to the first N bases, where N is a user-supplied value. Note that the length includes the Barcode and UMI sequences so do not set this below 26. This and r2-length are useful options for determining the optimal read length for sequencing. Default: do not trim Read 1.
    r2-lengthOptional. Limit the length of the input Read 2 sequence of V(D)J libraries to the first N bases, where N is a user-supplied value. Trimming occurs before sequencing metrics are computed and therefore, limiting the length of Read 2 may affect Q30 scores. Default: do not trim Read 2.

    This [antigen-specificity] section is recommended if an Antigen Capture (BEAM) library is present. It is needed to calculate the antigen specificity score.

    FieldDescription
    control_idRequired. A user-defined ID for any negative controls used in the T/BCR Antigen Capture assay. Must match id specified in the Feature Reference CSV. May only include ASCII characters and must not use whitespace, slash, quote, or comma characters. Each ID must be unique and must not collide with a gene identifier from the transcriptome.
    mhc_alleleThe MHC allele for TCR Antigen Capture libraries. Must match mhc_allele name specified in the Feature Reference CSV. For BCR Antigen Capture library, analysis runs with or without this header. If you keep the header, leave rows blank.