Running Cell Ranger multi with 5' Immune Profiling Data

To follow along, you must:

  • Have basic UNIX command line experience
  • Fulfill these system requirements
  • Download and install the Cell Ranger software
  • Choose a compute platform
  • Have access to a UNIX command prompt

We will work with the Human B cells dataset from a Healthy Donor (1k cells).

Watch this short video tutorial or follow text instructions to download example FASTQs.

Open up a terminal window. You may log in to a remote server or choose to perform the compute on your local machine. Refer to the System Requirements page for details.

In the working directory, create a new folder called dataset-multi-practice/ and cd into that folder:

mkdir dataset-multi-practice cd dataset-multi-practice

Download the input FASTQ files:

curl -LO https://cf.10xgenomics.com/samples/cell-vdj/6.0.0/sc5p_v2_hs_B_1k_multi_5gex_b_Multiplex/sc5p_v2_hs_B_1k_multi_5gex_b_Multiplex_fastqs.tar

A file named sc5p_v2_hs_B_1k_multi_5gex_b_Multiplex_fastqs.tar should appear in your directory when you list files with the ls command.

Uncompress the FASTQs:

tar -xf sc5p_v2_hs_B_1k_multi_5gex_b_Multiplex_fastqs.tar

You should now see a folder called sc5p_v2_hs_B_1k_multi_5gex_b_fastqs that contains two subfolders, sc5p_v2_hs_B_1k_5gex_fastqs and sc5p_v2_hs_B_1k_b_fastqs.

Navigate back to the working directory:

cd ..

Double check you are in the correct directory by running the ls command; the working directory should have the dataset-multi-practice folder.

Watch a short video tutorial or follow the text instructions below.

Download the pre-built human reference transcriptome to the working directory and uncompress it:

curl -O https://cf.10xgenomics.com/supp/cell-exp/refdata-gex-GRCh38-2020-A.tar.gz tar -xf refdata-gex-GRCh38-2020-A.tar.g

Next, download the pre-built V(D)J reference to the working directory and uncompress it:

curl -O https://cf.10xgenomics.com/supp/cell-vdj/refdata-cellranger-vdj-GRCh38-alts-ensembl-5.0.0.tar.gz tar -xf refdata-cellranger-vdj-GRCh38-alts-ensembl-5.0.0.tar.gz

Watch a short video tutorial or follow the text instructions below.

In your working directory, create a new CSV file called multi_config.csv using your text editor of choice:

nano multi_config.csv

Copy and paste this text into the newly created file, and customize file paths:

[gene-expression] reference,/jane.doe/working-directory/refdata-gex-GRCh38-2020-A expect-cells,1000 [vdj] reference,/jane.doe/working-directory/refdata-cellranger-vdj-GRCh38-alts-ensembl-5.0.0 [libraries] fastq_id,fastqs,lanes,feature_types,subsample_rate sc5p_v2_hs_B_1k_5gex,/jane.doe/working-directory/dataset-multi-practice/sc5p_v2_hs_B_1k_multi_5gex_b_fastqs/sc5p_v2_hs_B_1k_5gex_fastqs,1|2,gene expression, sc5p_v2_hs_B_1k_b,/jane.doe/working-directory/dataset-multi-practice/sc5p_v2_hs_B_1k_multi_5gex_b_fastqs/sc5p_v2_hs_B_1k_b_fastqs,1|2,vdj,

Use your text editor's save command to save the file. In nano, save by typing CTRL+XyENTER.

A customizable multi config CSV template is available for download on the example dataset page, under the Input Files tab.

Once you have all the necessary files, make a new directory called runs/ in your home directory:

mkdir runs/ cd runs/

You will run cellranger multi in the runs/ directory.

After downloading the FASTQ files, the reference transcriptome, and a V(D)J reference, you are ready to run cellranger multi.

Print the usage statement to get a list of all the options:

cellranger multi --help

The output should look similar to:

user_prompt$ cellranger multi --help cellranger-multi Analyze multiplexed data or combined gene expression/immune profiling/feature barcode data USAGE: cellranger multi [FLAGS] [OPTIONS] --id --csv FLAGS: --dry Do not execute the pipeline. Generate a pipeline invocation (.mro) file and stop --disable-ui Do not serve the web UI --noexit Keep web UI running after pipestance completes or fails --nopreflight Skip preflight checks -h, --help Prints help information OPTIONS: --id A unique run id and output folder name [a-zA-Z0- 9_-]+ --description Sample description to embed in output files [default: ] --csv Path of CSV file enumerating input libraries and analysis parameters --jobmode Job manager to use. Valid options: local (default), sge, lsf, slurm or path to a .template file. Search for help on "Cluster Mode" at support.10xgenomics.com for more details on configuring the pipeline to use a compute cluster [default: local] --localcores Set max cores the pipeline may request at one time. Only applies to local jobs ....

Options used in this tutorial

--idThe id argument must be a unique run ID. We will call this run HumanB_Cell_multi based on the sample type in the example dataset.
--csvPath to the multi config CSV file enumerating input libraries and analysis parameters. Your multi_config.csv file is in the working directory. When executing cellranger multi from the runs directory, the relative path should be: ../multi_config.csv

Watch a short video tutorial or follow the text instructions below.

From within the working-directory/runs/ directory, run cellranger multi

cellranger multi --id=HumanB_Cell_multi --csv=../multi_config.csv

The run begins similar to this:

user_prompt$ cellranger multi --id=HumanB_Cell_multi --csv=/jane.doe/working-directory/multi_config.csv Martian Runtime - v4.0.6 Serving UI at http://bespin1.fuzzplex.com:43129?auth=tIgY0u8ax70yeWhWKF61SkSgJDKvOIgZ-yjxYNJXXtY Running preflight checks (please wait)... 2022-01-06 16:36:56 [runtime] (ready) ID.HumanB_Cell_multi.SC_MULTI_CS.PARSE_MULTI_CONFIG 2022-01-06 16:36:56 [runtime] (run:hydra) ID.HumanB_Cell_multi.SC_MULTI_CS.PARSE_MULTI_CONFIG.fork0.chnk0.main 2022-01-06 16:37:26 [runtime] (chunks_complete) ID.HumanB_Cell_multi.SC_MULTI_CS.PARSE_MULTI_CONFIG 2022-01-06 16:37:26 [runtime] (ready) ID.HumanB_Cell_multi.SC_MULTI_CS.SC_MULTI_CORE.MULTI_CHEMISTRY_DETECTOR._GEM_WELL_CHEMISTRY_DETECTOR.DETECT_COUNT_CHEMISTRY 2022-01-06 16:37:26 [runtime] (run:hydra) ID.HumanB_Cell_multi.SC_MULTI_CS.SC_MULTI_CORE.MULTI_CHEMISTRY_DETECTOR._GEM_WELL_CHEMISTRY_DETECTOR.DETECT_COUNT_CHEMISTRY.fork0.chnk0.main ....

When the output of the cellranger multi command says, “Pipestance completed successfully!”, the job is done:

web_summary: /jane.doe/working-directory/runs/HumanB_Cell_multi/outs/per_sample_outs/HumanB_Cell_multi/web_summary.html metrics_summary: /jane.doe/working-directory/runs/HumanB_Cell_multi/outs/per_sample_outs/HumanB_Cell_multi/metrics_summary.csv } Waiting 6 seconds for UI to do final refresh. Pipestance completed successfully!

Watch a short video tutorial or follow the text instructions below.

Video tutorial Text instructions

A successful cellranger multi run produces a new directory called HumanB_Cell_multi/ (based on the --id flag specified during the run). The contents of the HumanB_Cell_multi/ directory:

── runs └── HumanB_Cell_multi ├── _cmdline ├── _filelist ├── _finalstate ├── HumanB_Cell_multi.mri.tgz ├── _invocation ├── _jobmode ├── _log ├── _mrosource ├── outs/ ├── _perf ├── SC_MULTI_CS/ ├── _sitecheck ├── _tags ├── _timestamp ├── _uuid ├── _vdrkill └── _versions

The outs/ directory contains all important output files generated by the cellranger multi pipeline:

── runs └── HumanB_Cell_multi └──outs ├── config.csv ├── multi │ ├── count │ │ ├── raw_cloupe.cloupe │ │ ├── raw_feature_bc_matrix │ │ ├── raw_feature_bc_matrix.h5 │ │ ├── raw_molecule_info.h5 │ │ ├── unassigned_alignments.bam │ │ └── unassigned_alignments.bam.bai │ └── vdj_b │ ├── all_contig_annotations.bed │ ├── all_contig_annotations.csv │ ├── all_contig_annotations.json │ ├── all_contig.bam │ ├── all_contig.bam.bai │ ├── all_contig.fasta │ ├── all_contig.fasta.fai │ └── all_contig.fastq ├── per_sample_outs │ └── HumanB_Cell_multi │ ├── count │ ├── metrics_summary.csv │ ├── vdj_b │ └── web_summary.html └── vdj_reference ├── fasta │ ├── donor_regions.fa │ └── regions.fa └── reference.json