Support homeCell Ranger 7.2Tutorials
Running Cell Ranger reanalyze

Running Cell Ranger reanalyze

This tutorial is written with Cell Ranger v6.1.2. Commands are compatible with later versions of Cell Ranger, unless noted otherwise.

The cellranger reanalyze pipeline is optional. It allows you to rerun the secondary analysis for a completed cellranger count or aggr run with different parameters. It is faster than running the whole cellranger countpipeline over again because it starts from the feature barcode matrix and not from FASTQs, so all of the aligning and UMI counting is already done.

We will call our working directory the yard. Start by making a directory.

mkdir ~/yard/run_cellranger_reanalyze cd ~/yard/run_cellranger_reanalyze

One of the more common reanalysis combinations is to increase the number of principle components (PCs) used in clustering while increasing the number of clusters used in the k-means algorithm. If we use one of the publicly-available PBMC datasets, we might want to increase the number of PCs and cluster to see if we can better separate out some of the rarer T-cell populations, such as T-regs. With this as our aim, we will start with the 1,000 PBMC experiment, and a 10,000 PBMC dataset. For this run we only need to download the matrix in H5 format.

wget https://cf.10xgenomics.com/samples/cell-exp/3.0.0/pbmc_10k_v3/pbmc_10k_v3_filtered_feature_bc_matrix.h5

Next run the cellranger reanalyze command with --help to get the usage and a full list of modifiable parameters.

cellranger reanalyze --help

The output looks similar to this:

cellranger-reanalyze Re-run secondary analysis (dimensionality reduction, clustering, etc) USAGE: cellranger reanalyze [FLAGS] [OPTIONS] --id <ID> --matrix <MATRIX_H5> FLAGS: --dry Do not execute the pipeline. Generate a pipeline invocation (.mro) file and stop --disable-ui Do not serve the web UI --noexit Keep web UI running after pipestance completes or fails --nopreflight Skip preflight checks -h, --help Prints help information ...

From this we can see that we need a matrix H5 file and a parameters CSV file. All of the modifiable parameters are listed on the Customized Secondary Analysis using cellranger reanalyze page.

Next make the parameters CSV file. Here we use nano, but you can use any text editor.

nano reanalyze_10k_pbmcs.csv

Paste the following into your text file:

num_principal_comps,14 max_clusters,15

Save the file as reanalyze_10k_pbmcs.csv.

Next, build the command:

cellranger reanalyze --id=10k_pbmc_reanalyze_pc_clust \ --matrix=pbmc_10k_v3_filtered_feature_bc_matrix.h5 \ --params=reanalyze_10k_pbmcs.csv

The output looks similar to this:

Martian Runtime - v4.0.6 ... Running preflight checks (please wait)...... ... Waiting 6 seconds for UI to do final refresh. Pipestance completed successfully! 2022-01-04 16:21:10 Shutting down.

Now that the cellranger reanalzye pipeline is finished, look at the output.

ls -1 10k_pbmc_reanalyze_pc_clust/outs/

The output looks similar to this:

├── analysis ├── cloupe.cloupe ├── filtered_feature_bc_matrix ├── filtered_feature_bc_matrix.h5 ├── params.csv └── web_summary.html

By listing the contents of the clustering folder in the analysis folder, you can see that the pipeline did output 15 clusters.

ls -1 10k_pbmc_reanalyze_pc_clust/outs/analysis/clustering/ graphclust kmeans_10_clusters kmeans_11_clusters kmeans_12_clusters kmeans_13_clusters kmeans_14_clusters kmeans_15_clusters kmeans_2_clusters kmeans_3_clusters kmeans_4_clusters kmeans_5_clusters kmeans_6_clusters kmeans_7_clusters kmeans_8_clusters kmeans_9_clusters

From here, explore the data further using the Loupe Browser or a number of other publicly available tools.