Cell Ranger ARC's pipelines analyze sequencing data produced from Chromium Epi Multiome.
The Cell Ranger ARC pipeline can only analyze Gene Expression and ATAC data together. It must not be used to analyze Gene Expression or ATAC alone.
The analysis involves the following steps:
- Run BCL Convert on the Illumina BCL output folder to generate FASTQ files.
- Run a separate instance of
cellranger-arc countfor each GEM well that was demultiplexed in the previous step.
For the following example, assume that one sample is processed using Epi Multiome to generate a Multiome ATAC library and a Multiome Gene Expression (GEX) library. The Multiome ATAC library is sequenced on flow cell HNATACSQXX and the Illumina BCL output is located in /sequencing/Sample_ATAC_HNATACSQXX; similarly, the Multiome GEX library is sequenced on flow cell HNGEXSQXXX and the Illumina BCL output is located in /sequencing/Sample_GEX_HNGEXSQXXX.
Reference packages for human (GRCh38) and mouse (mm10) compatible with Cell Ranger ARC are available for download. You can also create a reference package using cellranger-arc mkref starting with a genome assembly FASTA file, a GTF file of gene annotations, and optionally a file of transcription factor motifs in JASPAR format.
Construct a 3-column libraries CSV file that specifies the location of the ATAC and GEX FASTQ files associated with the sample.
| Column Name | Description |
|---|---|
fastqs | A fully qualified path to the directory containing the demultiplexed FASTQ files for this sample. This field does not accept comma-delimited paths. If you have multiple sets of fastqs for this library, add an additional row, and use the same library_type value. |
sample | Sample name assigned as the Sample_ID in the demultiplexing sample sheet. |
library_type | This field is case-sensitive and must exactly match Chromatin Accessibility for a Multiome ATAC library and Gene Expression for a Multiome GEX library. |
For our example, the file would look as follows:
fastqs,sample,library_type
/home/jdoe/runs/HNGEXSQXXX/outs/fastq_path,example,Gene Expression
/home/jdoe/runs/HNATACSQXX/outs/fastq_path,example,Chromatin Accessibility
The CSV contains two rows, as the sequence data for GEX and ATAC likely came from different flow cells. The library_type is restricted to be either Gene Expression or Chromatin Accessibility.
To generate single cell feature counts and secondary analyses for a single library, run cellranger-arc count with the following arguments. For a complete listing of the arguments accepted, see the Command Line Argument Reference, or run cellranger-arc count --help.
count pipeline. It is run locally (not cloud-based) by default, unless turned off. For details on all available cell type annotation models and analysis implementations for Multiome ATAC + Gene Expression data, refer to the Cell Annotation Data pipeline documentation.After determining these input arguments, run cellranger-arc:
$ cd /home/jdoe/runs
$ cellranger-arc count --id=sample345 \
--reference=/opt/refdata-cellranger-arc-GRCh38-2020-A-2.0.0 \
--libraries=/home/jdoe/runs/libraries.csv \
--create-bam=false \
--localcores=16 \
--localmem=64
Following a series of checks to validate input arguments, cellranger-arc count pipeline stages will begin to run:
Martian Runtime - v4.0.5
Running preflight checks (please wait)...
Checking FASTQ folder...
Checking reference...
Checking reference_path (/opt/refdata-cellranger-arc-GRCh38-2020-A-2.0.0) on compute-server32...
Checking chemistry...
Checking optional arguments...
...
By default, cellranger-arc will use all the cores available on your system to execute pipeline stages. You can specify a different number of cores to use with the --localcores option; for example, --localcores=16 will limit cellranger-arc to using up to sixteen cores at once. Similarly, --localmem will restrict the amount of memory (in GB) used by cellranger-arc.
The pipeline will create a new folder named with the sample ID you specified (e.g. /home/jdoe/runs/sample345) for its output. If this folder already exists, cellranger-arc will assume it is an existing pipestance and attempt to resume running it.
A successful cellranger-arc count run should conclude with a message similar to this:
Outputs:
- Secondary analysis outputs:
clustering:
atac: {
...
}
gex: {
...
}
dimensionality_reduction:
atac: {
...
}
gex: {
...
}
feature_linkage:
...
tf_analysis:
...
- Run summary HTML: /home/jdoe/runs/sample345/outs/web_summary.html
- Run summary metrics CSV: /home/jdoe/runs/sample345/outs/summary.csv
- Per barcode summary metrics: /home/jdoe/runs/sample345/outs/per_barcode_metrics.csv
- Filtered feature barcode matrix MEX: /home/jdoe/runs/sample345/outs/filtered_feature_bc_matrix
- Filtered feature barcode matrix HDF5: /home/jdoe/runs/sample345/outs/filtered_feature_bc_matrix.h5
- Raw feature barcode matrix MEX: /home/jdoe/runs/sample345/outs/raw_feature_bc_matrix
- Raw feature barcode matrix HDF5: /home/jdoe/runs/sample345/outs/raw_feature_bc_matrix.h5
- Loupe browser visualization file: /home/jdoe/runs/sample345/outs/cloupe.cloupe
- GEX Position-sorted alignments BAM: /home/jdoe/runs/sample345/outs/gex_possorted_bam.bam
- GEX Position-sorted alignments BAM index: /home/jdoe/runs/sample345/outs/gex_possorted_bam.bam.bai
- GEX Per molecule information file: /home/jdoe/runs/sample345/outs/gex_molecule_info.h5
- ATAC Position-sorted alignments BAM: /home/jdoe/runs/sample345/outs/atac_possorted_bam.bam
- ATAC Position-sorted alignments BAM index: /home/jdoe/runs/sample345/outs/atac_possorted_bam.bam.bai
- ATAC Per fragment information file: /home/jdoe/runs/sample345/outs/atac_fragments.tsv.gz
- ATAC Per fragment information index: /home/jdoe/runs/sample345/outs/atac_fragments.tsv.gz.tbi
- ATAC peak locations: /home/jdoe/runs/sample345/outs/atac_peaks.bed
- ATAC smoothed transposition site track: /home/jdoe/runs/sample345/outs/atac_cut_sites.bigwig
- ATAC peak annotations based on proximal genes: /home/jdoe/runs/sample345/outs/atac_peak_annotation.tsv
Waiting 6 seconds for UI to do final refresh.
Pipestance completed successfully!
yyyy-mm-dd hh:mm:ss Shutting down.
Saving pipestance info to "sample345/sample345.mri.tgz"
Once cellranger-arc count has successfully completed, you can browse the resulting summary HTML file in any supported web browser, open the .cloupe file in Loupe Browser, or refer to the Understanding Output section to explore the data manually.