Support homeCell Ranger ATACAnalysis
Specifying Input FASTQ Files for Cell Ranger ATAC

Specifying Input FASTQ Files for Cell Ranger ATAC

The cellranger-atac mkfastq pipeline is deprecated and will be removed in a future release. Please use Illumina's BCL Convert to generate Cell Ranger ATAC-compatible FASTQ files.

Cell Ranger ATAC requires Illumina FASTQ files as input, which typically come from running demultiplexing software (e.g., Illumina's BCL Convert). However, it is possible to use FASTQ files from other sources, such as a published dataset or the 10x Genomics bamtofastq tool.

To serve as inputs for cellranger-atac, FASTQ files should conform to the following naming conventions:

[Sample Name]_S1_L00[Lane Number]_[Read Type]_001.fastq.gz

or

[Sample Name]_S1_[Read Type]_001.fastq.gz

Where Read Type is one of:

  • I1: Dual index i7 read (optional)
  • R1: Read 1
  • R2: Dual index i5 read
  • R3: Read 2

Alternatively, Cell Ranger ATAC will also accept ATAC FASTQs in this format:

  • I1: Dual index i7 read (optional)
  • R1: Read 1
  • I2: Dual index i5 read
  • R2: Read 2
For the Epi ATAC chemistry, the barcode is sequenced as part of the i5 index read. BCL-Convert conventionally associates R2 with the i5 index read, and R3 with read 2. Thus read 1, barcode, read 2, sample index are associated with R1, R2, R3, I1 respectively.

The FASTQ files are specified by providing the path to the folder containing them (--fastqs) and then optionally restricting the selection by specifying the samples and or lanes of interest.

Cell Ranger ATAC scans the folder's subdirectories to locate the *.fastq.gz files. Make sure there are no duplicate sequence files in the subdirectories.

Here are the arguments available for specifying which FASTQ files cellranger-atac should use:

ArgumentBrief Description
--fastqsRequired. The folder containing the FASTQ files to be analyzed. If the files are in multiple folders, for instance because one library was sequenced across multiple flow cells, provide a comma-separated list of paths.
--sampleOptional. Sample name to analyze. Use the [Sample_Name] shown above. Multiple names may be provided as a comma-separated list, in which case they will be treated as one sample.
--lanesOptional. Lanes associated with this sample. Defaults to all lanes.