Support homeSpace RangerAnalysis
Specifying Input FASTQs to Space Ranger

Specifying Input FASTQs to Space Ranger

Space Ranger requires FASTQ files as input, which typically come from running demultiplexing software, (e.g., Illumina’s BCL Convert). However, it is possible to use FASTQ files from other sources such as a published dataset or the 10x Genomics bamtofastq tool. Check the compatible products page for sequencer platforms that are compatible with spatial gene expression assays.

[Sample Name]_S1_L00[Lane Number]_[Read Type]_001.fastq.gz

or

[Sample Name]_S1_[Read Type]_001.fastq.gz

Where Read Type is one of:

  • I1: Sample index read (optional)
  • I2: Sample index read (optional)
  • R1: Read 1
  • R2: Read 2
Space Ranger now accepts file names without [Lane Number], e.g., sample1_S1_R1_001.fastq.gz.

For experiments with Gene Expression (GEX) data, the arguments available for specifying which FASTQ files spaceranger count should use are listed below. The FASTQ files are specified by providing the path to the folder containing them (--fastqs) and then optionally restricting the selection by specifying the samples and/or lanes of interest. Space Ranger scans the folder’s subdirectories to locate the *.fastq.gz files. Make sure there are no duplicate sequence files in the subdirectories.

Here are the arguments available for specifying which FASTQ files spaceranger count should use:

ArgumentBrief Description
--fastqsRequired. The folder containing the FASTQ files to be analyzed. If the files are in multiple folders, for instance because one library was sequenced across multiple flow cells, provide a comma-separated list of paths.
--librariesRequired for gene + protein expression analysis. Path to a libraries.csv file declaring input libraries. See this page for details. If the libraries.csv file is provided, do not use --fastqs or --sample.
--sampleOptional. Sample name to analyze. Use the [Sample_Name] shown above. Multiple names may be provided as a comma-separated list, in which case they will be treated as one sample.
--lanesOptional. Lanes associated with this sample. Defaults to all lanes.

For multiomic experiments, separate libraries for the gene and protein expression reads are generated. In this case, you must construct a CSV file indicating the input data folder, sample name, and library type of each input library, then pass this file to spaceranger count using the --libraries flag. Please see the GEX + PEX Analysis page for details on how to construct the libraries.csv file.

If you have multiple directories and/or multiple FASTQ files from the same library to process as a single analysis:

├── flowcell_1 │ ├── Sample-GA-A1 │ │ ├── Sample-GA-A1_S1_L001_I1_001.fastq.gz │ │ ├── Sample-GA-A1_S1_L001_R1_001.fastq.gz │ │ ├── Sample-GA-A1_S1_L001_R2_001.fastq.gz │ │ └── Sample-GA-A1_S1_L001_I2_001.fastq.gz ├── flowcell_2 │ ├── Sample-GA-A1 │ │ ├── Sample-GA-A1_S1_L001_I1_001.fastq.gz │ │ ├── Sample-GA-A1_S1_L001_R1_001.fastq.gz │ │ ├── Sample-GA-A1_S1_L001_R2_001.fastq.gz │ │ └── Sample-GA-A1_S1_L001_I2_001.fastq.gz

In this case, the count pipeline would look like this:

spaceranger count --fastqs=/flowcell_1,/flowcell_2 --sample=Sample-GA-A1