In this tutorial you will:
For successful run of this tutorial, you must:
- Be comfortable in the Linux environment
- Have familiarity running command line tools
- Choose a compute platform
- Have access to a system that meets the minimum system requirements
Spatial gene expression for formalin fixed paraffin embedded (FFPE) tissue is determined using [spaceranger count pipeline] which takes microscope image of the visium slide (in either TIFF
or JPEG
formats) and sample FASTQ
files as inputs. The pipeline performs alignment, tissue and fiducial detection as well as barcode/UMI counting. Outputs capture the feature-spot matrices, clustering and differential gene expression (DGE) which can be further analyzed and visualized in Loupe Browser.
In this tutorial, we will run run spaceranger count
pipeline on a mouse brain FFPE section public dataset.
Key dataset features are:
- Tissue section of 5 µm thickness
- Section Orientation: Coronal
- H&E image
- Sequencing Depth: 28,826 reads per spot
- Sequencing Coverage: Read 1 - 28 bp (includes 16 bp Spatial Barcode, 12 bp UMI); Read 2 - 120 bp (transcript); i7 sample index - 10 bp; i5 sample index - 10 bp
- Visium Slide: V11J26-127
- Capture Area: B1
All the following commands will be run in the working directory spaceranger_tutorial
that was use to setup spaceranger on a compatible compute platform.
Example FFPE dataset
Both the raw sequencing files in FASTQ
format, and the image in JPG
format, are available for batch download on the dataset page. For better organization, we will create a datasets folder prior to downloading the required file.
Download with curl
command:
# Create datasets folder
mkdir datasets
# Download FASTQ to datasets folder
curl https://cf.10xgenomics.com/samples/spatial-exp/1.3.0/Visium_FFPE_Mouse_Brain/Visium_FFPE_Mouse_Brain_fastqs.tar -o datasets/Visium_FFPE_Mouse_Brain_fastqs.tar
# Download image file to datasets folder
curl https://cf.10xgenomics.com/samples/spatial-exp/1.3.0/Visium_FFPE_Mouse_Brain/Visium_FFPE_Mouse_Brain_image.jpg -o datasets/Visium_FFPE_Mouse_Brain_image.jpg
# Expected output
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
5 4154M 5 218M 0 0 32.7M 0 0:02:06 0:00:06 0:02:00 33.0M
Alternatively, download with the wget
command:
# Create datasets folder
mkdir datasets
# Download FASTQ to datasets folder
wget -P datasets/ https://cf.10xgenomics.com/samples/spatial-exp/1.3.0/Visium_FFPE_Mouse_Brain/Visium_FFPE_Mouse_Brain_fastqs.tar
# Download image file to datasets folder
wget -P datasets/ https://cf.10xgenomics.com/samples/spatial-exp/1.3.0/Visium_FFPE_Mouse_Brain/Visium_FFPE_Mouse_Brain_image.jpg
# Expected output
Resolving cf.10xgenomics.com (cf.10xgenomics.com)... 104.18.0.173, 104.18.1.173, 2606:4700::6812:1ad, ...
Connecting to cf.10xgenomics.com (cf.10xgenomics.com)|104.18.0.173|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4356188160 (4.1G) [application/x-tar]
Saving to: ‘datasets/Visium_FFPE_Mouse_Brain_fastqs.tar’
37% [=======================> ] 1,649,129,803 207MB/s eta 13s
Reference data
Since the example dataset is based on mouse tissue section, we can download the latest version of the mouse transcriptome reference available from the Downloads page. Here the curl
download option is shown.
# Download mouse reference
curl -O https://cf.10xgenomics.com/supp/spatial-exp/refdata-gex-mm10-2020-A.tar.gz
# Expected output
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
1 9835M 1 158M 0 0 34.1M 0 0:04:48 0:00:04 0:04:44 34.1M
Probe set reference
Since this is a FFPE tissue sample, the assay uses a pair of oligonucleotide probes targeting protein coding genes. In addition to the reference transcriptome, spaceranger
also requires the species specific probe set reference file in CSV
format to enable analysis of FFPE samples. You can either download the probe set reference from the 10x support website or use the probe set references pre-bundled in Space Ranger.
# Method 1
# Download mouse probe set reference from support website
curl -O https://cf.10xgenomics.com/supp/spatial-exp/probeset/Visium_Mouse_Transcriptome_Probe_Set_v1.0_mm10-2020-A.csv
# Method 2
# Space Ranger 2.0 comes bundled with probe set files
## Source mouse probe set reference
~/spaceranger_tutorial/spaceranger-2.0.0/probe_sets/Visium_Mouse_Transcriptome_Probe_Set_v1.0_mm10-2020-A.csv
# Expected output
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 2257k 100 2257k 0 0 5579k 0 --:--:-- --:--:-- --:--:-- 5587k
In this tutorial, we will use the path associated with the first option.
Extract files
After successful download of the all the required files, the contents of the tar files need to be extracted before moving onto the next steps.
# Extract sample FASTQ files
tar -xvf datasets/Visium_FFPE_Mouse_Brain_fastqs.tar -C datasets/ && rm datasets/Visium_FFPE_Mouse_Brain_fastqs.tar
# Extract mouse reference transcriptome
tar -xzvf refdata-gex-mm10-2020-A.tar.gz && rm refdata-gex-mm10-2020-A.tar.gz
# Expected output
# Sample FASTQ files
Visium_FFPE_Mouse_Brain_fastqs/
Visium_FFPE_Mouse_Brain_fastqs/Visium_FFPE_Mouse_Brain_S3_L002_R1_001.fastq.gz
Visium_FFPE_Mouse_Brain_fastqs/Visium_FFPE_Mouse_Brain_S3_L001_I2_001.fastq.gz
Visium_FFPE_Mouse_Brain_fastqs/Visium_FFPE_Mouse_Brain_S3_L002_R2_001.fastq.gz
...
# Reference mouse transcriptome
refdata-gex-mm10-2020-A/
refdata-gex-mm10-2020-A/fasta/
refdata-gex-mm10-2020-A/fasta/genome.fa
...
Successful extraction will create two additional folders, highlighted in bold, within the working directory.
1spaceranger_tutorial 2├── datasets 3│ ├── Visium_FFPE_Mouse_Brain 4│ └── Visium_FFPE_Mouse_Brain_image.jpg 5├── refdata-gex-mm10-2020-A 6├── Visium_Mouse_Transcriptome_Probe_Set_v1.0_mm10-2020-A.csv 7└── spaceranger-2.0.0
We now have all the required inputs needed to run the spaceranger count
pipeline. To obtain more information about the specifying inputs, print the pipeline specific usage statement.
# Print count usage statement
spaceranger count --help
# Expected output
spaceranger-count
Count gene expression and feature barcoding reads from a single capture area
USAGE:
spaceranger count [FLAGS] [OPTIONS] --id <ID> --transcriptome <PATH> --fastqs <PATH>... <--image <IMG>|--darkimage <IMG>...|--colorizedimage <IMG>>
FLAGS:
--no-bam Do not generate a bam file
-nosecondary Disable secondary analysis, e.g. clustering. Optional
--disable-ui Do not serve the web UI
--noexit Keep web UI running after pipestance completes or fails
--nopreflight Skip preflight checks
-h, --help Prints help information
...
OPTIONS:
--id <ID> A unique run id and output folder name [a-zA-Z0-9_-]+
--description <TEXT> Sample description to embed in output files
--image <IMG> Single H&E brightfield image in either TIFF or JPG format
--slide <TEXT> Visium slide serial number, for example 'V10J25-015'
--area <TEXT> Visium area identifier, for example 'A1'
--transcriptome <PATH> Path of folder containing 10x-compatible reference
...
Input image reorientation
By default spaceranger
expects the hourglass shaped corner fiducial to be in the top left corner.
The input image (top right) is however rotated 90°. To ensure good fiducial alignment and tissue spots detection, it is important to correct for this shift in orientation. There are three ways to achieve this:
- run
spaceranger count
without any additional parameters since automatic fiducial rotation and image mirroring is enabled by default in Space Ranger 2.0 onwards. - run
spaceranger count
with--reorient-images=true
flag, which will enable the same rotation and mirroring functionality as the first option, just explicitly. - by rotating the image using any image editing tool (e.g. ImageJ/Fiji, Image Magick) and providing the rotated image to
spaceranger count
In this tutorial, we will choose the second option of using the --reorient-images=true
flag for consistency in code from previous version of spaceranger
.
We can now build the spaceranger count
command for the example FFPE dataset. We will running the pipeline in our working directory spaceranger_tutorial
assuming the same directory structure as shown in the Extract files section above. The input folder paths below reflect this choice.
In case you have a different setup, amend the paths accordingly prior to running the pipeline to avoid any errors. The easiest method to customize would be to copy the code below in any text editor of your choice (e.g. notepad++), edit and paste it back to the terminal.
- With internet access: For compute platforms connected to the internet,
spaceranger
uses the value of the--slide
argument to automatically download the slide layout file ingpr
format.
spaceranger count --id="Visium_FFPE_Mouse_Brain" \
--description="Adult Mouse Brain (FFPE) using Mouse WTA Probe Set" \
--transcriptome=refdata-gex-mm10-2020-A \
--probe-set=Visium_Mouse_Transcriptome_Probe_Set_v1.0_mm10-2020-A.csv \
--fastqs=datasets/Visium_FFPE_Mouse_Brain_fastqs \
--image=datasets/Visium_FFPE_Mouse_Brain_image.jpg \
--slide=V11J26-127 \
--area=B1 \
--reorient-images=true \
--localcores=16 \
--localmem=128
- Without internet access: In absence of internet connectivity to the compute platform, you can download this specific slide layout file in
gpr
format and provide it tospaceranger
using the--slidefile
argument.
spaceranger count --id="Visium_FFPE_Mouse_Brain" \
--description="Adult Mouse Brain (FFPE) using Mouse WTA Probe Set" \
--transcriptome=refdata-gex-mm10-2020-A \
--probe-set=Visium_Mouse_Transcriptome_Probe_Set_v1.0_mm10-2020-A.csv \
--fastqs=datasets/Visium_FFPE_Mouse_Brain_fastqs \
--image=datasets/Visium_FFPE_Mouse_Brain_image.jpg \
--slide=V11J26-127 \
--slidefile=V11J26-127.gpr \
--area=B1 \
--reorient-images=true \
--localcores=16 \
--localmem=128
Below are brief descriptions of the above command line options:
Option | Description |
---|---|
--id | The id must be unique string and will be used to name the resulting folder with all of the pipeline outputs. We choose to keep the original dataset name of Visium_FFPE_Mouse_Brain |
--description | This is sample description included in the output files (e.g. web_summary.html ). We describe the sample as "Adult Mouse Brain (FFPE) using Mouse WTA Probe Set" |
--transcriptome | The path to the species specific pre-compiled transcriptome files. Note that you can either provide the relative path as shown above or the absolute path to this folder. As the tissue sample was of mouse origin, we provide the path to the mouse reference transcriptome refdata-gex-mm10-2020-A |
--probe-set | The absolute or relative path to the species specific probe set reference file in CSV format. Since the tissue sample is derived from mouse, we specify the relative path to the mouse probe set reference as Visium_FFPE_Mouse_Brain_probe_set.csv |
--fastqs | The path to the folder containing sample sequencing files in FASTQ format. The path can be relative as shown above or absolute. The relative path is /datasets/Visium_FFPE_Mouse_Brain_fastqs |
--image | The path to a single brightfield image with H&E staining in either TIFF or JPEG formats. The path can be relative or absolute. Here we have a TIFF format image with the following relative path /datasets/Visium_FFPE_Mouse_Brain_image.jpg |
--slide | The visium slide serial number of which the tissue sample was mounted and the value here is V11J26-127 |
--area | The capture area identifier on the visium slide. It can be one of four values: A1, B1, C1 or D1. Here the tissue sample was mounted on B1 capture area. |
--slidefile | The slide layout file in gpr format which is provided when spaceranger does not have internet access. You can download the slide layout file and provide it as V11J26-127.gpr |
--reorient-images | Option to choose whether spaceranger should rotate and mirror the image to find the best fiducial alignment. Acceptable values are true or false with default being true. Useful to set to false when you are absolutely certain the fiducial corners in the input image are in the canonical positions (hourglass in top left corner) similar to included image Visium_FFPE_Mouse_Brain_image.jpg . Setting this to false will reduce pipeline runtime and prevent the pipeline from finding a fiducial alignment where the image is rotated/mirrored. |
--localcores | The number of CPU cores available to run the spaceranger count pipeline. The maximum upper limit for your specific compute system is determined using the sitecheck subcommand. We will use 16 cores in this tutorial. |
--localmem | The max memory in GB available to run the spaceranger count pipeline. The maximum upper limit for your specific compute system is determined using the sitecheck subcommand. We will use 128 GB in this tutorial. |
At the start of the pipeline, you should see the message about the preflight checks printed to the command line.
# With internet access
# Run spaceranger count
spaceranger count --id="Visium_FFPE_Mouse_Brain" \
--description="Adult Mouse Brain (FFPE) using Mouse WTA Probe Set" \
--transcriptome=refdata-gex-mm10-2020-A \
--probe-set=Visium_Mouse_Transcriptome_Probe_Set_v1.0_mm10-2020-A.csv \
--fastqs=datasets/Visium_FFPE_Mouse_Brain_fastqs \
--image=datasets/Visium_FFPE_Mouse_Brain_image.jpg \
--slide=V11J26-127 \
--area=B1 \
--reorient-images=true \
--localcores=16 \
--localmem=128
# Without internet access
spaceranger count --id="Visium_FFPE_Mouse_Brain" \
--description="Adult Mouse Brain (FFPE) using Mouse WTA Probe Set" \
--transcriptome=refdata-gex-mm10-2020-A \
--probe-set=Visium_Mouse_Transcriptome_Probe_Set_v1.0_mm10-2020-A.csv \
--fastqs=datasets/Visium_FFPE_Mouse_Brain_fastqs \
--image=datasets/Visium_FFPE_Mouse_Brain_image.jpg \
--slide=V11J26-127 \
--slidefile=V11J26-127.gpr \
--area=B1 \
--reorient-images=true \
--localcores=16 \
--localmem=128
# Expected output
Martian Runtime - v4.0.5
Running preflight checks (please wait)...
Checking sample info...
Checking FASTQ folder...
Checking reference...
Checking reference_path...
Checking optional arguments...
...
Successful completion of the pipeline is indicated by summary of the output files generated.
Outputs:
- Run summary HTML: /spaceranger_tutorial/Visium_FFPE_Mouse_Brain/outs/web_summary.html
- Outputs of spatial pipeline:
aligned_fiducials: /spaceranger_tutorial/Visium_FFPE_Mouse_Brain/outs/spatial/aligned_fiducials.jpg
detected_tissue_image: /spaceranger_tutorial/Visium_FFPE_Mouse_Brain/outs/spatial/detected_tissue_image.jpg
scalefactors_json: /spaceranger_tutorial/Visium_FFPE_Mouse_Brain/outs/spatial/scalefactors_json.json
tissue_hires_image: /spaceranger_tutorial/Visium_FFPE_Mouse_Brain/outs/spatial/tissue_hires_image.png
tissue_lowres_image: /spaceranger_tutorial/Visium_FFPE_Mouse_Brain/outs/spatial/tissue_lowres_image.png
cytassist_image: null
aligned_tissue_image: null
tissue_positions: /spaceranger_tutorial/Visium_FFPE_Mouse_Brain/outs/spatial/tissue_positions.csv
spatial_enrichment: /spaceranger_tutorial/Visium_FFPE_Mouse_Brain/outs/spatial/spatial_enrichment.csv
barcode_fluorescence_intensity: null
- Run summary CSV: /spaceranger_tutorial/Visium_FFPE_Mouse_Brain/outs/metrics_summary.csv
- Correlation values between isotypes and Antibody features: null
- BAM: /spaceranger_tutorial/Visium_FFPE_Mouse_Brain/outs/possorted_genome_bam.bam
- BAM BAI index: /spaceranger_tutorial/Visium_FFPE_Mouse_Brain/outs/possorted_genome_bam.bam.bai
- BAM CSI index: null
- Filtered feature-barcode matrices MEX: /spaceranger_tutorial/Visium_FFPE_Mouse_Brain/outs/filtered_feature_bc_matrix
- Filtered feature-barcode matrices HDF5: /spaceranger_tutorial/Visium_FFPE_Mouse_Brain/outs/filtered_feature_bc_matrix.h5
- Unfiltered feature-barcode matrices MEX: /spaceranger_tutorial/Visium_FFPE_Mouse_Brain/outs/raw_feature_bc_matrix
- Unfiltered feature-barcode matrices HDF5: /spaceranger_tutorial/Visium_FFPE_Mouse_Brain/outs/raw_feature_bc_matrix.h5
- Secondary analysis output CSV: /spaceranger_tutorial/Visium_FFPE_Mouse_Brain/outs/analysis
- Per-molecule read information: /spaceranger_tutorial/Visium_FFPE_Mouse_Brain/outs/molecule_info.h5
- Loupe Browser file: /spaceranger_tutorial/Visium_FFPE_Mouse_Brain/outs/cloupe.cloupe
- Feature Reference: null
- Target Panel file: null
- Probe Set file: /spaceranger_tutorial/Visium_FFPE_Mouse_Brain/outs/probe_set.csv
Waiting 6 seconds for UI to do final refresh.
Pipestance completed successfully!
After the run is completed, the working directory will have a new folder named Visium_FFPE_Mouse_Brain
(value provided to --id
argument) that contains all the metadata and outputs generated from the spaceranger count
pipeline. We will highlight some key components of this folder:
Visium_FFPE_Mouse_Brain
├── _cmdline
├── _filelist
├── _finalstate
├── _invocation
├── _jobmode
├── _log
├── _mrosource
├── outs
├── _perf
├── _sitecheck
├── SPATIAL_RNA_COUNTER_CS
├── _tags
├── _timestamp
├── _uuid
├── Visium_FFPE_Mouse_Brain.mri.tgz
├── _vdrkill
└── _versions
outs
contains all the final pipeline generated outputsVisium_FFPE_Mouse_Brain.mri.tgz
contains diagnostic information helpful to 10x Genomics support to resolve any errors_sitecheck
captures the system configuration similar to sitecheck subcommand_timestamp
contains information on pipeline runtimes. The runtime for the example dataset with the above configuration was 50:58_cmdline
captures thecount
command provided to run the pipeline_versions
contains both thespaceranger
and Martian versions used in the run
The outs
folder contain all the calculated results.
You can further explore and understand these results by
- Browsing the summary HTML file in any supported web browser. Note that the displayed color scheme for tissue and fiducials has flipped from the earlier version of
spaceranger
- Continuing further analysis by opening the
.cloupe
file in Loupe Browser - Referring to the Understanding Output section to explore individual files
- Performing downstream analysis using community developed tools (e.g. Seurat, DropletUtils)
Q: I ran spaceranger count
and got this error Could not retrieve spot layout data
. What does this mean and how can I proceed?
When you specify the visium slide id using the --slide
argument, spaceranger count
downloads the corresponding slide file layout file in gpr
format. This step requires internet connectivity. However in some instances, compute platforms may not have internet access and hence the resulting error message. If you know the Visium slide id, you can download the slide layout file and provide it to the pipeline using the --slidefile
argument along with specifying the capture area with --area
.