10x Genomics Support/Space Ranger/Tutorials/

Set up Space Ranger

In this tutorial you will:

  • Download and install spaceranger
  • Key components of spaceranger folder
  • Add spaceranger to $PATH
  • Perform and review sitecheck
  • Perform a testrun

For successful run of this tutorial, you must:

This tutorial is written with Space Ranger v2.0.0. You can copy/paste all the commands listed in the tutorial into your command prompt to follow along.

You can download and install spaceranger in any location. For this tutorial, we will create a working directory spaceranger_tutorial and continue all the remaining steps in it.

# Create working directory mkdir spaceranger_tutorial # Change directory cd spaceranger_tutorial

To install the latest version of spaceranger

  • Go to the Downloads page
  • Fill out the 10x Genomics End User Software License Agreement information
  • Copy and paste the download command from either one of the command line utilities (curl or wget). You should see a download progress status similar to the output below.
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 10 1220M 10 125M 0 0 30.9M 0 0:00:39 0:00:04 0:00:35 30.8M

This downloads the spaceranger tarball spaceranger-2.0.0.tar.gz to your working directory. Next, we extract the contents.

# Extract spaceranger tarball tar -zxvf spaceranger-2.0.0.tar.gz # Expected output spaceranger-2.0.0/ spaceranger-2.0.0/.env.json spaceranger-2.0.0/.version spaceranger-2.0.0/LICENSE spaceranger-2.0.0/builtwith.json spaceranger-2.0.0/sourceme.bash spaceranger-2.0.0/sourceme.csh spaceranger-2.0.0/bin/ spaceranger-2.0.0/bin/_spaceranger_internal spaceranger-2.0.0/bin/spaceranger spaceranger-2.0.0/bin/rna/ spaceranger-2.0.0/bin/rna/_includes ...

When the extraction process is finished, you will have access to the command prompt and the folder spaceranger-2.0.0 will be created in the working directory.

The spaceranger-2.0.0 folder contains the executable and all of the required dependencies. The key folders that you would use are highlighted.

1spaceranger-2.0.0 2├── bin 3├── external 4│   ├── anaconda 5│   ├── martian 6│ │ └── jobmanagers 7│   ├── spaceranger_tiny_inputs 8│   └── spaceranger_tiny_ref 9├── lib 10│   ├── bin 11│ │ ├── bamtofastq 12│ │ ├── redstone 13│ │ └── ... 14│   └── python 15│ └── cellranger 16│    └── barcodes 17│ ├── visium-v1_coordinates.txt 18│ ├── visium-v2_coordinates.txt 19│ ├── visium-v4_coordinates.txt 20│ ├── visium-v5_coordinates.txt 21│ └── ... 22├── mro 23├── probe_sets 24│ ├── Visium_Human_Transcriptome_Probe_Set_v1.0_GRCh38-2020-A.csv 25│ ├── Visium_Human_Transcriptome_Probe_Set_v2.0_GRCh38-2020-A.csv 26│ └── Visium_Mouse_Transcriptome_Probe_Set_v1.0_mm10-2020-A.csv 27├── target_panels 28│ ├── gene_signature_v1.0_GRCh38-2020-A.target_panel.csv 29│ ├── immunology_v1.0_GRCh38-2020-A.target_panel.csv 30│ ├── neuroscience_v1.0_GRCh38-2020-A.target_panel.csv 31│ └── pan_cancer_v1.0_GRCh38-2020-A.target_panel.csv 32└── THIRD-PARTY-LICENSES.spaceranger.txt
  • antibody_refs folder contains the validated antibody panel used in combined GEX + PEX analysis
  • probe_sets contains the probe set reference CSV file used in analysis for FFPE samples
  • lib/python/cellranger/barcodes folder contains visium barcodes whitelist and their coordinates on the slide
  • lib/bin folder contains tools such as bamtofastq which is to convert 10x Genomics BAM files to FASTQ and redstone to enable data transfer to 10x Genomics
  • external/spaceranger_tiny_ref and external/spaceranger_tiny_inputs are utilized for spaceranger testrun
  • external/martian/jobmanagers folder contains sample templates for commonly used job schedulers
  • THIRD-PARTY-LICENSES.spaceranger.txt file contains all the licenses for dependencies used in Space Ranger

spaceranger is now installed. There are two ways to specify spaceranger in the commands.

  • Use the full path to the spaceranger-2.0.0 folder
# Method 1 ## Change directory to spaceranger-2.0.0 cd spaceranger-2.0.0 ## Get the full path pwd ## Change working directory back to spaceranger_tutorial cd .. # Method 2 ## Get the full path readlink -f spaceranger-2.0.0 # Expected output ## The path will change dpending on the compute setup you are using. /PATH/TO/WORKING_DIRECTORY/spaceranger_tutorial/spaceranger-2.0.0
  • Adding spaceranger-2.0.0 to your $PATH variable
# Method 1 ## Get the full path readlink -f spaceranger-2.0.0 ## Export PATH by providing the full path export PATH=/PATH/TO/WORKING_DIRECTORY/spaceranger_tutorial/spaceranger-2.0.0:$PATH ## Confirm installation which spaceranger # Method 2 ## Change directory to spaceranger-2.0.0 cd spaceranger-2.0.0 ## Export PATH by specifying a shell variable export PATH=$PWD:$PATH ## Confirm installation which spaceranger ## Change working directory back to spaceranger_tutorial cd .. # Expected output ~/spaceranger_tutorial/spaceranger-2.0.0

The tilde symbolizes your home directory which will be same as /PATH/TO/WORKING_DIRECTORY as before.

Adding variables using export lasts for only the current login session. Add the export command using the full path to spaceranger-2.0.0 to the shell configuration file (e.g. .bashrc, .zshrc etc) which is triggered for every login.

You can now invoke spaceranger at the command prompt to see the usage statement.

# Input ## When using full path to the spaceranger folder /PATH/TO/WORKING_DIRECTORY/spaceranger_tutorial/spaceranger-2.0.0/spaceranger ## When adding spaceranger folder to the $PATH variable spaceranger # Expected output USAGE: spaceranger <SUBCOMMAND> FLAGS: -h, --help Prints help information -V, --version Prints version information SUBCOMMANDS: count Count gene expression and feature barcoding reads from a single capture area aggr Aggregate data from multiple 'spaceranger count' runs ... testrun Execute the 'count' pipeline on a small test dataset upload Upload analysis logs to 10x Genomics support sitecheck Collect linux system configuration information help Prints this message or the help of the given subcommand(s)

For the rest of the tutorial, we will invoke spaceranger assuming addition of the spaceranger-2.0.0 folder to the $PATH variable.

spaceranger sitecheck enables you to check your system configuration to ensure it meets the minimum recommended requirements. Run the command and use > to re-direct the output to a text file.

spaceranger sitecheck > sitecheck.txt

If running spaceranger in cluster mode, run the sitecheck on both the head node and the worker node.

Open the file with less and use / (e.g. /CPU Cores) to search for specific sections with the file. Press 'q' to quit.

less sitecheck.txt

We will examine some key configuration metrics and compare against the recommended system requirements.

  • CPU Cores
CPU Cores grep -c processor /proc/cpuinfo --------------------------------------------------------------------- 96 ===================================================================== ...

This system has 96 CPUs and is capable of running spaceranger which requires at least 8 CPUs, preferably 32.

  • Memory Total
Memory Total grep MemTotal /proc/meminfo | cut -d ':' -f 2 | sed 's/^[ \t]*//' --------------------------------------------------------------------- 289287896 kB ===================================================================== ...

For direct comparison, let's convert kB to GB:

RAM in GB=2892878961e+6289{\text {RAM in GB}} = \frac{289287896}{1\mathrm{e}{+6}} \approx 289

which satisfies the system requirements of having at least [64GB RAM, preferably 128].

  • User Limits
1User Limits 2bash -c 'ulimit -a' 3--------------------------------------------------------------------- 4core file size (blocks, -c) 0 5data seg size (kbytes, -d) unlimited 6scheduling priority (-e) 0 7file size (blocks, -f) unlimited 8pending signals (-i) 1520514 9max locked memory (kbytes, -l) 64 10max memory size (kbytes, -m) unlimited 11open files (-n) 10240 12pipe size (512 bytes, -p) 8 13POSIX message queues (bytes, -q) 819200 14real-time priority (-r) 0 15stack size (kbytes, -s) 8192 16cpu time (seconds, -t) unlimited 17max user processes (-u) 131072 18virtual memory (kbytes, -v) unlimited 19file locks (-x) unlimited 20=====================================================================

The two metrics to consider are highlighted.

a. For the max user processes, the recommendation is the limit to be 64 per core. Assuming we use all 96 cores, 9664=6,144<131,07296*64 = 6,144 < 131,072

b. For max open files, the system limit 10,240<16,00010,240 < 16,000 which is the recommendation. While the pipelines may run at lower open file limit, caution is urged. This value is dependent on the system, the sample type and number of samples being run. In case the pipeline errors, it is advisable to increase the user limit ulimit and try again.

  • Global File Limit
Global File Limit cat /proc/sys/fs/file-{max,nr} --------------------------------------------------------------------- 2921445 68736 0 262144 =====================================================================

The value satisfies the minimum requirement of 10k per GB RAM 10,000289=2,890,000<2,921,44510,000*289 = 2,890,000 < 2,921,445, where 289 GB is the total memory of the system. The software support team can review your sitecheck results. There are two ways to send it across:

  • If the compute platform has access to internet, use the upload pipeline replacing the email address with your email spaceranger upload [email protected] sitecheck.txt
  • If the compute platform is not connected to the internet, you can send the sitecheck.txt as an attachment to [email protected].

We can verify the installation using spaceranger testrun. This pipeline can be run in two configurations depending on the internet connectivity of the compute platform.

# With internet access spaceranger testrun --id=verify_install # Without internet access spaceranger testrun --no-internet --id=verify_install # Expected output Martian Runtime - v4.0.5 Running preflight checks (please wait)... Checking sample info... Checking FASTQ folder... Checking reference... Checking reference_path Checking optional arguments... ... Pipestance completed successfully!

Successful completion of the testrun by extension implies successful installation of spaceranger.

Q: How can I use multiple versions of spaceranger?
Sometimes it is useful to have access to older as well as newer versions of spaceranger. There are two suggested ways to achieve this:

  • Update $PATH to point to the latest version

Since the spaceranger tarball comes annotated with the version number, you can download and uninstall the latest version and subsequently update the [$PATH] variable to point to the version you wish to use.

  • Use virtual environments

You can install and set up conda which functions as both package and environment manager. Use of virtual environments for running software provides many useful benefits such as reproducibility, compatibility, versioning as well as giving admin permissions on shared compute environments such as High-Performance Computing clusters (HPCs).