Run spaceranger annotate

The cloud-based cell annotation models were co-developed by 10x Genomics and the Cellarium AI Lab at the Data Sciences Platform of the Broad Institute. The models are in beta. A preprint describing the method is now available on bioRxiv: Accelerating scRNA-seq Analysis: Automated cell type annotation using representation learning and vector search. The Pan-Human Azimuth model was developed by the Satija lab as part of the The Human BioMolecular Atlas Program (HuBMAP).

When you enable cloud-based cell type annotation, your data is securely transmitted to 10x Genomics Cloud Analysis. Since your data is leaving your local environment and entering the 10x Genomics domain, it becomes subject to the terms outlined in the 10x Genomics End User License Agreement (EULA). Please review the EULA carefully to understand how your data will be handled and the associated usage terms. Additionally, please only use this feature if there are no restrictions that preclude your data being sent outside your local environment. The availability of automated cell annotation is subject to restrictions based on U.S. or local laws and regulations. See regional restrictions for the list of impacted regions.

Cell type annotation refers to the process of categorizing and assigning cell types to individual cells based on their gene expression profiles. These annotations are needed for understanding the cellular composition and diversity within a sample.

Space Ranger v4.1 introduced support for automated cell type annotation as part of the spaceranger count command and as a standalone command, spaceranger annotate.

To generate automated cell type annotations, ensure your analysis includes a Visium HD or Visium HD 3' library.

For cloud-based models:

Your sample must be from a human or mouse.
The total number of cells in your analysis should range from 100 to 1.2 million.
You must have a 10x Genomics Cloud Analysis account.

For the Pan-Human Azimuth model:

Only human samples are supported.
A 10x Genomics Cloud Analysis account is not required.

To run automated 10x Genomics cell type annotations with the spaceranger annotate command, you will need to access the 10x Genomics Cloud CLI Access Token.

If you only want to run the Pan-Human Azimuth model (human only), which is run locally with the Space Ranger pipeline, not on 10x Genomics Cloud, skip these two steps.

There are two ways of accessing the token:

Run spaceranger cloud auth setup (recommended):

Space Ranger v4.1 introduced a new command, spaceranger cloud auth setup, to simplify the process of authenticating with 10x Genomics Cloud Analysis. This command provides an interactive walkthrough that guides you step-by-step through the setup.

When you run the command:

You will be prompted to visit the 10x Genomics Cloud Analysis site, where you can generate an access token.
After copying the token, paste it back into the command prompt, allowing Space Ranger to save the token locally.

Once saved, the token is automatically reused for future requests, making it easier to access Cloud Analysis services without needing to repeatedly enter credentials.

Manually create a token file

The token is located on the security page of your Cloud Analysis account: https://cloud.10xgenomics.com/account/security.

You have the option to either generate a new token or copy an existing one.

To create a new token, click "Generate New Access Token."

Once the token is generated, use the copy button to copy the entire token and save it as a plain text file in a secure location that others cannot access. This token controls access to data stored in your 10x Cloud Account.

This token file is required as an input with a local installation (i.e., not running in 10x Cloud) to run spaceranger annotate and spaceranger count. When using these pipelines, you will provide a path to the token file via the --tenx-cloud-token-path argument.

The inputs for spaceranger annotate are files generated by the spaceranger count pipeline for Visium HD data.

Specifically, spaceranger annotate requires the following files, located in the outs/ directory of a typical spaceranger count run:

Filtered feature-barcode matrix in H5 format. It is highly recommended to use the cell-segmented version, not the binned version. All annotation models were trained on single cell data and will not perform as intended when bins are used.
Loupe Browser file (.cloupe) (Optional): If you want spaceranger annotate to generate an annotated .cloupe file as part of the output, include this file as an input. The .cloupe file provides a visual representation of the gene expression data, which can be used in the Loupe Browser to explore the results interactively. If you do not provide a .cloupe file, spaceranger annotate will still run, but it cannot produce an annotated .cloupe output.

Visit the command line arguments page or run spaceranger annotate --help for a full list of accepted arguments.

An example command looks like this:


spaceranger annotate --id=sample123 \
    --matrix=filtered_feature_bc_matrix.h5 \
    --cell-annotation-model=auto \
    --tenx-cloud-token-path=/path/to/10xcloud_token.json

In this example:

--matrix specifies the path to the filtered feature-barcode matrix in H5 format.
--cell-annotation-model determines the 10x Genomics cloud-based model used for cell type annotation. When set to auto, the pipeline automatically selects the appropriate model(s). Currently available models are human_pca_v1_beta (10x human model), mouse_pca_v1_beta (10x mouse model).
--tenx-cloud-token-path is the path to the 10x Genomics Cloud Access Token, which is necessary for communication with the cloud-based models. If not supplied, will default to the location stored through spaceranger cloud auth setup. If the token file does not exist, there is an error.
If you do not provide --cell-annotation-model and your sample is derived from a human, the Pan-Human Azimuth model will still be run locally.

Cell type annotation generates the same output files, whether run as a standalone annotate command or integrated into the count pipeline. All outputs are saved in the outs/ directory. For more details, see the Cell Type Annotation Outputs page.

Prerequisites

Generating a 10x Cloud Analysis token

Inputs and running the command

Expected outputs