Support homeCell Ranger ARCAnalysis
Cell Type Annotation

Cell Type Annotation

The cloud-based cell annotation model was co-developed by 10x Genomics and the Cellarium AI Lab at the Data Sciences Platform of the Broad Institute. The model is in beta. A preprint describing the method is now available on bioRxiv: Accelerating scRNA-seq Analysis: Automated cell type annotation using representation learning and vector search. The Pan-Human Azimuth model was developed by the Satija lab as part of the The Human BioMolecular Atlas Program (HuBMAP).

Cell type annotation refers to the process of categorizing and assigning cell types to individual cells based on their gene expression profiles. These annotations are needed for understanding the cellular composition and diversity within a sample.

Cell Ranger ARC v2.1 and later supports automated cell type annotation as part of the cellranger-arc count command. Learn more about the annotation algorithms here.

To generate automated cell type annotations, ensure your analysis includes a Gene Expression library.

For cloud-based models:

  1. Your sample must be from a human or mouse.
  2. The total number of cells in your analysis should range from 100 to 1.2 million.
  3. You must have a 10x Genomics Cloud Analysis account.

For the Pan-Human Azimuth model:

  1. Only human samples are supported.
  2. A 10x Genomics Cloud Analysis account is not required.
Cell Ranger ARC v2.2 introduces the Pan-Human Azimuth cell type annotation model to the count pipeline (not currently available for the aggr pipeline).

The Pan-Human Azimuth model is run locally (not 10x Cloud-based) by default. To implement, set up the pipeline like a normal count run:

cellranger-arc count --id=sample123 \ --reference=/opt/refdata-cellranger-arc-GRCh38-2024-A \ --libraries=libraries=/home/jdoe/runs/libraries.csv \ --create-bam=true \ --matrix=filtered_feature_bc_matrix.h5

If you wish to disable the Pan-Human Azimuth cell annotation pipeline, add the --disable-cell-annotation option to the count command. Visit the command line arguments page for a full list of accepted arguments.

When you enable cell type annotation, your data is securely transmitted to 10x Genomics Cloud Analysis. Since your data is leaving your local environment and entering the 10x Genomics domain, it becomes subject to the terms outlined in the 10x Genomics End User License Agreement (EULA). Please review the EULA carefully to understand how your data will be handled and the associated usage terms. Additionally, please only use this feature if there are no restrictions that preclude your data being sent outside your local environment. The availability of automated cell annotation is subject to restrictions based on U.S. or local laws and regulations. See regional restrictions for the list of impacted regions.

To run automated cell type annotations with the cellranger-arc count command, you will need to access the 10x Genomics Cloud CLI Access Token.

There are two ways of accessing the token:

  1. Run cellranger-arc cloud auth setup (recommended):

Cell Ranger ARC v2.1 introduces a new command, cellranger-arc cloud auth setup, to simplify the process of authenticating with 10x Genomics Cloud Analysis. This command provides an interactive walkthrough that guides you step-by-step through the setup.

When you run the command:

  • You will be prompted to visit the 10x Genomics Cloud Analysis site, where you can generate an access token.
  • After copying the token, paste it back into the command prompt, allowing Cell Ranger ARC to save the token locally.

Once saved, the token is automatically reused for future requests, making it easier to access Cloud Analysis services without needing to repeatedly enter credentials.

  1. Manually create a token file

The token is located on the security page of your Cloud Analysis account: https://cloud.10xgenomics.com/account/security.

You have the option to either generate a new token or copy an existing one.

To create a new token, click "Generate New Access Token."

Once the token is generated, use the copy button to copy the entire token and save it as a plain text file in a secure location that others cannot access. This token controls access to data stored in your 10x Cloud Account.

This token file is required as an input with a local installation (i.e., not running in 10x Cloud) to run cellranger-arc count with cell annotation. You will provide a path to the token file via the --tenx-cloud-token-path argument.

An example to run cellranger-arc count with cell annotation looks like this:

cellranger-arc count --id=sample123 \ --reference=/opt/refdata-cellranger-arc-GRCh38-2024-A \ --libraries=libraries=/home/jdoe/runs/libraries.csv \ --create-bam=true \ --matrix=filtered_feature_bc_matrix.h5 \ --cell-annotation-model=auto \ --tenx-cloud-token-path=/path/to/10xcloud_token.json \

In this example:

  • --cell-annotation-model determines the 10x Genomics cloud-based model used for cell type annotation. When set to auto, the pipeline automatically selects the appropriate model(s). Currently available models are human_pca_v1_beta (10x human model), mouse_pca_v1_beta (10x mouse model).
  • --tenx-cloud-token-path is the path to the 10x Genomics Cloud Access Token, which is necessary for communication with the cloud-based cell annotation model. If not supplied, it will default to the location stored through cellranger-arc cloud auth setup. If the token file does not exist, there is an error.
  • If you do not provide --cell-annotation-model and your sample is derived from a human, the Pan-Human Azimuth model will still be run locally.
  • If you wish to disable all annotations, use the --disable-cell-annotation option.

Visit the command line arguments page for a full list of accepted arguments.

Cell type annotation generates the same output files, whether run as a standalone command or integrated into the count pipeline. All outputs are saved in the outs/ directory.

outs/ ├── cell_types │ ├── cell_annotation_differential_expression.csv │ ├── cell_annotation_results.json.gz │ └── cell_types.csv

For more details about these output files, see the Cell Type Annotation Outputs page.