Support homeXenium Onboard AnalysisAnalysis
Xenium In Situ Gene and Protein Assay Outputs

Xenium In Situ Gene and Protein Assay Outputs

Along with additions to some existing XOA gene expression outputs, XOA v4.0 outputs files that are specific to data generated with the Xenium In Situ Gene and Protein Expression with Cell Segmentation Staining assay (CG000819) (referred to as Xenium Protein assay hereafter).

The following files contain both gene and protein data:

  • Analysis summary HTML
  • Cell-feature matrix files (MEX, H5, Zarr)
  • Secondary analysis files (analysis/, Zarr)

The following files are only provided in protein datasets:

  • 2D morphology focus OME-TIFF protein images
  • 2D morphology focus OME-TIFF background images
  • Auxiliary output directory with 2nd cycle autofluorescence images

The Summary, Analysis, and Image QC tabs contain additional information for protein datasets; these are described below. The Decoding and Cell Segmentation tabs are the same as described for all Xenium datasets here.

The Protein Panel lists all the protein subpanel(s) selected for the experiment on the Xenium Analyzer instrument. If selected during run setup, multimodal cell segmentation markers can be imaged individually (labeled "Cellular Localization" markers).

Protein datasets include two subtabs on the Analysis tab for secondary analysis results calculated from the transcript data and from the protein data. The Top Features tables for both subtabs are provided to help annotate clusters.

The Gene Expression Analysis subtab displays the same plots and tables as described on the Analysis Summary Overview page, with the addition of a table for relative protein expression for the RNA-only clusters:

  • The Top Protein Expression Features by Cluster (Z-score) table shows top proteins expressed based on RNA-only clusters.
  • The Top Features by Cluster (Log2 fold-change, p-value) table shows top genes expressed based on RNA-only clusters.

The Protein Analysis subtab displays Mean Protein Intensity Per Cell plots, Clustering plots, Top Feature tables for protein and gene expression, and a Protein Correlations plot. Refer to the help text in the HTML file for detailed plot and table descriptions.


The top feature tables on this view are presented for the protein-only cluster results:

  • The Top Protein Expression Features by Cluster (Z-score) table shows top proteins expressed based on protein-only clusters.
  • The Top Features by Cluster (Log2 fold-change, p-value) table shows top genes expressed based on protein-only clusters.

The Protein Correlations plot shows the Pearson correlation matrix of mean fluorescence intensity per cell for all proteins in the sample. Proteins are hierarchically clustered using the correlation structure and then ordered according to this clustering. Proteins that are markers for the same or similar cell types, such as memory T cells, should be clustered together and have high correlation.


Here is an example for investigating the results on the Protein Analysis tab:

  1. Select a protein or RNA cluster result to investigate and view the cell clusters in the Clustering spatial or UMAP plots.
  2. Next, look up the proteins that are more expressed in that cluster using the Top Protein Expression Features by Cluster (Z-score) table (median Z-score normalized mean intensity calculation described here).
  3. Finally, confirm the spatial expression result by viewing cells colored by mean intensity per cell in the spatial or UMAP plot (mean intensity calculation described here).

For the Xenium Protein assay outputs, this tab includes the RNA Decoding Images, Autofluorescence Images (1st cycle), and Cell Segmentation Images described on the Analysis Summary Overview page. In addition, this tab also displays Protein Images and Autofluorescence Images (2nd cycle) to help with QC and troubleshooting of protein data.

Image gallery units: The x- and y-axes are measured in micrometers. The intensity values in all autofluorescence, cell segmentation, and protein images are measured in arbitrary units (a.u.), as the image processing steps transform the original image units (photoelectrons (pe)) units. Interpret data as before by examining images for brightness patterns, however note that brightness is protein marker-specific and is not directly comparable across markers.


  • Autofluorescence Images (1st cycle): For Xenium Protein assay instrument runs, background images are acquired for the blue, green, yellow, and red channels.
  • Cell Segmentation Images: Cell segmentation staining images are acquired in the blue, green, red, and near UV channels. Note that the blank yellow channel is displayed to maintain the same order of channels presented in all sections of the Image QC gallery as you review the images vertically.

Protein Images: Thumbnail plots of the autofocused, downsampled, and processed morphology images. The protein stain images in this gallery have undergone deconvolution, background subtraction, masking, crosstalk correction, and stitching (described in the protein imaging algorithms page). Image features will appear sharper and fluorescence intensity values lower as a result of image deconvolution. These images can be used to check for large-scale issues like debris or detect patterns across imaging cycles. Any staining or image quality concerns should be verified by looking at the full resolution morphology_focus images in Xenium Explorer.


Use the toggle to overlay saturation QC masks over their corresponding protein images. Areas highlighted in magenta have saturated pixels in the exposure-adjusted background used for background subtraction or in at least one of the channels of a cycle, thus affecting crosstalk correction (described in the protein image algorithms page). Masked pixels are set to zero in the morphology_focus image and excluded from quantification. The full resolution masks can be found in aux_outputs/morphology_focus_qc_masks and viewed in Xenium Explorer by importing as an additional stack of images.

Autofluorescence Images (2nd cycle): Thumbnail plots of the autofluorescence images (maximum intensity projection (MIP) images (not deconvolved). These images are used in background subtraction to produce the autofocused images in the Protein Images gallery. Use these images to review any debris or autofluorescence artifacts that were automatically subtracted during morphology image processing. Downsampled images are available in the aux_outputs/background_qc_images/background_02_[channel].tiff directory.


2D multi-focus fused, deconvolved, background subtracted, masked, and crosstalk corrected protein images are provided in the multi-file OME-TIFF directory (morphology_focus/) that contains nucleus and cell stain segmentation images. In addition, the first cycle of autofluorescence background images, which are fused and deconvolved, will also be included when protein data is present. Learn about protein image process algorithms here.

The files will be ordered and labeled by channel number: DAPI (n=1), cell segmentation images (n=3), autofluorescence images by wavelength (n=4), and protein images in order of selection on instrument (n=up to 27). Protein subpanels are sorted alphabetically, and each protein within a subpanel is sorted by wavelength. If Xenium Multi-Tissue Stain Mix markers are imaged separately as one of the protein cycles, they are listed last and called "Cellular Localization" markers. Learn about protein subpanel markers here.

Example morphology_focus folder with DAPI, cell segmentation images, autofluorescence images, and all protein images:

morphology_focus ├── ch0000_dapi.ome.tif # DAPI image (cycle 17) ├── ch0001_atp1a1_cd45_e-cadherin.ome.tif # Cell segmentation images ├── ch0002_18s.ome.tif ├── ch0003_alphasma_vimentin.ome.tif ├── ch0004_blue_background.ome.tif # 1st cycle autofluorescence images ├── ch0005_green_background.ome.tif ├── ch0006_yellow_background.ome.tif ├── ch0007_red_background.ome.tif ├── ch0008_cd4.ome.tif # Xenium Protein Immune Cell Subpanel A images ├── ch0009_cd20.ome.tif ├── ch0010_cd8a.ome.tif ├── ch0011_cd3e.ome.tif ├── ch0012_cd138.ome.tif # Xenium Protein Immune Cell Subpanel B images ├── ch0013_hla-dr.ome.tif ├── ch0014_cd11c.ome.tif ├── ch0015_cd68.ome.tif ├── ch0016_cd16.ome.tif # Xenium Protein Immune Cell Subpanel C images ├── ch0017_granzymeb.ome.tif ├── ch0018_cd163.ome.tif ├── ch0019_pd-1.ome.tif # Xenium Protein Immune Checkpoint Subpanel images ├── ch0020_vista.ome.tif ├── ch0021_pd-l1.ome.tif ├── ch0022_lag-3.ome.tif ├── ch0023_cd45ra.ome.tif # Xenium Protein Proliferation & Differentiation Subpanel images ├── ch0024_pcna.ome.tif ├── ch0025_cd45ro.ome.tif ├── ch0026_ki-67.ome.tif ├── ch0027_beta-catenin.ome.tif # Xenium Protein Tumor Subpanel images ├── ch0028_cd31.ome.tif ├── ch0029_pten.ome.tif ├── ch0030_panck.ome.tif ├── ch0031_vimentin.ome.tif # Cellular Localization marker images ├── ch0032_alphasma.ome.tif ├── ch0033_cd45.ome.tif └── ch0034_e-cadherin.ome.tif

Programmatic examples for viewing these files is provided below.

View multi-file OME-TIFF images with Python

The output bundle contains the same cell_feature_matrix files (MEX, HDF5, and Zarr formats) as for gene expression datasets, and adds a Protein Expression feature type.

The cell_feature_matrix includes scaled mean protein intensity values (see calculation), 10x Genomics Xenium Protein marker IDs (TXP), and protein marker abbreviations.

Example of protein features in the cell_feature_matrix/features.tsv file:

TXP000019 PD-1 Protein Expression TXP000023 VISTA Protein Expression TXP000029 PD-L1 Protein Expression TXP000016 LAG-3 Protein Expression

Example hierarchy of the cell_feature_matrix.h5 file:

h5dump -n ./cell_feature_matrix.h5 HDF5 "./cell_feature_matrix.h5" { FILE_CONTENTS { group / group /matrix dataset /matrix/barcodes dataset /matrix/data group /matrix/features dataset /matrix/features/_all_tag_keys dataset /matrix/features/feature_type dataset /matrix/features/genome dataset /matrix/features/id dataset /matrix/features/name dataset /matrix/indices dataset /matrix/indptr dataset /matrix/shape } } # The scaling factor value is stored in the H5 file: h5dump --attribute=protein_scaling_factor cell_feature_matrix.h5 HDF5 "cell_feature_matrix.h5" { ATTRIBUTE "protein_scaling_factor" { DATATYPE H5T_STD_I64LE DATASPACE SCALAR DATA { (0): 10 } } }

The 10x Genomics Xenium IDs per protein in each subpanel are:

SubpanelProtein Marker IDProtein Marker Name
Xenium Protein Immune Cell Subpanel ATXP000007CD4
TXP000004CD20
TXP000011CD8A
TXP000005CD3E
Xenium Protein Immune Cell Subpanel BTXP000003CD138
TXP000015HLA-DR
TXP000002CD11c
TXP000010CD68
Xenium Protein Immune Cell Subpanel CTXP000044CD16
TXP000013GranzymeB
TXP000034CD163
Xenium Protein Immune Checkpoint SubpanelTXP000019PD-1
TXP000023VISTA
TXP000029PD-L1
TXP000016LAG-3
Xenium Protein Proliferation & Differentiation SubpanelTXP000008CD45RA
TXP000018PCNA
TXP000009CD45RO
TXP000030Ki-67
Xenium Protein Tumor SubpanelTXP000035Beta-catenin
TXP000006CD31
TXP000021PTEN
TXP000017PanCK
Xenium Cellular Localization markers
(if Xenium Multi-Tissue Stain Mix markers are imaged individually on-instrument)
TXP000022Vimentin
TXP000001alphaSMA
TXP000024CD45
TXP000025E-Cadherin

The output bundle contains secondary analysis files (CSV and Zarr formats) for gene data (prefix gene_expression_) and protein data (prefix protein_expression_). These results can be used for data QC and provide a starting point for downstream analyses.

Directory structure example:

└─ analysis ├── clustering ├── diffexp ├── pca ├── protexp └── umap

For differential expression (analysis/diffexp) results, a table is produced for each protein-only and RNA-only clustering result that shows differentially expressed RNA features in each cluster relative to all other clusters. This allows the user to annotate cell types of each cluster using gene expression data. These are the same outputs shown in the analysis summary Analysis tab's Top Gene Expression Features by Cluster (Log2 fold-change, p-value) tables:

  • Gene Expression Analysis: analysis/diffexp/gene_expression_[clustering method]/differential_expression.csv
  • Protein Analysis: analysis/diffexp/protein_expression_[clustering method]/differential_expression.csv

A new analysis output is added for Xenium Protein assay data (analysis/protexp) to enable users to annotate cell types of each cluster using protein data. For each protein-only and RNA-only clustering result, a table is produced that shows the z-score standardized protein quantities in each cluster relative to all other clusters. These are the same outputs shown in the analysis summary Analysis tab’s Top Protein Expression Features by Cluster (Z-score):

  • Gene Expression Analysis: analysis/protexp/gene_expression_[clustering method]/protein_expression.csv
  • Protein Analysis: analysis/protexp/protein_expression_[clustering method]/protein_expression.csv

Example:

head -5 gene_expression_kmeans_2_clusters/protein_expression.csv Feature ID,Feature Name,1 Z Score,2 Z Score TXP000019,PD-1,0.0,0.0 TXP000023,VISTA,0.0015787856530881013,-0.00016779365755827146 TXP000029,PD-L1,0.0,0.0 TXP000016,LAG-3,0.0,0.0 head -n 5 protein_expression_kmeans_2_clusters/protein_expression.csv Feature ID,Feature Name,1 Z Score,2 Z Score TXP000019,PD-1,0.0,0.0 TXP000023,VISTA,-0.0021058823637482237,0.0009362350496005511 TXP000029,PD-L1,0.0,0.0 TXP000016,LAG-3,0.0,0.0

In addition to the files described here, the aux_outputs/ directory includes additional background_qc_images/ image outputs for the Xenium Protein assay.

The background_qc_images/ directory also contains the 2nd cycle autofluorescence images (downsampled, TIFF format) acquired after protein image cycles, which are subtracted from the raw morphology stain images to produce the autofocused images (morphology_focus/) if the Xenium Protein protocol is used. These files have _02 in the file name to differentiate from the background images (_01) acquired after the RNA cycles.

Example:

├── background_qc_images │   ├── background_01_blu.tiff │   ├── background_01_grn.tiff │   ├── background_01_nuv.tiff │   ├── background_01_red.tiff │   ├── background_01_yel.tiff │   ├── background_02_blu.tiff │   ├── background_02_grn.tiff │   ├── background_02_nuv.tiff │   ├── background_02_red.tiff │   └── background_02_yel.tiff

Saturation QC masks are provided for all datasets starting in XOA v4.0, but are most useful for protein image data QC.

The morphology focus QC masks, also known as saturation QC masks, are binary images used to label saturated pixels that are masked in output protein images (located in morphology_focus/). The masking step ensures that invalid pixels are excluded from downstream analysis and visualization. Read the protein algorithms page to learn how the masking algorithm works. The QC masks can be imported into Xenium Explorer for data QC.

The protein_panel.json file specifies the protein targets that correspond to each imaging cycle-channel. It is generated when you select protein targets on the Xenium Analyzer UI during run set up, and included in the output bundle.