What Xenium output should I keep for archival storage for reanalysis and grant funding requirements?
The Xenium platform aims to support and embrace principles of data findability, accessibility, interoperability, and reusability (FAIR) so that it is easy to share newly generated Xenium data for collaborative analysis and reproduce findings from published Xenium data.
To that end, we recommend archiving Xenium raw data outputs, which consist of:
Decoded transcripts are provided in
.csv format. Morphology images are provided in
ome.tiff format. These data should be archived to fulfill grant funding requirements and for reanalysis, and may be submitted to repositories such as GEO. All other Xenium outputs are derived from these raw data in Xenium Onboard Analysis, can be rederived after a Xenium instrument run, and are not strictly necessary for long-term archival and reproducibility.
Additional detail on Xenium raw data output:
- A Xenium Q-Score indicates the probability that the detected object exists and was correctly identified by the decoding algorithm. All decoded transcript Q-Scores are output in the transcripts files. The cells and cell-feature matrix output files in the Xenium output bundle are filtered to Q-Score ≥ 20. For more details, see our Overview of Xenium Algorithms support page.
- Xenium morphology images will always be provided at the same resolution that our onboard segmentation algorithm uses as input. This ensures that you can benefit from improvements to our segmentation model as we add to its training over time, or run your own segmentation if you choose. Our off-instrument reanalysis package, Xenium Ranger, enables you to easily rerun segmentation or import your own segmentation results to generate derived outputs (e.g., cell-feature matrix) and view them in Xenium Explorer.
- We will stand by these FAIR principles with future capabilities (i.e., multimodal boundary stain solution for full segmentation). High-resolution morphology images will continue to be included in the Xenium output bundle for our onboard multimodal segmentation method.
- Other outputs from Xenium Onboard Analysis (XOA) are derived data from these raw outputs, and the community can recapitulate them from Xenium raw data.
Xenium raw data reduces low-level internal sensor data as described at Overview of Xenium Algorithms. It preserves details needed to assess decoded transcript quality, abstracting away low-level details of the instrumentation and assay that require calibration and specialized methods that will change over time as the platform improves and gains new capabilities.
On-instrument processing of Xenium internal sensor data — i.e., the 3D per-pixel values that Xenium Analyzer’s internal image sensor captures across multiple FOVs, multiple fluorescence channels, and multiple cycles of chemistry and imaging processing — is closely tied to Xenium optics. Consequently, Xenium internal sensor data cannot be reanalyzed after processing with Xenium Onboard Analysis.
Internal sensor data is not practically useful for reanalysis or storage (~tens of terabytes of data per sample). In the spirit of scientific reproducibility, it is more useful to store the Xenium decoded transcripts with assigned Phred-scaled Q-Scores and morphology images (typical output directory sizes) for reanalysis.
To add further transparency and to supplement existing methods to QC Xenium data, downsampled RNA diagnostic images are available in the Xenium auxiliary output directory in Xenium Onboard Analysis v1.6 and later. In XOA v1.7, these images are also available in the Analysis Summary. These images are not needed for raw data archival, but should be useful in gaining confidence in the robustness of Xenium's decoding algorithm.