Support homeSpace RangerAnalysis
Compatible Segmentation Input Files

Compatible Segmentation Input Files

Space Ranger v4.0 introduced nucleus and cell segmentation for Visium HD Spatial Gene Expression and Visium HD 3' Spatial Gene Expression. Users can alternatively provide their own segmentation masks using the --custom-segmentation-file option in the spaceranger count pipeline. The following input file formats are compatible for use with the spaceranger count pipeline:

  • Labeled mask in TIFF or NumPy NPY format
  • Polygons in GeoJSON format (FeatureCollection type)
  • A CSV file mapping squares to cell ID

Here are some additional considerations:

  • The TIFF, NPY, and GeoJSON coordinates must match the user's microscope image.
  • When invoking the --custom-segmentation-file option, you must also provide a value for -—nucleus-expansion-distance-micron. If using a nucleus mask, this is the maximum distance (in microns) of a barcode center from a segmented nucleus at which Space Ranger will assign a barcode to a nucleus. Set this to zero if providing a cell mask.

The remaining sections provide more detail on the accepted file formats.

Cell segmentation results can be represented as labeled masks, where each pixel's value corresponds to the cell it belongs to. Pixels not belonging to any cell are labeled with the value 0. Each distinct cell within the image is assigned a unique positive integer identifier. All pixels that constitute a single segmented cell will share the same integer value.

These masks are typically stored in either TIFF or NumPy (NPY) format, offering compatibility with various image analysis workflows.

  • TIFF (Tagged Image File Format): TIFF is a widely supported image format capable of storing multi-layered and high-bit-depth images. In this context, a TIFF file would store the labeled mask as a single-channel image with 16- or 32-bit integer pixel values. The dimensions (height and width) of this TIFF image must precisely match the dimensions of the corresponding microscopy image (e.g., H&E or IF).
  • NumPy (NPY): NumPy is a fundamental Python library for numerical computing. A segmentation mask in NPY format should be saved as a 2D NumPy array (matrix) where the number of rows and columns corresponds exactly to the height and width of the original microscopy image. Each element within this array would be a 32-bit integer representing the cell label of the corresponding pixel. NumPy's widespread adoption in the Python scientific ecosystem makes this a convenient format for many analysis pipelines.

Many cell segmentation software tools can generate labeled masks in these formats. For example, QuPath, a popular open-source bioimage analysis software, provides options to save segmentation results as labeled TIFF images. Similarly, Python-based segmentation tools often directly output labeled masks as NumPy arrays, which can then be saved in the .npy format.

Once loaded into analysis software (which, by the time of loading, treats these files as matrices with the correct dimensions), these labeled masks enable various downstream analyses, such as quantifying cellular features, spatial analysis, and investigating the spatial relationships and interactions between different cells. Space Ranger's compability with the TIFF or NPY formats ins meant to ensure interoperability between different image analysis tools and facilitate reproducible research.

The polygon GeoJSON should be exported as a FeatureCollection. The cells and nuclei can be saved in the same (as shown below in QuPath's output format) or separate GeoJSON files. The format should look similar to this example:

{ "type": "FeatureCollection", "features": [ { "type": "Feature", "id": "12059192-3b27-4438-96dc-97b41ca84717", "geometry": { "type": "Polygon", "coordinates": [ [ [3418.94, 2], [3414.62, 3.85], [3414.12, 22.65], [3415.06, 27.26], [3420.1, 35.15], [3428.03, 40.22], [3437.01, 42.94], [3440.69, 45.87], [3445.32, 46.71], [3468.85, 46.71], [3477.88, 44.35], [3534.35, 44.31], [3546.79, 37.83], [3552.28, 30.11], [3552.94, 25.45], [3552.93, 6.63], [3550.68, 2.49], [3546, 2], [3418.94, 2] ] ] } } ] }

Segmentation results from QuPath are compatible with Space Ranger and have these specifications:

  • The feature objectType should be cell, as features with a non-cell objectType will be ignored (i.e., annotations).
  • The --nuclei argument will use the nucleusGeometry polygon if it exists in the GeoJSON, otherwise it will use the geometry polygon. The --cells argument will use the geometry polygon.
  • QuPath exports nucleus and cell segmentation results in one file, so the same GeoJSON file should be specified for both --cells and --nuclei. The format should look similar to this example:
{ "type":"FeatureCollection", "features": [ { "type":"Feature", "id":"fd0c3d4e-6146-427d-9696-97fbe7adb63d", "geometry":{ "type":"Polygon", "coordinates":[ [ [4348.52, 0], [4344.02, 1.37], [4341.18, 10.23], [4341.8, 24.31], [4346.73, 32.28], [4353.27, 38.99], [4361.18, 44], [4379.94, 44.48], [4388.64, 40.97], [4396.11, 35.26], [4404.71, 21.36], [4404.27, 2.55], [4400.29, 0.06], [4348.52, 0] ] ] }, "nucleusGeometry":{ "type":"Polygon", "coordinates":[ [ [4373.91, 4.81], [4366.61, 9.07], [4364.28, 12.61], [4364.3, 16.85], [4370.37, 22.75], [4378.35, 20.24], [4382.75, 13.06], [4380.87, 9.26], [4373.91, 4.81] ] ] }, "properties":{ "objectType":"cell" } } ] }

The CSV must have two columns, square_002um and cell_id. The cell IDs can be cell barcodes, as shown below, or unique numbers. Barcodes not assigned to cells should be omitted.

Here is an example of the required CSV format:

square_002um,cell_id s_002um_00379_01214-1,cellid_000000616-1 s_002um_00380_01212-1,cellid_000000616-1