Data Loaders#

class spatialdata.dataloader.ImageTilesDataset(sdata, regions_to_images, regions_to_coordinate_systems, tile_scale=1.0, tile_dim_in_units=None, rasterize=False, return_annotations=None, table_name=None, transform=None, rasterize_kwargs=mappingproxy({}))#

Bases: Dataset

torch.utils.data.Dataset for loading tiles from a spatialdata.SpatialData object.

By default, the dataset returns a spatialdata.SpatialData object, but when return_annotations is not None, the dataset returns a tuple containing:

  • the tile image, centered in the target coordinate system of the region.

  • a vector or scalar value from the table.

Parameters:
  • sdata (SpatialData) – The spatialdata.SpatialData object.

  • regions_to_images (dict[str, str]) – A mapping between regions (labels, shapes) and images. The regions’ centroids will be the tile centers, while the images will be used to get the pixel values.

  • regions_to_coordinate_systems (dict[str, str]) – A mapping between regions and coordinate systems. The coordinate systems are used to transform both the centroid coordinates of the regions and the images.

  • tile_scale (float (default: 1.0)) –

    This parameter is used to determine the size (width and height) of the tiles. Each tile will have size in units equal to tile_scale times the diameter of the circle that approximates (=same area) the region that defines the tile.

    For example, suppose the regions to be multiscale labels; this is how the tiles are created:

    1. for each tile, each label region is approximated with a circle with the same area of the label region.

    2. The tile is then created as having the width/height equal to the diameter of the circle, multiplied by tile_scale.

    If tile_dim_in_units is passed, tile_scale is ignored.

  • tile_dim_in_units (Optional[float] (default: None)) – The dimension of the requested tile in the units of the target coordinate system. This specifies the extent of the tile; this parameter is not related to the size in pixel of each returned tile. If tile_dim_in_units is passed, tile_scale is ignored.

  • rasterize (bool (default: False)) – If True, the images are rasterized using spatialdata.rasterize() into the target coordinate system; this applies the coordinate transformations to the data. If False, the images are queried using spatialdata.bounding_box_query() from the pixel coordinate system; this back-transforms the target tile into the pixel coordinates. If the back-transformed tile is not aligned with the pixel grid, the returned tile will correspond to the bounding box of the back-transformed tile (so that the returned tile is axis-aligned to the pixel grid).

  • return_annotations (Union[list[str], str, None] (default: None)) – If not None, one or more values from the table are returned together with the image tile in a tuple. Only columns in anndata.AnnData.obs and anndata.AnnData.X can be returned. If None, it will return a SpatialData object with the table consisting of the row that annotates the region from which the tile was extracted.

  • table_name (Optional[str] (default: None)) – The name of the table in the SpatialData object to be used for the annotations. Currently only a single table is supported. If you have multiple tables, you can concatenate them into a single table that annotates multiple regions.

  • transform (Optional[Callable[[Any], Any]] (default: None)) – A data transformations (for instance, a normalization operation; not to be confused with a coordinate transformation) to be applied to the image and the table value. It is a Callable, with Any as return type, that takes as input the (image, table_value) tuple (when return_annotations is not None) or a Callable that takes as input the SpatialData object (when return_annotations is None).

  • rasterize_kwargs (Mapping[str, Any] (default: mappingproxy({}))) – Keyword arguments passed to spatialdata.rasterize() if rasterize is True. This argument can be used in particular to choose the pixel dimension of the produced image tiles; please refer to the spatialdata.rasterize() documentation for this use case.

Returns:

torch.utils.data.Dataset for loading tiles from a spatialdata.SpatialData.

property coordinate_systems: list[str]#

List of coordinate systems in the dataset.

property dataset_index: DataFrame#

DataFrame with the metadata of the tiles.

It contains the following columns:

  • instance: the name of the instance in the region.

  • cs: the coordinate system of the region-image pair.

  • region: the name of the region.

  • image: the name of the image.

property dataset_table: AnnData#

AnnData table filtered by the region and cs present in the dataset.

property dims: list[str]#

Dimensions of the dataset.

property regions: list[str]#

List of regions in the dataset.

property sdata: SpatialData#

The original SpatialData object.

property tiles_coords: DataFrame#

DataFrame with the index of tiles.

It contains axis coordinates of the centroids, and extent of the tiles. For example, for a 2D image, it contains the following columns:

  • x: the x coordinate of the centroid.

  • y: the y coordinate of the centroid.

  • extent: the extent of the tile.

  • minx: the minimum x coordinate of the tile.

  • miny: the minimum y coordinate of the tile.

  • maxx: the maximum x coordinate of the tile.

  • maxy: the maximum y coordinate of the tile.