Data Loaders

Data Loaders#

class spatialdata.dataloader.ImageTilesDataset(sdata, regions_to_images, regions_to_coordinate_systems, tile_scale=1.0, tile_dim_in_units=None, rasterize=False, return_annotations=None, table_name=None, transform=None, rasterize_kwargs=mappingproxy({}))#

Bases: Dataset

torch.utils.data.Dataset for loading tiles from a spatialdata.SpatialData object.

By default, the dataset returns a spatialdata.SpatialData object, but when return_annotations is not None, the dataset returns a tuple containing:

the tile image, centered in the target coordinate system of the region.

a vector or scalar value from the table.

Parameters:

sdata (SpatialData) – The spatialdata.SpatialData object.
regions_to_images (dict[str, str]) – A mapping between regions (labels, shapes) and images. The regions’ centroids will be the tile centers, while the images will be used to get the pixel values.
regions_to_coordinate_systems (dict[str, str]) – A mapping between regions and coordinate systems. The coordinate systems are used to transform both the centroid coordinates of the regions and the images.
tile_scale (float (default: 1.0)) –
This parameter is used to determine the size (width and height) of the tiles. Each tile will have size in units equal to tile_scale times the diameter of the circle that approximates (=same area) the region that defines the tile.

For example, suppose the regions to be multiscale labels; this is how the tiles are created:
1. for each tile, each label region is approximated with a circle with the same area of the label region.
2. The tile is then created as having the width/height equal to the diameter of the circle, multiplied by tile_scale.
If tile_dim_in_units is passed, tile_scale is ignored.
tile_dim_in_units (float | None (default: None)) – The dimension of the requested tile in the units of the target coordinate system. This specifies the extent of the tile; this parameter is not related to the size in pixel of each returned tile. If tile_dim_in_units is passed, tile_scale is ignored.
rasterize (bool (default: False)) – If True, the images are rasterized using spatialdata.rasterize() into the target coordinate system; this applies the coordinate transformations to the data. If False, the images are queried using spatialdata.bounding_box_query() from the pixel coordinate system; this back-transforms the target tile into the pixel coordinates. If the back-transformed tile is not aligned with the pixel grid, the returned tile will correspond to the bounding box of the back-transformed tile (so that the returned tile is axis-aligned to the pixel grid).
return_annotations (str | list[str] | None (default: None)) – If not None, one or more values from the table are returned together with the image tile in a tuple. Only columns in anndata.AnnData.obs and anndata.AnnData.X can be returned. If None, it will return a SpatialData object with the table consisting of the row that annotates the region from which the tile was extracted.
table_name (str | None (default: None)) – The name of the table in the SpatialData object to be used for the annotations. Currently only a single table is supported. If you have multiple tables, you can concatenate them into a single table that annotates multiple regions.
transform (Callable[[Any], Any] | None (default: None)) – A data transformations (for instance, a normalization operation; not to be confused with a coordinate transformation) to be applied to the image and the table value. It is a Callable, with Any as return type, that takes as input the (image, table_value) tuple (when return_annotations is not None) or a Callable that takes as input the SpatialData object (when return_annotations is None).
rasterize_kwargs (Mapping[str, Any] (default: mappingproxy({}))) – Keyword arguments passed to spatialdata.rasterize() if rasterize is True. This argument can be used in particular to choose the pixel dimension of the produced image tiles; please refer to the spatialdata.rasterize() documentation for this use case.

Returns:

torch.utils.data.Dataset for loading tiles from a spatialdata.SpatialData.

property coordinate_systems: list[str]#: List of coordinate systems in the dataset.

property dataset_index: DataFrame#

DataFrame with the metadata of the tiles.

It contains the following columns:

instance: the name of the instance in the region.

cs: the coordinate system of the region-image pair.

region: the name of the region.

image: the name of the image.

property dataset_table: AnnData#: AnnData table filtered by the region and cs present in the dataset.

property dims: list[str]#: Dimensions of the dataset.

property regions: list[str]#: List of regions in the dataset.

property sdata: SpatialData#: The original SpatialData object.

property tiles_coords: DataFrame#

DataFrame with the index of tiles.

It contains axis coordinates of the centroids, and extent of the tiles. For example, for a 2D image, it contains the following columns:

x: the x coordinate of the centroid.

y: the y coordinate of the centroid.

extent: the extent of the tile.

minx: the minimum x coordinate of the tile.

miny: the minimum y coordinate of the tile.

maxx: the maximum x coordinate of the tile.

maxy: the maximum y coordinate of the tile.

Data Loaders

Contents

Data Loaders#