Models#

The elements (building-blocks) that constitute SpatialData.

class spatialdata.models.Image2DModel(*args, **kwargs)#

Bases: RasterSchema

classmethod parse(data, dims=None, c_coords=None, transformations=None, scale_factors=None, method=None, chunks=None, **kwargs)#

Validate (or parse) raster data.

Parameters:
  • data (ndarray[Any, dtype[floating[Any]]] | DataArray | Array) – Data to validate (or parse). The shape of the data should be c(z)yx for 2D (3D) images and (z)yx for 2D ( 3D) labels. If you have a 2D image with shape yx, you can use numpy.expand_dims() (or an equivalent function) to add a channel dimension.

  • dims (Optional[Sequence[str]] (default: None)) – Dimensions of the data (e.g. [‘c’, ‘y’, ‘x’] for 2D image data). If the data is a xarray.DataArray, the dimensions can also be inferred from the data. If the dimensions are not in the order (c)(z)yx, the data will be transposed to match the order.

  • c_coords (str | list[str] | None) – Channel names of image data. Must be equal to the length of dimension ‘c’. Only supported for Image models.

  • transformations (Optional[dict[str, BaseTransformation]] (default: None)) – Dictionary of transformations to apply to the data. The key is the name of the target coordinate system, the value is the transformation to apply. By default, a single Identity transformation mapping to the "global" coordinate system is applied.

  • scale_factors (Optional[Sequence[dict[str, int] | int]] (default: None)) – Scale factors to apply to construct a multiscale image (datatree.DataTree). If None, a xarray.DataArray is returned instead. Importantly, each scale factor is relative to the previous scale factor. For example, if the scale factors are [2, 2, 2], the returned multiscale image will have 4 scales. The original image and then the 2x, 4x and 8x downsampled images.

  • method (Optional[Methods] (default: None)) – Method to use for multiscale downsampling. Please refer to multiscale_spatial_image.to_multiscale.

  • chunks (Union[int, tuple[int, ...], tuple[tuple[int, ...], ...], Mapping[Any, None | int | tuple[int, ...]], None] (default: None)) – Chunks to use for dask array.

  • kwargs (Any) – Additional arguments for to_spatial_image(). In particular the c_coords kwargs argument (an iterable) can be used to set the channel coordinates for image data. c_coords is not available for labels data as labels do not have channels.

Return type:

DataArray | DataTree

Returns:

: xarray.DataArray or datatree.DataTree

Notes

RGB images

If you have an image with 3 or 4 channels and you want to interpret it as an RGB or RGB(A) image, you can use the c_coords argument to specify the channel coordinates as ["r", "g", "b"] or ["r", "g", "b", "a"].

You can also pass the rgb argument to kwargs to automatically set the c_coords to ["r", "g", "b"]. Please refer to to_spatial_image() for more information. Note: if you set rgb=None in kwargs, 3-4 channel images will be interpreted automatically as RGB(A) images.

validate(data)#

Validate data.

Parameters:

data (Any) – Data to validate.

Raises:

ValueError – If data is not valid.

Return type:

None

class spatialdata.models.Image3DModel(*args, **kwargs)#

Bases: RasterSchema

classmethod parse(data, dims=None, c_coords=None, transformations=None, scale_factors=None, method=None, chunks=None, **kwargs)#

Validate (or parse) raster data.

Parameters:
  • data (ndarray[Any, dtype[floating[Any]]] | DataArray | Array) – Data to validate (or parse). The shape of the data should be c(z)yx for 2D (3D) images and (z)yx for 2D ( 3D) labels. If you have a 2D image with shape yx, you can use numpy.expand_dims() (or an equivalent function) to add a channel dimension.

  • dims (Optional[Sequence[str]] (default: None)) – Dimensions of the data (e.g. [‘c’, ‘y’, ‘x’] for 2D image data). If the data is a xarray.DataArray, the dimensions can also be inferred from the data. If the dimensions are not in the order (c)(z)yx, the data will be transposed to match the order.

  • c_coords (str | list[str] | None) – Channel names of image data. Must be equal to the length of dimension ‘c’. Only supported for Image models.

  • transformations (Optional[dict[str, BaseTransformation]] (default: None)) – Dictionary of transformations to apply to the data. The key is the name of the target coordinate system, the value is the transformation to apply. By default, a single Identity transformation mapping to the "global" coordinate system is applied.

  • scale_factors (Optional[Sequence[dict[str, int] | int]] (default: None)) – Scale factors to apply to construct a multiscale image (datatree.DataTree). If None, a xarray.DataArray is returned instead. Importantly, each scale factor is relative to the previous scale factor. For example, if the scale factors are [2, 2, 2], the returned multiscale image will have 4 scales. The original image and then the 2x, 4x and 8x downsampled images.

  • method (Optional[Methods] (default: None)) – Method to use for multiscale downsampling. Please refer to multiscale_spatial_image.to_multiscale.

  • chunks (Union[int, tuple[int, ...], tuple[tuple[int, ...], ...], Mapping[Any, None | int | tuple[int, ...]], None] (default: None)) – Chunks to use for dask array.

  • kwargs (Any) – Additional arguments for to_spatial_image(). In particular the c_coords kwargs argument (an iterable) can be used to set the channel coordinates for image data. c_coords is not available for labels data as labels do not have channels.

Return type:

DataArray | DataTree

Returns:

: xarray.DataArray or datatree.DataTree

Notes

RGB images

If you have an image with 3 or 4 channels and you want to interpret it as an RGB or RGB(A) image, you can use the c_coords argument to specify the channel coordinates as ["r", "g", "b"] or ["r", "g", "b", "a"].

You can also pass the rgb argument to kwargs to automatically set the c_coords to ["r", "g", "b"]. Please refer to to_spatial_image() for more information. Note: if you set rgb=None in kwargs, 3-4 channel images will be interpreted automatically as RGB(A) images.

validate(data)#

Validate data.

Parameters:

data (Any) – Data to validate.

Raises:

ValueError – If data is not valid.

Return type:

None

class spatialdata.models.Labels2DModel(*args, **kwargs)#

Bases: RasterSchema

classmethod parse(*args, **kwargs)#

Validate (or parse) raster data.

Parameters:
  • data – Data to validate (or parse). The shape of the data should be c(z)yx for 2D (3D) images and (z)yx for 2D ( 3D) labels. If you have a 2D image with shape yx, you can use numpy.expand_dims() (or an equivalent function) to add a channel dimension.

  • dims – Dimensions of the data (e.g. [‘c’, ‘y’, ‘x’] for 2D image data). If the data is a xarray.DataArray, the dimensions can also be inferred from the data. If the dimensions are not in the order (c)(z)yx, the data will be transposed to match the order.

  • c_coords (str | list[str] | None) – Channel names of image data. Must be equal to the length of dimension ‘c’. Only supported for Image models.

  • transformations – Dictionary of transformations to apply to the data. The key is the name of the target coordinate system, the value is the transformation to apply. By default, a single Identity transformation mapping to the "global" coordinate system is applied.

  • scale_factors – Scale factors to apply to construct a multiscale image (datatree.DataTree). If None, a xarray.DataArray is returned instead. Importantly, each scale factor is relative to the previous scale factor. For example, if the scale factors are [2, 2, 2], the returned multiscale image will have 4 scales. The original image and then the 2x, 4x and 8x downsampled images.

  • method – Method to use for multiscale downsampling. Please refer to multiscale_spatial_image.to_multiscale.

  • chunks – Chunks to use for dask array.

  • kwargs (Any) – Additional arguments for to_spatial_image(). In particular the c_coords kwargs argument (an iterable) can be used to set the channel coordinates for image data. c_coords is not available for labels data as labels do not have channels.

Return type:

DataArray | DataTree

Returns:

: xarray.DataArray or datatree.DataTree

Notes

RGB images

If you have an image with 3 or 4 channels and you want to interpret it as an RGB or RGB(A) image, you can use the c_coords argument to specify the channel coordinates as ["r", "g", "b"] or ["r", "g", "b", "a"].

You can also pass the rgb argument to kwargs to automatically set the c_coords to ["r", "g", "b"]. Please refer to to_spatial_image() for more information. Note: if you set rgb=None in kwargs, 3-4 channel images will be interpreted automatically as RGB(A) images.

validate(data)#

Validate data.

Parameters:

data (Any) – Data to validate.

Raises:

ValueError – If data is not valid.

Return type:

None

class spatialdata.models.Labels3DModel(*args, **kwargs)#

Bases: RasterSchema

classmethod parse(*args, **kwargs)#

Validate (or parse) raster data.

Parameters:
  • data – Data to validate (or parse). The shape of the data should be c(z)yx for 2D (3D) images and (z)yx for 2D ( 3D) labels. If you have a 2D image with shape yx, you can use numpy.expand_dims() (or an equivalent function) to add a channel dimension.

  • dims – Dimensions of the data (e.g. [‘c’, ‘y’, ‘x’] for 2D image data). If the data is a xarray.DataArray, the dimensions can also be inferred from the data. If the dimensions are not in the order (c)(z)yx, the data will be transposed to match the order.

  • c_coords (str | list[str] | None) – Channel names of image data. Must be equal to the length of dimension ‘c’. Only supported for Image models.

  • transformations – Dictionary of transformations to apply to the data. The key is the name of the target coordinate system, the value is the transformation to apply. By default, a single Identity transformation mapping to the "global" coordinate system is applied.

  • scale_factors – Scale factors to apply to construct a multiscale image (datatree.DataTree). If None, a xarray.DataArray is returned instead. Importantly, each scale factor is relative to the previous scale factor. For example, if the scale factors are [2, 2, 2], the returned multiscale image will have 4 scales. The original image and then the 2x, 4x and 8x downsampled images.

  • method – Method to use for multiscale downsampling. Please refer to multiscale_spatial_image.to_multiscale.

  • chunks – Chunks to use for dask array.

  • kwargs (Any) – Additional arguments for to_spatial_image(). In particular the c_coords kwargs argument (an iterable) can be used to set the channel coordinates for image data. c_coords is not available for labels data as labels do not have channels.

Return type:

DataArray | DataTree

Returns:

: xarray.DataArray or datatree.DataTree

Notes

RGB images

If you have an image with 3 or 4 channels and you want to interpret it as an RGB or RGB(A) image, you can use the c_coords argument to specify the channel coordinates as ["r", "g", "b"] or ["r", "g", "b", "a"].

You can also pass the rgb argument to kwargs to automatically set the c_coords to ["r", "g", "b"]. Please refer to to_spatial_image() for more information. Note: if you set rgb=None in kwargs, 3-4 channel images will be interpreted automatically as RGB(A) images.

validate(data)#

Validate data.

Parameters:

data (Any) – Data to validate.

Raises:

ValueError – If data is not valid.

Return type:

None

class spatialdata.models.ShapesModel#

Bases: object

classmethod parse(data, **kwargs)#

Parse shapes data.

Parameters:
  • data (Any) –

    Data to parse:

  • geometry

    Geometry type of the shapes. The following geometries are supported:

    • 0: Circles

    • 3: Polygon

    • 6: MultiPolygon

  • offsets – In the case of shapely.Polygon or shapely.MultiPolygon shapes, in order to initialize the shapes from their ragged array representation, the offsets of the polygons must be provided. Alternatively you can call the parser as ShapesModel.parse(data), where data is a GeoDataFrame object and ignore the offset parameter (recommended).

  • radius – Size of the Circles. It must be provided if the shapes are Circles.

  • index – Index of the shapes, must be of type str. If None, it’s generated automatically.

  • transformations – Transformations of shapes.

  • kwargs (Any) – Additional arguments for GeoJSON reader.

Return type:

GeoDataFrame

Returns:

: geopandas.GeoDataFrame

classmethod validate(data)#

Validate data.

Parameters:

data (GeoDataFrame) – geopandas.GeoDataFrame to validate.

Return type:

None

Returns:

: None

classmethod validate_shapes_not_mixed_types(gdf)#

Check that the Shapes element is either composed of Point or Polygon/MultiPolygon.

Parameters:

gdf (GeoDataFrame) – The Shapes element.

Raises:

ValueError – When the geometry column composing the object does not satisfy the type requirements.

Return type:

None

Notes

This function is not called by ShapesModel.validate() because computing the unique types by default could be expensive.

class spatialdata.models.PointsModel#

Bases: object

classmethod parse(data, **kwargs)#

Validate (or parse) points data.

Parameters:
  • data (Any) –

    Data to parse:

    • If numpy.ndarray, an annotation pandas.DataFrame can be provided, as well as a feature_key column in the annotation dataframe. Furthermore, numpy.ndarray is assumed to have shape (n_points, axes), with axes being “x”, “y” and optionally “z”.

    • If pandas.DataFrame, a coordinates mapping can be provided with key as valid axes (‘x’, ‘y’, ‘z’) and value as column names in dataframe. If the dataframe already has columns named ‘x’, ‘y’ and ‘z’, the mapping can be omitted.

  • annotation – Annotation dataframe. Only if data is numpy.ndarray. If data is an array, the index of the annotations will be used as the index of the parsed points.

  • coordinates – Mapping of axes names (keys) to column names (valus) in data. Only if data is pandas.DataFrame. Example: {‘x’: ‘my_x_column’, ‘y’: ‘my_y_column’}. If not provided and data is pandas.DataFrame, and x, y and optionally z are column names, then they will be used as coordinates.

  • feature_key – Optional, feature key in annotation or data. Example use case: gene id categorical column describing the gene identity of each point.

  • instance_key – Optional, instance key in annotation or data. Example use case: cell id column, describing which cell a point belongs to. This argument is likely going to be deprecated: scverse/spatialdata#503.

  • transformations – Transformations of points.

  • kwargs (Any) – Additional arguments for dask.dataframe.from_array().

Return type:

DataFrame

Returns:

: dask.dataframe.core.DataFrame

Notes

The order of the columns of the dataframe returned by the parser is not guaranteed to be the same as the order of the columns in the dataframe passed as an argument.

classmethod validate(data)#

Validate data.

Parameters:

data (DataFrame) – dask.dataframe.core.DataFrame to validate.

Return type:

None

Returns:

: None

class spatialdata.models.TableModel#

Bases: object

classmethod parse(adata, region=None, region_key=None, instance_key=None)#

Parse the anndata.AnnData to be compatible with the model.

Parameters:
  • adata (AnnData) – The AnnData object.

  • region (Union[list[str], str, None] (default: None)) – Region(s) to be used.

  • region_key (Optional[str] (default: None)) – Key in adata.obs that specifies the region.

  • instance_key (Optional[str] (default: None)) – Key in adata.obs that specifies the instance.

Return type:

AnnData

Returns:

: The parsed data.

validate(data)#

Validate the data.

Parameters:

data (AnnData) – The data to validate.

Return type:

AnnData

Returns:

: The validated data.