Models#
The elements (building-blocks) that constitute SpatialData.
- class spatialdata.models.Image2DModel#
Bases:
RasterSchema- classmethod parse(data, dims=None, c_coords=None, transformations=None, scale_factors=None, method=None, chunks=None, **kwargs)#
Validate (or parse) raster data.
- Parameters:
data (
ndarray[tuple[Any,...],dtype[floating[Any]]] |DataArray|Array) – Data to validate (or parse). The shape of the data should be c(z)yx for 2D (3D) images and (z)yx for 2D ( 3D) labels. If you have a 2D image with shape yx, you can usenumpy.expand_dims()(or an equivalent function) to add a channel dimension.dims (
Sequence[str] |None(default:None)) – Dimensions of the data (e.g. [‘c’, ‘y’, ‘x’] for 2D image data). If the data is axarray.DataArray, the dimensions can also be inferred from the data. If the dimensions are not in the order (c)(z)yx, the data will be transposed to match the order.c_coords (
str|list[str] |None(default:None)) – Channel names of image data. Must be equal to the length of dimension ‘c’. Only supported forImagemodels.transformations (
dict[str,BaseTransformation] |None(default:None)) – Dictionary of transformations to apply to the data. The key is the name of the target coordinate system, the value is the transformation to apply. By default, a singleIdentitytransformation mapping to the"global"coordinate system is applied.scale_factors (
Sequence[dict[str,int] |int] |None(default:None)) – Scale factors to apply to construct a multiscale image (datatree.DataTree). IfNone, axarray.DataArrayis returned instead. Importantly, each scale factor is relative to the previous scale factor. For example, if the scale factors are[2, 2, 2], the returned multiscale image will have 4 scales. The original image and then the 2x, 4x and 8x downsampled images.method (
Methods|None(default:None)) –Method to use for multiscale downsampling. The default (
None) differs between images and labels:Images (
Image2DModel,Image3DModel): usesmultiscale_spatial_image.to_multiscale()withmethod=Methods.XARRAY_COARSEN. This is the same default as in spatialdata <= 0.7.2 and is fast.Labels (
Labels2DModel,Labels3DModel): uses a lazy implementation based onome-zarr-py’sresize()(order=0, nearest-neighbour). This has lower peak memory usage than themultiscale_spatial_imageimplementation. Note: for images this ome-zarr-py path shows a significant performance regression (both time and memory); see GitHub issue #1079.
To override the default, pass any
Methodsvalue, which will force themultiscale_spatial_image.to_multiscale()code path for all element types. For example:method=Methods.XARRAY_COARSEN— coarsening via xarray (fast, default for images).method=Methods.DASK_IMAGE_NEAREST— nearest-neighbour via dask-image (not lazy as of multiscale-spatial-image==2.0.3, so it leads to higher memory usage).
chunks (
int|tuple[int,...] |tuple[tuple[int,...],...] |Mapping[Any,None|int|tuple[int,...]] |None(default:None)) – Chunks to use for dask array.kwargs (
Any) – Additional arguments forto_spatial_image(). In particular thec_coordskwargs argument (an iterable) can be used to set the channel coordinates for image data.c_coordsis not available for labels data as labels do not have channels.
- Return type:
- Returns:
:
xarray.DataArrayordatatree.DataTree
Notes
RGB images
If you have an image with 3 or 4 channels and you want to interpret it as an RGB or RGB(A) image, you can use the
c_coordsargument to specify the channel coordinates as["r", "g", "b"]or["r", "g", "b", "a"].You can also pass the
rgbargument tokwargsto automatically set thec_coordsto["r", "g", "b"]. Please refer toto_spatial_image()for more information. Note: if you setrgb=Noneinkwargs, 3-4 channel images will be interpreted automatically as RGB(A) images.Setting axes / dims In case of the data being a numpy or dask array, there are no named axes yet. In this case, we first try to use the dimensions specified by the user in the
dimsargument of.parse. These dimensions are used to potentially transpose the data to match the order (c)(z)yx. See the description of thedimsargument above. Ifdimsis not specified, the dims are set to (c)(z)yx, dependent on the number of dimensions of the data.
- classmethod validate(data)#
Validate data.
- Parameters:
data (
Any) – Data to validate.- Raises:
ValueError – If data is not valid.
- Return type:
None
- class spatialdata.models.Image3DModel#
Bases:
RasterSchema- classmethod parse(data, dims=None, c_coords=None, transformations=None, scale_factors=None, method=None, chunks=None, **kwargs)#
Validate (or parse) raster data.
- Parameters:
data (
ndarray[tuple[Any,...],dtype[floating[Any]]] |DataArray|Array) – Data to validate (or parse). The shape of the data should be c(z)yx for 2D (3D) images and (z)yx for 2D ( 3D) labels. If you have a 2D image with shape yx, you can usenumpy.expand_dims()(or an equivalent function) to add a channel dimension.dims (
Sequence[str] |None(default:None)) – Dimensions of the data (e.g. [‘c’, ‘y’, ‘x’] for 2D image data). If the data is axarray.DataArray, the dimensions can also be inferred from the data. If the dimensions are not in the order (c)(z)yx, the data will be transposed to match the order.c_coords (
str|list[str] |None(default:None)) – Channel names of image data. Must be equal to the length of dimension ‘c’. Only supported forImagemodels.transformations (
dict[str,BaseTransformation] |None(default:None)) – Dictionary of transformations to apply to the data. The key is the name of the target coordinate system, the value is the transformation to apply. By default, a singleIdentitytransformation mapping to the"global"coordinate system is applied.scale_factors (
Sequence[dict[str,int] |int] |None(default:None)) – Scale factors to apply to construct a multiscale image (datatree.DataTree). IfNone, axarray.DataArrayis returned instead. Importantly, each scale factor is relative to the previous scale factor. For example, if the scale factors are[2, 2, 2], the returned multiscale image will have 4 scales. The original image and then the 2x, 4x and 8x downsampled images.method (
Methods|None(default:None)) –Method to use for multiscale downsampling. The default (
None) differs between images and labels:Images (
Image2DModel,Image3DModel): usesmultiscale_spatial_image.to_multiscale()withmethod=Methods.XARRAY_COARSEN. This is the same default as in spatialdata <= 0.7.2 and is fast.Labels (
Labels2DModel,Labels3DModel): uses a lazy implementation based onome-zarr-py’sresize()(order=0, nearest-neighbour). This has lower peak memory usage than themultiscale_spatial_imageimplementation. Note: for images this ome-zarr-py path shows a significant performance regression (both time and memory); see GitHub issue #1079.
To override the default, pass any
Methodsvalue, which will force themultiscale_spatial_image.to_multiscale()code path for all element types. For example:method=Methods.XARRAY_COARSEN— coarsening via xarray (fast, default for images).method=Methods.DASK_IMAGE_NEAREST— nearest-neighbour via dask-image (not lazy as of multiscale-spatial-image==2.0.3, so it leads to higher memory usage).
chunks (
int|tuple[int,...] |tuple[tuple[int,...],...] |Mapping[Any,None|int|tuple[int,...]] |None(default:None)) – Chunks to use for dask array.kwargs (
Any) – Additional arguments forto_spatial_image(). In particular thec_coordskwargs argument (an iterable) can be used to set the channel coordinates for image data.c_coordsis not available for labels data as labels do not have channels.
- Return type:
- Returns:
:
xarray.DataArrayordatatree.DataTree
Notes
RGB images
If you have an image with 3 or 4 channels and you want to interpret it as an RGB or RGB(A) image, you can use the
c_coordsargument to specify the channel coordinates as["r", "g", "b"]or["r", "g", "b", "a"].You can also pass the
rgbargument tokwargsto automatically set thec_coordsto["r", "g", "b"]. Please refer toto_spatial_image()for more information. Note: if you setrgb=Noneinkwargs, 3-4 channel images will be interpreted automatically as RGB(A) images.Setting axes / dims In case of the data being a numpy or dask array, there are no named axes yet. In this case, we first try to use the dimensions specified by the user in the
dimsargument of.parse. These dimensions are used to potentially transpose the data to match the order (c)(z)yx. See the description of thedimsargument above. Ifdimsis not specified, the dims are set to (c)(z)yx, dependent on the number of dimensions of the data.
- classmethod validate(data)#
Validate data.
- Parameters:
data (
Any) – Data to validate.- Raises:
ValueError – If data is not valid.
- Return type:
None
- class spatialdata.models.Labels2DModel#
Bases:
RasterSchema- classmethod parse(*args, **kwargs)#
Validate (or parse) raster data.
- Parameters:
data – Data to validate (or parse). The shape of the data should be c(z)yx for 2D (3D) images and (z)yx for 2D ( 3D) labels. If you have a 2D image with shape yx, you can use
numpy.expand_dims()(or an equivalent function) to add a channel dimension.dims – Dimensions of the data (e.g. [‘c’, ‘y’, ‘x’] for 2D image data). If the data is a
xarray.DataArray, the dimensions can also be inferred from the data. If the dimensions are not in the order (c)(z)yx, the data will be transposed to match the order.c_coords (str | list[str] | None) – Channel names of image data. Must be equal to the length of dimension ‘c’. Only supported for
Imagemodels.transformations – Dictionary of transformations to apply to the data. The key is the name of the target coordinate system, the value is the transformation to apply. By default, a single
Identitytransformation mapping to the"global"coordinate system is applied.scale_factors – Scale factors to apply to construct a multiscale image (
datatree.DataTree). IfNone, axarray.DataArrayis returned instead. Importantly, each scale factor is relative to the previous scale factor. For example, if the scale factors are[2, 2, 2], the returned multiscale image will have 4 scales. The original image and then the 2x, 4x and 8x downsampled images.method –
Method to use for multiscale downsampling. The default (
None) differs between images and labels:Images (
Image2DModel,Image3DModel): usesmultiscale_spatial_image.to_multiscale()withmethod=Methods.XARRAY_COARSEN. This is the same default as in spatialdata <= 0.7.2 and is fast.Labels (
Labels2DModel,Labels3DModel): uses a lazy implementation based onome-zarr-py’sresize()(order=0, nearest-neighbour). This has lower peak memory usage than themultiscale_spatial_imageimplementation. Note: for images this ome-zarr-py path shows a significant performance regression (both time and memory); see GitHub issue #1079.
To override the default, pass any
Methodsvalue, which will force themultiscale_spatial_image.to_multiscale()code path for all element types. For example:method=Methods.XARRAY_COARSEN— coarsening via xarray (fast, default for images).method=Methods.DASK_IMAGE_NEAREST— nearest-neighbour via dask-image (not lazy as of multiscale-spatial-image==2.0.3, so it leads to higher memory usage).
chunks – Chunks to use for dask array.
kwargs (
Any) – Additional arguments forto_spatial_image(). In particular thec_coordskwargs argument (an iterable) can be used to set the channel coordinates for image data.c_coordsis not available for labels data as labels do not have channels.
- Return type:
- Returns:
:
xarray.DataArrayordatatree.DataTree
Notes
RGB images
If you have an image with 3 or 4 channels and you want to interpret it as an RGB or RGB(A) image, you can use the
c_coordsargument to specify the channel coordinates as["r", "g", "b"]or["r", "g", "b", "a"].You can also pass the
rgbargument tokwargsto automatically set thec_coordsto["r", "g", "b"]. Please refer toto_spatial_image()for more information. Note: if you setrgb=Noneinkwargs, 3-4 channel images will be interpreted automatically as RGB(A) images.Setting axes / dims In case of the data being a numpy or dask array, there are no named axes yet. In this case, we first try to use the dimensions specified by the user in the
dimsargument of.parse. These dimensions are used to potentially transpose the data to match the order (c)(z)yx. See the description of thedimsargument above. Ifdimsis not specified, the dims are set to (c)(z)yx, dependent on the number of dimensions of the data.
- classmethod validate(data)#
Validate data.
- Parameters:
data (
Any) – Data to validate.- Raises:
ValueError – If data is not valid.
- Return type:
None
- class spatialdata.models.Labels3DModel#
Bases:
RasterSchema- classmethod parse(*args, **kwargs)#
Validate (or parse) raster data.
- Parameters:
data – Data to validate (or parse). The shape of the data should be c(z)yx for 2D (3D) images and (z)yx for 2D ( 3D) labels. If you have a 2D image with shape yx, you can use
numpy.expand_dims()(or an equivalent function) to add a channel dimension.dims – Dimensions of the data (e.g. [‘c’, ‘y’, ‘x’] for 2D image data). If the data is a
xarray.DataArray, the dimensions can also be inferred from the data. If the dimensions are not in the order (c)(z)yx, the data will be transposed to match the order.c_coords (str | list[str] | None) – Channel names of image data. Must be equal to the length of dimension ‘c’. Only supported for
Imagemodels.transformations – Dictionary of transformations to apply to the data. The key is the name of the target coordinate system, the value is the transformation to apply. By default, a single
Identitytransformation mapping to the"global"coordinate system is applied.scale_factors – Scale factors to apply to construct a multiscale image (
datatree.DataTree). IfNone, axarray.DataArrayis returned instead. Importantly, each scale factor is relative to the previous scale factor. For example, if the scale factors are[2, 2, 2], the returned multiscale image will have 4 scales. The original image and then the 2x, 4x and 8x downsampled images.method –
Method to use for multiscale downsampling. The default (
None) differs between images and labels:Images (
Image2DModel,Image3DModel): usesmultiscale_spatial_image.to_multiscale()withmethod=Methods.XARRAY_COARSEN. This is the same default as in spatialdata <= 0.7.2 and is fast.Labels (
Labels2DModel,Labels3DModel): uses a lazy implementation based onome-zarr-py’sresize()(order=0, nearest-neighbour). This has lower peak memory usage than themultiscale_spatial_imageimplementation. Note: for images this ome-zarr-py path shows a significant performance regression (both time and memory); see GitHub issue #1079.
To override the default, pass any
Methodsvalue, which will force themultiscale_spatial_image.to_multiscale()code path for all element types. For example:method=Methods.XARRAY_COARSEN— coarsening via xarray (fast, default for images).method=Methods.DASK_IMAGE_NEAREST— nearest-neighbour via dask-image (not lazy as of multiscale-spatial-image==2.0.3, so it leads to higher memory usage).
chunks – Chunks to use for dask array.
kwargs (
Any) – Additional arguments forto_spatial_image(). In particular thec_coordskwargs argument (an iterable) can be used to set the channel coordinates for image data.c_coordsis not available for labels data as labels do not have channels.
- Return type:
- Returns:
:
xarray.DataArrayordatatree.DataTree
Notes
RGB images
If you have an image with 3 or 4 channels and you want to interpret it as an RGB or RGB(A) image, you can use the
c_coordsargument to specify the channel coordinates as["r", "g", "b"]or["r", "g", "b", "a"].You can also pass the
rgbargument tokwargsto automatically set thec_coordsto["r", "g", "b"]. Please refer toto_spatial_image()for more information. Note: if you setrgb=Noneinkwargs, 3-4 channel images will be interpreted automatically as RGB(A) images.Setting axes / dims In case of the data being a numpy or dask array, there are no named axes yet. In this case, we first try to use the dimensions specified by the user in the
dimsargument of.parse. These dimensions are used to potentially transpose the data to match the order (c)(z)yx. See the description of thedimsargument above. Ifdimsis not specified, the dims are set to (c)(z)yx, dependent on the number of dimensions of the data.
- classmethod validate(data)#
Validate data.
- Parameters:
data (
Any) – Data to validate.- Raises:
ValueError – If data is not valid.
- Return type:
None
- class spatialdata.models.ShapesModel#
Bases:
object- classmethod parse(data, **kwargs)#
- classmethod parse(cls, data, geometry, offsets=None, radius=None, index=None, transformations=None)
- classmethod parse(cls, data, radius=None, index=None, transformations=None, **kwargs)
- classmethod parse(cls, data, radius=None, index=None, transformations=None, **kwargs)
- classmethod parse(cls, data, transformations=None)
Parse shapes data.
- Parameters:
data (
Any) –Data to parse:
If
numpy.ndarray, it assumes the shapes are parsed as ragged arrays, in case ofshapely.Polygonorshapely.MultiPolygon. Therefore additional argumentsoffsetsandgeometrymust be providedif
Pathorstr, it’s read as a GeoJSON file.If
geopandas.GeoDataFrame, it’s validated. The object needs to have a column calledgeometrywhich is ageopandas.GeoSeriesorshapelyobjects. Valid options are combinations ofshapely.Polygonorshapely.MultiPolygonorshapely.Point. If the geometries arePoint, there must be another column calledradius.
geometry –
Geometry type of the shapes. The following geometries are supported:
0:
Circles3:
Polygon6:
MultiPolygon
offsets – In the case of
shapely.Polygonorshapely.MultiPolygonshapes, in order to initialize the shapes from their ragged array representation, the offsets of the polygons must be provided. Alternatively you can call the parser asShapesModel.parse(data), where data is aGeoDataFrameobject and ignore theoffsetparameter (recommended).radius – Size of the
Circles. It must be provided if the shapes areCircles.index – Index of the shapes, must be of type
str. If None, it’s generated automatically.transformations – Transformations of shapes.
kwargs (
Any) – Additional arguments for GeoJSON reader.
- Return type:
- Returns:
- classmethod validate(data)#
Validate data.
- Parameters:
data (
GeoDataFrame) –geopandas.GeoDataFrameto validate.- Return type:
None- Returns:
: None
- classmethod validate_shapes_not_mixed_types(gdf)#
Check that the Shapes element is either composed of Point or Polygon/MultiPolygon.
- Parameters:
gdf (
GeoDataFrame) – The Shapes element.- Raises:
ValueError – When the geometry column composing the object does not satisfy the type requirements.
- Return type:
None
Notes
This function is not called by ShapesModel.validate() because computing the unique types by default could be expensive.
- class spatialdata.models.PointsModel#
Bases:
object- classmethod parse(data, **kwargs)#
- classmethod parse(cls, data, annotation=None, feature_key=None, instance_key=None, transformations=None, **kwargs)
- classmethod parse(cls, data, coordinates=None, feature_key=None, instance_key=None, transformations=None, **kwargs)
- classmethod parse(cls, data, coordinates=None, feature_key=None, instance_key=None, transformations=None, **kwargs)
Validate (or parse) points data.
- Parameters:
data (
Any) –Data to parse:
If
numpy.ndarray, anannotationpandas.DataFramecan be provided, as well as afeature_keycolumn in theannotationdataframe. Furthermore,numpy.ndarrayis assumed to have shape(n_points, axes), withaxesbeing “x”, “y” and optionally “z”.If
pandas.DataFrame, acoordinatesmapping can be provided with key as valid axes (‘x’, ‘y’, ‘z’) and value as column names in dataframe. If the dataframe already has columns named ‘x’, ‘y’ and ‘z’, the mapping can be omitted.
annotation – Annotation dataframe. Only if
dataisnumpy.ndarray. If data is an array, the index of the annotations will be used as the index of the parsed points.coordinates – Mapping of axes names (keys) to column names (valus) in
data. Only ifdataispandas.DataFrame. Example: {‘x’: ‘my_x_column’, ‘y’: ‘my_y_column’}. If not provided anddataispandas.DataFrame, andx,yand optionallyzare column names, then they will be used as coordinates.feature_key – Optional, feature key in
annotationordata. Example use case: gene id categorical column describing the gene identity of each point.instance_key – Optional, instance key in
annotationordata. Example use case: cell id column, describing which cell a point belongs to. This argument is likely going to be deprecated: scverse/spatialdata#503.transformations – Transformations of points.
kwargs (
Any) – Additional arguments fordask.dataframe.from_array().
- Return type:
- Returns:
:
dask.dataframe.core.DataFrame
Notes
The order of the columns of the dataframe returned by the parser is not guaranteed to be the same as the order of the columns in the dataframe passed as an argument.
- class spatialdata.models.TableModel#
Bases:
object- classmethod parse(adata, region=None, region_key=None, instance_key=None, overwrite_metadata=False)#
Parse the
anndata.AnnDatato be compatible with the model.- Parameters:
adata (
AnnData) – The AnnData object.region (
str|list[str] |None(default:None)) – Region(s) to be used.region_key (
str|None(default:None)) – Key inadata.obsthat specifies the region.instance_key (
str|None(default:None)) – Key inadata.obsthat specifies the instance.overwrite_metadata (
bool(default:False)) – IfTrue, theregion,region_keyandinstance_keymetadata will be overwritten.
- Return type:
- Returns:
: The parsed data.