Models#
The elements (building-blocks) that constitute SpatialData
.
- class spatialdata.models.Image2DModel(*args, **kwargs)#
Bases:
RasterSchema
- classmethod parse(data, dims=None, c_coords=None, transformations=None, scale_factors=None, method=None, chunks=None, **kwargs)#
Validate (or parse) raster data.
- Parameters:
data (
ndarray
[Any
,dtype
[floating
[Any
]]] |DataArray
|Array
) – Data to validate (or parse). The shape of the data should be c(z)yx for 2D (3D) images and (z)yx for 2D ( 3D) labels. If you have a 2D image with shape yx, you can usenumpy.expand_dims()
(or an equivalent function) to add a channel dimension.dims (
Optional
[Sequence
[str
]] (default:None
)) – Dimensions of the data (e.g. [‘c’, ‘y’, ‘x’] for 2D image data). If the data is axarray.DataArray
, the dimensions can also be inferred from the data. If the dimensions are not in the order (c)(z)yx, the data will be transposed to match the order.c_coords (str | list[str] | None) – Channel names of image data. Must be equal to the length of dimension ‘c’. Only supported for
Image
models.transformations (
Optional
[dict
[str
,BaseTransformation
]] (default:None
)) – Dictionary of transformations to apply to the data. The key is the name of the target coordinate system, the value is the transformation to apply. By default, a singleIdentity
transformation mapping to the"global"
coordinate system is applied.scale_factors (
Optional
[Sequence
[dict
[str
,int
] |int
]] (default:None
)) – Scale factors to apply to construct a multiscale image (datatree.DataTree
). IfNone
, axarray.DataArray
is returned instead. Importantly, each scale factor is relative to the previous scale factor. For example, if the scale factors are[2, 2, 2]
, the returned multiscale image will have 4 scales. The original image and then the 2x, 4x and 8x downsampled images.method (
Optional
[Methods
] (default:None
)) – Method to use for multiscale downsampling. Please refer tomultiscale_spatial_image.to_multiscale
.chunks (
Union
[int
,tuple
[int
,...
],tuple
[tuple
[int
,...
],...
],Mapping
[Any
,None
|int
|tuple
[int
,...
]],None
] (default:None
)) – Chunks to use for dask array.kwargs (
Any
) – Additional arguments forto_spatial_image()
. In particular thec_coords
kwargs argument (an iterable) can be used to set the channel coordinates for image data.c_coords
is not available for labels data as labels do not have channels.
- Return type:
- Returns:
:
xarray.DataArray
ordatatree.DataTree
Notes
RGB images
If you have an image with 3 or 4 channels and you want to interpret it as an RGB or RGB(A) image, you can use the
c_coords
argument to specify the channel coordinates as["r", "g", "b"]
or["r", "g", "b", "a"]
.You can also pass the
rgb
argument tokwargs
to automatically set thec_coords
to["r", "g", "b"]
. Please refer toto_spatial_image()
for more information. Note: if you setrgb=None
inkwargs
, 3-4 channel images will be interpreted automatically as RGB(A) images.
- validate(data)#
Validate data.
- Parameters:
data (
Any
) – Data to validate.- Raises:
ValueError – If data is not valid.
- Return type:
None
- class spatialdata.models.Image3DModel(*args, **kwargs)#
Bases:
RasterSchema
- classmethod parse(data, dims=None, c_coords=None, transformations=None, scale_factors=None, method=None, chunks=None, **kwargs)#
Validate (or parse) raster data.
- Parameters:
data (
ndarray
[Any
,dtype
[floating
[Any
]]] |DataArray
|Array
) – Data to validate (or parse). The shape of the data should be c(z)yx for 2D (3D) images and (z)yx for 2D ( 3D) labels. If you have a 2D image with shape yx, you can usenumpy.expand_dims()
(or an equivalent function) to add a channel dimension.dims (
Optional
[Sequence
[str
]] (default:None
)) – Dimensions of the data (e.g. [‘c’, ‘y’, ‘x’] for 2D image data). If the data is axarray.DataArray
, the dimensions can also be inferred from the data. If the dimensions are not in the order (c)(z)yx, the data will be transposed to match the order.c_coords (str | list[str] | None) – Channel names of image data. Must be equal to the length of dimension ‘c’. Only supported for
Image
models.transformations (
Optional
[dict
[str
,BaseTransformation
]] (default:None
)) – Dictionary of transformations to apply to the data. The key is the name of the target coordinate system, the value is the transformation to apply. By default, a singleIdentity
transformation mapping to the"global"
coordinate system is applied.scale_factors (
Optional
[Sequence
[dict
[str
,int
] |int
]] (default:None
)) – Scale factors to apply to construct a multiscale image (datatree.DataTree
). IfNone
, axarray.DataArray
is returned instead. Importantly, each scale factor is relative to the previous scale factor. For example, if the scale factors are[2, 2, 2]
, the returned multiscale image will have 4 scales. The original image and then the 2x, 4x and 8x downsampled images.method (
Optional
[Methods
] (default:None
)) – Method to use for multiscale downsampling. Please refer tomultiscale_spatial_image.to_multiscale
.chunks (
Union
[int
,tuple
[int
,...
],tuple
[tuple
[int
,...
],...
],Mapping
[Any
,None
|int
|tuple
[int
,...
]],None
] (default:None
)) – Chunks to use for dask array.kwargs (
Any
) – Additional arguments forto_spatial_image()
. In particular thec_coords
kwargs argument (an iterable) can be used to set the channel coordinates for image data.c_coords
is not available for labels data as labels do not have channels.
- Return type:
- Returns:
:
xarray.DataArray
ordatatree.DataTree
Notes
RGB images
If you have an image with 3 or 4 channels and you want to interpret it as an RGB or RGB(A) image, you can use the
c_coords
argument to specify the channel coordinates as["r", "g", "b"]
or["r", "g", "b", "a"]
.You can also pass the
rgb
argument tokwargs
to automatically set thec_coords
to["r", "g", "b"]
. Please refer toto_spatial_image()
for more information. Note: if you setrgb=None
inkwargs
, 3-4 channel images will be interpreted automatically as RGB(A) images.
- validate(data)#
Validate data.
- Parameters:
data (
Any
) – Data to validate.- Raises:
ValueError – If data is not valid.
- Return type:
None
- class spatialdata.models.Labels2DModel(*args, **kwargs)#
Bases:
RasterSchema
- classmethod parse(*args, **kwargs)#
Validate (or parse) raster data.
- Parameters:
data – Data to validate (or parse). The shape of the data should be c(z)yx for 2D (3D) images and (z)yx for 2D ( 3D) labels. If you have a 2D image with shape yx, you can use
numpy.expand_dims()
(or an equivalent function) to add a channel dimension.dims – Dimensions of the data (e.g. [‘c’, ‘y’, ‘x’] for 2D image data). If the data is a
xarray.DataArray
, the dimensions can also be inferred from the data. If the dimensions are not in the order (c)(z)yx, the data will be transposed to match the order.c_coords (str | list[str] | None) – Channel names of image data. Must be equal to the length of dimension ‘c’. Only supported for
Image
models.transformations – Dictionary of transformations to apply to the data. The key is the name of the target coordinate system, the value is the transformation to apply. By default, a single
Identity
transformation mapping to the"global"
coordinate system is applied.scale_factors – Scale factors to apply to construct a multiscale image (
datatree.DataTree
). IfNone
, axarray.DataArray
is returned instead. Importantly, each scale factor is relative to the previous scale factor. For example, if the scale factors are[2, 2, 2]
, the returned multiscale image will have 4 scales. The original image and then the 2x, 4x and 8x downsampled images.method – Method to use for multiscale downsampling. Please refer to
multiscale_spatial_image.to_multiscale
.chunks – Chunks to use for dask array.
kwargs (
Any
) – Additional arguments forto_spatial_image()
. In particular thec_coords
kwargs argument (an iterable) can be used to set the channel coordinates for image data.c_coords
is not available for labels data as labels do not have channels.
- Return type:
- Returns:
:
xarray.DataArray
ordatatree.DataTree
Notes
RGB images
If you have an image with 3 or 4 channels and you want to interpret it as an RGB or RGB(A) image, you can use the
c_coords
argument to specify the channel coordinates as["r", "g", "b"]
or["r", "g", "b", "a"]
.You can also pass the
rgb
argument tokwargs
to automatically set thec_coords
to["r", "g", "b"]
. Please refer toto_spatial_image()
for more information. Note: if you setrgb=None
inkwargs
, 3-4 channel images will be interpreted automatically as RGB(A) images.
- validate(data)#
Validate data.
- Parameters:
data (
Any
) – Data to validate.- Raises:
ValueError – If data is not valid.
- Return type:
None
- class spatialdata.models.Labels3DModel(*args, **kwargs)#
Bases:
RasterSchema
- classmethod parse(*args, **kwargs)#
Validate (or parse) raster data.
- Parameters:
data – Data to validate (or parse). The shape of the data should be c(z)yx for 2D (3D) images and (z)yx for 2D ( 3D) labels. If you have a 2D image with shape yx, you can use
numpy.expand_dims()
(or an equivalent function) to add a channel dimension.dims – Dimensions of the data (e.g. [‘c’, ‘y’, ‘x’] for 2D image data). If the data is a
xarray.DataArray
, the dimensions can also be inferred from the data. If the dimensions are not in the order (c)(z)yx, the data will be transposed to match the order.c_coords (str | list[str] | None) – Channel names of image data. Must be equal to the length of dimension ‘c’. Only supported for
Image
models.transformations – Dictionary of transformations to apply to the data. The key is the name of the target coordinate system, the value is the transformation to apply. By default, a single
Identity
transformation mapping to the"global"
coordinate system is applied.scale_factors – Scale factors to apply to construct a multiscale image (
datatree.DataTree
). IfNone
, axarray.DataArray
is returned instead. Importantly, each scale factor is relative to the previous scale factor. For example, if the scale factors are[2, 2, 2]
, the returned multiscale image will have 4 scales. The original image and then the 2x, 4x and 8x downsampled images.method – Method to use for multiscale downsampling. Please refer to
multiscale_spatial_image.to_multiscale
.chunks – Chunks to use for dask array.
kwargs (
Any
) – Additional arguments forto_spatial_image()
. In particular thec_coords
kwargs argument (an iterable) can be used to set the channel coordinates for image data.c_coords
is not available for labels data as labels do not have channels.
- Return type:
- Returns:
:
xarray.DataArray
ordatatree.DataTree
Notes
RGB images
If you have an image with 3 or 4 channels and you want to interpret it as an RGB or RGB(A) image, you can use the
c_coords
argument to specify the channel coordinates as["r", "g", "b"]
or["r", "g", "b", "a"]
.You can also pass the
rgb
argument tokwargs
to automatically set thec_coords
to["r", "g", "b"]
. Please refer toto_spatial_image()
for more information. Note: if you setrgb=None
inkwargs
, 3-4 channel images will be interpreted automatically as RGB(A) images.
- validate(data)#
Validate data.
- Parameters:
data (
Any
) – Data to validate.- Raises:
ValueError – If data is not valid.
- Return type:
None
- class spatialdata.models.ShapesModel#
Bases:
object
- classmethod parse(data, **kwargs)#
Parse shapes data.
- Parameters:
data (
Any
) –Data to parse:
If
numpy.ndarray
, it assumes the shapes are parsed as ragged arrays, in case ofshapely.Polygon
orshapely.MultiPolygon
. Therefore additional argumentsoffsets
andgeometry
must be providedif
Path
orstr
, it’s read as a GeoJSON file.If
geopandas.GeoDataFrame
, it’s validated. The object needs to have a column calledgeometry
which is ageopandas.GeoSeries
orshapely
objects. Valid options are combinations ofshapely.Polygon
orshapely.MultiPolygon
orshapely.Point
. If the geometries arePoint
, there must be another column calledradius
.
geometry –
Geometry type of the shapes. The following geometries are supported:
0:
Circles
3:
Polygon
6:
MultiPolygon
offsets – In the case of
shapely.Polygon
orshapely.MultiPolygon
shapes, in order to initialize the shapes from their ragged array representation, the offsets of the polygons must be provided. Alternatively you can call the parser asShapesModel.parse(data)
, where data is aGeoDataFrame
object and ignore theoffset
parameter (recommended).radius – Size of the
Circles
. It must be provided if the shapes areCircles
.index – Index of the shapes, must be of type
str
. If None, it’s generated automatically.transformations – Transformations of shapes.
kwargs (
Any
) – Additional arguments for GeoJSON reader.
- Return type:
- Returns:
- classmethod validate(data)#
Validate data.
- Parameters:
data (
GeoDataFrame
) –geopandas.GeoDataFrame
to validate.- Return type:
None
- Returns:
: None
- classmethod validate_shapes_not_mixed_types(gdf)#
Check that the Shapes element is either composed of Point or Polygon/MultiPolygon.
- Parameters:
gdf (
GeoDataFrame
) – The Shapes element.- Raises:
ValueError – When the geometry column composing the object does not satisfy the type requirements.
- Return type:
None
Notes
This function is not called by ShapesModel.validate() because computing the unique types by default could be expensive.
- class spatialdata.models.PointsModel#
Bases:
object
- classmethod parse(data, **kwargs)#
Validate (or parse) points data.
- Parameters:
data (
Any
) –Data to parse:
If
numpy.ndarray
, anannotation
pandas.DataFrame
can be provided, as well as afeature_key
column in theannotation
dataframe. Furthermore,numpy.ndarray
is assumed to have shape(n_points, axes)
, withaxes
being “x”, “y” and optionally “z”.If
pandas.DataFrame
, acoordinates
mapping can be provided with key as valid axes (‘x’, ‘y’, ‘z’) and value as column names in dataframe. If the dataframe already has columns named ‘x’, ‘y’ and ‘z’, the mapping can be omitted.
annotation – Annotation dataframe. Only if
data
isnumpy.ndarray
. If data is an array, the index of the annotations will be used as the index of the parsed points.coordinates – Mapping of axes names (keys) to column names (valus) in
data
. Only ifdata
ispandas.DataFrame
. Example: {‘x’: ‘my_x_column’, ‘y’: ‘my_y_column’}. If not provided anddata
ispandas.DataFrame
, andx
,y
and optionallyz
are column names, then they will be used as coordinates.feature_key – Optional, feature key in
annotation
ordata
. Example use case: gene id categorical column describing the gene identity of each point.instance_key – Optional, instance key in
annotation
ordata
. Example use case: cell id column, describing which cell a point belongs to. This argument is likely going to be deprecated: scverse/spatialdata#503.transformations – Transformations of points.
kwargs (
Any
) – Additional arguments fordask.dataframe.from_array()
.
- Return type:
DataFrame
- Returns:
:
dask.dataframe.core.DataFrame
Notes
The order of the columns of the dataframe returned by the parser is not guaranteed to be the same as the order of the columns in the dataframe passed as an argument.
- classmethod validate(data)#
Validate data.
- Parameters:
data (
DataFrame
) –dask.dataframe.core.DataFrame
to validate.- Return type:
None
- Returns:
: None
- class spatialdata.models.TableModel#
Bases:
object
- classmethod parse(adata, region=None, region_key=None, instance_key=None)#
Parse the
anndata.AnnData
to be compatible with the model.- Parameters:
adata (
AnnData
) – The AnnData object.region (
Union
[list
[str
],str
,None
] (default:None
)) – Region(s) to be used.region_key (
Optional
[str
] (default:None
)) – Key inadata.obs
that specifies the region.instance_key (
Optional
[str
] (default:None
)) – Key inadata.obs
that specifies the instance.
- Return type:
- Returns:
: The parsed data.