Squidpy integration#

In this notebook, we will describe some usage principles for using SpatialData with squidpy.

Let’s first import some useful libraries and read in a spatialdata dataset.

You might have to install squidpy with !pip install squidpy.

import scanpy as sc
import spatialdata as sd
import squidpy as sq
import spatialdata_plot  # noqa: F401

We chosed a Xenium dataset formatted in the spatialdata format. Please download the xenium_rep1_io data from https://spatialdata.scverse.org/en/stable/tutorials/notebooks/datasets/README.html. Please rename the file to xenium.zarr and place it in the same folder as this notebook (or use symlinks to make the data accessible).

Information regarding data licensing and attribution for the dataset listed above is available at: https://github.com/scverse/spatialdata-notebooks/tree/main/datasets.

sdata = sd.read_zarr("xenium.zarr")
sdata
SpatialData object, with associated Zarr store: /Users/macbook/embl/projects/basel/spatialdata-integration-testing/dependencies/spatialdata-sandbox/xenium_rep1_io/data.zarr
├── Images
│     ├── 'morphology_focus': DataTree[cyx] (1, 25778, 35416), (1, 12889, 17708), (1, 6444, 8854), (1, 3222, 4427), (1, 1611, 2213)
│     └── 'morphology_mip': DataTree[cyx] (1, 25778, 35416), (1, 12889, 17708), (1, 6444, 8854), (1, 3222, 4427), (1, 1611, 2213)
├── Points
│     └── 'transcripts': DataFrame with shape: (<Delayed>, 8) (3D points)
├── Shapes
│     ├── 'cell_boundaries': GeoDataFrame shape: (167780, 1) (2D shapes)
│     ├── 'cell_circles': GeoDataFrame shape: (167780, 2) (2D shapes)
│     └── 'xenium_landmarks': GeoDataFrame shape: (3, 2) (2D shapes)
└── Tables
      └── 'table': AnnData (167780, 313)
with coordinate systems:
    ▸ 'aligned', with elements:
        morphology_focus (Images)
    ▸ 'global', with elements:
        morphology_focus (Images), morphology_mip (Images), transcripts (Points), cell_boundaries (Shapes), cell_circles (Shapes), xenium_landmarks (Shapes)

SpatialData has a more complex structure than the (legacy) spatial AnnData format introduced by squidpy. Nevertheless, because it fundamentally uses AnnData as table for annotating regions, with some minor adjustments we can readily use any tool from the scverse ecosystem (squidpy included) to perform downstream analysis.

More precisely, the scverse ecosystem will gradually transition to use SpatialData internally; until that moment the function to_legacy_anndata() from spatialdata_io.experimental can be used to enrich the AnnData in sdata.tables objects to be compatible with the legacy AnnData format.

Generally (i.e. when a single coordinate system is used), the only adjustment required is to populate .obsm['spatial'] with the centroids of the geometries being considered, and this is performed automatically by the spatialdata-io library. For more complex cases, the converter functions to_legacy_anndata() and from_legacy_anndata() can be used (see example at the end of the notebook).

Xenium example#

As an example of integration with Squidpy, let’s compute a nearest neighbor graph of the spatial coordiantes of the xenium dataset.

sq.gr.spatial_neighbors(sdata["table"])
INFO     Creating graph using `generic` coordinates and `None` transform and `1` libraries.

After that, we can cluster the cells based on gene expression profiles and compute clustering.

%%time
sc.pp.pca(sdata["table"])
sc.pp.neighbors(sdata["table"])
sc.tl.leiden(sdata["table"])
CPU times: user 1min 10s, sys: 2.77 s, total: 1min 13s
Wall time: 1min 14s

And run the neighbor enrichment analysis in squidpy.

sq.gr.nhood_enrichment(sdata["table"], cluster_key="leiden")
sq.pl.nhood_enrichment(sdata["table"], cluster_key="leiden", figsize=(5, 5))
../../../../_images/b1b01af2f1df53718238b294bd72d86903761fab98dc39e5e9e6235b1d68756e.png

We can finally visualize the results in spatial coordinates both with squidpy as well as with the novel plotting function in spatialdata.

%%time
sq.pl.spatial_scatter(sdata["table"], shape=None, color="leiden")
WARNING: Please specify a valid `library_id` or set it permanently in `adata.uns['spatial']`
CPU times: user 84.5 ms, sys: 5.9 ms, total: 90.4 ms
Wall time: 90.7 ms
../../../../_images/a4f2266f55c8508845ca24ed3eda1233b3f79ca15e709f6ee97451dfe8ff9692.png
%%time
# currently we need either to use `method='matplotlib`` (which is significantly slower than `method='datashader'`) or we need the one line workaround below because of this bug:
# https://github.com/scverse/spatialdata-plot/issues/291.
sdata["cell_circles"] = sd.transform(sdata["cell_circles"], to_coordinate_system="global")  # workaround
sdata.pl.render_shapes("cell_circles", color="leiden", method="datashader").pl.show(coordinate_systems="global")
INFO     Successfully extracted 47 colors from 'leiden_colors' in table 'table'.                                   
INFO     Using 'datashader' backend with 'sum' as reduction method to speed up plotting. Depending on the reduction
         method, the value range of the plot might change. Set method to 'matplotlib' to disable this behaviour.
CPU times: user 7.31 s, sys: 682 ms, total: 7.99 s
Wall time: 7.76 s
../../../../_images/10cf048f75a6c3445dae7c9922348110e495ca4d82639de7923510aba9f1090b.png

Using the converters from/to the legacy spatial AnnData format#

We will now show how to convert from/to the legacy spatialdata AnnData format. The convertes deal with the complexity of having multiple libraries in the legacy format, and with the generality of coordinate systems in SpatialData. For more information please refer to the API documentation of spatialdata-io.

from typing import Literal

import matplotlib.pyplot as plt
import scanpy as sc
import squidpy as sq
from spatialdata.datasets import blobs_annotating_element
from spatialdata.transformations import Affine, set_transformation
from spatialdata_io.experimental import from_legacy_anndata, to_legacy_anndata

BlobsTypes = Literal["blobs_labels", "blobs_circles", "blobs_polygons", "blobs_multipolygons"]

Example of conversion from the legacy AnnData to SpatialData#

# some datasets you can play around with
# adata = sq.datasets.visium_fluo_adata()
# adata = sq.datasets.merfish()
adata = sq.datasets.visium_hne_adata_crop()

sdata = from_legacy_anndata(adata)

my_gene = "Itpka"
# broken (multiple table not yet supported in spatialdata-plot), this is being fix in https://github.com/scverse/spatialdata-plot/pull/220
sdata.pl.render_images("V1_Adult_Mouse_Brain_hires").pl.render_shapes(color=my_gene).pl.show()
plt.show()
../../../../_images/37fef6291bf81717457ebb0b60f81e7fd9fb7ac953e5992edb229254432be71e.png

Example of conversion from SpatialData to the legacy AnnData, showing also that affine transformations are handled#

element = "blobs_circles"
sdata = blobs_annotating_element(element)
image = "blobs_multiscale_image"

affine = Affine([[1, 2, 3], [-2, 1, 6], [0, 0, 1]], input_axes=("x", "y"), output_axes=("x", "y"))
set_transformation(sdata[element], affine, "transformed")
set_transformation(sdata[image], affine, "transformed")

adata = to_legacy_anndata(sdata, include_images=True, coordinate_system="transformed")

axes = plt.subplots(1, 2, figsize=(10, 5))[1]
sdata.pl.render_images(image).pl.render_shapes(element).pl.show(coordinate_systems=["transformed"], ax=axes[0])
sc.pl.spatial(
    adata,
    spot_size=300,
    library_id=image,
    img_key="hires",
    na_color="white",
    show=True,
    crop_coord=(0, 2000, 0, 2000),
    ax=axes[1],
)
../../../../_images/dd099bf60e4647916d53be9a0b88bb8ebedfaee4cf0dc43d826d2be76b8baa37.png