Spatial omics datasets#

Here you can find all datasets necessary to run the example notebooks already converted to the ZARR file format.

If you want to convert additional datasets check out the scripts available in the spatialdata sandbox.

Technology

Sample

File Size

Filename (spatialdata-sandbox)

download data

work with data remotely (see note below)

license

Visium HD

Mouse intestin [2]

1 GB

visium_hd_3.0.0_id

.zarr.zip

S3

CCA

Visium

Breast cancer [3]

1.5 GB

visium_associated_xenium_io

.zarr.zip

S3

CCA

Xenium

Breast cancer [3]

2.8 GB

xenium_rep1_io

.zarr.zip

S3

CCA

Xenium

Breast cancer [3]

3.7 GB

xenium_rep2_io

.zarr.zip

S3

CCA

CyCIF (MCMICRO output)

Small lung adenocarcinoma [4]

250 MB

mcmicro_io

.zarr.zip

S3

CC BY-NC 4.0 DEED

MERFISH

Mouse brain [5]

50 MB

merfish

.zarr.zip

S3

CC0 1.0 DEED

MIBI-TOF

Colorectal carcinoma [6]

25 MB

mibitof

.zarr.zip

S3

CC BY 4.0 DEED

Imaging Mass Cytometry (Steinbock output)

4 different cancers (SCCHN, BCC, NSCLC, CRC) [7][8][9]

820 MB

steinbock_io

.zarr.zip

S3

CC BY 4.0 DEED

For the first 3 datasets, we also provide a version of them in which they are all aligned in a common coordinate system, and where we added the cell-type information, as described in our paper, to annotate the Xenium cells.

Technology

Sample

File Size

Filename (spatialdata-sandbox)

download data

work with data remotely (see note below)

license

Visium

Breast Cancer [3]

1.5 GB

visium_associated_xenium_io

.zarr.zip

S3

CCA

Xenium

Breast Cancer [3]

2.8 GB

xenium_rep1_io

.zarr.zip

S3

CCA

Xenium

Breast Cancer [3]

3.7 GB

xenium_rep2_io

.zarr.zip

S3

CCA

Note on S3 storage: opening the S3 URLs in a web browser will not work, you need to treat the URLs as Zarr stores. For example if you append .zgroup to any of the URLs above you will be able to see that file.

Licenses abbreviations#

  • CCA: Creative Common Attribution

  • CC0 1.0 DEED: CC0 1.0 Universal (CC0 1.0) Public Domain Dedication

  • CC BY 4.0 DEED: Creative Common Attribution 4.0 International

  • CC BY-NC 4.0 DEED: Creative Common Attribution-NonCommercial 4.0 International

The data retains the license of the original published data.

Artificial datasets#

Also, here you can find additional datasets and resources for methods developers.

References#

If you use the datasets please cite the original sources and double-check their license.