Spatial omics datasets#

Here you can find all datasets necessary to run the example notebooks already converted to the SpatialData Zarr file format.

Scripts to convert data from several other technologies into SpatialData Zarr are available in the spatialdata sandbox; in particular:

  • CyCIF (MCMICRO output)[4]

  • Imaging Mass Cytometry, IMC (Steinbock output)[7][8][9]

  • seqFISH

Technology

Sample

File Size

Filename (spatialdata-sandbox)

license

Visium HD

Mouse intestin [1]

~2.4 GB

visium_hd_3.0.0_io

CC BY 4.0

Visium HD

Mouse brain [13]

<200MB

visium_hd_4.0.1_io

CC BY 4.0

Visium

Breast cancer [2]

~1.5 GB

visium_associated_xenium_io

CC BY 4.0

Visium

Mouse brain [14]

<100MB

visium

CC BY 4.0

Xenium

Breast cancer [2]

~2.8 GB

xenium_rep1_io

CC BY 4.0

Xenium

Lung cancer [3]

~5.4 GB

xenium_2.0.0_io

CC BY 4.0

MERFISH

Mouse brain [5]

~50 MB

merfish

CC0 1.0

MIBI-TOF

Colorectal carcinoma [6]

~25 MB

mibitof

CC BY 4.0

Molecular Cartography (SPArrOW output)

Mouse Liver [10][11]

~70 MB

mouse_liver

CC BY 4.0

SpaceM

Hepa and NIH3T3 cells [12]

~60 MB

spacem_helanih3t3

CC BY 4.0

*Please select the dataset and version below to download the data. Available versions are fetched from the S3 bucket.


Licenses abbreviations#

  • CC0 1.0: CC0 1.0 Universal (CC0 1.0) Public Domain Dedication

  • CC BY 4.0: Creative Common Attribution 4.0 International

  • CC BY-NC 4.0: Creative Common Attribution-NonCommercial 4.0 International

The data retains the license of the original published data.

Artificial datasets#

Also, here you can find additional datasets and resources for methods developers.

References#

If you use the datasets please cite the original sources and double-check their license.

Opening an issue#

If you notice any issues, such as a changed dataset, a removed dataset, or missing dataset information, please open a GitHub issue so we can address it. Thank you!