Contributing guide#
Scanpy provides extensive developer documentation, most of which applies to this repo, too. This document will not reproduce the entire content from there. Instead, it aims at summarizing the most important information to get you started on contributing.
We assume that you are already familiar with git and with making pull requests on GitHub. If not, please refer to the scanpy developer guide.
Installing dev dependencies#
In addition to the packages needed to use this package, you need additional python packages to run tests and build
the documentation. It’s easy to install them using pip
:
cd spatialdata-io
pip install -e ".[dev,test,doc]"
Code-style#
This template uses pre-commit to enforce consistent code-styles. On every commit, pre-commit checks will either automatically fix issues with the code, or raise an error message.
To enable pre-commit locally, simply run
pre-commit install
in the root of the repository. Pre-commit will automatically download all dependencies when it is run for the first time.
Alternatively, you can rely on the pre-commit.ci service enabled on GitHub. If you didn’t run pre-commit
before
pushing changes to GitHub it will automatically commit fixes to your pull request, or show an error message.
If pre-commit.ci added a commit on a branch you still have been working on locally, simply use
git pull --rebase
to integrate the changes into yours. While the pre-commit.ci is useful, we strongly encourage installing and running pre-commit locally first to understand its usage.
Finally, most editors have an autoformat on save feature. Consider enabling this option for black and prettier.
Writing tests#
Note
Remember to first install the package with pip install '-e[dev,test]'
This package uses pytest for automated testing. Please write tests for every function added to the package.
Most IDEs integrate with pytest and provide a GUI to run tests. Alternatively, you can run all tests from the command line by executing
pytest
in the root of the repository. Continuous integration will automatically run the tests on all pull requests.
Continuous integration#
Continuous integration will automatically run the tests on all pull requests and test against the minimum and maximum supported Python version.
Additionally, there’s a CI job that tests against pre-releases of all dependencies (if there are any). The purpose of this check is to detect incompatibilities of new package versions early on and gives you time to fix the issue or reach out to the developers of the dependency before the package is released to a wider audience.
By including this additional information, the document now provides a more comprehensive overview of the continuous integration process related to testing.
Publishing a release#
Updating the version number#
Before making a release, you need to update the version number. Please adhere to Semantic Versioning, in brief
Given a version number MAJOR.MINOR.PATCH, increment the:
MAJOR version when you make incompatible API changes,
MINOR version when you add functionality in a backwards compatible manner, and
PATCH version when you make backwards compatible bug fixes.
Additional labels for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format.
You can find the labels for pre-release in this page.
You can either use bump2version to automatically create a git tag with the updated version number, or manually create the tag yourself (locally or from the GitHub interface when making a release).
If you use bump2version
, you can run one of the following commands in the root of the repository
bump2version patch
bump2version minor
bump2version major
# if you want to create a pre-release
bump2version --new-version 1.2.0rc1
Once you are done, run
git push --tags
to publish the created tag on GitHub.
It’s important that the tag for a pre-release follows this naming convention as it will determine if the package is displayed as pre-release or release in PyPI.
Making a release on GitHub and publishing to PyPI#
If you already tagged and pushed a commit as explained above and you want to create a release from that tag, you can go to the Tags page on GitHub, select the (latest) tag and press the “Create release from tag” button. Please name the release with the same string used for the tag (including the v
prefix).
Alternatively you can go to the Releases page on GitHub and press the “Draft a new release button”. Now press “Choose a tag” and create a new tag.
Both approaches lead to the same page and view. From this, you need to specify if the release is a pre-release or if it should be set as the latest release (please use the checkboxes accordingly).
The last step is to fill the releases notes (explained in the next session), after this, you can press the “Publish release” button and the release will be available on GitHub. A GitHub action will automatically build the package and upload it to PyPI. The action may fail, so please check the status badge of the action from the Readme.
Writing release notes#
We recommend using the button “Generate release notes” to automatically collect all the information of the pull requests that are part of the release. The release notes serve as a changelog for the user of the package so it’s important to have them curated and well-organized. This is explained in depth below.
Here is an example of automatically generated release notes for a previous release (v0.2.3):
## What's Changed
* Add clip parameter to polygon_query; tests missing by @LucaMarconato in https://github.com/scverse/spatialdata/pull/670
* Add sort parameter to points model by @LucaMarconato in https://github.com/scverse/spatialdata/pull/672
* [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/scverse/spatialdata/pull/673
* Docs for datasets (blobs, raccoon) by @LucaMarconato in https://github.com/scverse/spatialdata/pull/674
* Update issue templates by @LucaMarconato in https://github.com/scverse/spatialdata/pull/675
* Minor fixes: `id()` -> `is`, inplace category subset `AnnData` relational query by @LucaMarconato in https://github.com/scverse/spatialdata/pull/681
* Added ColorLike to _types.py by @timtreis in https://github.com/scverse/spatialdata/pull/689
* [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/scverse/spatialdata/pull/685
* [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/scverse/spatialdata/pull/690
* [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/scverse/spatialdata/pull/698
* Fix labels multiscales method by @aeisenbarth in https://github.com/scverse/spatialdata/pull/697
**Full Changelog**: https://github.com/scverse/spatialdata/compare/v0.2.2...v0.2.3
The release notes above can be hard to read, but this is addressed by our configuration file. It organizes release notes by change type, inferred from GitHub labels, and ignores PRs from bots. We recommend opening the PRs included in the release and adding the appropriate labels. The automatic generation will then group PRs by release labels and list each PR on a separate line. Here is an example output:
<!-- Release notes generated using configuration in .github/release.yml at main -->
## What's Changed
### Major
* Adding `attrs` at the `SpatialData` object level by @quentinblampey in https://github.com/scverse/spatialdata/pull/711
### Minor
* Add asv benchmark code by @berombau in https://github.com/scverse/spatialdata/pull/784
* relabel block by @ArneDefauw in https://github.com/scverse/spatialdata/pull/664
* validate tables while parsing by @melonora in https://github.com/scverse/spatialdata/pull/808
### Fixed
* relaxed fsspec version by @LucaMarconato in https://github.com/scverse/spatialdata/pull/798
* fix for to_polygons when using processes instead of threads in dask by @ArneDefauw in https://github.com/scverse/spatialdata/pull/756
* Fix `transform_to_data_extent` converting labels to images by @aeisenbarth in https://github.com/scverse/spatialdata/pull/791
* fix join non matching table by @melonora in https://github.com/scverse/spatialdata/pull/813
**Full Changelog**: https://github.com/scverse/spatialdata/compare/v0.2.6...v0.2.7
Use informative titles for PRs, as these will serve as section titles in the release notes (rename the PRs if necessary). You can also manually edit the release notes before publishing them to improve readability.
Some additional considerations
Important! If a PR is large and its title isn’t informative or requires multiple lines, do not add a release tag. Instead, at the end of the first message of the PR discussion, please include a markdown section with title
# Release notes
with a brief description of the intended release notes. This will allow the person making a release to manually add the PR content to the release notes during the release process.Please avoid redundancy and do not add the same release notes to consecutive pre-releases/releases/post-releases.
When automatically generating the release notes, you can use the button “Previous tag: …” to choose which PRs will be included in the release notes.
Finally, you can see an example of a release in action in from Luca this short video tutorial.
Publishing to conda-forge#
Shortly after you make a release in PyPI, a new PR will be automatically made in the conda-forge “feedstock repository” for the package (this has been previously setup). The PR will contain a checklist of which tasks should be done to be able to merge the PR. Once the PR is merged, the package will be available in the conda-forge channel.
Practically, the changes that usually needs to be done are comparing the package requirements in pyproject.toml
from your repository, with the packages and versions in the meta.yaml
file in the conda-forge feedstock repository. If there are any differences, you should update the meta.yaml
file accordingly. After that, the CI will run and if green the PR can be merged.
Writing documentation#
Please write documentation for new or changed features and use-cases. This project uses sphinx with the following features:
the myst extension allows to write documentation in markdown/Markedly Structured Text
Numpy-style docstrings (through the napoloen extension).
Jupyter notebooks as tutorials through myst-nb (See Tutorials with myst-nb)
Sphinx autodoc typehints, to automatically reference annotated input and output types
See the scanpy developer docs for more information on how to write documentation.
Tutorials with myst-nb and jupyter notebooks#
The documentation is set-up to render jupyter notebooks stored in the docs/notebooks
directory using myst-nb.
Currently, only notebooks in .ipynb
format are supported that will be included with both their input and output cells.
It is your reponsibility to update and re-run the notebook whenever necessary.
If you are interested in automatically running notebooks as part of the continuous integration, please check
out this feature request in the cookiecutter-scverse
repository.
Hints#
If you refer to objects from other packages, please add an entry to
intersphinx_mapping
indocs/conf.py
. Only if you do so can sphinx automatically create a link to the external documentation.If building the documentation fails because of a missing link that is outside your control, you can add an entry to the
nitpick_ignore
list indocs/conf.py
Building the docs locally#
cd docs
make html
open _build/html/index.html
Debugging and profiling#
There are various tools available to help you understand the existing code base and your new code contributions. For debugging code there are multiple resources available: Scientific Python, VSCode and PyCharm.
To find out the time or memory performance of your code, profilers can help. Again, various resources from Scientific Python, napari, PyCharm and Dask can be helpful.