Publishing#

Publishing your results can mean several things:

  • Writing a manuscript and submitting the results to a journal for (double-blind) peer review.

  • Creating a data publication, ideally submitted with your text manuscript, for data transparency & reproducibility.

  • Publish your git repository, making it available for others to find and read your notebooks.

We provide a short step-by-step guide on how to publish Jupyter notebooks together with the generated visuals and output as a data publication.

Make a release version#

When you are finished with your work, e.g. before submitting your manuscript for the first round of review, create a git release for your notebook repository and give it a version number:

git tag -a v1.0.0 -m "Major release before submitting to Journal"
git push --tags

Hint

This adds a marker to your Git repository that can be easily found and referenced at any later stage. If you submit a minor or major revision at a later date, add another version tag to describe your progress.

After pushing your tag to Github or Gitlab, you can (and should!) create a Release from it, where you can attach data and other output information. Releases can be cited with (e.g.) Zenodo or ioerDATA.

../_images/release.png

Fig. 7 A release from our Gitlab repository based on the version v0.6.5-tag of the training materials.#

Create HTML versions of all your notebooks#

This is an optional step, but recommended because reviewers may not have Jupyter Lab to open your *.ipynb notebooks. By converting notebooks to HTML format, you can archive any code together with the generated visuals. You can convert notebooks directly in Jupyter with the below command.

!jupyter nbconvert --to html \
    --output-dir=../out/ ./205_publish.ipynb \
    --template=../nbconvert.tpl \
    --ExtractOutputPreprocessor.enabled=False
[NbConvertApp] Converting notebook ./205_publish.ipynb to html
[NbConvertApp] Writing 348507 bytes to ../out/205_publish.html

Hint

If you are using Carto-Lab Docker, replace html with html_toc in the above cell. This will automatically add a Terms of Content to the sidebar of your HTML, based on the headers (Markdown) in your Jupyter notebook.

Add conversion command at the end of every notebook

It is a good idea to add this command to every notebook, so it is run after every notebook change.

Attach the static HTML files as Supplementary Material for your submitted paper

These HTML versions of the notebooks are ideal for attaching directly to your publications when submitting a manuscript as Supplementary Materials (SM). They are like a portable archive version that contains your documentation, code and output graphics at the time of publication. This is the most important information and should be attached directly to your paper. This also helps reviewers to have a look at your workflow if they do not want to run the notebooks themselves.

In addition to these HTML files, the original notebook files (*.ipynb) and accompanying data should be made available in a proper data publication, which we show below.

Create a ZIP file with all your output data#

Once you have exported all notebook HTMLs and figures, create a ZIP archive that includes all your data, notebooks, HTML files, and figures. You can create this ZIP file directly in Jupyter, based on the latest Git version we created earlier.

Remove any previous releases

!rm ../out/*.zip

This Bash command cleans up any previous ZIP files in the out/ directory. The ! indicates it’s a shell command, not Python.

See an error?

The error rm: cannot remove ‘../out/*.zip’: No such file or directory indicates that there are no zip files in the directory that can be deleted. This is expected if you run this notebook for the first time.

Prepare a release

Make sure that 7z is available. Carto-Lab Docker comes with 7z. If you are using this in a rootfull container, you can use !apt install p7zip-full. Otherwise (e.g. in Jupyter4NFDI Hub), we must retrieve the binary below.

%%bash
# Check if '7z' is already available globally or in ~/bin
if ! command -v 7z >/dev/null 2>&1 && [ ! -x "$HOME/bin/7z" ]; then
  echo "7z not found. Installing local copy..."
  mkdir -p ~/bin && cd ~/bin
  wget -q https://www.7-zip.org/a/7z2301-linux-x64.tar.xz
  tar -xf 7z2301-linux-x64.tar.xz
  ln -sf ~/bin/7zz ~/bin/7z
else
  echo "7z is already available."
fi
7z is already available.

Create a new release *.zip file

We want to create a ZIP file with the current release version in the name. We can get this with the following command:

!git describe --tags --abbrev=0
v1.8.0

Create the release file:

%%bash
export PATH="$HOME/bin:$PATH" \
    && cd .. && git config --local --add safe.directory '*' \
    && RELEASE_VERSION=$(git describe --tags --abbrev=0) \
    && 7z a -tzip -mx=9 out/release_$RELEASE_VERSION.zip \
    py/* out/* resources/* tmp/* *.bib notebooks/*.ipynb 00_data/* \
    *.md *.yml *.ipynb nbconvert.tpl conf.json pyproject.toml \
    -x!py/__pycache__ -x!py/modules/__pycache__ -x!py/modules/.ipynb_checkpoints \
    -y > /dev/null

Attach the static HTML files as Supplementary Material for your submitted paper

This may take a while. If you want to see output from the 7z process, remove -y > /dev/null (and the preceeding backslash). Depending on your environment and what part of the training materials was completed, you may also see errors such as missing directories (e.g. 00_data. This is expected, too. The list of files that you want to include in a release file must be maintained and updated, depending on the progress.

  • export PATH="$HOME/bin:$PATH" This ensures that any binary (e.g. 7z) previously retrieved is accessible (see cell above)

  • git config --local --add safe.directory '*' Ensures Git doesn’t prompt for confirmation when working with repositories owned by different users.

  • RELEASE_VERSION is the bash variable that holds the value.

  • 7z a -tzip -mx=9 out/release_$RELEASE_VERSION.zip Uses the 7z archiving tool (7z a) to create a ZIP archive (-tzip) named out/release_ followed by the retrieved version number ($RELEASE_VERSION). -mx=9 sets the compression level to maximum.

  • With py/* out/* resources/* notebooks/*.ipynb (etc.) we explicitly select the folders and files that we want to include in the release. Note that we explicitly include the 00_data/ directory, which is not committed to the git repository itself (due to the .gitignore file).

  • At the end, we exclude a number of temporary files that we do not need to archive (-x!py/__pycache__ -x!py/modules/__pycache__ etc.) and turn off any output logging by piping to /dev/null.

%%bash

Above, we enable the IPython %%bash cell magic to allow bash commands to be written directly. See Built-in magic commands. There are other magics such as %%time which are quite useful. Unfortunately, magics cannot be easily combined, so we decided to limit ourselves to only %%bash here.

Next, we check the generated file:

!RELEASE_VERSION=$(git describe --tags --abbrev=0) \
    && ls -alh ../out/release_$RELEASE_VERSION.zip
-rw-r--r-- 1 root root 119M Jun  5 08:20 ../out/release_v1.8.0.zip
../_images/download.png

Fig. 8 In the Explorer on the left, right click and select download. Archive this replication package with your data repository of choice.#

List the directory file tree#

Before uploading data to a repository, it is useful to print a file tree of your current working directory. This will help others to understand how your files were organised at the time of execution. For example, you may have forgotten to add a data file to the repository which is in a folder which is also excluded with the .gitignore file. Without being transparent about where these files were and how they were named at the time of build, it would be impossible to reproduce your work.

There are several ways to do this. For example, you could create a file tree using a Jupyter cell and the bash command !tree --prune -I "_build|tmp". This would output a tree of files, but exclude the _build and tmp directories, which only contain temporary files.

As an alternative, we wrote a Python method that does something similar, but with more formatting options.

import sys
from pathlib import Path

module_path = str(Path.cwd().parents[0] / "py")
if module_path not in sys.path:
    sys.path.append(module_path)
from modules import tools
ignore_files_folders = ["_build", "et-book"]
ignore_match = ["*.gdbtabl*", "*a0000000*"]
tools.tree(
    Path.cwd().parents[0],
    ignore_files_folders=ignore_files_folders, ignore_match=ignore_match)
Directory file tree
├── .pandoc
│ ├── favicon-16x16.png
│ ├── favicon-32x32.png
│ ├── puppeteer-config.json
│ ├── readme.css
│ └── readme.html
├── .templates
│ └── CHANGELOG.md.j2
├── 00_data
│ ├── Biotopwert.lyr
│ ├── Biotopwert_Biodiversität.zip
│ ├── Biotopwerte Dresden 2018 Readme .txt
│ ├── Biotopwerte_Dresden_2018.cpg
│ ├── Biotopwerte_Dresden_2018.dbf
│ ├── Biotopwerte_Dresden_2018.gdb
│ │ ├── gdb
│ │ └── timestamps
│ ├── Biotopwerte_Dresden_2018.gdb.zip
│ ├── Biotopwerte_Dresden_2018.geojson
│ ├── Biotopwerte_Dresden_2018.prj
│ ├── Biotopwerte_Dresden_2018.sbn
│ ├── Biotopwerte_Dresden_2018.sbx
│ ├── Biotopwerte_Dresden_2018.shp
│ ├── Biotopwerte_Dresden_2018.shp.xml
│ ├── Biotopwerte_Dresden_2018.shx
│ ├── clc_legend.csv
│ ├── MANIFEST.TXT
│ └── occurrences_query.csv
├── _ext
│ └── custom_bibtex_styles.py
├── _static
│ ├── custom.css
│ ├── images
│ │ ├── FDZ-Logo_EN_RGB-clr_bg-sol_mgn-full_h200px_web.svg
│ │ ├── FDZ-Logo_EN_RGB-wht_bg-tra_mgn-full_h200px_web.svg
│ │ ├── header.svg
│ │ ├── jupyter.svg
│ │ ├── NFDI_4_Biodiversity___Logo_Negativ_Kopie.png
│ │ └── NFDI_4_Biodiversity___Logo_Positiv_Kopie.png
│ ├── inter
│ │ ├── Inter-Black.woff2
│ │ ├── Inter-BlackItalic.woff2
│ │ ├── Inter-Bold.woff2
│ │ ├── Inter-BoldItalic.woff2
│ │ ├── Inter-ExtraBold.woff2
│ │ ├── Inter-ExtraBoldItalic.woff2
│ │ ├── Inter-ExtraLight.woff2
│ │ ├── Inter-ExtraLightItalic.woff2
│ │ ├── Inter-Italic.woff2
│ │ ├── Inter-Light.woff2
│ │ ├── Inter-LightItalic.woff2
│ │ ├── Inter-Medium.woff2
│ │ ├── Inter-MediumItalic.woff2
│ │ ├── Inter-Regular.woff2
│ │ ├── Inter-SemiBold.woff2
│ │ ├── Inter-SemiBoldItalic.woff2
│ │ ├── Inter-Thin.woff2
│ │ ├── Inter-ThinItalic.woff2
│ │ ├── inter.css
│ │ ├── InterDisplay-Black.woff2
│ │ ├── InterDisplay-BlackItalic.woff2
│ │ ├── InterDisplay-Bold.woff2
│ │ ├── InterDisplay-BoldItalic.woff2
│ │ ├── InterDisplay-ExtraBold.woff2
│ │ ├── InterDisplay-ExtraBoldItalic.woff2
│ │ ├── InterDisplay-ExtraLight.woff2
│ │ ├── InterDisplay-ExtraLightItalic.woff2
│ │ ├── InterDisplay-Italic.woff2
│ │ ├── InterDisplay-Light.woff2
│ │ ├── InterDisplay-LightItalic.woff2
│ │ ├── InterDisplay-Medium.woff2
│ │ ├── InterDisplay-MediumItalic.woff2
│ │ ├── InterDisplay-Regular.woff2
│ │ ├── InterDisplay-SemiBold.woff2
│ │ ├── InterDisplay-SemiBoldItalic.woff2
│ │ ├── InterDisplay-Thin.woff2
│ │ ├── InterDisplay-ThinItalic.woff2
│ │ ├── InterVariable-Italic.woff2
│ │ └── InterVariable.woff2
│ └── videos
│ ├── jupyter4nfdi.webm
│ ├── Video.webm
│ └── Video3.webm
├── notebooks
│ ├── .gitkeep
│ ├── 00_toc.ipynb
│ ├── 00_toc.md
│ ├── 101_theory_chapters.ipynb
│ ├── 102_jupyter_notebooks.ipynb
│ ├── 201_example_introduction.ipynb
│ ├── 202_data_retrieval_gbif.ipynb
│ ├── 203_data_retrieval_monitor.ipynb
│ ├── 204_analysis.ipynb
│ ├── 205_publish.ipynb
│ ├── 301_accessing_data.ipynb
│ ├── 302_file_formats.ipynb
│ ├── 303_projections.ipynb
│ ├── 304_selecting_and_filtering.ipynb
│ ├── 305_mapping.ipynb
│ ├── 306_spatial_clipping.ipynb
│ ├── 307_merging_data.ipynb
│ ├── 308_spatial_overlays.ipynb
│ ├── 309_buffering.ipynb
│ ├── 310_statistics.ipynb
│ ├── 401_endmatter-thanks.ipynb
│ ├── 501_milvus_maps.ipynb
│ └── 502_geosocialmedia.ipynb
├── out
│ ├── 205_publish.html
│ ├── biodiversity_dresden.svg
│ ├── clipped.cpg
│ ├── clipped.dbf
│ ├── clipped.gpkg
│ ├── clipped.prj
│ ├── clipped.shp
│ ├── clipped.shx
│ ├── clipped_dataset.csv
│ ├── clipped_layer.csv
│ ├── geoviews_map.html
│ ├── graph.png
│ ├── graph.svg
│ ├── occurrences_query.csv
│ ├── release_v1.8.0.zip
│ ├── S12RG_2023_200m_DE.tif
│ ├── S12RG_2023_200m_DE.tiff
│ ├── S12RG_2023_200m_Saxony.tiff
│ ├── saxony.gpkg
│ └── saxony_S12RG_2023_200m.tif
├── py
│ └── modules
│ ├── pkginstall.sh
│ └── tools.py
├── resources
│ ├── 01_edit_files.gif
│ ├── 02_git_extension.gif
│ ├── 03_stage_changes.gif
│ ├── 04_commit_message.gif
│ ├── 05_pull_changes.gif
│ ├── 06_push_changes.gif
│ ├── 07_ci_pipeline.webp
│ ├── 08_observe_changes.gif
│ ├── 094_Verdichtung.jpg
│ ├── 1.png
│ ├── 10.png
│ ├── 11.png
│ ├── 13.png
│ ├── 14.png
│ ├── 14_.png
│ ├── 15.png
│ ├── 15_.png
│ ├── 16.png
│ ├── 18.png
│ ├── 19.png
│ ├── 2.png
│ ├── 21.png
│ ├── 22-2.png
│ ├── 22.png
│ ├── 23.png
│ ├── 24.png
│ ├── 25.png
│ ├── 26.png
│ ├── 3.png
│ ├── 4.png
│ ├── 5.png
│ ├── 6.png
│ ├── 7.png
│ ├── 8.png
│ ├── 9.png
│ ├── admonition.webp
│ ├── binder.png
│ ├── cover_image.jpg
│ ├── download.png
│ ├── gbif_api_reference.png
│ ├── geosocial_patterns_de.png
│ ├── hide-tag.webp
│ ├── html
│ │ └── geoviews_map.html
│ ├── linguee.webp
│ ├── monitor.webp
│ ├── release.png
│ └── terminal.jpg
├── scripts
│ └── patch_binder_links.sh
├── tests
│ └── link-check.sh
├── tmp
│ └── shapes
│ └── vg2500_12-31.utm32s.shape
│ ├── aktualitaet.txt
│ ├── dokumentation
│ │ ├── aktualitaet.txt
│ │ ├── anlagen_vg.pdf
│ │ ├── annex_vg.pdf
│ │ ├── Datenquellen_vg_nuts.pdf
│ │ ├── verwaltungsgliederung_vg.pdf
│ │ ├── vg2500.pdf
│ │ └── vg2500_eng.pdf
│ └── vg2500
│ ├── VG2500_KRS.cpg
│ ├── VG2500_KRS.dbf
│ ├── VG2500_KRS.prj
│ ├── VG2500_KRS.shp
│ ├── VG2500_KRS.shx
│ ├── VG2500_LAN.cpg
│ ├── VG2500_LAN.dbf
│ ├── VG2500_LAN.prj
│ ├── VG2500_LAN.shp
│ ├── VG2500_LAN.shx
│ ├── VG2500_LI.cpg
│ ├── VG2500_LI.dbf
│ ├── VG2500_LI.prj
│ ├── VG2500_LI.shp
│ ├── VG2500_LI.shx
│ ├── VG2500_RBZ.cpg
│ ├── VG2500_RBZ.dbf
│ ├── VG2500_RBZ.prj
│ ├── VG2500_RBZ.shp
│ ├── VG2500_RBZ.shx
│ ├── VG2500_STA.cpg
│ ├── VG2500_STA.dbf
│ ├── VG2500_STA.prj
│ ├── VG2500_STA.shp
│ ├── VG2500_STA.shx
│ ├── VG_DATEN.cpg
│ ├── VG_DATEN.dbf
│ ├── VG_IBZ.cpg
│ ├── VG_IBZ.dbf
│ ├── VG_WERTE.cpg
│ ├── VG_WERTE.dbf
│ ├── VGTB_ATT.cpg
│ ├── VGTB_ATT.dbf
│ ├── VGTB_RGS.cpg
│ └── VGTB_RGS.dbf
├── .gitignore
├── .gitlab-ci.yml
├── .version
├── _config.yml
├── _toc.yml
├── BIBLIOGRAPHY.md
├── CHANGELOG.md
├── conf.json
├── CONTRIBUTING.md
├── favicon.ico
├── intro.ipynb
├── LICENSE.md
├── logo.svg
├── nbconvert.tpl
├── pyproject.toml
├── README.md
└── references.bib
22 directories, 228 files
.
  • Path.cwd().parents[0] specifies the origin directory for the tree, which is the base path of our repository

  • ignore_files_folders is a list of full folder or file names that should not be listed

  • ignore_match is a list of wildcard patterns that can be used to exclude a wider range of files, such as most of the proprietary ESRI files in *.gdb folders.

ioerDATA#

With this file you are ready to upload your data to a data repository and create a DOI so that it can be properly archived, cited and referenced.

The ioerDATA is one such repository. It is available to all IOER collaborators at https://data.fdz.ioer.de.

See the ioerDATA documentation

If you are an IOER colleague, have a look at the (internal) ioerDATA documentation at https://docs.fdz.ioer.info/documentation/ioerdata/.

Other data repositories include Zenodo.

See the data publication for this work by Dworczyk et al. (2025).

Publishing code#

In addition to a data repository, you can (and should!) make your git repository available through (for example) Gitlab or Github. If you are in the middle of peer review, you may want to temporarily remove or redact any names.

Using Github pages

You can configure Github to publish your HTML converted notebooks to Github Pages at github.io. See the Quickstart for GitHub Pages.

✨ Then, spread the love! 💖 Share your notebook links with others on social media 📢, in communities 🤝, and beyond! 🚀

References#

[1] (1,2)

Claudia Dworczyk, Alexander Dunkel, Fatemeh Rafiei, and Ralf-Uwe Syrbe. Replication Data for: Exploring Spatial and Biodiversity Data with Python and JupyterLab. 2025. Version Number: 1. URL: https://doi.org/10.71830/6ILS40, doi:10.71830/6ILS40.