-
Notifications
You must be signed in to change notification settings - Fork 1
Sustainability & Reproducibility
Stephan Reichl edited this page May 27, 2025
·
10 revisions
To ensure sustainable development, implicit documentation, and reproducibility each {module} has to fulfill the following requirements/specifications:
-
Traceability/Provenance & End-to-End Workflows
- If the workflow uses external resources (i.e., files) that are defined in
configsthey have to be defined asinputto the respectiverule(and notparams) to enable traceability/provenance and end-to-end execution in multi-module workflows. - Never use any
configsdirectly in scripts (or shell). Instead always provide them viaparams. Definingparamsforconfigsseems redundant but makes rules more readable/transparent and enables traceability for Snakemake. Otherwise, Snakemake can not notice a change in parameters and trigger a rerun upon changes in theconfig.
- If the workflow uses external resources (i.e., files) that are defined in
-
GitHub repository for development and version control.
- Descriptive name (i.e., what it does and its purpose e.g,
dea_limma) using Snakecase i.e., split by underscores_. -
READMEaccording to the provided template. - Repository structure according to Snakemake's best practice.
- Releases (i.e., versions) according to the semantic versioning scheme.
- Workflow rulegraph in
workflow/dags/rulegraph.svgsnakemake --rulegraph --forceall | dot -Tsvg > workflow/dags/rulegraph.svg - GitHub page displaying the
README. -
LICENSEfile (recommendation: MIT). -
CITATION.cfffile. - (Optional, but recommended) Add example data and configurations for users as a starting point.
- (Optional, but recommended) Provide resources and/or external data sources (e.g., reference data) as links, or Zenodo, or Git Large File Storage.
- Descriptive name (i.e., what it does and its purpose e.g,
-
Zenodo repository to ensure compatibility, citability, and long-term archiving.
- Via automated GitHub hook.
- Every GitHub release will trigger the creation of a new release in the Zenodo repository, and thereby a new version-specific DOI.
- The Zenodo repository will be annotated using the provided information in the
CITATION.cfffile in your GitHub repository. - There is one permanent DOI that can be used to reference/cite all releases/versions of a given repository. We recommend using this DOI and the release version for referencing e.g., in publications.
- Add the version-specific DOI badge to the top of the GitHub repository.
- Add the permanent project DOI to the
READMEin the introduction, the methods, at the bottom (Zenodo link), and to theCITATION.cff.
-
Snakemake Workflow Catalog entry to increase visibility and findability.
- By fulfilling the requirements for Standardized Usage, the workflow will be automatically indexed.
- Every GitHub release will trigger the catalog entry to be updated.
- Snakemake Report for implicit documentation and presentation of results.
- Result directory
-
Software Management with conda for reproducibility and portability.
- Specify the exact version of every entry in your conda environment specification files (
workflow/envs/*.yaml). - (CAVEAT: Currently not recommended due to an unresolved Snakemake issue) For maximal compatibility define your global workflow dependencies in
workflow/envs/global.yamlcontaining all required software for the execution of theSnakefile.
- Specify the exact version of every entry in your conda environment specification files (
-
Workflow specific profile
- Provide a workflow specific profile in
workflow/profiles/default/config.yamlfor workflow-specific parameters or resources.
- Provide a workflow specific profile in
- Use the
min_versiondirective in yourSnakefile##### set minimum snakemake version ##### min_version("8.20.1")
- (COMING SOON) Containerization with Docker/Singularity for OS-level virtualization.
- This final virtualization frontier will be explored and implemented across all MrBiomics modules in the future.
- Automated containerization has been supported since Snakemake 6.0.0 (released 2021-02-26).
-
Add the
{module}to the summary table with all modules in this repository's README under Modules.
- GitHub repository with README, LICENSE, CITATION.cff, Snakemake Workflow Catalog entry, and conda
YAMLspecifications with exact versions. - Zenodo repository via GitHub webhook.
- GitHub release to trigger Zenodo DOI generation.
- Add general and version-specific DOI to the GitHub README, CITATION.cff, and MrBiomics Modules.
- Final GitHub release with minor version bump including generated DOI.