feat(models,training): multi dataset integration #594

mchantry · 2025-10-08T08:29:11Z

Description

Supports multiple time-aligned datasets as inputs and outputs for training.
e.g.

era_t     |         era_{t+1}
          | - > 
cerra_t   |         cerra_{t+1}

era_t      |        era_{t+1}
           | - > 
           |        cerra_{t+1}

era_t     |       
          | - > 
cerra_t   |         cerra_{t+1}

where inputs/outputs each use their own encoder/decoder.

As a contributor to the Anemoi framework, please ensure that your changes include unit tests, updates to any affected dependencies and documentation, and have been tested in a parallel setting (i.e., with multiple GPUs). As a reviewer, you are also responsible for verifying these aspects and requesting changes if they are not adequately addressed. For guidelines about those please refer to https://anemoi.readthedocs.io/en/latest/

By opening this pull request, I affirm that all authors agree to the Contributor License Agreement.

Various changes, see git history. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

…ure/multi_ds_dict

…pecified

training/src/anemoi/training/config/datamodule/multi.yaml

training/src/anemoi/training/config/data/multi_debug.yaml

for more information, see https://pre-commit.ci

radiradev · 2025-10-30T16:18:23Z

Hi @mchantry, I'm planning on doing some training with this branch. Could you let me know what I should expect to work and not work?

for more information, see https://pre-commit.ci

MeraX · 2025-11-11T10:59:43Z

training/src/anemoi/training/train/train.py

+        # ALWAYS override dataset from dataloader config (ignore dummy in graph config)
+        if hasattr(graph_config.nodes.data.node_builder, "dataset"):
+            graph_config.nodes.data.node_builder.dataset = dataset_path


Would a similar mechanism be required for ICON Node builders? see also #627

radiradev · 2025-11-11T15:17:08Z

training/src/anemoi/training/config/debug_multi_eracerra.yaml

+    dataset: aifs-ea-an-oper-0001-mars-${data.resolution}-1979-2023-6h-v8.zarr
+
+    # Secondary dataset for ERA5_copy (using same file for debugging)
+    dataset_b: cerra-rr-an-oper-0001-mars-5p5km-1984-2022-6h-v3-hmsi.zarr


Is this working currently? I've been trying to run with two datasets with non-identical datetimes and getting errors

Here I mean, one is a subset of datetimes of the other as it's the case in the example above.

What do you mean about non-identical datetimes radi?

The first implementation of multiple datasets aim to use datasets that are fully time aligned, and feature the time datetimes

training/src/anemoi/training/data/multidataset.py

training/src/anemoi/training/data/dataset/multidataset.py

for more information, see https://pre-commit.ci

## Description Adaptation of plotting callbacks to multiple datasets. We add a parameter `datasets` to the callback configuration that allows specifying which datasets to plot. All plots will be based on the same configuration, including the choice of parameters to plot. **In order to plot different datasets with different parameters, the user can configure multiple callbacks of the same type with different parameters.** Callbacks included: - PlotLoss - PlotSample - PlotSpectrum - PlotHistogram - GraphTrainableFeaturesPlot Callbacks not included in this PR: - ensemble plots - LongRolloutPlots Here are MLFlow runs for a [single dataset](https://mlflow.ecmwf.int/#/experiments/420/runs/8dca1935a3194edea16a21d4ed4d6a13/artifacts) and a [two dataset](https://mlflow.ecmwf.int/#/experiments/420/runs/ea59f113b0d34061b487038552d9467c/artifacts) use case. ## The config interface for plotting Most datasets will have somewhat different parameters, and hence different parameters to plot. The config interface implemented on this PR means that users will either need to plot parameters shared by all datasets, or configure multiple callbacks of the same type for different datasets. The reasons for this interface as a first draft is not that is it optimal for multiple datasets but rather that: - it keeps plots and plotting configs for the single dataset case backwards compatible. Avoiding regression on plotting for existing use cases was the main requirement. - it requires less changes to the callbacks than making them more configurable - it provides _some functionality_ to plot multiple datasets. - we don't know yet how multiple datasets will mostly be used and it might make sense to delay the design of the interface a bit. This will give us more time to rethink plotting callbacks more generally, rather than settling on an interface now. ## Other decisions - We will adapt pydantic schemas for plotting (and set a default for the new parameter `datasets = ["data"]`) as part of fixing schemas on the multiple datasets branch - I'd suggest to update the remaining plotting callbacks separately but open to thoughts.

ssmmnn11 and others added 15 commits September 22, 2025 10:48

multi dataset support

e8574c1

fix

b1919b6

fix

866250a

integrate into graphforecaster

0ddce52

fix

73763ff

remove defunct

7f3a813

clean-up

33518d8

fixes

6d76370

added debug config

715e7ed

disable mlflow and wandb in debug config

fa1fa03

Merge remote-tracking branch 'origin/main'

9e6c247

Multi dataset update 1 (#575)

049a877

Various changes, see git history. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Merge remote-tracking branch 'origin/feature/multi_ds_dict' into feat…

e709b41

…ure/multi_ds_dict

comment failing code

ff973bd

merged main

091f7bf

github-project-automation bot added this to Anemoi-dev Oct 8, 2025

github-project-automation bot moved this to To be triaged in Anemoi-dev Oct 8, 2025

github-actions bot added training models labels Oct 8, 2025

mchantry changed the title ~~Feature/multi dataset integration~~ feat(models,training): multi dataset integration Oct 8, 2025

mchantry added the ATS Approved Approved by ATS label Oct 9, 2025

This was referenced Oct 9, 2025

feat(models,training): multiple datasets (WIP) #441

Closed

feat(training): autoencoder 🗜️ #252

Open

JPXKQX and others added 3 commits October 16, 2025 15:50

fix: pass the full dataset config in case thinning or area args are s…

b5ef95c

…pecified

fix tests for anemoi-models

d978e9d

fix one test for anemoi-trainign

f067e00

JPXKQX reviewed Oct 21, 2025

View reviewed changes

training/src/anemoi/training/config/datamodule/multi.yaml Outdated Show resolved Hide resolved

dnerini moved this from To be triaged to Now In Progress in Anemoi-dev Oct 21, 2025

JPXKQX reviewed Oct 21, 2025

View reviewed changes

training/src/anemoi/training/config/data/multi_debug.yaml Show resolved Hide resolved

update profilers

51f4fc6

JPXKQX and others added 8 commits October 23, 2025 16:25

update config

6887e05

stash

7c89772

debug_single & debug_multi_eracerra working

b9f313a

Merge branch 'main' into feature/multi-dataset-integration

40fe3bb

[pre-commit.ci] auto fixes from pre-commit.com hooks

57c64d3

for more information, see https://pre-commit.ci

delete _run_mapper

ad8f83e

[pre-commit.ci] auto fixes from pre-commit.com hooks

e6518be

for more information, see https://pre-commit.ci

Merge branch 'main' into feature/multi-dataset-integration

10d974f

JPXKQX and others added 4 commits November 4, 2025 16:37

upgrade version to 2.0

6380da7

[pre-commit.ci] auto fixes from pre-commit.com hooks

99581df

for more information, see https://pre-commit.ci

clean

17b3e3a

Merge branch 'main' into feature/multi-dataset-integration

ab879db

MeraX reviewed Nov 11, 2025

View reviewed changes

radiradev reviewed Nov 11, 2025

View reviewed changes

radiradev reviewed Nov 12, 2025

View reviewed changes

training/src/anemoi/training/data/multidataset.py Outdated Show resolved Hide resolved

radiradev reviewed Nov 12, 2025

View reviewed changes

training/src/anemoi/training/data/dataset/multidataset.py Outdated Show resolved Hide resolved

JPXKQX and others added 6 commits November 15, 2025 21:15

Merge branch 'main' into feature/multi-dataset-integration

56c5565

fix: multiple NativeGrid

dd1543f

add rich

51db837

format

ed38dcd

typing

87c5940

[pre-commit.ci] auto fixes from pre-commit.com hooks

1ad50fc

for more information, see https://pre-commit.ci

HCookie assigned floriankrb and JPXKQX Nov 17, 2025

JPXKQX and others added 5 commits November 18, 2025 14:21

fix: paths

be302de

update tests

d5cf7e2

Merge branch 'main' into feature/multi-dataset-integration

02d408d

[pre-commit.ci] auto fixes from pre-commit.com hooks

fa5f1df

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(models,training): multi dataset integration #594

feat(models,training): multi dataset integration #594

mchantry commented Oct 8, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

radiradev commented Oct 30, 2025

Uh oh!

MeraX Nov 11, 2025

Uh oh!

radiradev Nov 11, 2025

Uh oh!

radiradev Nov 12, 2025

Uh oh!

mchantry Nov 17, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

feat(models,training): multi dataset integration #594

Are you sure you want to change the base?

feat(models,training): multi dataset integration #594

Conversation

mchantry commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

Uh oh!

Uh oh!

radiradev commented Oct 30, 2025

Uh oh!

MeraX Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

radiradev Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

radiradev Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

mchantry Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

mchantry commented Oct 8, 2025 •

edited

Loading