Skip to content

Conversation

@ssmmnn11
Copy link
Member

@ssmmnn11 ssmmnn11 commented Sep 30, 2025

Description

As described in #552 refactor mappers to enable both flexible graph behaviour, static, dynamic etc.

  • simplify mapper structure (Mixins), every mapper has its own pre_process and post_proscess implementation

@ssmmnn11 ssmmnn11 requested a review from JPXKQX September 30, 2025 19:11
@ssmmnn11 ssmmnn11 self-assigned this Sep 30, 2025
@ssmmnn11 ssmmnn11 added the ATS Approval Not Needed No approval needed by ATS label Sep 30, 2025
@mchantry mchantry changed the title Feat/mapper refactor Feat: mapper refactor Oct 1, 2025
@github-project-automation github-project-automation bot moved this to To be triaged in Anemoi-dev Oct 1, 2025
@mchantry mchantry added ATS Approved Approved by ATS and removed ATS Approval Not Needed No approval needed by ATS labels Oct 1, 2025
@mchantry mchantry changed the title Feat: mapper refactor feat(models): mapper refactor Oct 1, 2025
Copy link
Member

@JPXKQX JPXKQX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just left some minor comments/questions:

  • Some of these comments relate to what should be included in the _init_graph_mode() function, and whether it could do more than just set the graph mode.
  • Another point is whether it makes sense to pass the edge_index and edge_attr in the init of the mappers (instead sub_graph and sub_graph_edge_attributes), and whether the selection of the edge attributes should be moved from the mapper to the model.

@github-project-automation github-project-automation bot moved this from To be triaged to Under Review in Anemoi-dev Oct 2, 2025
@ssmmnn11 ssmmnn11 requested a review from a team as a code owner October 7, 2025 08:55
@ssmmnn11 ssmmnn11 force-pushed the feat/mapper-refactor branch from 1591075 to 5c4cdbc Compare October 7, 2025 09:01
Copy link
Contributor

@matschreiner matschreiner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Simon!
Thanks for implementing this, it is really cool!

There still seem to be two parallel control flows: one where graphs are provided externally (dynamic case) and one where they’re provided by the GraphProvider (static case). The purpose of this design is to abstract graph handling into a single place, but it’s still split between two places.

I think either the DynamicGraphProvider should be hooked into the “graph stream” somehow, or the static graph should be provided at the same point in the pipeline as the dynamic graphs are now, so both can be passed consistently into the mappers’ forward methods.

Copy link
Contributor

@matschreiner matschreiner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :)

@JPXKQX
Copy link
Member

JPXKQX commented Nov 19, 2025

UPDATE: One integration test is failing: test_restart_from_existing_checkpoint.
The migration checkpoint has been implemented to migrate state_dicts, but the test is failing due to incompatible optimiser states. As a result, previous checkpoints can be used for transfer learning, provided that load_weights_only: True is used. Migrating the optimiser states should be possible, but without any contextual information, it is not straightforward and could lead to silent misbehaviour.


class SparseProjector(torch.nn.Module):
"""Constructs and applies a sparse projection matrix for mapping features between grids.
"""Applies sparse projection matrix to input tensors.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the logic gets moved to the ProjectionGraphProvider, this class could be removed as it essentially just performs a matmul.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Under Review

Development

Successfully merging this pull request may close these issues.

6 participants