Add changes for pushforward trick by SavvasMel · Pull Request #1997 · ecmwf/WeatherGenerator

SavvasMel · 2026-03-06T13:06:48Z

Description

This PR adds the necessary changes for the pushforward trick.

Issue Number

Closes #1740

Is this PR a draft? Mark it as draft.

Checklist before asking for review

I have performed a self-review of my code
My changes comply with basic sanity checks:
- I have fixed formatting issues with ./scripts/actions.sh lint
- I have run unit tests with ./scripts/actions.sh unit-test
- I have documented my code and I have updated the docstrings.
- I have added unit tests, if relevant
I have tried my changes with data and code:
- I have run the integration tests with ./scripts/actions.sh integration-test
- (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
- (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
I have informed and aligned with people impacted by my change:
- for config changes: the MatterMost channels and/or a design doc
- for changes of dependencies: the MatterMost software development channel

clessig · 2026-03-15T18:18:57Z

src/weathergen/model/model_interface.py

+                # reshard_after_forward=False keeps FE parameters unsharded
+                # during the multi-step rollout loop.
+                # Needed for pushforward trick.
+                fully_shard(module, reshard_after_forward=False, **fsdp_kwargs)


@sophie-xhonneux : is this maybe related to the problem we are seeing with the EMATeacher where we need to reshard?

clessig · 2026-03-15T18:26:05Z

src/weathergen/model/model.py

+                        tokens = self.forecast_engine(tokens, step, model_params.rope_coords)
+
+                    # Add empty predictions for all streams (vectorized / batched if possible)
+                    for stream_name in self.stream_names:


We can avoid this when we create output with the correct length. I thought we do this anyway, and hence the step argument.

clessig · 2026-03-15T18:27:20Z

src/weathergen/model/model.py

+                )
+
+                if needs_full_prediction:
+                    tokens = self.forecast_engine(tokens, step, coords=model_params.rope_coords)


Remove coords= for consistency

clessig · 2026-03-15T18:30:00Z

src/weathergen/model/model.py

+                    or step == max(batch.get_output_idxs())
+                )
+
+                if needs_full_prediction:


We already call the forecast engine in l702. Don't we call it twice then in one iteration.

Also, if we set self.forecast_engine to Identity if the number of blocks is 0 then we avoid the condition above.

Sorry the one in l702 must have been introduced due to wrong merge, I will delete it.

Many thanks for the heads-up, I will correct this too.

@clessig just to make sure that I understand, if blocks are not >0 then the forecasting engine turns to None. Should I introduce an identity function instead in this case?

Yes exactly

clessig · 2026-03-15T18:31:22Z

src/weathergen/model/model.py

-            output = self.predict_decoders(model_params, step, tokens, batch, output)
-            # latent predictions (raw and with SSL heads)
-            output = self.predict_latent(model_params, step, tokens, batch, output)
+                needs_full_prediction = (


If you choose a bit more compact variable names then we can fit this in one line and it's more readable.

Eg. pushforward instead of pushforward_trick

clessig · 2026-03-15T18:31:47Z

src/weathergen/model/model.py

+                needs_full_prediction = (
+                    not pushforward_trick
+                    or not self.training
+                    or step == max(batch.get_output_idxs())


You use this to determine the number of forecast steps?

I use this to determine the last step for which we will take gradients.

Add changes for pushforward trick

a32cc51

github-project-automation bot added this to WeatherGen-dev Mar 6, 2026

SavvasMel added 2 commits March 6, 2026 14:07

Remove comments

e509ee1

Linting

de20faf

github-actions bot added model Related to model training or definition (not generic infra) science Scientific questions labels Mar 9, 2026

SavvasMel added 2 commits March 9, 2026 18:56

Merge branch 'develop' into SavvasMel/develop/pushf_trick

1d5f2fa

Merge branch 'develop' into SavvasMel/develop/pushf_trick

30a51ee

SavvasMel requested a review from clessig March 12, 2026 11:57

clessig reviewed Mar 15, 2026

View reviewed changes

Merge branch 'develop' into SavvasMel/develop/pushf_trick

056d4be

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add changes for pushforward trick#1997

Add changes for pushforward trick#1997
SavvasMel wants to merge 6 commits intoecmwf:developfrom
SavvasMel:SavvasMel/develop/pushf_trick

SavvasMel commented Mar 6, 2026

Uh oh!

clessig Mar 15, 2026

Uh oh!

clessig Mar 15, 2026

Uh oh!

clessig Mar 15, 2026

Uh oh!

clessig Mar 15, 2026

Uh oh!

SavvasMel Mar 16, 2026

Uh oh!

SavvasMel Mar 19, 2026

Uh oh!

clessig Mar 19, 2026

Uh oh!

clessig Mar 15, 2026

Uh oh!

clessig Mar 15, 2026

Uh oh!

clessig Mar 15, 2026

Uh oh!

SavvasMel Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SavvasMel commented Mar 6, 2026

Description

Issue Number

Checklist before asking for review

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants