Sgrasse/develop/1526 update io by grassesi · Pull Request #1908 · ecmwf/WeatherGenerator

grassesi · 2026-02-23T15:21:26Z

Description

Issue Number

Closes #1526

Is this PR a draft? Mark it as draft.

Checklist before asking for review

I have performed a self-review of my code
My changes comply with basic sanity checks:
- I have fixed formatting issues with ./scripts/actions.sh lint
- I have run unit tests with ./scripts/actions.sh unit-test
- I have documented my code and I have updated the docstrings.
- I have added unit tests, if relevant
I have tried my changes with data and code:
- I have run the integration tests with ./scripts/actions.sh integration-test
- (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
- (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
I have informed and aligned with people impacted by my change:
- for config changes: the MatterMost channels and/or a design doc
- for changes of dependencies: the MatterMost software development channel

…WindowHandler

integration_tests/test_output.py

packages/common/src/weathergen/common/config.py

src/weathergen/datasets/batch.py

src/weathergen/utils/output.py

florianscheidl

Many nice changes, to assess the functional improvements, I'd need to more time to understand the changes in depth.

Some higher-level comments:

For typing, it might be nice to add some static typechecking like pyright or ty (Astral)
Would be helpful to have more docstrings/explainers for the new classes and functions introduced in Writer
What do we use for logging? I've seen a couple of raise and specific errors but no logging.

I'll test it on our HPC once we've implemented the fixes.

clessig

See comments. I will also rebase.

packages/common/src/weathergen/common/io.py

clessig · 2026-03-13T07:47:42Z

packages/common/src/weathergen/common/io.py

            self._append_dataset(self.source, "source")

        if self.key.with_target(forecast_offset):
+            # TODO requiring target data currently prevents predictions with unknowable target


Does the code work when target is None? Or should we have a tensor of size (0,num_channels)?

No currently neither option works. But it should work with target=None Writing an entire extra dataset to zarr that is totally empty is a bad band aid solution. I think we should rename ItemKey.with_target to ItemKey.with_prediction (since it is the prediction we are always interested in if the target is present it just means some ground truth is available and if not that is also fine) and require just that prediction is not None. Will implement.

src/weathergen/datasets/batch.py

clessig · 2026-03-13T08:00:13Z

src/weathergen/datasets/batch.py

-
-    # device of the tensors in the batch
-    device: str | torch.device
+    The data a instance contains is associated with one particular inital sampling window.


This should replace the whole paragraph:

Class representing one batch processed by the model.

This is not the place to explain how some of the general processing works.

At least I was missing this context when processing the data, where should I put the information instead?

MultiStreamDataSampler is responsible for assembling batches

clessig · 2026-03-13T10:18:29Z

src/weathergen/train/trainer.py

+                        targets = [
+                            targets
+                            for loss_name, targets in targets_and_auxs.items()
+                            if loss_name == PHYSICAL_LOSS_KEY


What about latent outputs?

If possible I would like to keep this out of scope for now and wait for #1860 to be finalized.

clessig · 2026-03-13T10:19:39Z

src/weathergen/utils/output.py

+        try:  # TODO: do this in config
+            _output_streams = val_cfg.output.streams
+            if _output_streams is None:
+                raise ConfigAttributeError("")


This should generate a warning and then terminate regularly

Currently not every training/validation/test config implements a "output" section, this guarantees a fallback/default. If the output section is missing should the default instead be no output/ no-op writer? Failing instead would mean that configs where no output is needed would require to specify a "output" section.

If there's no output section or no streams are specified or the specified streams are not present should all generate a warning and terminate regularly.

clessig · 2026-03-13T10:20:59Z

src/weathergen/utils/output.py

+            name: config for name, config in streams.items() if name in _output_streams
+        }
+        try:  # TODO: do this in config
+            self._forecast_offset = val_cfg.forecast.offset


Why not use self._forecast_offset = val_cfg.forecast.get( "offset", 0)--that's much more readable than the try/except

The try except also handles the case where a training/val/test config does not contain a "forecast" section. Can I always assume this is always there I am happy to change.

val_cfg.get( "forecast", {}).get( "offset, 0) is better

clessig · 2026-03-13T10:23:07Z

src/weathergen/utils/output.py

+            data = self._predictions.get_physical_prediction_normalized(key, self._normalizer)
+        except Exception as e:
+            # TODO: if preds are empty so create copy of target and add ensemble dimension
+            # preds = [targets[0].clone().unsqueeze(0)]


We need this. What's the problem?

clessig · 2026-03-13T10:24:23Z

src/weathergen/datasets/batch.py

        return self

+    @property
+    def sample_idxs(self) -> list[int]:


Why is this needed?

Just quality of life: Use batch.target_samples.sample_idxs instead of sample_idxs = [sample .sample_idx for sample in batch.target_samples.samples]

Why do we need sample .sample_idx--this is not on current develop

…nto sgrasse/develop/1526-update-io

…instead of str.

github-project-automation bot added this to WeatherGen-dev Feb 23, 2026

This was referenced Feb 24, 2026

Bug: Inference error when running physical jepa pretrained model #1759

Open

validation output breaks with num_input_steps > 1 and num_samples > 1 #1527

Open

grassesi force-pushed the sgrasse/develop/1526-update-io branch from d943446 to c8ece6b Compare February 24, 2026 23:57

grassesi mentioned this pull request Feb 25, 2026

[1914][1596] Implemented step_callback for incremental output of forecast outputs #1916

Closed

4 tasks

grassesi force-pushed the sgrasse/develop/1526-update-io branch 3 times, most recently from 7c91c97 to 8ed6d54 Compare February 27, 2026 11:55

grassesi added 22 commits March 3, 2026 15:42

Add/improve/correct typehints

c9f53d8

remove class attributes from Sample and correctly initialize

f82113b

refactor batch.py: constructor, typehints docstrings.

e263d86

Improve error handling

7da0c89

cleanup contructor of TargetAuxOutput

b7f4c54

Make logic clearer by using simpler control flow

0718b2e

get path run with run_id only

908d770

add geoinfo information to streams config during stream initialization

9423077

add property to sample to easily retrieve sample_idx

cd8cf66

notes

86f74c7

Add methods to retrieve normalized targets/predictios via ItemKeys

172c59d

implement new output.Writer class replacing io.OutputBatchData

05643d9

Use new output.Writer for writing output

6e4f84a

handle incremental sample writes

1b00fc3

handle incremental fstep writes

6e271f2

comment

4b1cc88

formatting

ecd7cc4

Add initial time window to batch.

50ab745

get source window from batch instead of separate retriaval using Time…

46e6786

…WindowHandler

correct usage of sample_idx

d3c3a7b

fixes

8be2c37

ruff/cleanup

0ff06de

grassesi added 2 commits March 3, 2026 15:45

more informative assert

6fe4273

ruffed

b7c57c1

grassesi force-pushed the sgrasse/develop/1526-update-io branch from 8ed6d54 to b7c57c1 Compare March 3, 2026 14:55

grassesi marked this pull request as ready for review March 3, 2026 14:59

grassesi requested a review from clessig March 5, 2026 06:43

github-actions bot added infra Issues related to infrastructure model Related to model training or definition (not generic infra) labels Mar 9, 2026

florianscheidl suggested changes Mar 11, 2026

View reviewed changes

github-project-automation bot moved this to In Progress in WeatherGen-dev Mar 11, 2026

florianscheidl suggested changes Mar 11, 2026

View reviewed changes

clessig reviewed Mar 13, 2026

View reviewed changes

clessig added 2 commits March 13, 2026 14:25

Merge branch 'develop' of https://github.com/ecmwf/WeatherGenerator i…

00ba256

…nto sgrasse/develop/1526-update-io

Minor reequired fixes

0953f30

clessig mentioned this pull request Mar 14, 2026

Add verif functionality to develop #1723

Open

4 tasks

grassesi added 14 commits March 18, 2026 21:23

Implement suggested changes.

77a53ad

Remove unused methods

786fc19

Improve Writer: get result path from caller.

2c9d7bd

Improve Writer: dont require full config.

1faa8b2

Improve Writer: get fsteps from _BatchOutputData

477f830

Improve OutputItem: Identify different datasets within using StrEnum …

b1da2f0

…instead of str.

renaming and comments/docstrings

99287e0

add idx to model batch

60cf476

simplify writer

aa7ab06

better query config values during writer initialization.

7d8d522

use proper constructor for _BatchOutputData

74433ed

handle raw_key translation in _BatchOutputData

e718653

guard against empty/spoofed data

141d0b7

ruff format

82e1118

Conversation

grassesi commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Issue Number

Checklist before asking for review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

florianscheidl left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

clessig left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

grassesi Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

grassesi commented Feb 23, 2026 •

edited

Loading

florianscheidl left a comment •

edited

Loading

grassesi Mar 16, 2026 •

edited

Loading