Skip to content

RFC-8: Collections#343

Draft
normanrz wants to merge 32 commits into
ome:mainfrom
normanrz:rfc-8
Draft

RFC-8: Collections#343
normanrz wants to merge 32 commits into
ome:mainfrom
normanrz:rfc-8

Conversation

@normanrz
Copy link
Copy Markdown
Contributor

This is the work-in-progress draft for RFC-8.

cc @jluethi @lorenzocerrone @tischi @perlman @matthewh-ebi

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Sep 29, 2025

Automated Review URLs

@normanrz normanrz mentioned this pull request Sep 29, 2025
Comment thread rfc/8/index.md Outdated
#### `Collection` keys

* `"type"` (required). Value must be `"collection"`.
* `"nodes"` (required). Value must be an array of `CollectionNode` or `Collection` objects.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since every node has a unique name, why is this an array and not an object?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that could also work.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if representing an order may be desired, though. For example, https://ngff.openmicroscopy.org/latest/index.html#bf2raw states "Parsers like Bio-Formats define a strict, stable ordering of the images in a single container ...".
If it were an object the ordering would likely get lost in some JSON implementations. It could be represented through sortable node names, but that also seems less convenient.

Copy link
Copy Markdown
Contributor

@d-v-b d-v-b Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

order might also be useful for collections of layers in the context of an image visualization tool. Although you can always add an "order" field to the elements that's an integer (sort of the reverse of adding a "name" field that must be unique in the container).

Comment thread rfc/8/index.md Outdated

### Metadata

This RFC defines two main objects for OME-Zarr: `Collection`, `CollectionNode`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A CollectionNode can be a Collection, so it's a bit confusing to say that these are two objects unless you explain that "object" here means something like "interface" or "protocol"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the best term here? Is it a class?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it, there are currently 3 entities that need to be defined:

  • collection
  • multiscales
  • root

collection and multiscales can be discriminated based on their type field, and collection has attributes that multiscales does not, so regular inheritance from a base class doesn't express their relationship very well.

Maybe defining these as protocols would work? e.g., there's a core Node protocol, which the fields {type, name, attributes}, and objects that implement Node can also implement Collection OR Multiscales (but not both, because of the requirement on the type key). Finally, there's a Root protocol which can only be implemented by a Collection

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably bioformats2raw.layout and plate collections will still be around (not removed with this proposal). So a Node could be Collection or Multiscales or bioformats2raw or plate?

Copy link
Copy Markdown
Contributor

@d-v-b d-v-b Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I was wrong, regular inheritance isn't problematic for Collection and Multiscales -- there's a base Node, Collection and Multiscales (and anything else) inherit from Node (totally fine for them to add new attributes as children).

As for the requirement is that there be only 1 root node, I don't think that can be expressed in a type system easily as long as the root is structurally compatible with a Collection, but that can be added as a regular requirement

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the requirement is just that the root node have version (weaker than requiring that only the root node have version), then this is a bit simpler.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably bioformats2raw.layout and plate collections will still be around (not removed with this proposal). So a Node could be Collection or Multiscales or bioformats2raw or plate?

The idea is to remove bioformats2raw.layout and plate as separate entities with this proposal and express the functionality through attributes in the collection nodes. We need to work more on these.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this work similar to how I proposed it for the coordinate transforms? In essence, the paths specified in the plate metadata could be allowed to contain a Collection, which would contan the reference to the path.

Comment thread rfc/8/index.md Outdated
Comment thread rfc/8/index.md Outdated
@d-v-b
Copy link
Copy Markdown
Contributor

d-v-b commented Oct 2, 2025

this is looking really cool!

@dstansby
Copy link
Copy Markdown
Contributor

Looks nice! As a quick initial comment, it would be super helpful to have a minmal example that demonstrates the new metadata structure being proposed - the webknossos examples are nice, but I'm struggling to distinguish what's required and optional in those files because there's lots of extra (I think?) attributes.

Comment thread rfc/8/index.md Outdated
}, {
"name": "..",
"type": "collection",
"path": "./nested_collection.json"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The collection should be a directory that contains a zarr.json, right?
e.g. "path": "./nested_collection.zarr"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now I see that this standalone json file is proposed as part of this RFC. But that isn't covered until much later below under Examples Where is this collection metadata stored?. Maybe that should be moved up above this point?

If an implementation is using e.g. zarr-python or another zarr library to retrieve zarr metadata, then it may be kinda painful to also support fetching of vanilla file.json files using a different mechanism? Don't know about other libs.

@will-moore will-moore mentioned this pull request Oct 30, 2025
4 tasks
@jo-mueller jo-mueller mentioned this pull request Oct 30, 2025
@will-moore
Copy link
Copy Markdown
Member

I started a basic implementation of Collections spec for the validator at ome/ome-ngff-validator#62.
This should allow you to browse example Collections. Also there's a couple of linked test collections there to try out.

@tepals
Copy link
Copy Markdown

tepals commented Mar 13, 2026

As part of the CCP-volumeEM OME-NGFF Hackathon at EMBL-EBI Hinxton, we wrote down the following user story:

User Story: Large-Scale Multi-Beam vEM Tiling

Definitions

  • sub-image: 900x900 px image acquired by a single beam
  • tile: 8x8 sub-images stitched
  • slice: multiple tiles that make up a z-plane in the volume (each slice is made up of 100 - 1600 tiles) => 6400 - 100k sub-images in a slice
  • volume: series of slices (up to 100)

User Story
A microscopist using a Delmic FAST-EM system acquires up to a 100-slice volume where the imaged area varies per slice (e.g., following an irregular tissue boundary). Each slice consists of a variable number of tiles, each tile consists of 64 900 x 900 pixel sub-images with a 4 nm pixel size. While a nominal 100-pixel overlap is targeted, the actual spatial distribution is non-uniform, resulting in varying slice dimensions (& a need for individual transformations).

Currently this is solved by stitching the sub-images to a single tile, each tile is saved as an individual pyramidal tiff containing the first 3 zoom levels. For optimized viewing the fourth zoom level is created by stitching 16 (?) tiles and saving that as an individual tiff.

This raises 2 challenges:

  1. How many OME-Zarrs with individual transformations can reasonably be stored in a single collection? What are the trade-offs between saving the data viewer-optimized in a single OME-Zarrs vs. having many OME-Zarrs with transformations that allow full flexibility in raw data access and can still display a fused image at some performance cost. Additional complexity arises for parallel writing of a collection: Avoiding race conditions when writing the collections files & overly large collection json metadata.
  2. When handling very many OME-Zarrs & building pyramid layers (around 8) coming form many individual OME-Zarrs, lower resolution pyramid layers would need to be combined across multiple OME-Zarrs. Otherwise, the lowest res version of a 900px sub-images is a ~3x3px array (while at the resolution of the full field, it would still be a 400x400 image)

To handle problem 1, the OME-NGFF metadata must scale to coordinateTransformation for each of the N (100 - 1600 if tiles are OME-Zarr, 6400 - 100k if sub-images are OME-Zarrs) sub-images to position them within a shared 2D physical space. This would allow a viewer to render a "stitched" multiscale global view that only exists where data was actually acquired, while still providing direct access to the underlying raw overlapping FOVs regardless of their local grid density.
In order to handle viewing at lower resolutions, the collection spec would need to support combining many single scales at full resolution with fewer single-scale at lower resolution to build a pyramid.

Rough idea of the Example file structure, where each slice in a volume is saved as an OME-Zarr collection with shared low-resolution pyramid layers:

slice.zarr/
├── zarr.json                 <-- Collection metadata             
│
├── low_res_slice/            <-- Low-res, shared single scales (either grouped as a multiscales or individual single scales)
│   ├── zarr.json             <-- Multiscales (Zoom 4+)
│   ├── 4/                    <-- Resolution level 4 across the tiles
│   ├── 5/                    <-- Resolution level 5 across the tiles
│   └── 6/                    <-- Resolution level 6 across the tiles
│
├── tile_001/                 <-- One of the tiles in the large 2D slice
│   ├── zarr.json             <-- transformation: [z: 0, y: 0, x: 0]
│   ├── 0                     <-- full resolution data
│   ├── 1                     <-- down-sampled data
│   ├── 2                     <-- down-sampled data
│   └── 3/                    <-- down-sampled data
├── tile_002/                 <-- Second tile in the large 2D slice
│   ├── zarr.json             <-- transformation: [z: 0, y: 0, x: 812] => tile shifted in X
│   ├── 0                     <-- full resolution data
│   ├── 1                     <-- down-sampled data
│   ├── 2                     <-- down-sampled data
│   └── 3/                    <-- down-sampled data
└── ...

@toloudis
Copy link
Copy Markdown

User story 1 : combining independent segmentation zarrs with raw image zarrs.

We produce multiscale zarrs of our raw microscope images, using filtered downsampling.
We later produce multiscale zarrs of segmentations of the above images, using unfiltered (nearest neighbor) downsampling.

In viewers we want to give users an easy way of combining the two. In particular, our users are interested in seeing the data as if it were actually separate channels of the same volume. This may or may not be a viewer implementation detail, but it could be interesting if the spec supported this, pointing to two separate zarrs and treating them as consecutive channels. For our viewer, this only works if the spatial dimensions are the same, and can be transformed to the same origin (always trivially true for the data I describe).

User story 2: dataset releases

Is it practical to have one single very large collection? as in 1000s of zarrs or more? We would likely produce collections of matched raw+segmentation zarrs as described in my user story 2.

@jo-mueller
Copy link
Copy Markdown
Contributor

TODO:

This post is a bit stream of consciousness-y - I hope I manage to express the bump I a stumbling over with the current state of transforms in here. In the version of this RFC, when coordinateTransformations were added, the input and output fields of transforms were supercharged to bei either of:

  • the name of a coordinate system
  • the path to a multiscales group or single zarr array
  • and now, additionally, the name of a node instance.

In ome/ngff-spec#117, this was made more explicit, so that these input/output fields explicitly declare these. In an RFC8 world, this would look like this:

"input": {
  "path": "./scale0",
  "node": "node_name",
  "name": "coordinate_system_name"
}

And I think porting over this formalism is important, because instances of nodes (of whatever kind) may own multiple coordinate systems.

This has implications. In RFC8, the transforms for Singlescales nodes live in the singlescale metadata, which may not be the same metadata document as the multiscales metadata (unless the singlescale metadata is inlined). So currently, the transforms metadata for an inlined signlescale would look like this:

{
    "ome": {
        "version": "0.x",
        "type": "collection",
        "name": "example",
        "attributes": {
            "coordinateSystems": [
                {
                  "id": "world",
                  "name": "world",
                  "axes": [...]
                }
            ]
        },
        "nodes": [{
            "name": "raw",
            "type": "multiscale",
            "nodes": [{
                "id": "raw_0",
                "type": "singlescale",
                "path": {
                  "type": "zarr",
                  "path": "./raw/0"
                },
                "attributes": {
                  "coordinateTransformations": [
                    {
                      "type": "scale",
                      "scale": [1, 1, 1],
                      "input": {
                        "path": "raw_0",
                      },
                      "output": {
                        "name": "world"
                        "node": "raw_0"
                      }
                    }
                  ]
                }
            }, ...]
        }, ... ]
    }
}

The question I'm stuck with now: If the Singlescale is not inlined - where does the coordinateSystem metadata for the coordinate system called "world" live? Should it live in the attributes of the Multiscales node? In that case, the output of the coordinateTransformations metadata that is located in the Singlescales nodes would have to point up to the Multiscales metadata, which is tricky for obvious reasons. The other way, would be to require an instance of coordinateSystems and coordinateTransformations metadata in the attributes of the Multiscales node. The transforms could either:

  • Link to the coordinateSystems in the singlescale nodes via identity transforms
  • duplicate the coordinateTransformations in the Singlescale nodes

I don't have a good idea about which to prefer, though.

Copy link
Copy Markdown
Contributor

@lubianat lubianat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(sorry, the approval was a misclick on GH mobile when hastily ok'ing ome/ngff-spec#128)

Copy link
Copy Markdown
Member

@will-moore will-moore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just adding comments, but seems I have to create a review...

@will-moore
Copy link
Copy Markdown
Member

Seems that adding comments to the changes page isn't working for me at the moment. So I'll add some here

Non-root Node objects SHOULD NOT have a version field and MUST NOT have a different version value than the root Node.

Many of the collections I would like to represent with this spec contain images of different OME-Zarr versions. E.g. the figure at https://ome.github.io/omero-figure/?file=https://gist.githubusercontent.com/will-moore/75a7f0de5be0f7b4202d5f0229cadcc9/raw/ngff_images_figure.json or the list of samples at https://idr.github.io/ome-ngff-samples/ so this would be a blocker for many use-cases.

Singlescale: This new interface replaces the dataset metadata defined in the previous versions of the OME-Zarr specification.

I'm not sure what the motivation is for Singlescale nodes, and what it means to "replaces the dataset metadata". Presumably Multiscale images/nodes still use "datasets"?
What is the difference between a Multiscale image with a single dataset and a Singlescale? Do we need both?

@normanrz
Copy link
Copy Markdown
Contributor Author

normanrz commented May 6, 2026

Thanks @will-moore!

Non-root Node objects SHOULD NOT have a version field and MUST NOT have a different version value than the root Node.

Many of the collections I would like to represent with this spec contain images of different OME-Zarr versions. E.g. the figure at https://ome.github.io/omero-figure/?file=https://gist.githubusercontent.com/will-moore/75a7f0de5be0f7b4202d5f0229cadcc9/raw/ngff_images_figure.json or the list of samples at https://idr.github.io/ome-ngff-samples/ so this would be a blocker for many use-cases.

Collections will likely be a feature of OME-Zarr 1.0. I don't think it is reasonable to referentially include all previous versions of the spec in the 1.0 release because of the burden that would put on implementations.

Singlescale: This new interface replaces the dataset metadata defined in the previous versions of the OME-Zarr specification.

I'm not sure what the motivation is for Singlescale nodes

The motivation for Singlescale nodes is to a) have a name for resolution levels within a multiscale (other name suggestions welcome!), b) allow images to be Singlescales without a multiscale (i.e. the metadata is in the zarr.json of an array instead of a group with a single item), c) make multiscales behave like collections (i.e. they have nodes).

, and what it means to "replaces the dataset metadata". Presumably Multiscale images/nodes still use "datasets"?

Multiscales are now collections of Singlescales. The field datasets (which imo is very overloaded) is replaced by nodes.

What is the difference between a Multiscale image with a single dataset and a Singlescale? Do we need both?

Multiscales with a single Singlescale are not disallowed, but not required anymore. Users can just create Singlescales as Zarr arrays without the need for enclosing Zarr groups.

@normanrz normanrz closed this May 6, 2026
@normanrz normanrz reopened this May 6, 2026
@d-v-b
Copy link
Copy Markdown
Contributor

d-v-b commented May 6, 2026

The motivation for Singlescale nodes is to a) have a name for resolution levels within a multiscale (other name suggestions welcome!)

could you define the term "image" to mean "a Zarr array", and "multiscale image" to mean "a collection of images at different levels of detail". Starting with the more basic thing (a single array) and defining the collection in terms of that seems better than starting with the collection (multiscales) and defining the more basic thing in terms of it.

@will-moore
Copy link
Copy Markdown
Member

It feels like we have been working on RFC-5 for a long time and have finally reached a consensus on transforms and scenes etc. But even before v0.6 is released we are proposing to re-work all that again (and other core concepts like Multiscales.datasets that have been around since v0.1).
I don't know what the timeline for this looks like but it feels like too much churn.

Are we saying that OME.zarr data v0.6 and earlier are not expected to be supported by tools that read v1.0 because they are too different? That would discourage adoption of OME.zarr v0.6 because it's sunsetted even before it's released.

My first impression of RFC-8 was that it's a way of grouping existing Multiscales images into Collections. But this proposal looks like starting from scratch and ditching previous work and support for existing data?

I'm not even sure I fully understand @jo-mueller's question above, except that it shows all the hard RFC-5 discussions are going to need to be revisited again?

@jo-mueller
Copy link
Copy Markdown
Contributor

@will-moore thanks for the feedback. About my comment above, I think discussing intents and structure last week in Düsseldorf helped to structure my ideas for RFC8. I opened normanrz#4 with some suggestions that address some of my concerns.

@normanrz
Copy link
Copy Markdown
Contributor Author

normanrz commented May 8, 2026

It feels like we have been working on RFC-5 for a long time and have finally reached a consensus on transforms and scenes etc. But even before v0.6 is released we are proposing to re-work all that again (and other core concepts like Multiscales.datasets that have been around since v0.1). I don't know what the timeline for this looks like but it feels like too much churn.

I appreciate the design work that has gone into RFC-5 and I think RFC-8 is building on top of that. I'll review with @jo-mueller next week whether to bring back the scene metadata.

My first impression of RFC-8 was that it's a way of grouping existing Multiscales images into Collections. But this proposal looks like starting from scratch and ditching previous work and support for existing data?

I think it is important to look at RFC-8 as part of the long-term vision of the 1.0 release. This probably warrants its own RFC, but in my view 1.0 is supposed to be a long-term release that carries us through the next decade without breaking changes. Up until now every release of OME-Zarr has been breaking and I think that needs to stop to foster serious adoption. That also means this is the last opportunity in a while to break things in order to make the OME-Zarr spec more consistent and extensible. Basically, take all the learnings from the 0.x releases and make a great long-term 1.0 release.

Are we saying that OME.zarr data v0.6 and earlier are not expected to be supported by tools that read v1.0 because they are too different? That would discourage adoption of OME.zarr v0.6 because it's sunsetted even before it's released.

I definitely think that tools should be considered compliant with the 1.0-spec if they only support v1.0 and no previous versions. This is already the case with 0.x versions. Only very few tools understand 0.1-0.3 and some tools only understand 0.5 and not 0.4 anymore. I think that is totally fine, because they are 0.x releases.

That being said, I think the extension mechanism could be used to include 0.x OME-Zarrs in 1.0 Collections. Just define an extension node type that references 0.x multiscales. Tools could voluntarily support that, if they find it useful.

I want to add that 0.5 -> 0.6 -> 1.0 are metadata-only changes. I don't think it is unreasonable for users to consider migrating the metadata. This will be less of a lift than the 2024 NGFF challenge, where we actually converted the data.

@d-v-b
Copy link
Copy Markdown
Contributor

d-v-b commented May 8, 2026

seconding norman's POV. And a broader point about churn: churn during development is valuable if it buys a better released product. This churn affects devs for months, but users will interact with 1.0 for years. It would be unfortunate if they had to tolerate a deficient product because devs settled too early. Now is the time to fix stuff. It only gets harder later.

@jo-mueller
Copy link
Copy Markdown
Contributor

I think this is a super-useful discussion here. If anything, it will help RFC8 authors to get a feeling from which direction to expect feedback or sharpen RFC8 towards. I think there are two separate things to take from this discussion:

Minimally, I think the relationship between coordinate system and nodes needs to be clarified. To a degree, this already happened in 0.6.dev3 -> 0.6.dev4. The important thing to note here is that coordinate systems and transformations define their own graph like structure, that can be independent of the collection/node layout. Since a node in RFC8 can cleanly be resolved to a path, the only crucial thing to update is how coordinate systems are referenced, which is indeed a result of a quite lengthy discussion, but it's not hard to fix. Transform inputs and outputs can refer to coordinate systems and the node inside which they live and that'll be fine.

The other thing is the following:

0.5 -> 0.6 -> 1.0 are metadata-only changes.

I'm not so sure about that. In 0.x, the smallest interpretable, indivisible aggregation of data and metadata is the multiscales specification. Any implementation of ome-zarr minimally needs to understand the concept of a multiscales; Without that, plates, wells or scenes don't make much sense. At the same time, the multiscales brings the functionality that allows viewers and other consumers to give users the smooth access to large data that we often advertise. Without the multiscales concept, this is simply not possible.

The introduction of the Singlescale acknowledges individual resolution levels (aka zarr arrays) as part of the ome-zarr metadata, which changes what elements of a multiscales are understood to be "load-bearing" parts. Moving the singlescale metadata into the zarr.json of the individual multiscale arrays seems to me like those arrays are intended to be interpretable by themselves, irrespective of the context of a multiscales into which they are embedded. That, in requires consumers to be able to handle singlescale arrays, which I think many (i.e., viewers) cannot do easily.

Don't get me wrong, I'm not opposed to renaming datasets into nodes where every entry in the array is an instance of a Singlescale. As @normanrz points out, that's indeed just a change of metadata. I'm skeptical about distributing this metadata into the zarr.json of several arrays, all of which should be interpretable without the context of a multiscale node that's defined elsewhere in the hierarchy.

What I propose in normanrz#4 is simply a stratification and clarification of where metadata sits and what collections are expected to collect:

  • Collections declare collections of images and potentially other stuff
  • An image is a special kind of collection, consisting of one or more resolution levels.
  • An image (aka multiscale group) forms the most elementary, portable, self-describing unit of a collection, that can be copied, referenced and validated independently.

This is currently not necessarily the case with the Singlescale node. RFC8 already suggests a possible way out: Inlining the singlescale metadata into the multiscale node sorts this out. I would just require multiscales to always inline singlescales.

Imho, making this restriction doesn't take from the expressiveness and elegance of RFC8, but adds to the integrity and reliability of images - aka multiscales - as an essential concept in the spec.

@normanrz normanrz mentioned this pull request May 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rfc Status: request for comments rfc-8

Projects

None yet

Development

Successfully merging this pull request may close these issues.