Skip to content

Version handling for field sets #123

@ian-ross

Description

@ian-ross

Either add explicit versioning to field sets or provide comprehensive compatibility checking when reading trajectory stores. (I'm leaning towards the latter option right now.)

Some notes:

  • What do we want versioning in this sense actually to do?

    • We can reconstruct the field set used to save a store from the NetCDF
      group information, so we can see both the "saved" and "declared"
      version of the field set.
    • Need some sort of field set comparison operation to determine
      incompatibilities.
    • Do we need /version numbers/ or /version identifiers/ on field sets at all?
      Or can we get away just with comparing the "saved" and "declared" field
      sets? For built-in field sets (so far, just "base" and "emissions") the
      version of the field set is the software version, so we don't need a
      separate version ID, but maybe we need to add an optional version ID
      for "external" field sets. [NEEDS THINKING]
    • If we have a version number, do we somehow need to provide access to
      the whole history of version changes for a field set? I don't think so:
      we just need to be able to say whether
      the versions are different or
      not, and whether the field sets are compatible or not.
  • Need to think about:

    • Reading an older store version with a newer field set.
    • Reading a newer store version with an older field set?
    • Just reading a store with a different version of a field set than it
      was created with.
    • The older/newer distinction only makes sense for adding/removing
      fields. For changing the details of any field, only "different" is
      relevant.
  • Things that might change in a field set:

    • Add a new field: read new store with older field set ⇒ just ignore the
      added field; read old store with newer field set ⇒ OK if there's a
      default value, otherwise incompatible.
    • Remove a field: read old store with newer field set ⇒ just ignore the
      deleted field; read new store with older field set ⇒ OK if there's a
      default value, otherwise incompatible.
    • Change the type of a field: if the types are compatible/convertible,
      that's no problem, and if they're not, the files with different
      versions are totally incompatible ⇒ allow option to exclude bad fields
      when reading store?
    • Change the dimensions of a field: old/new fields will be incompatible ⇒
      option to exclude.
    • Change the description of a field: benign? ⇒ "strict" option to catch
      even these small changes?
    • Change the units of a field: if the units are convertible, OK, convert
      them, otherwise incompatible ⇒ introduce Pint for conversions?
    • Change the required status of a field: required=false → required=true ⇒
      incompatible if input is missing; option to exclude whole trajectories
      when reading a store, or maybe some sort of "bad trajectory handler"
      option?
    • Change the default value of a field: OK?
  • Basic cases for re-reading a trajectory store. Sₒ is the output field
    set, Sᵣ is the read field set, f is a field and f(S) is the metadata for
    field f in field set S (field identities are governed purely by field
    /names/):

    Case f ∈ Sₒ f ∈ Sᵣ f(Sₒ) = f(Sᵣ) Result
    1 T T T All OK.
    2 T T F Multiple possibilities — see below.
    3 T F Ignore field f on read.
    3 F T OK if f(Sᵣ) has a default; otherwise an error.
  • Cases 1, 3 and 4 are simple. Case 2 depends on the exact differences in
    the metadata. (Units could be handled more leniently if using a units
    package like Pint but, for the moment, any changes in units have to be
    treated as incompatible.)

    Field Condition
    type f(Sₒ).type → f(Sᵣ).type not OK ⇒ incompatible.
    dimensions Difference ⇒ incompatible.
    units Difference ⇒ incompatible.
    required T→F ⇒ compatible; F→T ⇒ incompatible if missing.
    default Always compatible.

Metadata

Metadata

Assignees

Labels

post-v1Issues for triage when we start thinking about work after V1 release.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions