-
Notifications
You must be signed in to change notification settings - Fork 9
Version handling for field sets #123
Description
Either add explicit versioning to field sets or provide comprehensive compatibility checking when reading trajectory stores. (I'm leaning towards the latter option right now.)
Some notes:
-
What do we want versioning in this sense actually to do?
- We can reconstruct the field set used to save a store from the NetCDF
group information, so we can see both the "saved" and "declared"
version of the field set. - Need some sort of field set comparison operation to determine
incompatibilities. - Do we need /version numbers/ or /version identifiers/ on field sets at all?
Or can we get away just with comparing the "saved" and "declared" field
sets? For built-in field sets (so far, just "base" and "emissions") the
version of the field set is the software version, so we don't need a
separate version ID, but maybe we need to add an optional version ID
for "external" field sets. [NEEDS THINKING] - If we have a version number, do we somehow need to provide access to
the whole history of version changes for a field set? I don't think so:
we just need to be able to say whether the versions are different or
not, and whether the field sets are compatible or not.
- We can reconstruct the field set used to save a store from the NetCDF
-
Need to think about:
- Reading an older store version with a newer field set.
- Reading a newer store version with an older field set?
- Just reading a store with a different version of a field set than it
was created with. - The older/newer distinction only makes sense for adding/removing
fields. For changing the details of any field, only "different" is
relevant.
-
Things that might change in a field set:
- Add a new field: read new store with older field set ⇒ just ignore the
added field; read old store with newer field set ⇒ OK if there's a
default value, otherwise incompatible. - Remove a field: read old store with newer field set ⇒ just ignore the
deleted field; read new store with older field set ⇒ OK if there's a
default value, otherwise incompatible. - Change the type of a field: if the types are compatible/convertible,
that's no problem, and if they're not, the files with different
versions are totally incompatible ⇒ allow option to exclude bad fields
when reading store? - Change the dimensions of a field: old/new fields will be incompatible ⇒
option to exclude. - Change the description of a field: benign? ⇒ "strict" option to catch
even these small changes? - Change the units of a field: if the units are convertible, OK, convert
them, otherwise incompatible ⇒ introduce Pint for conversions? - Change the required status of a field: required=false → required=true ⇒
incompatible if input is missing; option to exclude whole trajectories
when reading a store, or maybe some sort of "bad trajectory handler"
option? - Change the default value of a field: OK?
- Add a new field: read new store with older field set ⇒ just ignore the
-
Basic cases for re-reading a trajectory store. Sₒ is the output field
set, Sᵣ is the read field set, f is a field and f(S) is the metadata for
field f in field set S (field identities are governed purely by field
/names/):Case f ∈ Sₒ f ∈ Sᵣ f(Sₒ) = f(Sᵣ) Result 1 T T T All OK. 2 T T F Multiple possibilities — see below. 3 T F — Ignore field f on read. 3 F T — OK if f(Sᵣ) has a default; otherwise an error. -
Cases 1, 3 and 4 are simple. Case 2 depends on the exact differences in
the metadata. (Units could be handled more leniently if using a units
package like Pint but, for the moment, any changes in units have to be
treated as incompatible.)Field Condition type f(Sₒ).type → f(Sᵣ).type not OK ⇒ incompatible. dimensions Difference ⇒ incompatible. units Difference ⇒ incompatible. required T→F ⇒ compatible; F→T ⇒ incompatible if missing. default Always compatible.