pass through attributes when creating metrics, dimensions, identifiers#18
pass through attributes when creating metrics, dimensions, identifiers#18
Conversation
| @transforms.setter | ||
| def transforms(self, transforms): | ||
| # TODO: Check structure of transforms dict | ||
| if not transforms: |
There was a problem hiding this comment.
nonsubstantive, but I do like oneliners
There was a problem hiding this comment.
Aye... I like this better too. I think I left it uncollapsed because I was expecting to do more logic than would be elegant in a one-liner. For now, this works well.
mensor/measures/types.py
Outdated
| else: | ||
| raise KeyError("No such attribute {}.".format(attr)) | ||
| raise AttributeError( | ||
| "Cannot initialize {}<{}> with attribute {}.".format(self.__class__.__name__, self.name, attr) |
There was a problem hiding this comment.
Can you quote the attribute name in single quotes?
|
I don't know how I feel about passing through all extra keyword arguments like this. There are some attributes which have been intentionally masked by the subclass constructors. Doing this re-opens direct access to these attributes, but does on the other hand simplify setting them. Most of these attributes are definitely not supposed to be touched. The transforms were intentionally not made something that measure providers could set, because that can lead to unintuitive and confusing behaviour. Can you give me some examples of where it is necessary? |
ef3ba16 to
8009d35
Compare
| @classmethod | ||
| def _on_registered(cls, key): | ||
| for agg in ['sum', 'mean', 'sos', 'count']: | ||
| for agg in ['sum', 'mean', 'sos', 'count', '1']: |
There was a problem hiding this comment.
perhaps rename to "any"?
| def __init__(self, name, expr=None, default=None, desc=None, shared=False, partition=False, requires_constraint=False, provider=None): | ||
| _ProvidedFeature.__init__(self, name, expr=expr, default=default, desc=desc, shared=shared, provider=provider) | ||
| def __init__(self, name, expr=None, default=None, desc=None, shared=False, partition=False, requires_constraint=False, provider=None, **attrs): | ||
| _ProvidedFeature.__init__(self, name, expr=expr, default=default, desc=desc, shared=shared, provider=provider, **attrs) |
There was a problem hiding this comment.
as discussed, perhaps we should be more explicit with which attrs get sent through
- add '1' SQL transform, aka "any"
8009d35 to
5a33132
Compare
| } | ||
|
|
||
| if isinstance(self.transforms, dict): | ||
| transforms.update(self.transforms.get('_default', {})) |
There was a problem hiding this comment.
default transforms for all unit types?
There was a problem hiding this comment.
I remain unconvinced, I'm afraid, that this is a good idea, as it leads to confusing (and usually wrong) behaviour :/. Suppose we chose count as the default aggregation for some measure. That might make sense the first time it is aggregated, but on subsequent aggregations (which may occur due to multiple rebase operations, for example), you would be taking the count of counts at each aggregations, which would make the resulting measure likely meaningless.
|
@danfrankj I still need to think about this a bit more deeply. It seems important to me that One alternative approach to that pursued here is to encourage things like this to be implemented as metrics, rather than measures. Is there a reason why this would not work well in this case? |
|
@danfrankj We've spoken about this a little through other channels, but just to keep it all in one place: I have given some more thought to the aggregation configuration issue. Are there any cases that you can think of where it makes sense to change the aggregation past the first aggregation of a given measure? I'm thinking of adding support at the provider level for specifying what the next aggregation will be; perhaps via something like If we allow further configuration, it seems to me that there would be an explosion of complexity, not just in building the providers, but in understanding the behaviour of new providers. Measures can be aggregated over different unit types, potentially multiple times, and if a measure provider can affect downstream aggregations beyond the first aggregation, then a lot of weird (and probably incorrect) operations will occur. |
Summary
When configuring measures, I need to be able to configure additional properties, especially transforms. This change passes additional configuration up the chain.
e,g,
enables answering questions like ...
Testing Done
Manual