Conversation
kbattocchi
left a comment
There was a problem hiding this comment.
Looks like a good start but there are a few things that should be cleaned up.
econml/data/dgps.py
Outdated
| return T, X | ||
|
|
||
|
|
||
| class StandardDGP(): |
There was a problem hiding this comment.
(very minor)
| class StandardDGP(): | |
| class StandardDGP: |
econml/data/dgps.py
Outdated
| return T, X | ||
|
|
||
|
|
||
| class StandardDGP(): |
There was a problem hiding this comment.
Add docstring, especially since some arguments can be a function or a dict.
econml/data/dgps.py
Outdated
| else: # else must be dict | ||
| if nuisance_Y is None: | ||
| nuisance_Y = {'support': self.d_x, 'degree': 1} | ||
| nuisance_Y['k'] = self.d_x |
There was a problem hiding this comment.
Consider asserting the expected type here, something like:
| nuisance_Y['k'] = self.d_x | |
| assert(isinstance(nuisance_Y, dict), f"nuisance_Y must be a callable or dict, but got {type(nuisance_Y)}") | |
| nuisance_Y['k'] = self.d_x |
(and likewise for the other similar arguments)
|
|
||
| def gen_Y(self): | ||
| self.y_noise = np.random.normal(size=(self.n, self.d_y), scale=self.y_eps) | ||
| self.Y = self.theta(self.X) * self.y_of_t(self.T) + self.nuisance_Y(self.X) + self.y_noise |
There was a problem hiding this comment.
It looks like y_of_t is really a treatment featurizer, is that right? If so, change the name accordingly.
econml/data/dgps.py
Outdated
| else: # else must be dict | ||
| if nuisance_Y is None: | ||
| nuisance_Y = {'support': self.d_x, 'degree': 1} | ||
| nuisance_Y['k'] = self.d_x |
There was a problem hiding this comment.
Setting nuisance_Y['k'] is not strictly necessary since the default behavior of gen_nuisance will do this for you, but perhaps it's worth being explicit.
More importantly, making this assignment will mutate the dictionary that was passed in, which is possibly undesirable; probably better to do something like:
| nuisance_Y['k'] = self.d_x | |
| nuisance_Y = { **nuisance_Y, 'k':self.d_x } |
which will create a new dictionary instead of altering the old one.
Furthermore, what if the dictionary already had a 'k' entry - do you really want to ignore and override it?
econml/data/dgps.py
Outdated
| self.t_eps = t_eps | ||
|
|
||
| def gen_Y(self): | ||
| self.y_noise = np.random.normal(size=(self.n, self.d_y), scale=self.y_eps) |
There was a problem hiding this comment.
Consider allowing the user to specify a seed so that results are reproducible.
econml/data/dgps.py
Outdated
| return T, X | ||
|
|
||
|
|
||
| class StandardDGP(): |
There was a problem hiding this comment.
This should probably support generating and using W as well.
econml/data/dgps.py
Outdated
| return T, X | ||
|
|
||
|
|
||
| class StandardDGP(): |
There was a problem hiding this comment.
This should probably support using a user-specified featurizer on X.
econml/data/dgps.py
Outdated
| return T, X | ||
|
|
||
|
|
||
| class StandardDGP(): |
There was a problem hiding this comment.
Add tests to at least verify that the API can actually be used and that you get arrays of the right dimensions out, etc., even if there aren't good checks for the actual data (but perhaps there are useful checks there, too)
econml/data/dgps.py
Outdated
| return T, X | ||
|
|
||
|
|
||
| class StandardDGP(): |
There was a problem hiding this comment.
Do we expect users to use this class to test things? If so, perhaps also add a DGP section to the documentation (and at least add it to the module docs). But if this is mainly for internal use, add a comment to the docstring indicating that instead.
5961cbf to
47ad1ca
Compare
No description provided.