-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clean-up and implement the conversion methods #11
Conversation
@jthielen, @TomNicholas: here's an overview of the current implementation of usage examples of DataArray.pint.toIn [1]: import xarray as xr
...: import numpy as np
...: import pintxarray
...: import pint
...: ureg = pint.UnitRegistry()
In [2]: da = xr.DataArray(
...: dims="x",
...: data=np.linspace(0, 1, 4) * ureg.m,
...: coords={"u": ("x", [2, 4, 6, 8] * ureg.s)},
...: )
In [3]: da.pint.to("mm")
Out[3]:
<xarray.DataArray (x: 4)>
<Quantity([ 0. 333.33333333 666.66666667 1000. ], 'millimeter')>
Coordinates:
u (x) int64 <Quantity([2 4 6 8], 'second')>
Dimensions without coordinates: x
In [4]: da.pint.to(ureg.mm)
Out[4]:
<xarray.DataArray (x: 4)>
<Quantity([ 0. 333.33333333 666.66666667 1000. ], 'millimeter')>
Coordinates:
u (x) int64 <Quantity([2 4 6 8], 'second')>
Dimensions without coordinates: x
In [5]: da.pint.to({da.name: "mm"})
Out[5]:
<xarray.DataArray (x: 4)>
<Quantity([ 0. 333.33333333 666.66666667 1000. ], 'millimeter')>
Coordinates:
u (x) int64 <Quantity([2 4 6 8], 'second')>
Dimensions without coordinates: x
In [6]: da.pint.to("mm", u="ms")
Out[6]:
<xarray.DataArray (x: 4)>
<Quantity([ 0. 333.33333333 666.66666667 1000. ], 'millimeter')>
Coordinates:
u (x) float64 <Quantity([2000. 4000. 6000. 8000.], 'millisecond')>
Dimensions without coordinates: x
In [7]: da.pint.to({da.name: "mm", "u": "ms"})
Out[7]:
<xarray.DataArray (x: 4)>
<Quantity([ 0. 333.33333333 666.66666667 1000. ], 'millimeter')>
Coordinates:
u (x) float64 <Quantity([2000. 4000. 6000. 8000.], 'millisecond')>
Dimensions without coordinates: x
In [8]: da.pint.to({da.name: "mm"}, u="ms")
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-9-e092d5c447d0> in <module>
----> 1 da.pint.to({da.name: "mm"}, u="ms")
.../pintxarray/accessors.py in to(self, units, registry, **unit_kwargs)
265 )
266
--> 267 units = either_dict_or_kwargs(units, unit_kwargs, "to")
268
269 return conversion.convert_units(self.da, units)
.../pintxarray/accessors.py in either_dict_or_kwargs(positional, keywords, method_name)
43 )
44 if keywords:
---> 45 raise ValueError(
46 "cannot specify both keyword and positional "
47 f"arguments to .{method_name}"
ValueError: cannot specify both keyword and positional arguments to .to usage examples of Dataset.pint.toIn [1]: import xarray as xr
...: import numpy as np
...: import pintxarray
...: import pint
...: ureg = pint.UnitRegistry()
In [2]: ds = xr.Dataset(
...: data_vars={"a": ("x", np.linspace(0, 1, 4) * ureg.m)},
...: coords={"u": ("x", [2, 4, 6, 8] * ureg.ms)},
...: )
In [3]: ds.pint.to(a="mm", u="ms")
Out[3]:
<xarray.Dataset>
Dimensions: (x: 4)
Coordinates:
u (x) int64 <Quantity([2 4 6 8], 'millisecond')>
Dimensions without coordinates: x
Data variables:
a (x) float64 <Quantity([ 0. 333.33333333 666.66666667...
In [4]: ds.pint.to({"a": "mm", "u": "ms"})
Out[4]:
<xarray.Dataset>
Dimensions: (x: 4)
Coordinates:
u (x) int64 <Quantity([2 4 6 8], 'millisecond')>
Dimensions without coordinates: x
Data variables:
a (x) float64 <Quantity([ 0. 333.33333333 666.66666667...
In [5]: ds.pint.to({"a": "mm"}, u="ms")
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-5-49ddaee6c1c1> in <module>
----> 1 ds.pint.to({"a": "mm"}, u="ms")
.../pintxarray/accessors.py in to(self, units, **unit_kwargs)
376
377 def to(self, units=None, **unit_kwargs):
--> 378 units = either_dict_or_kwargs(units, unit_kwargs, "to")
379
380 return conversion.convert_units(self.ds, units)
.../pintxarray/accessors.py in either_dict_or_kwargs(positional, keywords, method_name)
43 )
44 if keywords:
---> 45 raise ValueError(
46 "cannot specify both keyword and positional "
47 f"arguments to .{method_name}"
ValueError: cannot specify both keyword and positional arguments to .to In a future PR we could then add support for contexts via a Also, we could add pseudo-support for converting units in indexes: take the string unit from the |
Also, @TomNicholas: there's still only a single check visible. I can't seem to be able to set up an account so I can't verify this, but reading the documentation it seems that it needs read / write access to checks. |
@TomNicholas, this should be ready for review and merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me!
One comment: there is now this separation between high-level functions (e.g. attach_units
) and internal ones (e.g. array_attach_units
). Should they be separated with preceding underscores? Should some of the top-level ones be public API? In particular the testing assert functions might be useful for users or downstream developers.
pintxarray/tests/test_conversion.py
Outdated
pytest.param(None, id="without registry"), | ||
pytest.param(unit_registry, id="with registry"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not know you could assign tests ids like this - cool!
hmm... for now I'd say let's consider everything imported by I agree that adding a |
@keewis The implementation of |
Also, @jthielen, if you already implemented the pseudo-index support once it might be faster for you to do so again. Go ahead if you want to and have the time. |
ah, with the relocation the CI setup is gone. @TomNicholas, do you think you could set this up again? |
I added the docstrings (and found two bugs in the process), so this is ready for another round of reviews. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me pending CI getting fixed and passing. Just a couple questions (that don't need to hold up merging) on things I noticed when looking at this closer:
|
||
units = either_dict_or_kwargs(units, unit_kwargs, "to") | ||
|
||
return conversion.convert_units(self.da, units) | ||
|
||
def to_base_units(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should to_base_units
also be adjusted analogously to use a util from conversion
or is it fine as-is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to_base_units
uses Quantity.to_base_units
, so I'm not quite sure how we'd be able to use a conversion
function here. Did you want to add a function to conversion
that calls .to_base_units
? If so, then yes, we could use that to reduce duplicated code.
We should also allow passing in a dict
/ kwargs to control which of the DataArray
's variables should be converted (defaulting to "convert all variables").
Both should be a new PR, though.
units = either_dict_or_kwargs(units, unit_kwargs, "to") | ||
|
||
return conversion.convert_units(self.ds, units) | ||
|
||
def to_base_units(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here: not sure how to use conversion
, but we should definitely add a possibility to only convert certain variables.
Thanks for flagging that the CI has gone. I think I've fixed it - @keewis can you just push an empty commit to this branch to see if that triggers the CI on this PR? |
the CI seems to work again |
if the code citing style I used (stating the import path) is acceptable, this should be ready to be merged. I'm not quite sure what the best citing style would be, so if we agree on a certain style I'm happy to use that instead. |
@keewis Not to get too nit-picky, but I believe the terms of the Apache License require us to include "all copyright, patent, trademark, and attribution notices from the Source form of the Work" along with "a copy of this License", and to also state changes. So, since we're already including the Apache License (since that is |
You're not, I just have zero experience with code citing. I looked at the xarray code base for reference, and there the import style reference is mostly used, but then there's also the
while the license is included in their So now, I'm thinking of using this:
and including |
That sounds great except for one thing: it doesn't include the copyright notice (since Apache, unlike BSD and MIT, doesn't include the copyright notice as part of the standard license text). So, I think you'll just have to add that somewhere. At the top of |
sure, although |
Hmm...I'm pretty sure then pandas messed up on that one, since the license is clear about requiring "all copyright, patent, trademark, and attribution notices". I'll submit a PR to pandas and see what they say. |
thanks |
done. Once you got a response from the pandas devs, we can change this again. Maybe it's worth pinging @shoyer to resolve this? |
so if the code citation good enough for now, let's merge? |
As described in #6, this aims to implement working
Dataset.pint.to
andDataArray.pint.to
methods.