Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adopt NEP-37 and NEP-47 (array module and Python Array API) #1592

Open
jthielen opened this issue Sep 29, 2022 · 7 comments
Open

Adopt NEP-37 and NEP-47 (array module and Python Array API) #1592

jthielen opened this issue Sep 29, 2022 · 7 comments
Labels
enhancement numpy Numpy related bug/enhancement

Comments

@jthielen
Copy link
Contributor

While Pint was a reasonably early adopter of NEP 18 (__array_function__) for array type compatibility, there have been several compatibility efforts in the NumPy (and Python arrays more generally) ecosystem recently that Pint has not yet taken advantage of, particularly the open NEPs 37 and 47, and with those, the Python Array API standards. It would be wonderful to see these implemented in Pint, and I believe that it should not be too onerous of a task (at least compared to the initial NEP 18 implementation)...perhaps all that would be needed is exposing the registries of functions in https://github.com/hgrecco/pint/blob/master/pint/facets/numpy/numpy_func.py as a module, which could then be returned as appropriate in the __array_module__ and __array_namespace__ protocols? That being said, given that __array_namespace__ denotes compliance with the Python Array API, perhaps rigorous testing against the API standard should be done prior to implementing that?

xref pydata/xarray#7067, pydata/duck-array-discussion#3

@hgrecco
Copy link
Owner

hgrecco commented Sep 30, 2022

I fully agree that we can move forward. One nice thing about the current facets organization is that we can create a facets/new_numpy/ facet and test it side by side.

Is there a good testsuite for compliance with the API that we can use?

@tomwhite
Copy link

Is there a good testsuite for compliance with the API that we can use?

The test suite at https://github.com/data-apis/array-api-tests is very thorough, and easy to set up on CI too.

@MichaelTiemannOSC
Copy link
Collaborator

The work I've been doing on uncertainties might be good grist for this mill.

xref #1611, #1615
xref hgrecco/pint-pandas#139 hgrecco/pint-pandas#140

@TomNicholas
Copy link
Contributor

I would really like to see pint conform to the array API standard. That effort is really being broadly adopted, and helps out wrapper libraries like xarray immensely. It would particularly help fix this frustrating issue with pint-xarray having to set force_ndarray_like = True, see xarray-contrib/pint-xarray#216.

@hgrecco
Copy link
Owner

hgrecco commented Dec 13, 2023

I would like to move this forward. Should we put schedule it for 0.24? We need PR, tests and a good deprecation policy if some behavior needs to go away. See also #1895

@keewis
Copy link
Contributor

keewis commented Dec 26, 2023

As mentioned above, we've been discussing this a bit in the pint-xarray issue above. In short, xarray uses a few numpy protocols and the array properties dtype, shape, and ndim to detect and cast scalars to numpy arrays. However, pint defines all of them but will raise / return None / return NotImplemented if it actually wraps a scalar. This is means that hasattr checks don't work (though I just earlier learned that hasattr is basically

def hasattr(obj, attr):
    try:
        getattr(obj, attr)
    except Exception:
        return False
    else:
        return True

which means that properties are always executed, and the checks could become getattr(obj, attr, NotImplemented) is not NotImplemented or something similar without significant additional cost).

One option to resolve this would be to split into QuantityScalar and QuantityArray, but I still think this is not a good idea, since predicting whether a return value is a scalar or an array will not be easy.

A different option, which I think would be much easier to deal with, is to have separate modes of the registry: a scalar mode, where any function takes a scalar and returns a scalar, and a array mode that follows the array API and takes arrays and returns arrays (and scalars should be passed through pint.Quantity.__array_namespace__().asarray(), as the array API does not allow interaction with scalars).

Unlike force_ndarray_like=True – which is a registry-global option – this would allow using the same registry in scalar workflows and array workflows without interfering with each other.

@burnpanck
Copy link

One option to resolve this would be to split into QuantityScalar and QuantityArray, but I still think this is not a good idea, since predicting whether a return value is a scalar or an array will not be easy.

I don't see why that would be a problem. If you are a QuantityScalar, you are low in the array hierarchy, and you can safely assume all operations return scalars too - otherwise, a higher object would have been responsible. On the other hand, if you are QuantityArray, you will be delegating most operations to an array library handling the magnitude, and carry some units information through a side channel. Then, you simply inspect the return value you got from that array library whether it is a scalar or not. Performance wise, that is O(1), just like the unit operations are.

On the other hand, there are other reasons where a separation would be helpful, see #1128 (comment) (i.e scalars should not conform to the array API).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement numpy Numpy related bug/enhancement
Projects
None yet
Development

No branches or pull requests

8 participants