Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP, BUG: Implement custom role for numpydoc parameters. #484

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

rossbar
Copy link
Contributor

@rossbar rossbar commented Aug 6, 2023

The numpydoc standard states:

Enclose variables in single backticks. The colon must be preceded by a space, or omitted if the type is absent.

In sphinx itself, single-backticks are used to indicate the default_role, which is py:obj by default. Long story short, this tends to result in a lot of nitpicky link warnings from sphinx due to parameter names. See #419 (comment) for details.

This PR aims to resolve this situation by adding a numpydoc_parameter_role configuration. The general idea is as follows:

When parsing the docstring, numpydoc will find all text enclosed in single backticks and compare it against the function parameters. Any matches will be explicitly converted to use the numpydoc_param_role; all others will be left as-is to use the sphinx default_role.

TODOs

  • Add tests
  • Make the numpydoc_parameter_role a sphinx config value (currently hard-coded)
  • Test docbuilds of scientific Python libraries (numpy, scipy, scikits, etc.) to evaluate impact.

Extensions

For now I aim to use the :emphasis: role as it's built in to sphinx and will fix the nitpicky warnings. Ultimately I'd like to expand this so that numpydoc parameters become anchor links in the docstring like the sphinx-paramlinks extension.

start_idx = m.start()
# If the opening backtick is preceded by a ":" or "`", it's already
# a role or literal markup, respectively, and should not be changed
if start_idx != 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit less fragile?

Suggested change
if start_idx != 0:
if start_idx > 0:

@larsoner
Copy link
Collaborator

larsoner commented Aug 7, 2023

This PR aims to resolve this situation by adding a numpydoc_parameter_role configuration... Ultimately I'd like to expand this so that numpydoc parameters become anchor links in the docstring like the sphinx-paramlinks extension.

This would be a great and something I've wanted for a long time!

One idea would be to have a weak dependency on paramlinks for this feature and "just" auto-convert single-backticks when numpydoc_parameter_role = 'paramlinks' or so. From a very quick look at their docs they already support cross-refs like

:paramref:`.EnvironmentContext.configure.transactional_ddl`

which I've also wanted and would be nice to have as part of this PR or a follow-up.

But as an alternative that I think makes more sense, it looks like the MIT-linensed code for sphinx-params is only ~400 lines, so it wouldn't be too much work to roll our own similar version while fixing/customize some stuff based on our use case:

  • I think ideally end-users should link to refs with :py:param: rather than :py:paramref: -- we don't link to classes with :classref: or functions with :funcref: after all, and I don't think we need to make public the role for the param itself
  • I don't think we would need to expose the equivalent of the sphinx-params :param: role to actually say "this is a parameter and document it as such" because we get that part for free during parsing of the Parameters section

@mattip
Copy link
Member

mattip commented Sep 18, 2023

Any progress here? We are getting close to turning on nit-picky mode in NumPy and need this to clean up parameter formatting

@rossbar
Copy link
Contributor Author

rossbar commented Sep 18, 2023

Any progress here? We are getting close to turning on nit-picky mode in NumPy and need this to clean up parameter formatting

Thanks for the ping @mattip . A quick summary on the current status:

I think this approach is generally tenable and will resolve the aforementioned issue generally for many libraries. The big caveat that falls into the "needs decision" category is that adding this feature would break instances of rst tables that incorporate a single-backticked parameter.

This is a baked-in limitation due to the fact that numpydoc does an rst->rst->ast conversion rather than an rst->ast conversion directly. This already affects other components in numpydoc, such as references (see also #130), so I don't think this should be a hard blocker, but YMMV.

On the TODO-side of things, there are two main things left:

  1. Tests + documentation updates, and
  2. Trying this out with downstream libraries (numpy, scipy, matplotlib, the scikits, etc.) to ensure it's not too disruptive and generally improves things.

I've done 2) to some extent (testing with numpy and networkx), but I haven't gone through the full exercise including summarizing the results.

I'll aim to tackle these things ASAP, but this is very much a in-free-time activity 🙃 .

@timhoffm
Copy link
Contributor

timhoffm commented May 14, 2024

I'm somewhat concerned on this approach.

  • The rewriting of docstrings is higher magic. Single backticks have a clear meaning in Sphinx (prefix it with default_role). Overloading that semantics in docstrings may be surprising, e.g. `axis` resolves differently whether or not there is a parameter named "axis" in the docstring. In particular, the parameter shadows the same name in the default role, which can lead to unintended changes in existing docs.

  • Have you done performance measurements? Regexing through all docstrings is a bit of an effort.

At least this needs very clear documentation and should be configurable.

Context: In matplotlib, we're using default_role="py:obj" and denote parameters as *param* because of the current issue with backticks. So, we will not benefit from this but possibly experience link issues and longer build times. Therefore, for now, I would just deactivate such behavior for the matplotlib docs.


An alternative and more standard approach would be to try and make `param` be resolvable via a new regular sphinx role - similar to paramref from sphinx-params. But I see two possible blockers: (1) In sphinx-params one has to add the (full) function qualification before the paramter. That's not an acceptable solution for the current numpydoc situation. I'm unclear whether the context could be extracted in pure sphinx. (2) Such a role would need to be configured as default_role, but we cannot block that config. One would need some sort of composition default_role="py:param,py:obj".

@mdhaber
Copy link
Contributor

mdhaber commented Jun 13, 2024

@rossbar will we be able to link to the description of the return value, too? We didn't specifically mention the return value in the style guide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants