Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support something like value_parser_os and use for PathBuf, OsString parameters #13

Open
cbeck88 opened this issue Oct 14, 2024 · 0 comments

Comments

@cbeck88
Copy link
Owner

cbeck88 commented Oct 14, 2024

Right now, we take all the args and env as OsString, but in the internals, we eventually convert everything to String and signal UTF-8 errors if they aren't UTF-8.

This is typically fine on linux systems, and when deploying in docker containers or similar.
However, especially on windows systems, it turns out that it's not reasonable to expect paths to be utf-8.

The most correct thing is to:

  • Provide an option to set a value_parser that takes an OsStr instead of a str. We could call this value_parser_os.
  • Do this by default when the field is OsString or PathBuf. This represents 99% of the situations where this would actually be necessary.

In clap v3, they supported the option for OsStr, and there were attributes like #[clap(parse(from_os_str))]. But the default for PathBuf and OsString was to use FromStr and hardly anyone remembered to enable the from_os_str, so most programs didn't work right on windows. (https://users.rust-lang.org/t/path-osstr-and-supporting-non-utf-8-paths-inputs/64826). See also this issue.

In clap-v4, they decided to change to a different approach.

  • It was proposed to use autoref-specialization to automatically detect when types have a conversion from OsStr, and to prefer that, and fallback to a FromStr if present, etc.
  • Eventually this became the role of the clap::value_parser! macro. This can then be used in both the builder API and the derive API, and also plays a role in automatically creating value hints which are then leveraged in shell completion features.

However, IMO this caused some regressions in the usability of the system. Some simple usages of value_parser don't work in clap v4. For example

#[arg(value_parser = serde_json::from_str)]
pub json: MySchema,

doesn't work in clap v4.5.8, due to a lifetime inference issue. It works fine in conf however.

I believe that this error happens in clap because autoref-specialization is a complex technique that can interfere with type inference.

Moreover, this autoref-specialization stuff is unnecessary, so I don't think we should follow suit.

The proc macro is capable of looking at the type and deciding if it is OsString or PathBuf, which will work correctly for 99% of the time that the type actually is OsString or PathBuf, and then setting a good value_parser_os for those types. In all other cases, type-inference will keep working as it does now, which is basically as well as it can. We don't need the solution to also work for a builder API, because we intentionally don't have one.

I think avoiding the complexity of the clap v4 ValueParser system is a win for the users, based on reports like this one.

@cbeck88 cbeck88 changed the title Support something like #[clap(parse(from_os_str))] (which was part of clap v3) Support something like value_parser_os and use for PathBuf, OsString parameters Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant