Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add feature "disable_idna" #836

Closed
wants to merge 1 commit into from

Conversation

parasyte
Copy link

@parasyte parasyte commented May 2, 2023

This has a fairly colorful history.

The idna dependency was first made optional in #728, and then reverted in #790. This PR introduces a feature flag to opt-in to ASCII-only domain support, as suggested in #790 (comment). It partially reverts #790.

Unfortunately, this opt-in method is less robust than using a default-enabled feature and making idna optional (#728) because idna is always built. It is trivial for future changes to accidentally link idna unconditionally, making the disable_idna feature a no-op.

In my testing, using this feature flag saves around 235 KiB in the produced binary in release mode on Windows x86_64 builds. (Side note: Windows builds with VS are "always stripped", and debug symbols are provided in a separate pdb file. For everything else, the binary Cargo.toml can include strip = "symbols" in the release profile.) The feature was originally requested for WASM, and binary size deltas should be in the same ballpark with that arch.

@parasyte parasyte mentioned this pull request May 2, 2023
3 tasks
@valenting
Copy link
Collaborator

So, I'm less and less convinced that this is a direction we want to pursue.
If a crate using URL wants to opt out of the IDNA, they can replace the idna dependency in Cargo.toml with a dummy crate that implements domain_to_unicode, domain_to_ascii, domain_to_ascii_strict, and contains struct Errors- see instructions here

@bors-servo
Copy link
Contributor

☔ The latest upstream changes (presumably b33514a) made this pull request unmergeable. Please resolve the merge conflicts.

@micolous micolous mentioned this pull request Nov 2, 2023
@valenting
Copy link
Collaborator

Rather than adding all this to the url crate, instead I suggest patching out idna as suggested earlier.
I have an example here
Pretty much the only requirement is adding the following to your Cargo.toml

[patch.crates-io]
idna = {git = 'https://github.com/valenting/no-idna.git'}

This avoids adding any spec incompatible flag to this crate.
If this works for you I propose we close this issue.

@parasyte
Copy link
Author

I don't think patch works with published crates. If I can't get rid of the heavy IDNA dependency in my published project, that doesn't seems like a useful workaround.

@micolous
Copy link
Contributor

micolous commented Nov 29, 2023

I've written an alternative to this for wasm32-unknown-unknown in #887.

For Windows, there are platform IDN bindings which idna could use instead, which is significantly smaller than the UTS46 mapping table and unicode_normalization, even for projects which don't already link against windows-rs!

What would need to happen for that is for Config and Errors to move up into the parent crate, and then implement a work-alike for uts46::Idna which maps onto windows::Win32::Globalization::{IdnToAscii, IdnToUnicode}. A quick reading of MSDN suggests there may be some functionality that doesn't work quite the same.

Edit, after further experimenting: The big gotcha with the Windows APIs is that you can't control whether IDNA2003 or IDNA2008 is used – it's tied to the OS version:

The IdnToNameprepUnicode, IdnToAscii, and IdnToUnicode algorithms are not applicable to Windows NT, Windows 2000, Windows XP, or Windows Server 2003. These algorithms follow the IDNA2003 standards for Windows Vista, Windows Server 2008, Windows 7, and Windows Server 2008 R2 operating system. Otherwise, the algorithms follow the IDNA2008+UTS46 standards.

Even on Windows 11 23H2 (which should apply IDNA2008+UTS46 rules), to_ascii("☕.com") converts successfully to xn--53h.com.

There are a lot of other failures, and canonical examples in the IDNA spec (used in idna crate tests) aren't working.

I have a branch of rust-url which uses the Windows APIs, but I'm not planning to work on that any further due to the compatibility issues and everything being pretty broken.

@hsivonen
Copy link
Collaborator

hsivonen commented Nov 4, 2024

With url 2.5.3 and idna 1.0.3, it's now possible to disable idna data and support for non-ASCII domain input. See https://docs.rs/crate/idna_adapter/latest for instructions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants