NullCharEscapes

(legacy summary: cannot match protocol of an absolute URL via String.startsWith.) (legacy labels: Attack-Vector)

Null characters in URL can disguise protocols such as `javascript:`.

Unsanitized code can be embedded in comments, and conditional compilation might disable runtime assertions.

RFC 3986 allows the following characters in a URI scheme:

scheme      = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )

IE apparently allows, and silently removes, certain characters from URLs.

Collin Jackson reports that unicode code point 65533 is one of those.

The last 4 code points in 32b unicode are:

http://www.mozillazine.org/talkback.html?article=4078 talks about wider exploits due to null bytes %00 in URLs.

URL html attribute not stripped of null characters,

OR URLs not restricted to absolute urls with a whitelisted protocol OR URLs not normalized.

IE

<iframe src="java&#65533;script:alert(42)"></iframe>