`util.parse_url` allows invalid characters in reg-names
See original GitHub issueutil.parse_url
allows invalid characters in reg-names.
RFC3986 defines reg-name as follows:
reg-name = *( unreserved / pct-encoded / sub-delims )
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
pct-encoded = "%" HEXDIG HEXDIG
Currently, urllib3.util.parse_url
accepts reg-names containing characters outside those enumerated above.
The incorrectly accepted ASCII characters are [\x00-\x20]
, "
, <
, >
, ^
, `
, {
, |
, }
, and \x7f
.
Environment
>>> print("OS", platform.platform())
OS Linux-6.0.7-arch1-1-x86_64-with-glibc2.36
>>> print("Python", platform.python_version())
urllib3 2.0.0.dev0
>>> print("urllib3", urllib3.__version__)
Python 3.10.8
Steps to Reproduce
>>> import urllib3
>>> urllib3.util.parse_url("http://|") # Whatever happens, "|" should not end up in a reg-name.
Url(scheme='http', auth=None, host='|', port=None, path=None, query=None, fragment=None)
>>> # But it does!
Issue Analytics
- State:
- Created 10 months ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
Which characters make a URL invalid? - Stack Overflow
Here is an example of a URL that has invalid and unwise characters (e.g. '$', '[' ... Per this newer meaning of "URL",...
Read more >Certain special characters are not allowed in the URL entered ...
Certain special characters are not allowed in the URL entered into the address ... Either the username or the password contains any one...
Read more >Remove invalid characters from Amazon Redshift data - AWS
There are non-valid characters in my Amazon Redshift data. How do I remove them? Short description. If your data contains non-printable ASCII ...
Read more >Field validation: Regular Expressions : Stop Invalid Characters
Hello All, I have a requirement where I need to validate a field in service catalog. It shouldn't allow special characters as.
Read more >-19015: Invalid Characters In Path - NI - National Instruments
The following paths are invalid because they contain one or more characters not allowed in filenames:
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I closed the PR because I made it from the main branch of my fork and it had an extraneous fix built in. Working on making a series of PRs, each from separate branches, that fix a few different things.
Currently traveling for Thanksgiving, but will submit in the next few days.
@kenballus Yeah that’s a safe assumption, we implement RFC 3986 instead of WHATWG. Why have you closed your PR btw?