question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Should we support looser header name validation?

See original GitHub issue

Closely related to #97.

Prompted by https://github.com/encode/httpx/issues/1363#issuecomment-709706049

So, h11 currently has stricter-than-urllib3 rules on header name validation…

>>> import httpx
>>> httpx.get("https://club.huawei.com/")
Traceback (most recent call last):
...
httpx.RemoteProtocolError: malformed data

Which is occurring because the response looks like this…

HTTP/1.1 200 OK
Connection: keep-alive
Content-Encoding: gzip
Content-Security-Policy: base-uri
Content-Type: text/html; charset=utf-8
Date: Thu, 15 Oct 2020 13:19:33 GMT
Server: CloudWAF
Set-Cookie: HWWAFSESID=a74181602debc465809; path=/
Set-Cookie: HWWAFSESTIME=1602767969615; path=/
Set-Cookie: a3ps_2132_saltkey=yCXrVqdR06Nk5u2PrmLgs9eqlGIpQd9FogV2GL6bxGP3HH2XweRXIeCVny%2BrVDpoOYNLphTU9uVN1HP1%2Fav1bvV2Yrafq%2BXdJR%2BVAVPHizU92ISGAest0dKt7%2FIbdulNYXV0aGtleQ%3D%3D; path=/; secure; httponly
Set-Cookie: a3ps_2132_errorinfo=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; secure; httponly
Set-Cookie: a3ps_2132_errorcode=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; secure; httponly
Set-Cookie: a3ps_2132_auth=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; secure; httponly
Set-Cookie: a3ps_2132_lastvisit=1602764373; expires=Sat, 14-Nov-2020 13:19:33 GMT; Max-Age=2592000; path=/; secure; httponly
Set-Cookie: a3ps_2132_lastact=1602767973%09portal.php%09; expires=Fri, 16-Oct-2020 13:19:33 GMT; Max-Age=86400; path=/; secure; httponly
Set-Cookie: a3ps_2132_currentHwLoginUrl=http%3A%2F%2Fcn.club.vmall.com%2F; expires=Thu, 15-Oct-2020 15:19:33 GMT; Max-Age=7200; path=/; secure; httponly
Transfer-Encoding: chunked
X-XSS-Protection: 1; mode=block
banlist-ip: 0
banlist-uri: 0
get-ban-to-cache-result/portal.php: userdata not support
get-ban-to-cache-result62.31.28.214: userdata not support
result-ip: 0
result-uri: 0

That’s not all that unexpected, since it’s obviously simply just due to h11 being a wonderfully thoroughly engineered package. And doing a great job of following the relevant specs. However we might(?) want to be following a path of as-lax-as-possible-if-still-parsable on stuff that comes in from the wire, while keeping the constraints on always ensuring spec-compliant outputs. (?)

In practice, if httpx is failing to parse responses like this, then at least some subset of users are going to see behaviour that from their perspective is a regression vs. other HTTP tooling.

What are our thoughts here?

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:4
  • Comments:9 (6 by maintainers)

github_iconTop GitHub Comments

3reactions
sigmavirus24commented, Oct 16, 2020

So in general, the maxim is “Be liberal in what you accept and conservative in what you send” (or something close to that).

I definitely think there’s value in not raising an exception here, that said, these should probably be “quarantined” for lack of a better turn of phrase. I think urllib3 might drop these on the floor (or has a few cases where that happens by virtue of using the standard library’s http client) and that’s also surprising. A way to signal to users “Hey, these are … weird, maybe be careful with them” would probably be valuable

1reaction
memstcommented, Sep 14, 2021

I would urge to make a decision and go down the path of urllib3 and other libraries that pass parsable headers even if they don’t percisely follow the RFC 7230. Often users can’t control what response headers the server is sending, but they would still like to process the data. The choice to hard error is currently made on the basis of safety, but people are now using a workaround and direclty overwriting h11._readers.header_field_re, which exposes them to more threats because that regex won’t be maintained.

I think that discarding invalid headers is worse than passing them through. It still creates the same problem for users that have to access that part of the request. Even an opt-in option is likely to be unaccessible to the end user who is utilising h11 through other libraries, which might not implement the option.

I think the decision should be made soon. It it’s a bad idea to start bypassing security features by modifying the library’s internal variables, but the current state leaves the users with no other choice.

Read more comments on GitHub >

github_iconTop Results From Across the Web

RESTful API: What METHOD/HEADER combo to use for ...
1) Use custom header · 2) Put something in the query string indicating to validate only · 3) Use Action URl e.g. \IndividualClient\123\actions\ ......
Read more >
HTTP header validation
The API Gateway can check HTTP header values for threatening content. This ensures that only properly configured name-value pairs appear in the HTTP...
Read more >
Spring Validation Example - Spring MVC Form Validator
When we accept user inputs in any web application, it becomes necessary to validate them. We can validate the user input at the...
Read more >
Error occurred while validating HTTP header: user-agent
I haven't used postman and I am closing this issue since I tracked down the problem to loose JSON management. As a matter...
Read more >
RFC 3261: SIP: Session Initiation Protocol
SIP transparently supports name mapping and redirection services, ... (There are three Via header field values - one added by Alice's SIP phone,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found