question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

HTTP/2.0 non US-ASCII header names should be rejected

See original GitHub issue

Expected behavior

According to the HTTP/2.0 RFC :

Just as in HTTP/1.x, header field names are strings of ASCII characters that are compared in a case-insensitive fashion. However, header field names MUST be converted to lowercase prior to their encoding in HTTP/2. A request or response containing uppercase header field names MUST be treated as malformed (Section 8.1.2.6).

Since it is referring to HTTP/1.x RFC, here is the interesting part :

A recipient MUST parse an HTTP message as a sequence of octets in an encoding that is a superset of US-ASCII [USASCII]. Parsing an HTTP message as a stream of Unicode characters, without regard for the specific encoding, creates security vulnerabilities due to the varying ways that string processing libraries handle invalid multibyte character sequences that contain the octet LF (%x0A).

US-ASCII is only using 7 bits. So during the header name parsing, any bytes in the form 1xxx xxxx should be treated as error and the request should be rejected (ie : byteValue < 0).

As reminder, here is the header name validator used in Netty for HTTP/1.1 :

private static void validateHeaderNameElement(byte value) {
        switch (value) {
        case 0x1c:
        case 0x1d:
        case 0x1e:
        case 0x1f:
        case 0x00:
        case '\t':
        case '\n':
        case 0x0b:
        case '\f':
        case '\r':
        case ' ':
        case ',':
        case ':':
        case ';':
        case '=':
            throw new IllegalArgumentException(
               "a header name cannot contain the following prohibited characters: =,;: \\t\\r\\n\\v\\f: " +
                       value);
        default:
            // Check to see if the character is not an ASCII character, or invalid
            if (value < 0) {
                throw new IllegalArgumentException("a header name cannot contain non-ASCII character: " + value);
            }
        }

Actual behavior

The only validation made on header names is :

private static final ByteProcessor HTTP2_NAME_VALIDATOR_PROCESSOR = new ByteProcessor() {
        @Override
        public boolean process(byte value) {
            return !isUpperCase(value); // isUpperCase : return value >= 'A' && value <= 'Z';
        }
    };

Thus non-ASCII characters are accepted and corrupted when transformed into string (ie : 2 byte coding UTF-16BE chars).

As an example, this non-ASCII byte : 0xf0 will be converted into 0x00 0xf0.

Minimal yet complete reproducer code (or URL to code)

import org.assertj.core.api.Assertions;
import org.junit.jupiter.api.Test;

import io.netty.handler.codec.http2.DefaultHttp2Headers;
import io.netty.handler.codec.http2.Http2Error;
import io.netty.handler.codec.http2.Http2Exception;
import io.netty.util.AsciiString;

public class Test {
    @Test
    void shouldRejectNonUsAsciiValues() {
        // U+1F631 : 😱 : F0 9F 98 B1
        byte[] buf = new byte[] {(byte) 0xF0, (byte) 0x9F, (byte) 0x98, (byte) 0xB1};
        DefaultHttp2Headers headers = new DefaultHttp2Headers();
        Assertions.assertThatThrownBy(() -> headers.add(new AsciiString(buf), "test"))
                .isInstanceOf(Http2Exception.class)
                .hasMessageStartingWith("invalid header name")
                .hasFieldOrPropertyWithValue("error", Http2Error.PROTOCOL_ERROR);
    }
}

Netty version

4.1.72

JVM version (e.g. java -version)

$ java --version
openjdk 11.0.11 2021-04-20
OpenJDK Runtime Environment AdoptOpenJDK-11.0.11+9 (build 11.0.11+9)
OpenJDK 64-Bit Server VM AdoptOpenJDK-11.0.11+9 (build 11.0.11+9, mixed mode)

OS version (e.g. uname -a)

$ uname -a
Darwin *** 20.6.0 Darwin Kernel Version 20.6.0***

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
idelpivnitskiycommented, Jan 11, 2022

Looks reasonable, HTTP/1.1 RFC defines the header name as:

header-field   = field-name ":" OWS field-value OWS
field-name     = token
token          = 1*tchar
tchar          = "!" / "#" / "$" / "%" / "&" / "'" / "*" / "+" / "-" / "." /
    "^" / "_" / "`" / "|" / "~" / DIGIT / ALPHA

HTTP/2.0 says it’s the same with a new requirement to be in lowercase.

0reactions
NilsRenaudcommented, Jan 12, 2022

This quote of HTTP/1 doesn’t argue your point. It says superset, which means things like Latin-1 and UTF-8 are fair game.

@ejona86 you’re right indeed and the RFC extract from @idelpivnitskiy is the only proof I needed to refer to.

I don’t understand the corruption argument. When is that done? If you are converting an AsciiString containing non-ASCII to a String, you have to know the encoding. Basically, there’s nothing that says 0x00 0xf0 is not the valid conversion of 0xf0 in your example.

@ejona86 Indeed, instead of corruption I should have say that the header name bytes are read as Latin-1 encoded text then encoded with UTF-16 in java’s char to form a String.

Read more comments on GitHub >

github_iconTop Results From Across the Web

HTTP/2.0 non US-ASCII header names should be rejected
It looks like the HPACK algorithm corrupts the header name, and the HTTP/2.0 implementation is then not able to reject this invalid header...
Read more >
The Message Content-Type in MIME
The message header fields are always US-ASCII in any case, and data within the body can still be encoded, in which case the...
Read more >
RFC 2616: Hypertext Transfer Protocol -- HTTP/1.1
This specification defines the protocol referred to as "HTTP/1.1", and is an update ... The US-ASCII coded character set is defined by ANSI...
Read more >
Use Request Filtering - IIS - Microsoft Learn
This feature either allows or rejects all requests to IIS that contain non-ASCII characters and logs the error code 404.12.
Read more >
Hypertext Transfer Protocol (HTTP/1.1): Message Syntax - IETF
Newly defined header fields SHOULD limit their field values to US-ASCII octets. ... A server MAY reject a request that contains a message...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found