HTTP/2.0 non US-ASCII header names should be rejected
See original GitHub issueExpected behavior
According to the HTTP/2.0 RFC :
Just as in HTTP/1.x, header field names are strings of ASCII characters that are compared in a case-insensitive fashion. However, header field names MUST be converted to lowercase prior to their encoding in HTTP/2. A request or response containing uppercase header field names MUST be treated as malformed (Section 8.1.2.6).
Since it is referring to HTTP/1.x RFC, here is the interesting part :
A recipient MUST parse an HTTP message as a sequence of octets in an encoding that is a superset of US-ASCII [USASCII]. Parsing an HTTP message as a stream of Unicode characters, without regard for the specific encoding, creates security vulnerabilities due to the varying ways that string processing libraries handle invalid multibyte character sequences that contain the octet LF (%x0A).
US-ASCII is only using 7 bits. So during the header name parsing, any bytes in the form 1xxx xxxx
should be treated as error and the request should be rejected (ie : byteValue < 0
).
As reminder, here is the header name validator used in Netty for HTTP/1.1 :
private static void validateHeaderNameElement(byte value) {
switch (value) {
case 0x1c:
case 0x1d:
case 0x1e:
case 0x1f:
case 0x00:
case '\t':
case '\n':
case 0x0b:
case '\f':
case '\r':
case ' ':
case ',':
case ':':
case ';':
case '=':
throw new IllegalArgumentException(
"a header name cannot contain the following prohibited characters: =,;: \\t\\r\\n\\v\\f: " +
value);
default:
// Check to see if the character is not an ASCII character, or invalid
if (value < 0) {
throw new IllegalArgumentException("a header name cannot contain non-ASCII character: " + value);
}
}
Actual behavior
The only validation made on header names is :
private static final ByteProcessor HTTP2_NAME_VALIDATOR_PROCESSOR = new ByteProcessor() {
@Override
public boolean process(byte value) {
return !isUpperCase(value); // isUpperCase : return value >= 'A' && value <= 'Z';
}
};
Thus non-ASCII characters are accepted and corrupted when transformed into string (ie : 2 byte coding UTF-16BE chars).
As an example, this non-ASCII byte : 0xf0
will be converted into 0x00 0xf0
.
Minimal yet complete reproducer code (or URL to code)
import org.assertj.core.api.Assertions;
import org.junit.jupiter.api.Test;
import io.netty.handler.codec.http2.DefaultHttp2Headers;
import io.netty.handler.codec.http2.Http2Error;
import io.netty.handler.codec.http2.Http2Exception;
import io.netty.util.AsciiString;
public class Test {
@Test
void shouldRejectNonUsAsciiValues() {
// U+1F631 : 😱 : F0 9F 98 B1
byte[] buf = new byte[] {(byte) 0xF0, (byte) 0x9F, (byte) 0x98, (byte) 0xB1};
DefaultHttp2Headers headers = new DefaultHttp2Headers();
Assertions.assertThatThrownBy(() -> headers.add(new AsciiString(buf), "test"))
.isInstanceOf(Http2Exception.class)
.hasMessageStartingWith("invalid header name")
.hasFieldOrPropertyWithValue("error", Http2Error.PROTOCOL_ERROR);
}
}
Netty version
4.1.72
JVM version (e.g. java -version
)
$ java --version
openjdk 11.0.11 2021-04-20
OpenJDK Runtime Environment AdoptOpenJDK-11.0.11+9 (build 11.0.11+9)
OpenJDK 64-Bit Server VM AdoptOpenJDK-11.0.11+9 (build 11.0.11+9, mixed mode)
OS version (e.g. uname -a
)
$ uname -a
Darwin *** 20.6.0 Darwin Kernel Version 20.6.0***
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (4 by maintainers)
Top GitHub Comments
Looks reasonable, HTTP/1.1 RFC defines the header name as:
HTTP/2.0 says it’s the same with a new requirement to be in lowercase.
@ejona86 you’re right indeed and the RFC extract from @idelpivnitskiy is the only proof I needed to refer to.
@ejona86 Indeed, instead of corruption I should have say that the header name bytes are read as Latin-1 encoded text then encoded with UTF-16 in java’s
char
to form a String.