epoll_wait produces an EINVAL error since 4.1.30
See original GitHub issueExpected behavior
epoll_wait should work in 4.1.30 like it did in 4.1.29
Actual behavior
Since switching to 4.1.30, EpollEventLoop
’s handleLoopException
is triggered with
io.netty.channel.ChannelException: timerfd_settime() failed: Invalid argument
, which points to timerfd_settime
. This causes an epoll thread to be “blocked” sleeping.
The issue is visible in a Spring Boot test dealing with bad SSL certificates, which uses reactor/reactor-netty.
While investigating this remotely with limited resources (partial access to the logs and reproduction case, no local linux machine to test on), I found that the 4.1.30 suspiciously contained an issue related to epoll_wait.
Looking at the PR I think I might have found the regression:
https://github.com/netty/netty/pull/7816/files#diff-db3e069239a403b954e3ebc024ba9507R251
Integer.MAX_VALUE
should be MAX_SCHEDULED_TIMERFD_NS
(999,999,999
) like it was before the PR, else timerfd_settime
might return EINVAL
if it is too large.
Steps to reproduce
The issue is triggered during tests of Spring Boot, but this is a smaller reproduction snippet that is using Spring Framework 5:
@Test
public void strippedDown() {
assertThatExceptionOfType(RuntimeException.class)
.isThrownBy(() -> WebClient.create().get()
.uri("https://" + "self-signed.badssl.com/").exchange()
.block(Duration.ofSeconds(10)))
.withCauseInstanceOf(SSLException.class);
}
I can try to spin up a repository with a maven project that reproduces the issue and can be run without set up if you need.
Netty version
4.1.30
JVM version (e.g. java -version
)
??
OS version (e.g. uname -a
)
??
Issue Analytics
- State:
- Created 5 years ago
- Comments:20 (17 by maintainers)
Top GitHub Comments
I can fill in some of the blanks about the environment where we’ve seen the issue:
JVM version (e.g. java -version)
OS version (e.g. name -a)
@wilkinsona I would open a new one as the error itself is gone (I was also not able to reproduce yet 😦 ).