Garbage generated in EpollDatagramChannel
See original GitHub issueI am developing a high performance UDP client on Linux using Netty 4.1 with native transport, and DatagramSocketAddress objects are the main source of allocations in the JVM. The native code EpollDatagramChannel creates a DatagramSocketAddress object for each received UDP datagram: https://github.com/netty/netty/blob/fa8f967852c04fd99ae15ee2bd1047595a97417e/transport-native-unix-common/src/main/c/netty_unix_socket.c#L363 Those allocations are quite heavy, as internally, char[], String, Inet4Address, InetAddressHolder, InetSocketAddressHolder, InetAddress[], byte[] objects are getting generated when constructing the object.
On the client side, if the socket is connected, the address cannot change and does not have to be re-instanciated every time. What do you think about caching the sender address in EpollDatagramChannel and allocating it only if needed? That is what is done in DatagramChannelImpl from the JDK, which considerably reduces GC: https://github.com/frohoff/jdk8u-jdk/blob/master/src/windows/native/sun/nio/ch/DatagramChannelImpl.c#L179
That can easily be reproduced using any echo client/server UDP example configured to be using native transport.
Expected behavior
EpollDatagramChannel caches the DatagramSocketAddress object when receiving a datagram
Actual behavior
EpollDatagramChannel allocates one DatagramSocketAddress object per incoming datagram
Steps to reproduce
Simple UDP client configured with native transport
Netty version
4.1
JVM version (e.g. java -version
)
openjdk version “1.8.0_191” OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-0ubuntu0.16.04.1-b12) OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)
OS version (e.g. uname -a
)
Ubuntu 16.04 x86_64 kernel 4.15.0-43-generic
Issue Analytics
- State:
- Created 5 years ago
- Comments:6 (3 by maintainers)
Top GitHub Comments
@estaban PTAL https://github.com/netty/netty/pull/8806. I think this actually fixes it in an easier way. Also I was thinking of maybe introducing a
ChannelOption
that would allow to set that if used in connected mode we would just fire theByteBuf
though theChannelPipeline
and not wrap it at all in aDatagramPacket
. This would reduce GC even more. WDYT ?@normanmaurer Thanks for the fix, it is much simpler than the one I proposed. When I looked into that option, I was concerned about race conditions between connect calls (potentially several of them?) and reads (using the cached address), but I missed that they are all done on the event loop thread, so that should not be a problem.
Firing a ByteBuf when reading UDP datagrams would be a great idea!