"Lost connection to socket" every 5 seconds
See original GitHub issueWhat version of gRPC and what language are you using?
C# 2.53.0
What operating system (Linux, Windows,…) and version?
Issue observed both Windows 11 and Linux
What runtime / compiler are you using (e.g. .NET Core SDK version dotnet --info
)
.NET 6.0.408
What did you do/see/expect?
I have a .NET client application talking to a Java gRPC server. My custom resolver was being unexpectedly called every 5 seconds so after enabling trace level logs on grpc-dotnet and adding logging for HTTP I see the following message occurring every 5 seconds is there is no traffic between the socket connecting and the first time SocketConnectivitySubchannelTransport.OnCheckSocketConnection
is called:
frontend-58846576d8-8z8jd frontend 2023-05-31T12:21:23.459206138-07:00 trce: Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport[4]
frontend-58846576d8-8z8jd frontend 2023-05-31T12:21:23.459255541-07:00 Subchannel id '1' checking socket 10.240.7.121:5001.
frontend-58846576d8-8z8jd frontend 2023-05-31T12:21:23.459264942-07:00 ############## [EventSource(Private.InternalDiagnostics.System.Net.Sockets, ae391de7-a2cb-557c-dd34-fe00d0b98c7f)] [Info] thisOrContextObject: Socket#43624646, memberName: Poll, message: Poll returns socketCount:0
frontend-58846576d8-8z8jd frontend 2023-05-31T12:21:23.459271942-07:00 ############## [EventSource(Private.InternalDiagnostics.System.Net.Sockets, ae391de7-a2cb-557c-dd34-fe00d0b98c7f)] [Info] thisOrContextObject: Socket#43624646, memberName: Dispose, message: timeout = -1
frontend-58846576d8-8z8jd frontend 2023-05-31T12:21:23.459278243-07:00 ############## [EventSource(Private.InternalDiagnostics.System.Net.Sockets, ae391de7-a2cb-557c-dd34-fe00d0b98c7f)] [Info] thisOrContextObject: Socket#43624646, memberName: Dispose, message: disposing:True Disposed:False
frontend-58846576d8-8z8jd frontend 2023-05-31T12:21:23.459289243-07:00 ############## [EventSource(Private.InternalDiagnostics.System.Net.Sockets, ae391de7-a2cb-557c-dd34-fe00d0b98c7f)] [Info] thisOrContextObject: Socket#43624646, memberName: Dispose, message: Calling _handle.CloseAsIs()
frontend-58846576d8-8z8jd frontend 2023-05-31T12:21:23.459330846-07:00 ############## [EventSource(Private.InternalDiagnostics.System.Net.Sockets, ae391de7-a2cb-557c-dd34-fe00d0b98c7f)] [Info] thisOrContextObject: SafeSocketHandle#46732916, memberName: CloseAsIs, message: shouldClose=True
frontend-58846576d8-8z8jd frontend 2023-05-31T12:21:23.459337746-07:00 ############## [EventSource(Private.InternalDiagnostics.System.Net.Sockets, ae391de7-a2cb-557c-dd34-fe00d0b98c7f)] [Info] thisOrContextObject: SafeSocketHandle#46732916, memberName: ReleaseHandle, message: shouldClose=False
frontend-58846576d8-8z8jd frontend 2023-05-31T12:21:23.459343647-07:00 ############## [EventSource(Private.InternalDiagnostics.System.Net.Sockets, ae391de7-a2cb-557c-dd34-fe00d0b98c7f)] [Info] thisOrContextObject: SafeSocketHandle#46732916, memberName: CloseHandle, message: handle:0x16B
frontend-58846576d8-8z8jd frontend 2023-05-31T12:21:23.459349747-07:00 ############## [EventSource(Private.InternalDiagnostics.System.Net.Sockets, ae391de7-a2cb-557c-dd34-fe00d0b98c7f)] [Info] thisOrContextObject: SafeSocketHandle#46732916, memberName: DoCloseHandle, message: handle:0x16B Following 'non-abortive' branch.
frontend-58846576d8-8z8jd frontend 2023-05-31T12:21:23.459390249-07:00 ############## [EventSource(Private.InternalDiagnostics.System.Net.Sockets, ae391de7-a2cb-557c-dd34-fe00d0b98c7f)] [Info] thisOrContextObject: SafeSocketHandle#46732916, memberName: CloseHandle, message: handle:363, close():SUCCESS
frontend-58846576d8-8z8jd frontend 2023-05-31T12:21:23.460124293-07:00 ############## [EventSource(System.Net.NameResolution, 4b326142-bfb5-5ed3-8585-7714181d14b0)] [ResolutionStart] hostNameOrAddress: fe-e2e-d42b40067b4f4c779181362-az22db33c3-search-svc.gbarberi3
frontend-58846576d8-8z8jd frontend 2023-05-31T12:21:23.460150295-07:00 dbug: Grpc.Net.Client.Balancer.Subchannel[11]
frontend-58846576d8-8z8jd frontend 2023-05-31T12:21:23.460158196-07:00 Subchannel id '1' state changed to Idle. Detail: 'Lost connection to socket.'.
My channel is configured like this:
var channel = GrpcChannel.ForAddress(
address,
channelOptions: new GrpcChannelOptions
{
ThrowOperationCanceledOnCancellation = true,
LoggerFactory = services.GetRequiredService<ILoggerFactory>(),
Credentials = ChannelCredentials.Insecure,
MaxReconnectBackoff = serviceOptions.MaxReconnectBackoff,
MaxReceiveMessageSize = serviceOptions.MaxReceiveMessageSize,
MaxSendMessageSize = serviceOptions.MaxSendMessageSize,
ServiceConfig = new ServiceConfig { LoadBalancingConfigs = { new RoundRobinConfig() } },
HttpHandler = new SocketsHttpHandler
{
ConnectTimeout = serviceOptions.ConnectTimeout, // 30 sec by default
EnableMultipleHttp2Connections = true
},
// Required to dispose of manually created SocketsHttpHandler above
DisposeHttpClient = true,
// Create a service provider to inject the activation manager into the resolver factory
ServiceProvider = new ServiceCollection()
.AddSingleton<ResolverFactory, IndexServiceReaderResolverFactory>()
.AddSingleton<ResolverFactory, IndexServiceWriterResolverFactory>()
.BuildServiceProvider()
}
);
I added a breakpoint to SocketConnectivitySubchannelTransport
and see that IsSocketInBadState
returns true
when the transport is polling for status every 5 seconds which causes the transport to then disconnect.
I also added a breakpoint right after the connection is established in SocketConnectivitySubchannelTransport.TryConnectAsync
here:
SocketConnectivitySubchannelTransportLog.ConnectingSocket(_logger, _subchannel.Id, currentAddress);
await _socketConnect(socket, currentAddress.EndPoint, context.CancellationToken).ConfigureAwait(false);
SocketConnectivitySubchannelTransportLog.ConnectedSocket(_logger, _subchannel.Id, currentAddress);
Calling IsSocketInBadState(socket)
in the breakpoint above is returning true
. The socket is healthy but there is data pending (socket.Available==40):
{System.Net.Sockets.Socket}
AddressFamily: InterNetworkV6
Available: 40
Blocking: true
Connected: true
DontFragment: Exception of type 'System.NotSupportedException' was thrown
DualMode: true
EnableBroadcast: Exception of type 'System.Net.Sockets.SocketException' was thrown
ExclusiveAddressUse: false
Handle: 0xd38
IsBound: true
LingerState: {System.Net.Sockets.LingerOption}
LocalEndPoint: {[::ffff:127.0.0.1]:64546}
MulticastLoopback: Exception of type 'System.Net.Sockets.SocketException' was thrown
NoDelay: true
ProtocolType: Tcp
ReceiveBufferSize: 65536
ReceiveTimeout: 0
RemoteEndPoint: null
SafeHandle: {System.Net.Sockets.SafeSocketHandle}
SendBufferSize: 65536
SendTimeout: 0
SocketType: Stream
Ttl: 128
UseOnlyOverlappedIO: false
The IsSocketInBadState
function will return true
if there is data pending according to the comments and socket.Poll
docs:
private static bool IsSocketInBadState(Socket socket)
{
try
{
// Will return true if closed or there is pending data for some reason.
return socket.Poll(0, SelectMode.SelectRead);
}
catch (Exception e) when (e is SocketException || e is ObjectDisposedException)
{
return false;
}
}
The 40 bytes in the socket are 00001204000000000000037FFFFFFF000400100000000600002000000004080000000000000F0001
which I don’t know what they mean but searching for those bytes I see references to java gRPC service.
Issue Analytics
- State:
- Created 4 months ago
- Reactions:3
- Comments:8 (4 by maintainers)
Top GitHub Comments
Connecting to Java grpc immediately returns 40 bytes [0,0,18,4,0,0,0,0,0,0,3,127,255,255,255,0,4,0,16,0,0,0,6,0,0,32,0,0,0,4,8,0,0,0,0,0,0,15,0,1] connect to C-core-based gRPC C# Server return 46 bytes Server code
I tested the 46 byte result from Grpc.Core is, and it’s an HTTP/2 SETTINGS frame.
I think the PR at https://github.com/grpc/grpc-dotnet/pull/2179 is on the right track, but it can be improved to cache the initial frame and replay it to the client once it needs a connection.