Connection reset by peer when using AWS Lambda
See original GitHub issueSteps to reproduce
- Deploy a simple .net core 3.x app which is using aurora-postgresql (version >= 11) to AWS lambda Example of the configuration:
services.AddDbContextPool<DriveLogsDbContext, DriveLogsDbPostgresContext>(options =>
options.UseNpgsql(_config.GetConnectionString("PostgresDriveLogsDb"),
opts => opts.SetPostgresVersion(12, 4))
.UseSnakeCaseNamingConvention());
Connection string (pool size does not matter)
"server=some-aurora-postgres-rds-endpoint;userid=root;pwd=pwd;port=5432;database=dbname;Minimum Pool Size=5;"
- curl some endpoint which accesses the db (read only is fine)
- Wait for 10 minutes (time could vary)
- curl the same endpoint again
The issue
During the second request connection to the db will be interrupted in the middle of execution by the lambda
[Error] Microsoft.EntityFrameworkCore.Database.Command: Failed executing DbCommand (22ms)
[Parameters=[@__filter_Name_0='?'], CommandType='Text', CommandTimeout='30']
SELECT COUNT(*)::INT FROM some_table AS l WHERE some_table.name = @__filter_Name_0
[Error] Microsoft.EntityFrameworkCore.Database.Command: Failed executing DbCommand (22ms)
[Parameters=[@__filter_Name_0='?'], CommandType='Text', CommandTimeout='30']
SELECT COUNT(*)::INT FROM some_table AS l WHERE some_table.name = @__filter_Name_0
[Error] Microsoft.EntityFrameworkCore.Query: An exception occurred while iterating over the results of a query for context type 'DriveLogs.Data.DriveLogsDbPostgresContext'.
System.InvalidOperationException: An exception has been raised that is likely due to a transient failure.
---> Npgsql.NpgsqlException (0x80004005): Exception while reading from stream
---> System.IO.IOException: Unable to read data from the transport connection: Connection reset by peer.
---> System.Net.Sockets.SocketException (104): Connection reset by peer
at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
--- End of inner exception stack trace ---
at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
at Npgsql.NpgsqlReadBuffer.<>c__DisplayClass30_0.<<Ensure>g__EnsureLong|0>d.MoveNext()
at Npgsql.NpgsqlReadBuffer.<>c__DisplayClass30_0.<<Ensure>g__EnsureLong|0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at Npgsql.NpgsqlConnector.<>c__DisplayClass160_0.<<DoReadMessage>g__ReadMessageLong|0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at Npgsql.NpgsqlConnector.<>c__DisplayClass160_0.<<DoReadMessage>g__ReadMessageLong|0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at Npgsql.NpgsqlDataReader.NextResult(Boolean async, Boolean isConsuming)
at Npgsql.NpgsqlDataReader.NextResult()
at Npgsql.NpgsqlCommand.ExecuteReaderAsync(CommandBehavior behavior, Boolean async, CancellationToken cancellationToken)
at Npgsql.NpgsqlCommand.ExecuteReader(CommandBehavior behavior)
at Npgsql.NpgsqlCommand.ExecuteDbDataReader(CommandBehavior behavior)
at System.Data.Common.DbCommand.ExecuteReader()
at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReader(RelationalCommandParameterObject parameterObject)
at Microsoft.EntityFrameworkCore.Query.Internal.QueryingEnumerable`1.Enumerator.InitializeReader(DbContext _, Boolean result)
at Npgsql.EntityFrameworkCore.PostgreSQL.Storage.Internal.NpgsqlExecutionStrategy.Execute[TState,TResult](TState state, Func`3 operation, Func`3 verifySucceeded)
--- End of inner exception stack trace ---
at Npgsql.EntityFrameworkCore.PostgreSQL.Storage.Internal.NpgsqlExecutionStrategy.Execute[TState,TResult](TState state, Func`3 operation, Func`3 verifySucceeded)
at Microsoft.EntityFrameworkCore.Query.Internal.QueryingEnumerable`1.Enumerator.MoveNext()
Postgres log:
2021-02-22 05:49:00 UTC:172.20.66.0(51071):root@drivelogs:[31559]:LOG:
could not receive data from client: Connection reset by peer
Further technical details
Npgsql version: 4.1.8.0 PostgreSQL version: 12.4 Operating system: AWS lambda
Other details about my project setup:
- The same app with mysql is working fine. So this is not our AWS setup/lambda issue. But it could be an AWS postgres issue.
- I have two completely different projects the second one is using postgres 11.x (don’t remember the minor version) and it is experiencing the same issue
- The issue does not happen during lambda cold start. And there is should be some considerable delay between requests to reproduce it
- I do saw the same tickets here previously but all of them were closed for some reasons without any solution. Now I have time, test stand and strong wish to fix the issue. What Im asking is some guidance on how to debug the issue
- If it would help I can create a simple example project and push it to some public repo.
Issue Analytics
- State:
- Created 3 years ago
- Comments:14 (4 by maintainers)
Top Results From Across the Web
AWS lambda throws read: connection reset by peer
Error read: connection reset by peer means that TCP connection was closed. It is hard to say what can happen without access to...
Read more >Troubleshoot networking issues in Lambda
Network connectivity errors can result from issues with your VPC's routing configuration, security group rules, AWS Identity and Access Management (IAM) role ...
Read more >AWS lambda throws read: connection ... - appsloveworld.com
Error read: connection reset by peer means that TCP connection was closed. It is hard to say what can happen without access to...
Read more >[Errno 104] Connection reset by peer / 107, 'Transport ...
I've been talking to AWS and they said i need to retry. This is the 2nd error that i see happening: requests.exceptions.ConnectionError: (' ......
Read more >How do I troubleshoot Lambda function failures?
Connection reset by peer. ECONNRESET ECONNREFUSED. To troubleshoot Lambda networking errors. 1. Confirm that there's a valid network path to the endpoint ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Our team encountered the exact same problem, and I delved a bit deeper. Hopefully someone will find below useful.
AWS Lambda purges idle connections over time (AWS docs). Allegedly the threshold is 350 seconds (Couchbase forum).
If a Lambda function’s two consecutive invocations are separated by about 10 minutes:
Adding
Keepalive=30;
seems to work in practice, but in an unexpected way, and probably with a race condition:An alternative solution without race condition: set
ConnectionLifetime
to a positive value below Lambda max connection idling time, e.g. 180 sec:We upgraded to Npgsql 5 and set
ConnectionLifetime=300
in our functions yesterday. Not seeing any connection error since then so far.