Error in `node': corrupted double-linked list (TLS / SSL crashes)
See original GitHub issueProblem description
App crashes with a core trace. Unsure what is the cause. I’ve attached the trace log.
Excerpt from it:
*** Error in `node': corrupted double-linked list: 0x000000002b02cca0 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x70bfb)[0x7f894cfb4bfb]
/lib/x86_64-linux-gnu/libc.so.6(+0x76fc6)[0x7f894cfbafc6]
/lib/x86_64-linux-gnu/libc.so.6(+0x7733d)[0x7f894cfbb33d]
/lib/x86_64-linux-gnu/libc.so.6(+0x78dfa)[0x7f894cfbcdfa]
/lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x54)[0x7f894cfbef64]
node(CRYPTO_zalloc+0x61)[0x14460b1]
node(SSL_new+0x39)[0x13519f9]
/home/node/node_modules/grpc/src/node/extension_binary/node-v64-linux-x64-glibc/grpc_node.node(+0x6c845)[0x7f894c495845]
/home/node/node_modules/grpc/src/node/extension_binary/node-v64-linux-x64-glibc/grpc_node.node(+0x51076)[0x7f894c47a076]
/home/node/node_modules/grpc/src/node/extension_binary/node-v64-linux-x64-glibc/grpc_node.node(+0x2b357)[0x7f894c454357]
/home/node/node_modules/grpc/src/node/extension_binary/node-v64-linux-x64-glibc/grpc_node.node(+0x6d7d5)[0x7f894c4967d5]
/home/node/node_modules/grpc/src/node/extension_binary/node-v64-linux-x64-glibc/grpc_node.node(+0x7c2f5)[0x7f894c4a52f5]
node[0xa55559]
node[0xa5a5c8]
node(uv_run+0x14b)[0xa4a21b]
node(_ZN4node5StartEPN2v87IsolateEPNS_11IsolateDataERKSt6vectorISsSaISsEES9_+0x565)[0x8e6f45]
node(_ZN4node5StartEiPPc+0x469)[0x8e5239]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f894cf642e1]
node[0x89ed85]
...
- This may have started happening with a client that implemented managed channels on the client side, but unsure how that would lead to crashes on the server.
- The client may hold a channel open between 5-30s before closing entirely (which I understand would contribute to memory consumption to keep the socket open)
- Our grpc servers get millions of grpc connections a day, don’t generally have control over the client impls (there’s multiple client/devices) of how they connect to the server
- We use a cluster of docker containers with 3GB of memory on multiple host boxes - horizontal / vertical scaling only delays the problem
- Our grpc connection requires TLS
- We’re unable to reproduce the problem locally at the moment (when I do memory profiling, I see gc happening properly, but I’m only launching 5-10 of our own test clients at once to test our various flows), it only seems to happen in our high-traffic production environments
Questions:
- (might not be one for you to answer) - If node runs out of memory, we wouldn’t see a
corrupted double-linked list
type error, but an error specific to running out of memory, correct? - Any potential ideas of where / how to troubleshoot based on looking at the trace log?
Environment
grpc-node versions we’ve tried:
1.20.3
1.16.1
Node Version: 10.15.2
(we’ve also upgraded node along the way in the 10.x series)
OS: Debian-based (using docker node:10.5.2 image), we do not do any package updates (eg apt update, install latest devtools, etc)
Additional info
We’ve noticed that we get a lot of these entries in our logs - it could be clients that are misconfigured, unsure if it eventually leads to crashing:
E0702 22:22:08.625087997 7 ssl_transport_security.cc:1229] Handshake failed with fatal error SSL_ERROR_SSL: error:1417A0C1:SSL routines:tls_post_process_client_hello:no shared cipher.
E0702 22:23:23.517829514 7 tcp_server_custom.cc:220] getpeername error: {"created":"@1562106203.517802318","description":"getpeername failed","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":73,"grpc_status":14,"os_error":"socket is not connected"}
E0702 22:23:54.952124948 7 tcp_server_custom.cc:220] getpeername error: {"created":"@1562106234.952105465","description":"getpeername failed","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":73,"grpc_status":14,"os_error":"socket is not connected"}
E0702 22:27:13.495415849 7 ssl_transport_security.cc:1229] Handshake failed with fatal error SSL_ERROR_SSL: error:14094416:SSL routines:ssl3_read_bytes:sslv3 alert certificate unknown.
E0702 22:27:20.835176028 7 tcp_server_custom.cc:220] getpeername error: {"created":"@1562106440.835163957","description":"getpeername failed","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":73,"grpc_status":14,"os_error":"socket is not connected"}
E0702 22:27:26.764072732 7 tcp_server_custom.cc:220] getpeername error: {"created":"@1562106446.764040019","description":"getpeername failed","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":73,"grpc_status":14,"os_error":"socket is not connected"}
E0702 22:27:40.745594638 7 ssl_transport_security.cc:1229] Handshake failed with fatal error SSL_ERROR_SSL: error:14094416:SSL routines:ssl3_read_bytes:sslv3 alert certificate unknown.
E0702 22:28:19.820591172 7 tcp_server_custom.cc:220] getpeername error: {"created":"@1562106499.820576136","description":"getpeername failed","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":73,"grpc_status":14,"os_error":"socket is not connected"}
We also get these in our logs too:
E0703 00:14:17.961441781 7 tcp_server_custom.cc:220] getpeername error: {"created":"@1562112857.961430955","description":"getpeername failed","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":73,"grpc_status":14,"os_error":"socket is not connected"}
E0703 00:14:51.348432608 7 tcp_server_custom.cc:220] getpeername error: {"created":"@1562112891.348418529","description":"getpeername failed","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":73,"grpc_status":14,"os_error":"socket is not connected"}
E0703 00:15:32.933723158 7 tcp_server_custom.cc:220] getpeername error: {"created":"@1562112932.933711032","description":"getpeername failed","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":73,"grpc_status":14,"os_error":"socket is not connected"}
E0703 00:16:14.605935643 7 tcp_server_custom.cc:220] getpeername error: {"created":"@1562112974.605925035","description":"getpeername failed","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":73,"grpc_status":14,"os_error":"socket is not connected"}
E0703 00:16:27.814974144 7 tcp_server_custom.cc:220] getpeername error: {"created":"@1562112987.814961950","description":"getpeername failed","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":73,"grpc_status":14,"os_error":"socket is not connected"}
Issue Analytics
- State:
- Created 4 years ago
- Comments:14 (9 by maintainers)
Top Results From Across the Web
What does 'corrupted double-linked list' mean - Stack Overflow
Could somebody provide me a short snippet of code that causes the glibc to say 'corrupted double-linked list' when I compile and execute...
Read more >"corrupted double-linked list" error - ROOT Forum
When I run the code, it runs which I understand from the cout statements, but right before I write the histogram into the...
Read more >glibc detected *** urxvtd: malloc(): smallbin double linked list ...
Some keystrockes in an urxvtc client make the server crash. I still can't say what exactly trigger the bug, it only happened once...
Read more >Process | Node.js v19.3.0 Documentation
However, warnings are not part of the normal Node.js and JavaScript error handling flow. Node.js can emit warnings whenever it detects bad coding...
Read more >Known issues - Fortinet Documentation Library
HTTPS TLS 1.3 handshake fails with internal error alert. ... Kernel crash occurs with FEC enabled on IPsec VPN when corrupted packets are...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
The notice is there right at the top of the library’s README, and it is marked as deprecated in npm. If you’re talking about this repository as a whole, both packages live here so that would not be the appropriate place for that notice.
If that is the problem, it’s fixed on the latest version of the library. The
grpc
package is on version 1.24 of the core, in which that bug was fixed in grpc/grpc#26750. That update was picked up in #1861 and published in version 1.24.11 of the library.