question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error in `node': corrupted double-linked list (TLS / SSL crashes)

See original GitHub issue

Problem description

App crashes with a core trace. Unsure what is the cause. I’ve attached the trace log.

dump.txt

Excerpt from it:

*** Error in `node': corrupted double-linked list: 0x000000002b02cca0 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x70bfb)[0x7f894cfb4bfb]
/lib/x86_64-linux-gnu/libc.so.6(+0x76fc6)[0x7f894cfbafc6]
/lib/x86_64-linux-gnu/libc.so.6(+0x7733d)[0x7f894cfbb33d]
/lib/x86_64-linux-gnu/libc.so.6(+0x78dfa)[0x7f894cfbcdfa]
/lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x54)[0x7f894cfbef64]
node(CRYPTO_zalloc+0x61)[0x14460b1]
node(SSL_new+0x39)[0x13519f9]
/home/node/node_modules/grpc/src/node/extension_binary/node-v64-linux-x64-glibc/grpc_node.node(+0x6c845)[0x7f894c495845]
/home/node/node_modules/grpc/src/node/extension_binary/node-v64-linux-x64-glibc/grpc_node.node(+0x51076)[0x7f894c47a076]
/home/node/node_modules/grpc/src/node/extension_binary/node-v64-linux-x64-glibc/grpc_node.node(+0x2b357)[0x7f894c454357]
/home/node/node_modules/grpc/src/node/extension_binary/node-v64-linux-x64-glibc/grpc_node.node(+0x6d7d5)[0x7f894c4967d5]
/home/node/node_modules/grpc/src/node/extension_binary/node-v64-linux-x64-glibc/grpc_node.node(+0x7c2f5)[0x7f894c4a52f5]
node[0xa55559]
node[0xa5a5c8]
node(uv_run+0x14b)[0xa4a21b]
node(_ZN4node5StartEPN2v87IsolateEPNS_11IsolateDataERKSt6vectorISsSaISsEES9_+0x565)[0x8e6f45]
node(_ZN4node5StartEiPPc+0x469)[0x8e5239]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f894cf642e1]
node[0x89ed85]

...
  • This may have started happening with a client that implemented managed channels on the client side, but unsure how that would lead to crashes on the server.
    • The client may hold a channel open between 5-30s before closing entirely (which I understand would contribute to memory consumption to keep the socket open)
  • Our grpc servers get millions of grpc connections a day, don’t generally have control over the client impls (there’s multiple client/devices) of how they connect to the server
  • We use a cluster of docker containers with 3GB of memory on multiple host boxes - horizontal / vertical scaling only delays the problem
  • Our grpc connection requires TLS
  • We’re unable to reproduce the problem locally at the moment (when I do memory profiling, I see gc happening properly, but I’m only launching 5-10 of our own test clients at once to test our various flows), it only seems to happen in our high-traffic production environments

Questions:

  • (might not be one for you to answer) - If node runs out of memory, we wouldn’t see a corrupted double-linked list type error, but an error specific to running out of memory, correct?
  • Any potential ideas of where / how to troubleshoot based on looking at the trace log?

Environment

grpc-node versions we’ve tried:

  • 1.20.3
  • 1.16.1

Node Version: 10.15.2 (we’ve also upgraded node along the way in the 10.x series)

OS: Debian-based (using docker node:10.5.2 image), we do not do any package updates (eg apt update, install latest devtools, etc)

Additional info

We’ve noticed that we get a lot of these entries in our logs - it could be clients that are misconfigured, unsure if it eventually leads to crashing:

E0702 22:22:08.625087997       7 ssl_transport_security.cc:1229] Handshake failed with fatal error SSL_ERROR_SSL: error:1417A0C1:SSL routines:tls_post_process_client_hello:no shared cipher.
E0702 22:23:23.517829514       7 tcp_server_custom.cc:220]   getpeername error: {"created":"@1562106203.517802318","description":"getpeername failed","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":73,"grpc_status":14,"os_error":"socket is not connected"}
E0702 22:23:54.952124948       7 tcp_server_custom.cc:220]   getpeername error: {"created":"@1562106234.952105465","description":"getpeername failed","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":73,"grpc_status":14,"os_error":"socket is not connected"}
E0702 22:27:13.495415849       7 ssl_transport_security.cc:1229] Handshake failed with fatal error SSL_ERROR_SSL: error:14094416:SSL routines:ssl3_read_bytes:sslv3 alert certificate unknown.
E0702 22:27:20.835176028       7 tcp_server_custom.cc:220]   getpeername error: {"created":"@1562106440.835163957","description":"getpeername failed","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":73,"grpc_status":14,"os_error":"socket is not connected"}
E0702 22:27:26.764072732       7 tcp_server_custom.cc:220]   getpeername error: {"created":"@1562106446.764040019","description":"getpeername failed","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":73,"grpc_status":14,"os_error":"socket is not connected"}
E0702 22:27:40.745594638       7 ssl_transport_security.cc:1229] Handshake failed with fatal error SSL_ERROR_SSL: error:14094416:SSL routines:ssl3_read_bytes:sslv3 alert certificate unknown.
E0702 22:28:19.820591172       7 tcp_server_custom.cc:220]   getpeername error: {"created":"@1562106499.820576136","description":"getpeername failed","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":73,"grpc_status":14,"os_error":"socket is not connected"}

We also get these in our logs too:

E0703 00:14:17.961441781       7 tcp_server_custom.cc:220]   getpeername error: {"created":"@1562112857.961430955","description":"getpeername failed","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":73,"grpc_status":14,"os_error":"socket is not connected"}
E0703 00:14:51.348432608       7 tcp_server_custom.cc:220]   getpeername error: {"created":"@1562112891.348418529","description":"getpeername failed","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":73,"grpc_status":14,"os_error":"socket is not connected"}
E0703 00:15:32.933723158       7 tcp_server_custom.cc:220]   getpeername error: {"created":"@1562112932.933711032","description":"getpeername failed","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":73,"grpc_status":14,"os_error":"socket is not connected"}
E0703 00:16:14.605935643       7 tcp_server_custom.cc:220]   getpeername error: {"created":"@1562112974.605925035","description":"getpeername failed","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":73,"grpc_status":14,"os_error":"socket is not connected"}
E0703 00:16:27.814974144       7 tcp_server_custom.cc:220]   getpeername error: {"created":"@1562112987.814961950","description":"getpeername failed","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":73,"grpc_status":14,"os_error":"socket is not connected"}

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:14 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
murgatroid99commented, Jan 13, 2022

The notice is there right at the top of the library’s README, and it is marked as deprecated in npm. If you’re talking about this repository as a whole, both packages live here so that would not be the appropriate place for that notice.

1reaction
murgatroid99commented, Jan 13, 2022

If that is the problem, it’s fixed on the latest version of the library. The grpc package is on version 1.24 of the core, in which that bug was fixed in grpc/grpc#26750. That update was picked up in #1861 and published in version 1.24.11 of the library.

Read more comments on GitHub >

github_iconTop Results From Across the Web

What does 'corrupted double-linked list' mean - Stack Overflow
Could somebody provide me a short snippet of code that causes the glibc to say 'corrupted double-linked list' when I compile and execute...
Read more >
"corrupted double-linked list" error - ROOT Forum
When I run the code, it runs which I understand from the cout statements, but right before I write the histogram into the...
Read more >
glibc detected *** urxvtd: malloc(): smallbin double linked list ...
Some keystrockes in an urxvtc client make the server crash. I still can't say what exactly trigger the bug, it only happened once...
Read more >
Process | Node.js v19.3.0 Documentation
However, warnings are not part of the normal Node.js and JavaScript error handling flow. Node.js can emit warnings whenever it detects bad coding...
Read more >
Known issues - Fortinet Documentation Library
HTTPS TLS 1.3 handshake fails with internal error alert. ... Kernel crash occurs with FEC enabled on IPsec VPN when corrupted packets are...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found