question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Synapse keeps being killed due to healthcheck failures

See original GitHub issue

Description

I am running a docker stack with the official Synapse image in it.

Absolutely randomly, Synaps just shuts down. See log; it literally just receives a SIGTERM and decides to shut itself down. Everything up to where the copied log starts is just normal Synapse logging

It’s driving me absolutely nuts, why would it possibly do this? Is this a new ‘feature’ in 1.65?

Steps to reproduce

  • Start Synapse service
  • Wait a few minutes, hours, or days

Homeserver

My own homeserver

Synapse Version

v1.65.0

Installation Method

Docker (matrixdotorg/synapse)

Platform

Docker on debian

Relevant log output

matrix_synapse.1.mybcsb5py8xx@beast    | 2022-08-25 05:31:30,627 - synapse.storage.databases.main.event_push_actions - 969 - INFO - rotate_notifs-62 - Rotating notifications
matrix_synapse.1.mybcsb5py8xx@beast    | 2022-08-25 05:31:30,628 - synapse.storage.databases.main.event_push_actions - 1130 - INFO - rotate_notifs-62 - Rotating notifications up to: 1326650
matrix_synapse.1.mybcsb5py8xx@beast    | 2022-08-25 05:31:30,631 - synapse.storage.databases.main.event_push_actions - 1218 - INFO - rotate_notifs-62 - Rotating notifications, handling 0 rows
matrix_synapse.1.mybcsb5py8xx@beast    | 2022-08-25 05:31:30,661 - synapse.storage.databases.main.event_push_actions - 1293 - INFO - rotate_notifs-62 - Rotating notifications, deleted 0 push actions
matrix_synapse.1.mybcsb5py8xx@beast    | 2022-08-25 05:31:30,951 - synapse.util.caches.lrucache - 212 - INFO - LruCache._expire_old_entries-62 - Dropped 0 items from caches
matrix_synapse.1.mybcsb5py8xx@beast    | 2022-08-25 05:31:39,819 - twisted - 274 - INFO - sentinel - Received SIGTERM, shutting down.
matrix_synapse.1.mybcsb5py8xx@beast    | 2022-08-25 05:31:39,820 - synapse.storage.databases.main.lock - 92 - INFO - LockStore._on_shutdown-0 - Dropping held locks due to shutdown
matrix_synapse.1.mybcsb5py8xx@beast    | 2022-08-25 05:31:39,821 - synapse.storage.databases.main.lock - 101 - INFO - LockStore._on_shutdown-0 - Dropped locks due to shutdown
matrix_synapse.1.mybcsb5py8xx@beast    | 2022-08-25 05:31:39,821 - synapse.handlers.presence - 766 - INFO - presence.on_shutdown-0 - Performing _on_shutdown. Persisting 7 unpersisted changes
matrix_synapse.1.mybcsb5py8xx@beast    | 2022-08-25 05:31:39,822 - synapse.app._base - 492 - INFO - sentinel - Shutting down...
matrix_synapse.1.mybcsb5py8xx@beast    | 2022-08-25 05:31:39,912 - synapse.handlers.presence - 779 - INFO - presence.on_shutdown-0 - Finished _on_shutdown
matrix_synapse.1.mybcsb5py8xx@beast    | 2022-08-25 05:31:39,913 - synapse.http.site - 362 - INFO - GET-553 - Connection from client lost before response was sent
matrix_synapse.1.mybcsb5py8xx@beast    | 2022-08-25 05:31:39,914 - synapse.http.site - 362 - INFO - GET-556 - Connection from client lost before response was sent
matrix_synapse.1.mybcsb5py8xx@beast    | 2022-08-25 05:31:39,914 - twisted - 274 - INFO - sentinel - (TCP Port 8008 Closed)
matrix_synapse.1.mybcsb5py8xx@beast    | 2022-08-25 05:31:39,919 - twisted - 274 - INFO - sentinel - Main loop terminated.

Anything else that would be useful to know?

No response

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:17 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
IeuanKcommented, Aug 25, 2022

I sadly don’t seem to be getting any events for the task that was shut down, only the current one (cwo8bra2rt2u)

cwo8bra2rt2u   matrix_synapse.1        matrixdotorg/synapse:v1.65.0           beast     Running         Running 2 hours ago
9k6elyuqfm12    \_ matrix_synapse.1    matrixdotorg/synapse:v1.65.0           beast     Shutdown        Complete 2 hours ago
css4r2i9s8he    \_ matrix_synapse.1    matrixdotorg/synapse:v1.65.0           beast     Shutdown        Complete 2 hours ago
p4owu5h29kh2    \_ matrix_synapse.1    matrixdotorg/synapse:v1.65.0           beast     Shutdown        Complete 5 hours ago
y26oljfjpyin    \_ matrix_synapse.1    matrixdotorg/synapse:v1.65.0           beast     Shutdown        Complete 5 hours ago

These are the current and shut down tasks, I’ve tried various timestamps and checked for the four IDs that were shut down, but no dice. I’ll start up a manual log of the events to a file, then check that file when it next crashes.

0reactions
IeuanKcommented, Sep 14, 2022

I think so, all tasks in my Matrix stack have been running since I rebooted the docker daemon for something 5 days ago.

So I guess this issue can be closed, with increasing the health check timeout as the solution. Thanks @DMRobertson and @richvdh for helping me figure this out!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Availability group lease health check timeout - Microsoft Learn
Mechanics and guidelines for the lease, cluster, and health check times ... service and is killed whenever the cluster service is killed.
Read more >
Synapse stops responding to incoming requests if ... - GitHub
It seems that Synapse is losing its connection to Postgres. It is known that Synapse does not handle the db connection disappearing and ......
Read more >
Synapse Devops Guide - The Vertex Project
If you are promoting the follower due to a catastrophic failure of the previous leader, you may use the command synapse.tools.promote --failure to...
Read more >
Marathon: Health Checks and Task Termination - Mesosphere
This means that setting maxConsecutiveFailures = 0 will lead to task being killed immediately after first health check fails. timeoutSeconds (Optional.
Read more >
Immune Synapse: Beautiful Under The Microscope, Kiss Of ...
The immunological synapse is a thing of beauty to behold under a microscope but the kiss of death for a cancer cell. "Synapse"...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found