question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Patronictl List not displaying a Node

See original GitHub issue

Hi Team,

Below are the versions :

Patroni : 1.6.5
PostgreSQL : 11.8
OS : Ubuntu 18.04

I have come across a strange issue and need some help regarding the same. One of my nodes is not showing in the patronictl list even though the node seems to be active and also data is getting replicated properly. The respective node “pg3” also doesn’t show in the members list when I check in the etcd server list. Only when I restart the service that the issue gets resolved.

postgres@q-sw-pgdb-r05:/var/log/postgresql$ patronictl -c /etc/patroni/deepthought1.yml list
+ Cluster: deepthought (6828106004191544221) ----+-----------+
| Member |      Host     |  Role  |  State  | TL | Lag in MB |
+--------+---------------+--------+---------+----+-----------+
|  pg1   | 10.47.226.202 | Leader | running |  4 |           |
|  pg2   | 10.47.226.203 |        | running |  4 |         0 |
+--------+---------------+--------+---------+----+-----------+
postgres=# select * from pg_stat_replication ;
-[ RECORD 1 ]----+------------------------------
pid              | 115230
usesysid         | 16384
usename          | replicator
application_name | pg2
client_addr      | 10.47.226.203
client_hostname  |
client_port      | 50754
backend_start    | 2020-05-25 23:21:03.287489-07
backend_xmin     |
state            | streaming
sent_lsn         | 7/DA04ABF8
write_lsn        | 7/DA04ABF8
flush_lsn        | 7/DA04ABF8
replay_lsn       | 7/DA04ABF8
write_lag        |
flush_lag        |
replay_lag       |
sync_priority    | 0
sync_state       | async
-[ RECORD 2 ]----+------------------------------
pid              | 27783
usesysid         | 16384
usename          | replicator
application_name | pg3
client_addr      | 10.47.226.80
client_hostname  |
client_port      | 41776
backend_start    | 2020-05-28 03:17:51.67092-07
backend_xmin     |
state            | streaming
sent_lsn         | 7/DA04ABF8
write_lsn        | 7/DA04ABF8
flush_lsn        | 7/DA04ABF8
replay_lsn       | 7/DA04ABF8
write_lag        |
flush_lag        |
replay_lag       |
sync_priority    | 0
sync_state       | async

However, after a few hours again the same issue crops up and I am struggling to reproduce the issue from Postgresql & Patroni logs.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:19 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
huchangqiqicommented, Nov 24, 2020

Looks like Patroni on pg3 is not running, you have to figure out why. And pg1 should periodically write into logs Failed to drop replication slot 'pg3'

which situation will cause it periodically write into logs Failed to drop replication slot i notice if _schedule_load_slots = False, if the slots is not active , it will not call drop_replicaiton_slots, if it has called drop_replicaiton_slots but failed due to the slots is still active, it will try to do it again and again. in my environment, i have a 3 node patroni cluster, and i also have some other postgres (not belong the patroni cluster) use logical replication with the patroni leader node, but i find the patroni leader node try to drop the logical replication slots, and report Failed to drop replication slot the patroni version is v1.6.0 and postgres version is 10.

0reactions
cobolbabycommented, Dec 10, 2020

Same question.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Patroni Documentation - Read the Docs
Ask Patroni to restart the node with patronictl restart ... Also, the standby cluster is not being displayed in patronictl list or.
Read more >
YAML Configuration Settings — Patroni 2.1.5 documentation
Dynamic configuration settings¶. Dynamic configuration is stored in the DCS (Distributed Configuration Store) and applied on all cluster nodes.
Read more >
Troubleshooting the management database - IBM
This situation occurs when the replicas do not initialize properly. You can use the patronictl reinit command to reinitialize the replica. Note that...
Read more >
Testing the Patroni PostgreSQL Cluster
Run the following command on any node to check the current cluster status: $ sudo patronictl -c /etc/patroni/patroni.yml list + Cluster: stampede1 ...
Read more >
Patroni replication lag - Mirantis Container Cloud - Docs
Enter the leader Pod if it is not the current one. From the leader Pod, resync the replica Pod: patronictl -c postgres.yml reinit...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found