Patroni Slave start failed
See original GitHub issueWe have a two node patroni cluster and one of the slaves cannot be restarted.
Mar 31 20:51:40 compassvm1 patroni[23074]: Mock authentication nonce: ff3d3bdb50c75d680f386e3c26025ee47b038f17390a4fce37c539bd823f9049
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40,164 INFO: Lock owner: postgresql1; I am postgresql2
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40,166 INFO: starting as a secondary
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40,179 INFO: postmaster pid=23461
Mar 31 20:51:40 compassvm1 patroni[23074]: 10.225.100.141:5432 - no response
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40.191 NZDT [23461] LOG: listening on IPv4 address "10.225.100.141", port 5432
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40.199 NZDT [23461] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40.360 NZDT [23463] LOG: database system was shut down in recovery at 2020-03-31 12:22:52 NZDT
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40.360 NZDT [23463] LOG: entering standby mode
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40.360 NZDT [23463] FATAL: requested timeline 49 does not contain minimum recovery point 2E54/9C316E18 on timeline 48
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40.362 NZDT [23461] LOG: startup process (PID 23463) exited with exit code 1
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40.362 NZDT [23461] LOG: aborting startup due to startup process failure
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40.390 NZDT [23461] LOG: database system is shut down
Mar 31 20:51:41 compassvm1 patroni[23074]: 2020-03-31 20:51:41,190 ERROR: postmaster is not running
Mar 31 20:51:41 compassvm1 patroni[23074]: 2020-03-31 20:51:41,192 INFO: Lock owner: postgresql1; I am postgresql2
Mar 31 20:51:41 compassvm1 patroni[23074]: 2020-03-31 20:51:41,194 INFO: failed to start postgres
Patroni config is this
scope: postgres
namespace: /db/
name: postgresql2
restapi:
listen: 10.225.100.141:8008
connect_address: 10.225.100.141:8008
etcd:
host: 10.225.100.102:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
use_slots: true
parameters:
wal_level: replica
hot_standby: "on"
wal_keep_segments: 8
max_wal_senders: 5
max_replication_slots: 5
checkpoint_timeout: 30
max_worker_processes: 19
initdb:
- encoding: UTF8
- data-checksums
pg_hba:
- host replication replicator 127.0.0.1/32 md5
- host replication replicator 10.225.100.140/0 md5
- host replication replicator 10.225.100.141/0 md5
- host all all 0.0.0.0/0 md5
users:
dba:
password: secret
options:
- createrole
- createdb
repl:
password: secret
options:
- replication
postgresql:
listen: 10.225.100.141:5432
connect_address: 10.225.100.141:5432
data_dir: /data/database
config_dir: /data/database
bin_dir: /usr/lib/postgresql/10/bin
pgpass: /tmp/pgpass
authentication:
replication:
username: replicator
password: ***
superuser:
username: postgres
password: ***
parameters:
max_locks_per_transaction: 256
max_worker_processes: 19
shared_preload_libraries: 'timescaledb,pg_stat_statements'
unix_socket_directories: '/var/run/postgresql/'
shared_buffers: 8042MB
effective_cache_size: 24126MB
maintenance_work_mem: 2047MB
work_mem: 10293kB
timescaledb.max_background_workers: 8
max_parallel_workers_per_gather: 4
max_parallel_workers: 8
wal_buffers: 16MB
min_wal_size: 4GB
max_wal_size: 8GB
default_statistics_target: 500
random_page_cost: 1.1
checkpoint_completion_target: 0.9
autovacuum_max_workers: 10
autovacuum_naptime: 10
effective_io_concurrency: 200
timescaledb.last_tuned: '2019-07-23T18:56:13+12:00'
timescaledb.last_tuned_version: '0.7.0'
pg_stat_statements.max: 10000
pg_stat_statements.track: all
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
Current TL running on master is 50. Could anyone help me to get the slave up and running? This is a production cluster 😕
Issue Analytics
- State:
- Created 3 years ago
- Comments:6
Top Results From Across the Web
Issue with Patroni - postgres not starting
As from my understanding if it finds that there is a master holding a lock, postgres on the slave will fail to start....
Read more >Newest 'patroni' Questions
Patroni is a cluster management software for PostgreSQL that provides automatic failover for high availability. It uses PostgreSQL streaming replication and ...
Read more >How to setup a 3 node Patroni cluster using etcd in Postgres
Beginning with Postgres 11.5, Patroni High Availability (HA) solution is being shipped out as a part of the Postgres binary.
Read more >Patroni : Setting up a highly available PostgreSQL Cluster
Patroni is a cluster manager used to customize and automate deployment and maintenance of PostgreSQL HA (High Availability) clusters.
Read more >How to Set Up PostgreSQL High Availability with Patroni
ETCD uses this information to elects the master node and keeps the cluster UP and running. • HAProxy keeps track of changes in...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
rebuild replica
patronictl -c /etc/patroni/patroni.yml reinit [CLUSTER_NAME] [MEMBER_NAME]
Encountered this problem, my reason was wrong permission on data_dir, it needs to be 700, instead of default 755. I guess checking patroni’s log is sufficient. It was on /var/log/messages on my server.