question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Patroni Slave start failed

See original GitHub issue

We have a two node patroni cluster and one of the slaves cannot be restarted.

Mar 31 20:51:40 compassvm1 patroni[23074]:   Mock authentication nonce: ff3d3bdb50c75d680f386e3c26025ee47b038f17390a4fce37c539bd823f9049
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40,164 INFO: Lock owner: postgresql1; I am postgresql2
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40,166 INFO: starting as a secondary
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40,179 INFO: postmaster pid=23461
Mar 31 20:51:40 compassvm1 patroni[23074]: 10.225.100.141:5432 - no response
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40.191 NZDT [23461] LOG:  listening on IPv4 address "10.225.100.141", port 5432
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40.199 NZDT [23461] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40.360 NZDT [23463] LOG:  database system was shut down in recovery at 2020-03-31 12:22:52 NZDT
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40.360 NZDT [23463] LOG:  entering standby mode
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40.360 NZDT [23463] FATAL:  requested timeline 49 does not contain minimum recovery point 2E54/9C316E18 on timeline 48
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40.362 NZDT [23461] LOG:  startup process (PID 23463) exited with exit code 1
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40.362 NZDT [23461] LOG:  aborting startup due to startup process failure
Mar 31 20:51:40 compassvm1 patroni[23074]: 2020-03-31 20:51:40.390 NZDT [23461] LOG:  database system is shut down
Mar 31 20:51:41 compassvm1 patroni[23074]: 2020-03-31 20:51:41,190 ERROR: postmaster is not running
Mar 31 20:51:41 compassvm1 patroni[23074]: 2020-03-31 20:51:41,192 INFO: Lock owner: postgresql1; I am postgresql2
Mar 31 20:51:41 compassvm1 patroni[23074]: 2020-03-31 20:51:41,194 INFO: failed to start postgres

Patroni config is this

scope: postgres
namespace: /db/
name: postgresql2

restapi:
    listen: 10.225.100.141:8008
    connect_address: 10.225.100.141:8008

etcd:
    host: 10.225.100.102:2379

bootstrap:
    dcs:
        ttl: 30
        loop_wait: 10
        retry_timeout: 10
        maximum_lag_on_failover: 1048576
        postgresql:
            use_pg_rewind: true
            use_slots: true
            parameters:
                wal_level: replica
                hot_standby: "on"
                wal_keep_segments: 8
                max_wal_senders: 5
                max_replication_slots: 5
                checkpoint_timeout: 30
                max_worker_processes: 19

    initdb:
    - encoding: UTF8
    - data-checksums

    pg_hba:
    - host replication replicator 127.0.0.1/32 md5
    - host replication replicator 10.225.100.140/0 md5
    - host replication replicator 10.225.100.141/0 md5
    - host all all 0.0.0.0/0 md5

    users:
    dba:
      password: secret
      options:
        - createrole
        - createdb
    repl:
      password: secret
      options:
        - replication

postgresql:
    listen: 10.225.100.141:5432
    connect_address: 10.225.100.141:5432
    data_dir: /data/database
    config_dir: /data/database
    bin_dir: /usr/lib/postgresql/10/bin
    pgpass: /tmp/pgpass
    authentication:
        replication:
            username: replicator
            password: ***
        superuser:
            username: postgres
            password: ***
    parameters:
        max_locks_per_transaction: 256
        max_worker_processes: 19
        shared_preload_libraries: 'timescaledb,pg_stat_statements'
        unix_socket_directories: '/var/run/postgresql/'
        shared_buffers: 8042MB
        effective_cache_size: 24126MB
        maintenance_work_mem: 2047MB
        work_mem: 10293kB
        timescaledb.max_background_workers: 8
        max_parallel_workers_per_gather: 4
        max_parallel_workers: 8
        wal_buffers: 16MB
        min_wal_size: 4GB
        max_wal_size: 8GB
        default_statistics_target: 500
        random_page_cost: 1.1
        checkpoint_completion_target: 0.9
        autovacuum_max_workers: 10
        autovacuum_naptime: 10
        effective_io_concurrency: 200
        timescaledb.last_tuned: '2019-07-23T18:56:13+12:00'
        timescaledb.last_tuned_version: '0.7.0'
        pg_stat_statements.max: 10000
        pg_stat_statements.track: all
tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

Current TL running on master is 50. Could anyone help me to get the slave up and running? This is a production cluster 😕

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6

github_iconTop GitHub Comments

3reactions
vitabakscommented, Mar 31, 2020

rebuild replica patronictl -c /etc/patroni/patroni.yml reinit [CLUSTER_NAME] [MEMBER_NAME]

0reactions
vietvudanhcommented, Sep 2, 2020

Encountered this problem, my reason was wrong permission on data_dir, it needs to be 700, instead of default 755. I guess checking patroni’s log is sufficient. It was on /var/log/messages on my server.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Issue with Patroni - postgres not starting
As from my understanding if it finds that there is a master holding a lock, postgres on the slave will fail to start....
Read more >
Newest 'patroni' Questions
Patroni is a cluster management software for PostgreSQL that provides automatic failover for high availability. It uses PostgreSQL streaming replication and ...
Read more >
How to setup a 3 node Patroni cluster using etcd in Postgres
Beginning with Postgres 11.5, Patroni High Availability (HA) solution is being shipped out as a part of the Postgres binary.
Read more >
Patroni : Setting up a highly available PostgreSQL Cluster
Patroni is a cluster manager used to customize and automate deployment and maintenance of PostgreSQL HA (High Availability) clusters.
Read more >
How to Set Up PostgreSQL High Availability with Patroni
ETCD uses this information to elects the master node and keeps the cluster UP and running. • HAProxy keeps track of changes in...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found