Thread hang cause "waiting for leader to bootstrap"
See original GitHub issueHi,
We meet an issue during “helm install”. After analysis, I think there is a chance python thread hang and cause “waiting for leader to bootstrap”. So I would like to report this issue, I don’t know if you could do something to improve this part or not?
Patroni v1.6.3 Python 2.7
— LOG —
2020-03-31T19:07:05.066891092Z Skip service level restore action.
2020-03-31T19:07:05.093194223Z /entrypoint.sh: dir data changed for postgresql
2020-03-31T19:07:05.096706161Z /entrypoint.sh: dir /var/lib/postgresql/data/pgdata changed owner for postgresql
2020-03-31T19:07:05.117268451Z ls: cannot access '/var/lib/postgresql/data/pgdata/pg_replslot/': No such file or directory
2020-03-31T19:07:05.119851734Z /entrypoint.sh: create dir done, uid=26(postgres) gid=26(postgres) groups=26(postgres),0(root)
2020-03-31T19:07:05.685056852Z 2020-03-31 19:07:05,684 INFO: postgres connection_string is postgres://192.168.21.199:5432/postgres
2020-03-31T19:07:05.68508475Z 2020-03-31 19:07:05,684 INFO: No PostgreSQL configuration items changed, nothing to reload.
2020-03-31T19:07:05.686310355Z 2020-03-31 19:07:05,686 INFO: Selected address family is 2
2020-03-31T19:07:05.68791133Z 2020-03-31 19:07:05,687 INFO: Postgres stop: success: True, signaled: False, block_callbacks: False
2020-03-31T19:07:05.688267346Z 2020-03-31 19:07:05,687 INFO: Lock owner: None; I am testapp-db-pg-0
2020-03-31T19:07:05.688286198Z 2020-03-31 19:07:05,688 INFO: waiting for leader to bootstrap
2020-03-31T19:07:15.688363227Z 2020-03-31 19:07:15,687 INFO: Postgres stop: success: True, signaled: False, block_callbacks: False
2020-03-31T19:07:15.688460201Z 2020-03-31 19:07:15,688 INFO: Lock owner: None; I am testapp-db-pg-0
If we take a look source code, I find this ha.py
else:
ret = self._async_executor.try_run_async('bootstrap', self.state_handler.bootstrap.bootstrap,
args=(self.patroni.config['bootstrap'],))
return ret or 'trying to bootstrap a new cluster'
async_executor.py
def run_async(self, func, args=()):
Thread(target=self.run, args=(func, args)).start()
def try_run_async(self, action, func, args=()):
prev = self.schedule(action)
if prev is None:
return self.run_async(func, args)
return 'Failed to run {0}, {1} is already in progress'.format(action, prev)
As we didn’t see “trying to bootstrap a new cluster” printout, I think the python thread had some kind of run-time problem.
Do you have any suggestions?
BRs, Fan Liu
Issue Analytics
- State:
- Created 3 years ago
- Comments:8
Top Results From Across the Web
Thread: Patroni configuration issue - Postgres Professional
Waiting for leader to bootstrap yml -- start this when p0 is down. ideally when it is started as replica, it would...
Read more >Upgrade patroni to 2.0.x (#5870) · Issues - GitLab
Replicas are waiting for checkpoint indication via member key of the leader in DCS. The key is normally updated only once per HA...
Read more >Patroni
Changing the bootstrap section in the Patroni configuration takes no effect once the cluster has been bootstrapped. Page 52. 52. Please capita.
Read more >Consumer not receiving messages, kafka console, new ...
I my MAC box I was facing the same issue of console-consumer not consuming any messages when used the command kafka-console-consumer --bootstrap-server ...
Read more >Patroni - PGCon
2019-03-07 12:14:33,864 INFO: doing crash recovery in a single user mode ... with url: /v2/keys/service/demo/leader (Caused by.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
The only place which keeps the information about the initialized cluster is configmap or endpoint on K8s and the
/config
key for any other DCS. If you still getwaiting for leader to bootstrap
message - that means<cluster-name>-config
configmap or endpoint is still there. Nothing else is possible.Thanks for the info @CyberDem0n You are right, especially on K8s. Restart just happens by many reason.
BRs, Fan Liu