patroni entered a FATAL state
See original GitHub issueThe following is from fresh install of patroni using the helm chart provided by incubator/patroni. All of the etcd and spilo nodes come up but all of the spilo nodes have the following error. Environment is Canonical Distribution of Kubernetes 1.7 running on top of Openstack
If there is a better place to report this issue, please let me know and I will report elsewhere.
2017-08-29 00:17:19,541 - bootstrapping - INFO - Figuring out my environment (Google? AWS? Local?)
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/requests/packages/urllib3/connection.py", line 141, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "/usr/local/lib/python3.4/dist-packages/requests/packages/urllib3/util/connection.py", line 60, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "/usr/lib/python3.4/socket.py", line 533, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/requests/packages/urllib3/connectionpool.py", line 600, in urlopen
chunked=chunked)
File "/usr/local/lib/python3.4/dist-packages/requests/packages/urllib3/connectionpool.py", line 356, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python3.4/http/client.py", line 1125, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python3.4/http/client.py", line 1163, in _send_request
self.endheaders(body)
File "/usr/lib/python3.4/http/client.py", line 1121, in endheaders
self._send_output(message_body)
File "/usr/lib/python3.4/http/client.py", line 951, in _send_output
self.send(msg)
File "/usr/lib/python3.4/http/client.py", line 886, in send
self.connect()
File "/usr/local/lib/python3.4/dist-packages/requests/packages/urllib3/connection.py", line 166, in connect
conn = self._new_conn()
File "/usr/local/lib/python3.4/dist-packages/requests/packages/urllib3/connection.py", line 150, in _new_conn
self, "Failed to establish a new connection: %s" % e)
requests.packages.urllib3.exceptions.NewConnectionError: <requests.packages.urllib3.connection.HTTPConnection object at 0x7fd6b0237160>: Failed to establish a new connection: [Errno -2] Name or service not known
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/requests/adapters.py", line 423, in send
timeout=timeout
File "/usr/local/lib/python3.4/dist-packages/requests/packages/urllib3/connectionpool.py", line 649, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/local/lib/python3.4/dist-packages/requests/packages/urllib3/util/retry.py", line 376, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
requests.packages.urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='instance-data', port=80): Max retries exceeded with url: /latest/meta-data/placement/availability-zone (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7fd6b0237160>: Failed to establish a new connection: [Errno -2] Name or service not known',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/configure_spilo.py", line 526, in <module>
main()
File "/configure_spilo.py", line 475, in main
placeholders = get_placeholders(provider)
File "/configure_spilo.py", line 309, in get_placeholders
placeholders['instance_data'] = get_instance_metadata(provider)
File "/configure_spilo.py", line 256, in get_instance_metadata
metadata[k] = requests.get('{}/{}'.format(url, v or k), timeout=2, headers=headers).text
File "/usr/local/lib/python3.4/dist-packages/requests/api.py", line 70, in get
return request('get', url, params=params, **kwargs)
File "/usr/local/lib/python3.4/dist-packages/requests/api.py", line 56, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python3.4/dist-packages/requests/sessions.py", line 488, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.4/dist-packages/requests/sessions.py", line 609, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.4/dist-packages/requests/adapters.py", line 487, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='instance-data', port=80): Max retries exceeded with url: /latest/meta-data/placement/availability-zone (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7fd6b0237160>: Failed to establish a new connection: [Errno -2] Name or service not known',))
2017-08-29 00:17:20,387 CRIT Supervisor running as root (no user in config file)
2017-08-29 00:17:20,387 WARN Included extra file "/etc/supervisor/conf.d/patroni.conf" during parsing
2017-08-29 00:17:20,387 WARN Included extra file "/etc/supervisor/conf.d/cron.conf" during parsing
2017-08-29 00:17:20,388 WARN Included extra file "/etc/supervisor/conf.d/pgq.conf" during parsing
2017-08-29 00:17:20,416 INFO RPC interface 'supervisor' initialized
2017-08-29 00:17:20,416 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2017-08-29 00:17:20,417 INFO supervisord started with pid 1
2017-08-29 00:17:21,419 INFO spawned: 'cron' with pid 25
2017-08-29 00:17:21,421 INFO spawned: 'patroni' with pid 26
2017-08-29 00:17:21,423 INFO spawned: 'pgq' with pid 27
Usage: /usr/local/bin/patroni config.yml
Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:17:21,773 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:17:22,774 INFO success: cron entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-08-29 00:17:22,775 INFO spawned: 'patroni' with pid 33
2017-08-29 00:17:22,776 INFO success: pgq entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
Usage: /usr/local/bin/patroni config.yml
Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:17:23,130 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:17:25,134 INFO spawned: 'patroni' with pid 35
Usage: /usr/local/bin/patroni config.yml
Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:17:25,468 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:17:28,473 INFO spawned: 'patroni' with pid 37
Usage: /usr/local/bin/patroni config.yml
Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:17:28,804 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:17:29,805 INFO gave up: patroni entered FATAL state, too many start retries too quickly
2017-08-29 00:21:32,284 INFO spawned: 'patroni' with pid 75
Usage: /usr/local/bin/patroni config.yml
Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:21:32,634 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:21:33,637 INFO spawned: 'patroni' with pid 77
Usage: /usr/local/bin/patroni config.yml
Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:21:34,010 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:21:36,014 INFO spawned: 'patroni' with pid 79
Usage: /usr/local/bin/patroni config.yml
Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:21:36,350 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:21:39,355 INFO spawned: 'patroni' with pid 81
Usage: /usr/local/bin/patroni config.yml
Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:21:39,698 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:21:40,699 INFO gave up: patroni entered FATAL state, too many start retries too quickly
2017-08-29 00:23:13,550 INFO spawned: 'patroni' with pid 92
Usage: /usr/local/bin/patroni config.yml
Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:23:13,883 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:23:14,886 INFO spawned: 'patroni' with pid 94
Usage: /usr/local/bin/patroni config.yml
Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:23:15,250 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:23:17,254 INFO spawned: 'patroni' with pid 96
Usage: /usr/local/bin/patroni config.yml
Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:23:17,607 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:23:20,613 INFO spawned: 'patroni' with pid 98
Usage: /usr/local/bin/patroni config.yml
Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:33:44,999 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:33:46,000 INFO gave up: patroni entered FATAL state, too many start retries too quickly
Issue Analytics
- State:
- Created 6 years ago
- Comments:7 (4 by maintainers)
Top Results From Across the Web
How to recover a Patroni PostgreSQL instance when it is fails ...
When this kind of issue occurs in a Patroni cluster, we can recover the failed instance using patronictl reinit in most cases.
Read more >Highly Available PostgreSQL with patroni - DBA Stack Exchange
The guide said on both Postgres server have the same patroni.yml setup, ... Nov 12 07:50:02 DB2 systemd[1]: patroni.service: Unit entered failed state....
Read more >Failed to run patroni - postgresql - Stack Overflow
I think the answer is obvious. If you start patroni with sudo , it will run as root, and that is exactly the...
Read more >Accidental promotion of Patroni cluster in production ... - GitLab
Today, OnGres started to test the Patroni production cluster by detaching it from production (details below). We did not have a production ...
Read more >Patroni
Cluster state stored in a consistent distributed storage ... 2018-01-18 16:04:57.666 CET [36339] LOG: entering standby mode.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
did some more digging, turns out
# should be only accessible on AWS
is not true from:https://github.com/zalando/spilo/blob/master/postgres-appliance/configure_spilo.py#L218
the hostname instance-data fromhttps://github.com/zalando/spilo/blob/master/postgres-appliance/configure_spilo.py#L244
is not resolvable sohttps://github.com/zalando/spilo/blob/master/postgres-appliance/configure_spilo.py#L253
fails. replace instance-data with 169.254.169.254 and everything works just fineLooks good @roll4life!