question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

patroni entered a FATAL state

See original GitHub issue

The following is from fresh install of patroni using the helm chart provided by incubator/patroni. All of the etcd and spilo nodes come up but all of the spilo nodes have the following error. Environment is Canonical Distribution of Kubernetes 1.7 running on top of Openstack

If there is a better place to report this issue, please let me know and I will report elsewhere.

2017-08-29 00:17:19,541 - bootstrapping - INFO - Figuring out my environment (Google? AWS? Local?)
Traceback (most recent call last):
  File "/usr/local/lib/python3.4/dist-packages/requests/packages/urllib3/connection.py", line 141, in _new_conn
    (self.host, self.port), self.timeout, **extra_kw)
  File "/usr/local/lib/python3.4/dist-packages/requests/packages/urllib3/util/connection.py", line 60, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/usr/lib/python3.4/socket.py", line 533, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.4/dist-packages/requests/packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/usr/local/lib/python3.4/dist-packages/requests/packages/urllib3/connectionpool.py", line 356, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/lib/python3.4/http/client.py", line 1125, in request
    self._send_request(method, url, body, headers)
  File "/usr/lib/python3.4/http/client.py", line 1163, in _send_request
    self.endheaders(body)
  File "/usr/lib/python3.4/http/client.py", line 1121, in endheaders
    self._send_output(message_body)
  File "/usr/lib/python3.4/http/client.py", line 951, in _send_output
    self.send(msg)
  File "/usr/lib/python3.4/http/client.py", line 886, in send
    self.connect()
  File "/usr/local/lib/python3.4/dist-packages/requests/packages/urllib3/connection.py", line 166, in connect
    conn = self._new_conn()
  File "/usr/local/lib/python3.4/dist-packages/requests/packages/urllib3/connection.py", line 150, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
requests.packages.urllib3.exceptions.NewConnectionError: <requests.packages.urllib3.connection.HTTPConnection object at 0x7fd6b0237160>: Failed to establish a new connection: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.4/dist-packages/requests/adapters.py", line 423, in send
    timeout=timeout
  File "/usr/local/lib/python3.4/dist-packages/requests/packages/urllib3/connectionpool.py", line 649, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/local/lib/python3.4/dist-packages/requests/packages/urllib3/util/retry.py", line 376, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
requests.packages.urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='instance-data', port=80): Max retries exceeded with url: /latest/meta-data/placement/availability-zone (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7fd6b0237160>: Failed to establish a new connection: [Errno -2] Name or service not known',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/configure_spilo.py", line 526, in <module>
    main()
  File "/configure_spilo.py", line 475, in main
    placeholders = get_placeholders(provider)
  File "/configure_spilo.py", line 309, in get_placeholders
    placeholders['instance_data'] = get_instance_metadata(provider)
  File "/configure_spilo.py", line 256, in get_instance_metadata
    metadata[k] = requests.get('{}/{}'.format(url, v or k), timeout=2, headers=headers).text
  File "/usr/local/lib/python3.4/dist-packages/requests/api.py", line 70, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/local/lib/python3.4/dist-packages/requests/api.py", line 56, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.4/dist-packages/requests/sessions.py", line 488, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.4/dist-packages/requests/sessions.py", line 609, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.4/dist-packages/requests/adapters.py", line 487, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='instance-data', port=80): Max retries exceeded with url: /latest/meta-data/placement/availability-zone (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7fd6b0237160>: Failed to establish a new connection: [Errno -2] Name or service not known',))
2017-08-29 00:17:20,387 CRIT Supervisor running as root (no user in config file)
2017-08-29 00:17:20,387 WARN Included extra file "/etc/supervisor/conf.d/patroni.conf" during parsing
2017-08-29 00:17:20,387 WARN Included extra file "/etc/supervisor/conf.d/cron.conf" during parsing
2017-08-29 00:17:20,388 WARN Included extra file "/etc/supervisor/conf.d/pgq.conf" during parsing
2017-08-29 00:17:20,416 INFO RPC interface 'supervisor' initialized
2017-08-29 00:17:20,416 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2017-08-29 00:17:20,417 INFO supervisord started with pid 1
2017-08-29 00:17:21,419 INFO spawned: 'cron' with pid 25
2017-08-29 00:17:21,421 INFO spawned: 'patroni' with pid 26
2017-08-29 00:17:21,423 INFO spawned: 'pgq' with pid 27
Usage: /usr/local/bin/patroni config.yml
	Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:17:21,773 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:17:22,774 INFO success: cron entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-08-29 00:17:22,775 INFO spawned: 'patroni' with pid 33
2017-08-29 00:17:22,776 INFO success: pgq entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
Usage: /usr/local/bin/patroni config.yml
	Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:17:23,130 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:17:25,134 INFO spawned: 'patroni' with pid 35
Usage: /usr/local/bin/patroni config.yml
	Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:17:25,468 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:17:28,473 INFO spawned: 'patroni' with pid 37
Usage: /usr/local/bin/patroni config.yml
	Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:17:28,804 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:17:29,805 INFO gave up: patroni entered FATAL state, too many start retries too quickly
2017-08-29 00:21:32,284 INFO spawned: 'patroni' with pid 75
Usage: /usr/local/bin/patroni config.yml
	Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:21:32,634 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:21:33,637 INFO spawned: 'patroni' with pid 77
Usage: /usr/local/bin/patroni config.yml
	Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:21:34,010 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:21:36,014 INFO spawned: 'patroni' with pid 79
Usage: /usr/local/bin/patroni config.yml
	Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:21:36,350 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:21:39,355 INFO spawned: 'patroni' with pid 81
Usage: /usr/local/bin/patroni config.yml
	Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:21:39,698 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:21:40,699 INFO gave up: patroni entered FATAL state, too many start retries too quickly
2017-08-29 00:23:13,550 INFO spawned: 'patroni' with pid 92
Usage: /usr/local/bin/patroni config.yml
	Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:23:13,883 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:23:14,886 INFO spawned: 'patroni' with pid 94
Usage: /usr/local/bin/patroni config.yml
	Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:23:15,250 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:23:17,254 INFO spawned: 'patroni' with pid 96
Usage: /usr/local/bin/patroni config.yml
	Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:23:17,607 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:23:20,613 INFO spawned: 'patroni' with pid 98
Usage: /usr/local/bin/patroni config.yml
	Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
2017-08-29 00:33:44,999 INFO exited: patroni (exit status 1; not expected)
2017-08-29 00:33:46,000 INFO gave up: patroni entered FATAL state, too many start retries too quickly

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
roll4lifecommented, Aug 29, 2017

did some more digging, turns out # should be only accessible on AWS is not true from: https://github.com/zalando/spilo/blob/master/postgres-appliance/configure_spilo.py#L218 the hostname instance-data from https://github.com/zalando/spilo/blob/master/postgres-appliance/configure_spilo.py#L244 is not resolvable so https://github.com/zalando/spilo/blob/master/postgres-appliance/configure_spilo.py#L253 fails. replace instance-data with 169.254.169.254 and everything works just fine

0reactions
gdmellocommented, Sep 7, 2017

Looks good @roll4life!

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to recover a Patroni PostgreSQL instance when it is fails ...
When this kind of issue occurs in a Patroni cluster, we can recover the failed instance using patronictl reinit in most cases.
Read more >
Highly Available PostgreSQL with patroni - DBA Stack Exchange
The guide said on both Postgres server have the same patroni.yml setup, ... Nov 12 07:50:02 DB2 systemd[1]: patroni.service: Unit entered failed state....
Read more >
Failed to run patroni - postgresql - Stack Overflow
I think the answer is obvious. If you start patroni with sudo , it will run as root, and that is exactly the...
Read more >
Accidental promotion of Patroni cluster in production ... - GitLab
Today, OnGres started to test the Patroni production cluster by detaching it from production (details below). We did not have a production ...
Read more >
Patroni
Cluster state stored in a consistent distributed storage ... 2018-01-18 16:04:57.666 CET [36339] LOG: entering standby mode.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found