Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

failed to start postgres (reaped unknown pid : postmaster is not running )

See original GitHub issue

Hey Guys, During server upgrades, when patroni instances are restarted, this happens on secondary nodes:

2017-12-06 00:40:29,638 INFO: Lock owner: postgres-patroni-2; I am postgres-patroni-0
2017-12-06 00:40:29,665 INFO: starting as a secondary
2017-12-06 00:40:29,705 INFO: postmaster pid=428
2017-12-06 00:40:29 UTC [428]: [1-1] 5a273c7d.1ac 0     LOG:  redirecting log output to logging collector process
2017-12-06 00:40:29 UTC [428]: [2-1] 5a273c7d.1ac 0     HINT:  Future log output will appear in directory "../pg_log".
2017-12-06 00:40:29,924 INFO reaped unknown pid 428
2017-12-06 00:40:29,924 INFO reaped unknown pid 429
2017-12-06 00:40:30,706 ERROR: postmaster is not running
2017-12-06 00:40:30,711 INFO: Lock owner: postgres-patroni-2; I am postgres-patroni-0
2017-12-06 00:40:30,725 INFO: failed to start postgres

It starts fine if I delete all postgres data on these nodes… however I want replication to kick in and resume where it has finished (before the server upgrade).

Any ideas why this happens and who kills postmaster process ? Supervisord? Why?

Issue Analytics

State:
Created 6 years ago
Comments:14

Top GitHub Comments

2reactions

CyberDem0ncommented, Sep 27, 2019

Hi @bappr,

that’s quite an old image, it was build more than a year ago…

You can try to figure out why postgres if failing by execing into the pod and looking into logs which are located in the /home/postgres/pgdata/pgroot/pg_log. Since it is a Friday now, the current log file is postgres-5.csv.

Since your master is alive you can rebuild replicas with the help of patronictl:

$ kubectl exec cluster-name-X bash
root@cluster-name-X:/home/postgres# su postgres
postgres@cluster-name-X:~$ patronictl list # check cluster status
postgres@cluster-name-X:~$ patronictl reinit cluster-name cluster-name-X

The last command will wipe PGDATA and take a fresh pg_basebackup from the master.

1reaction

CyberDem0ncommented, Dec 9, 2019

Any reason why the pg_control got fucked ?

You’ll have to figure it out from the logs.

Top Results From Across the Web

failed to start postgres (reaped unknown pid - Bountysource

It starts fine if I delete all postgres data on these nodes... however I want replication to kick in and resume where it...

PostgreSQL stale 'postmaster.pid' error - Danielle McCarthy

Open your terminal and make sure you're in the home directory. · Navigate to the Postgres directory. cd Library/Application\ Support/Postgres · Type ls...

Обсуждение: BUG #14945: postmaster deadlock ... - Postgres Pro

The message that it's writing indicates that it failed to start an autovacuum ... to the other side of the pipe) has exited...

Subprocesses — Supervisor 4.2.4 documentation - Supervisord

The process could not be started successfully. UNKNOWN (1000). The process is in an unknown state (supervisord programming error). Each process run ......

1", port 5432 failed: fatal: role "postgres" does not exist - You ...

13. It would appear that you have not created a user account for the application. In psql: CREATE USER myapp WITH PASSWORD 'thepassword';...