Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Expose synchronous replication status through API

See original GitHub issue

Hi,

I started a cluster for testing, made of 3 nodes and I think that the current setup does not allow me to scale reads and backups.

My desired scenario:

Master - writes and reads
Synchronous slave - reads only for offloading of master
Potential synchronous slave - backups only

The current haproxy config and patroni API allows for this:

backend be_db_rw
    option httpchk GET /master
    http-check expect status 200
    ...

backend be_db_ro
    option httpchk GET /
    http-check expect string replica
    ...

But this will result in having one master and 2 read only servers.

I was thinking of adding another key - value to the API response containing the result of: postgres=# select client_addr, sync_state from pg_stat_replication; client_addr | sync_state -------------+------------ 10.0.0.1 | sync 10.0.0.2 | potential

This is visible only from the current master so I can’t simply add it to the API response. So I was thinking of adding the key client_addr and value sync_state in etcd and after expose them through API. There should be a match of IP so that the value is displayed on the correct server. This way the haproxy setup for sync and potential slaves will be strait forward.

Maybe I am missing a feature of patroni that allows one to find out which is the sync and potential. Or maybe there is a better way of doing this. Ideas?

Issue Analytics

State:
Created 7 years ago
Comments:10 (5 by maintainers)

Top GitHub Comments

1reaction

alexeyklyukincommented, Aug 1, 2016

Yes, I think the idea about filtering out the GET points based on whether the replica is synchronous or not should be implemented.

1reaction

CyberDem0ncommented, May 5, 2016

Hi. You are right, currently there is no way to tell which replica is synchronous and which is potential. I even knew better way to identify replicas (without relying on client_addr). We just need to set application_name in a primary_conninfo parameter. But… Will it really help you? First of all, you are getting information about replica status asynchronously. At the moment when you got this info it could happen that in sync replica already was changed. Postgres changes in sync replicas really easy and fast. For example I was running a test cluster with 2 replicas, postgresql1 and postgresql2. postgresql1 was in sync replica. I stopped it and a few moments later started it up.

2016-05-05 21:44:36,613 INFO: no action.  i am the leader with the lock
LOG:  standby "postgresql2" is now the synchronous standby with priority 1
LOG:  standby "postgresql1" is now the synchronous standby with priority 1
2016-05-05 21:44:46,605 INFO: Lock owner: postgresql0; I am postgresql0

As it was expected master almost immediately changed in sync replica to the postgresql2. But after postgresql1 became available it switched back to it.

Well, lets assume that in sync replica is not changing very often and you are reading from the “right” one. But what postgres documentation tells about synchronous replication? http://www.postgresql.org/docs/9.5/static/warm-standby.html#SYNCHRONOUS-REPLICATION

When requesting synchronous replication, each commit of a write transaction will wait
until confirmation is received that the commit has been written to the transaction log on
disk of both the primary and standby server.

Synchronous replication follows different purpose, it increases data loss protection, but gives not warranty that you will read exactly the same data as from master.

Unfortunately there is no easy way to scale read workload. Your application should aware of the fact that it could get some old data from the replica (even from in sync replica).

Anyway I like your idea of exposing such information. It could be very useful for example for monitoring.

P.S., there is a better way to identify replicas:

backend be_db_ro
-    option httpchk GET /
-    http-check expect string replica
+    option httpchk GET /replica
+    http-check expect status 200
    ...

We could even think about excluding some replicas from load balancing if they are more then N bytes behind the master (N could be passed as a request parameter). But one should keep in mind that master updates its position in etcd once in a while (once per 10 seconds by default).

Top Results From Across the Web

Replication Monitoring and Statistics | Couchbase Docs

Sync Gateway provides easy access to replication status data through the Admin REST API. You can obtain the replication status details for a...

16.3.9.3 Semisynchronous Replication Monitoring

You can monitor the Rpl_semi_sync_master_status status variable to determine whether the source currently is using asynchronous or semisynchronous replication.

8.5. Synchronous and Asynchronous Replication

Synchronous replication blocks a thread or caller (for example on a put() operation) until the modifications are replicated across all nodes in the...

Patroni REST API — Patroni 2.1.5 documentation

GET /replica : replica health check endpoint. It returns HTTP status code 200 only when the Patroni node is in the state running...

Storage Replica Overview | Microsoft Learn

Synchronous replication mirrors data within a low-latency network site with crash-consistent volumes to ensure zero data loss at the ...