question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add support for PgBackrest

See original GitHub issue

Hi @CyberDem0n and team,

In August on the Slack channel, we talked about the support of PgBackrest in Spilo. Now my colleague @fdalfons and I are working on it. Let me first specify our requirements. We need to run PgBackrest on Spilo image in order to collect WAL files and send them to our COS on IBM Cloud (S3 based Object Storage) every 10 minutes. Then we have a PgBackrest container standalone that runs on a separate worker (different from Patroni nodes because we don’t want to run a backup process where Patroni leave to not overload it) that connects to the Patroni nodes and creates a full backup every 6 hours and incremental backup every 30 minutes and send them to COS. We need also the feature to let Patroni cluster bootstrap from backup using PgBackrest on Spilo nodes.

Now we want to discuss here only the installation of PgBackrest on Spilo image, the writing of its /etc/pgbackrest.conf file in order to collect the WAL files and send them to COS.

The installation of PgBackrest on Spilo is easy:

RUN export DEBIAN_FRONTEND=noninteractive \
    && apt-get update \
    && apt-get install -y pgbackrest

just below this line of code: https://github.com/zalando/spilo/blob/3612e7f54284c8394bb7a4fd1097a0ef63e7c154/postgres-appliance/Dockerfile#L412

in order to generate the /etc/pgbackrest/pgbackrest.conf we need a set of environment variables like this:

ENV PGBACKREST_USE=false
ENV PGBACKREST_S3_ENDPOINT=
ENV PGBACKREST_S3_KEY=
ENV PGBACKREST_S3_KEY_SECRET=
ENV PGBACKREST_S3_PATH=
ENV PGBACKREST_S3_REGION=
ENV PGBACKREST_CHIPHER_TYPE=
ENV PGBACKREST_CHIPHER_PASSWORD=

The final file should be something like this:

[$SCOPE]
pg1-path=$PGDATA
pg1-socket-path=/tmp
pg1-port=$PGPORT
pg2-path=$PGDATA                                   
pg2-socket-path=/tmp                               
pg2-port=$PGPORT                                   
pg3-path=$PGDATA                                   
pg3-socket-path=/tmp                               
pg3-port=$PGPORT                                   

[global]
log-level-file=detail                              
process-max=4                                      
repo1-cipher-pass=$PGBACKREST_CHIPHER_PASSWORD     
repo1-cipher-type=$PGBACKREST_CHIPHER_TYPE         
repo1-retention-diff=2                             
repo1-retention-full=2                             
repo1-path=$PGBACKREST_S3_PATH
repo1-s3-bucket=$PGBACKREST_S3_BUCKET
repo1-s3-endpoint=$PGBACKREST_S3_ENDPOINT
repo1-s3-key=$PGBACKREST_S3_KEY
repo1-s3-key-secret=$PGBACKREST_S3_KEY_SECRET
repo1-s3-region=$PGBACKREST_S3_REGION
repo1-type=s3

[global:archive-push]
compress-level=3                                   

At the moment this file comes from our code, probably needs some rework, but it is just a baseline for our discussion. As you can notice the file need some information already available in Spilo, for example:

  • PGDATA
  • PGPORT
  • SCOPE
  • PGUSER_ADMIN

Now it’s clear to me that, in order to write this file, I need to modify configure_spilo.py and I used WAL-E for inspiration. I think I need to add something like this:

    elif section == 'pgbackrest':
            if placeholders['PGBACKREST_USE']:
                write_pgbackrest_environment(placeholders, '', args['force'])

now write_pgbackrest_environment should be something like write_wale_environment but it’s not clear to me exactly what this function does and if I need to support all Clouds like AWS, Google, etc. that are out of the scope of my activity.

I imagine also that I need something like this also for bootstrap from backup with PgBackrest: https://github.com/zalando/spilo/blob/3612e7f54284c8394bb7a4fd1097a0ef63e7c154/postgres-appliance/scripts/configure_spilo.py#L1084-L1086

In general, I need a few directions on how to change the code to achieve my requirements, what to do to make the code as generic as possible to be included in Spilo (obviously if this doesn’t take too much time), and if there are other missing parts I didn’t consider (I imagine also some changes in TEMPLATE).

Thank you in advance for any help.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:13 (13 by maintainers)

github_iconTop GitHub Comments

1reaction
sasadangelocommented, Oct 14, 2021

In certain cases creating the replica from S3 could be faster (when the cluster size is huge and it doesn’t generate too much WAL). But, for example, we have clusters producing about 3TB of WAL files a day. Replaying so much WAL will take at least half a day, while building the replica with pg_basebackup would take two-three hours. Hence, the wale_restore.sh tries to implement some sort of a logic that analyzing how much WAL has been generated since the last backup and in case if it was too much (configurable) it exists with non-zero code allowing basebackup to kick in.

This is interesting. Thank you.

There is a cron daemon running in the container, which starts the backup job according to the schedule (BACKUP_SCHEDULE=“0 1 * * *” by default). We just always take the backup from the primary.

Ok, thank you. My question is: isn’t it dangerous to have the primary server do the backup? Usually what we do is always try to leave to sync node (where possible) or async to manage backup, because it is expected it don’t have read traffic and, in general, there is less workload.

0reactions
sasadangelocommented, Nov 17, 2021

Hi all,

I finished implementing PgBackrest in Spilo. It works quite well for our purpose. If you want to give a look you can check out this commit: https://github.com/sasadangelo/spilo/commit/666005248cd1eb9a69c2d4bb0e789c1b408bea7e

however, for the moment I will not provide a PR because I would the to explore the possibility to have PgBackrest as a separate container and run it in the same Spilo Pod. In this way, there is no need to change the Spilo code. At the moment, I already have a Docker container for PgBackrest but it works in a separate Pod only to schedule and orchestrate backups on a NOT MASTER node. The goal is to have this container is two way:

  • as PgBakcrest-repo, the orchestrator mentioned above.
  • as PgBackrest, the container running on Spilo Pod.

Obviously, to do that we need that the PgBackrest container has access via mount point to the PostgreSQL data directory. Then inject via Kubernetes YAML the archive_command.

A similar approach is used by CrunchyData to keep the backup code in a separated container different by the Patroni/PostgreSQL one. I will info you as soon as it will be ready.

Read more comments on GitHub >

github_iconTop Results From Across the Web

pgBackRest User Guide - Debian & Ubuntu
The pgBackRest User Guide demonstrates how to quickly and easily setup pgBackRest for your PostgreSQL database. Step-by-step instructions lead the user ...
Read more >
pgBackRest Releases
Simplify messaging around supported versions in the documentation. (Reviewed by Stefan Fercot, Reid Thompson, Greg Sabino Mullane.) Add option type descriptions ...
Read more >
pgBackRest Configuration Reference
Repository host port when repo-host is set. Use this option to specify a non-default port for the repository host protocol. Currently only SSH...
Read more >
pgBackRest Command Reference
Repository host port when repo-host is set. Use this option to specify a non-default port for the repository host protocol. Currently only SSH...
Read more >
pgBackRest - Reliable PostgreSQL Backup & Restore
pgBackRest provides fast, reliable backup and restore for PostgreSQL and seamlessly scales to ... Full, differential, and incremental backups are supported.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found