Is there a better way to add failed node back to postgres cluster?

Posted on

Question :

  • There are 2 nodes in the cluster: node1(primary) and node2(replica).
  • I will shutdown node1 and promote node2 as primary.
  • I’ll add the node1 back to the cluster (as a replica this time), so I am using the following procedure on failed node1:

systemctl stop postgresql-12
rm -rf /var/lib/pgsql/12/data/*
pg_basebackup -h node2 -D /var/lib/pgsql/12/data/ -U replicator -P -v  -R -X stream -C -S pgstandby1
systemctl start postgresql-12

Is there a better way? I don’t like that I’m deleting everything from /var/lib/pgsql/12/data/, and it takes some time until restoring, especially when there’s a lot of data.
What are your considerations gentleman?

Answer :

That’s exactly what pg_rewind was written for. It allows you to undo any transactions that have happened on node1 after node2 was promoted. It can be seen as a fast version of pg_basebackup in this case.

There is no guarantee that pg_rewind will succeed. It depends on whether you have all the WAL since the last common checkpoint of node1 and node2. If there is not enough, you have to resort to pg_basebackup.

You can make sure that old WAL is kept around for a while by setting wal_keep_size (wal_keep_segments in older releases) appropriately.

Leave a Reply

Your email address will not be published. Required fields are marked *