Cluster operations

Show replication status

On the primary node run to show some replication statistics

SELECT pid, client_addr, state, sync_state, sent_lsn, write_lsn, flush_lsn, replay_lsn, write_lag, flush_lag, replay_lag FROM pg_stat_replication;

The most interesting metric here is the replay_lag showing the time elapsed between flushing recent WAL locally and receiving notification that the standby node has written, flushed and applied it.

Cluster node failover

The following steps will guide you through the process of promoting a standby node to primary

Verify primary and standby nodes

If the failover is planned and the primary is still alive, start by verifying the status of all nodes affected by the failover to avoid any mistakes

SELECT * FROM pg_is_in_recovery();

returns t on a standby node, because it is “recovering” and following a primary node by streaming its WAL. The primary node should return f here.

Shutdown primary node

Again if the old primary is sill alive, stop the PostgreSQL instance with

systemctl stop <environment_name>-<instance_name>

and wait until the shutdown finishes.

Remove recovery configuration

Standby nodes are bootstrapped from a pgBackrest backup, and then attached to the primary node by providing a primary_conninfo configuration pointing to the primary node. This configuration is then written to /storage/data/<environment_name>/<instance_name>/postgresql.auto.conf which gets autoloaded by PostgreSQL on startup. As long as that config is present, the node will assume its a standby node, and try to connect to the primary.

To transition a standby node to primary, remove the recovery config on the standby node and restart the database instance

rm /storage/data/<environment_name>/<instance_name>/postgresql.auto.conf
systemctl restart <environment_name>-<instance_name>

Promote standby to primary

To finish the transition now on the standby promote the database instance to primary

SELECT pg_promote();

and verify that the promotion was successful and the node now is primary with

SELECT * FROM pg_is_in_recovery();

which should return f now.

Trigger a backup

When the failover process is done, PostgreSQL internally creates a new timeline to be able to track changes from that point onward. This ensures that previous Write-Ahead Log (WAL) data isn’t overwritten. Because the latest backup was created on the old node with the previous timeline, we need to trigger a new backup, so the backup repository knows the latest timeline. See also the role documentation for an in depth explanation

On the new primary trigger a backup

<environment_name>-<instance_name>-backup-incr.sh

If you are in doubt which node has the latest timeline, you can query the current using

SELECT timeline_id FROM pg_control_checkpoint();

Restore new standby node

To create a new standby node that follows the new primary execute the ansible role again, with the new primary provided as primary_node.

If you plan to re-use the old primary node, make sure to move the old data aside first to trigger the restore

mv /storage/data/<environment_name>/<instance_name> /storage/data/<environment_name>/<instance_name>.old