File 0001-galera-Fix-automatic-recovery-when-a-cluster-was-not.patch of Package resource-agents.21369
From 73551ac029c7480cf710c392ad2d500ef2a7b07a Mon Sep 17 00:00:00 2001
From: Petr Pavlu <petr.pavlu@suse.com>
Date: Wed, 26 Aug 2020 13:59:16 +0200
Subject: [PATCH 1/1] galera: Fix automatic recovery when a cluster was not
gracefully stopped
When selecting a bootstrap node, the Galera resource agent primarily
depends on the safe_to_bootstrap flag in grastate.dat. If none of the
nodes have this flag set to 1 then functions detect_last_commit() +
detect_first_master() provide a recovery logic to select the bootstrap
node based on each node's last commit, as obtained from grastate.dat or
'mysqld_safe --wsrep-recover'.
Fix 65f35e9172407e64ded90f29ea8fc0dfca9643e3 introduced a problem for
this recovery logic. If a whole cluster is not gracefully stopped,
grastate.dat on every node contains "safe_to_bootstrap: 0" and
"seqno: -1". Function detect_safe_to_bootstrap() then considers each
node with this seqno as not suitable for bootstraping and clears the
safe_to_bootstrap attribute. Nonetheless, functions detect_last_commit()
+ detect_first_master() successfully find a bootstrap node, relying on
the recovery logic. However, when the promote operation is invoked,
function galera_promote() runs check '"$(get_safe_to_bootstrap)" = "0"'
which fails and prevents the code from writing "safe_to_bootstrap: 1"
into grastate.dat of the selected node to mark it as a bootstrap node.
The end result is that Galera refuses to be started on this node and
therefore the whole cluster remains down.
The patch fixes the problem by adjusting detect_safe_to_bootstrap() to
accept the combination of "safe_to_bootstrap: 0" and "seqno: -1" and
allow a node with this state to potentially become a bootstrap node.
---
heartbeat/galera | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/heartbeat/galera b/heartbeat/galera
index 4a313e24..74f11d8c 100755
--- a/heartbeat/galera
+++ b/heartbeat/galera
@@ -604,12 +604,17 @@ detect_safe_to_bootstrap()
seqno=$(sed -n 's/^seqno:\s*\(.*\)$/\1/p' < ${OCF_RESKEY_datadir}/grastate.dat)
fi
- if [ -z "$uuid" ] || [ -z "$seqno" ] || \
- [ "$uuid" = "00000000-0000-0000-0000-000000000000" ] || \
- [ "$seqno" = "-1" ]; then
+ if [ -z "$uuid" ] || \
+ [ "$uuid" = "00000000-0000-0000-0000-000000000000" ]; then
clear_safe_to_bootstrap
return
fi
+ if [ "$safe_to_bootstrap" = "1" ]; then
+ if [ -z "$seqno" ] || [ "$seqno" = "-1" ]; then
+ clear_safe_to_bootstrap
+ return
+ fi
+ fi
if [ "$safe_to_bootstrap" = "1" ] || [ "$safe_to_bootstrap" = "0" ]; then
set_safe_to_bootstrap $safe_to_bootstrap
--
2.26.2