File 0001-Fix-controller-Delay-join-finalization-if-a-transiti.patch of Package pacemaker.29834
From 39a497edcc01d0ab67c6da308cc7dd7d4bc96011 Mon Sep 17 00:00:00 2001
From: Reid Wahl <nrwahl@protonmail.com>
Date: Wed, 22 Mar 2023 02:28:19 -0700
Subject: [PATCH] Fix: controller: Delay join finalization if a transition is
 in progress
While a transition is in progress, CIB updates may be generated and
received rapidly as resource actions complete. This can cause problems
if it happens during a controller join sequence.
The last two major steps of the join sequence are:
1. The client sends XML containing its resource history, obtained from
   its local executor, to the DC in do_cl_join_finalize_respond().
2. The DC receives this client resource history in do_dc_join_ack(),
   deletes the client's node state in the CIB, and writes the received
   client resource history to the CIB as the client's new node state.
However, suppose a resource action completes after the client generates
its resource history XML. Further suppose that action is recorded in the
CIB and is received by the DC's CIB manager before the DC updates the
client's node state. In this case, the newer history item is deleted
from the DC's CIB. The DC updated the client's node state based on the
history XML that the client fetched earlier. Now, the DC does not know
that the action completed on the client.
This can result in an action improperly being scheduled a second time.
Specifically, a user reported an issue in which a migrate_to operation
was run a second time after completing. The second time, the migrate_to
operation failed because the resource was no longer physically present
on the source node.
The do_dc_join_finalize() function in controld_join_dc.c contains a
block that delays join finalization while a transition is in progress.
If the R_IN_TRANSITION bit is set in the input register, the controller
stalls.
The problem is that nothing sets this bit. It was added by commit
a1c1b340 in 2005, and the line of code that set the bit was mistakenly
removed by commit feef7987 in 2008. We can tell that removing the
bit-setting line of code was a mistake, because the code that clears the
bit was kept (and moved elsewhere), while the code that checks the bit
was unmodified.
We do want to delay finalization if a transition is in progress.
However, the R_IN_TRANSITION bit itself is no longer necessary:
controld_globals.transition_graph->complete fulfills the same role, so
we can use that and remove R_IN_TRANSITION. The complete flag is
initialized to false (via calloc()) when a new graph is created during
do_te_invoke(). It's set to true by the time we reach notify_crmd()
(usually by te_graph_trigger()), which is where we previously cleared
the R_IN_TRANSITION bit.
This simple fix appears to resolve the known race conditions with client
history fetching versus CIB updates during a join sequence.
Closes T375
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
---
 daemons/controld/controld_fsa.h        | 2 --
 daemons/controld/controld_join_dc.c    | 2 +-
 daemons/controld/controld_te_actions.c | 1 -
 3 files changed, 1 insertion(+), 4 deletions(-)
Index: pacemaker-2.0.1+20190417.13d370ca9/daemons/controld/controld_fsa.h
===================================================================
--- pacemaker-2.0.1+20190417.13d370ca9.orig/daemons/controld/controld_fsa.h
+++ pacemaker-2.0.1+20190417.13d370ca9/daemons/controld/controld_fsa.h
@@ -418,8 +418,6 @@ enum crmd_fsa_input {
                                            response? if so perhaps we shouldn't
                                            stop yet */
 
-#  define R_IN_TRANSITION   0x10000000ULL
-                                        /*  */
 #  define R_SENT_RSC_STOP   0x20000000ULL /* Have we sent a stop action to all
                                          * resources in preparation for
                                          * shutting down */
Index: pacemaker-2.0.1+20190417.13d370ca9/daemons/controld/controld_join_dc.c
===================================================================
--- pacemaker-2.0.1+20190417.13d370ca9.orig/daemons/controld/controld_join_dc.c
+++ pacemaker-2.0.1+20190417.13d370ca9/daemons/controld/controld_join_dc.c
@@ -429,7 +429,7 @@ do_dc_join_finalize(long long action,
         set_bit(fsa_input_register, R_HAVE_CIB);
     }
 
-    if (is_set(fsa_input_register, R_IN_TRANSITION)) {
+    if (!transition_graph->complete) {
         crm_warn("Delaying response to cluster join offer while transition in progress "
                  CRM_XS " join-%d", current_join_id);
         crmd_fsa_stall(FALSE);
Index: pacemaker-2.0.1+20190417.13d370ca9/daemons/controld/controld_te_actions.c
===================================================================
--- pacemaker-2.0.1+20190417.13d370ca9.orig/daemons/controld/controld_te_actions.c
+++ pacemaker-2.0.1+20190417.13d370ca9/daemons/controld/controld_te_actions.c
@@ -743,7 +743,6 @@ notify_crmd(crm_graph_t * graph)
 
     graph->abort_reason = NULL;
     graph->completion_action = tg_done;
-    clear_bit(fsa_input_register, R_IN_TRANSITION);
 
     if (event != I_NULL) {
         register_fsa_input(C_FSA_INTERNAL, event, NULL);