LogoopenSUSE Build Service > Projects
Sign Up | Log In

View File fenced_do_not_ignore_victim_done_messages_for_reduced_victims.patch of Package cluster (Project home:sschapiro:openstack:upstream)

commit 4eb20ad9f8e62d0739c7897d79bc1bccb94e5250
Author: David Teigland <teigland@redhat.com>
Date:   Tue Feb 22 15:47:24 2011 -0600

    fenced: don't ignore victim_done messages for reduced victims
    
    When a victim is "reduced" (i.e. fenced skips fencing it because it
    rejoins the cluster cleanly before fenced fences it), it is immediately
    removed from the list of victims, before the "victim_done" message is
    sent for it.  The victim_done message updates the time of the last
    successful fencing operation for a failed node.
    
    The code that processes received victim_done messages was ignoring the
    message for the reduced victim because the node couldn't be found in
    the victims list.  This caused the latest fencing information to not be
    recorded for the node, causing dlm_controld to wait indefinately for
    fencing to complete for the reduced victim.
    
    The fix is to simply record the information from a victim_done message
    even if the node is not in the victims list.
    
    bz 678704
    
    Acked-by: Ryan O'Hara <rohara@redhat.com>
    Signed-off-by: David Teigland <teigland@redhat.com>

diff --git a/fence/fenced/cpg.c b/fence/fenced/cpg.c
index a8629b9..99e16a0 100644
--- a/fence/fenced/cpg.c
+++ b/fence/fenced/cpg.c
@@ -652,9 +652,9 @@ static void receive_victim_done(struct fd *fd, struct fd_header *hd, int len)
 
 	node = get_node_victim(fd, id->nodeid);
 	if (!node) {
+		/* see comment below about no node */
 		log_debug("receive_victim_done %d:%u no victim nodeid %d",
 			  hd->nodeid, seq, id->nodeid);
-		return;
 	}
 
 	log_debug("receive_victim_done %d:%u remove victim %d time %llu how %d",
@@ -670,9 +670,11 @@ static void receive_victim_done(struct fd *fd, struct fd_header *hd, int len)
 	if (hd->nodeid == our_nodeid) {
 		/* sanity check, I don't think this should happen;
 		   see comment in fence_victims() */
-		if (!node->local_victim_done)
-			log_error("expect local_victim_done");
-		node->local_victim_done = 0;
+		if (node) {
+			if (!node->local_victim_done)
+				log_error("expect local_victim_done");
+			node->local_victim_done = 0;
+		}
 	} else {
 		/* save details of fencing operation from master, which
 		   master saves at the time it completes it */
@@ -680,8 +682,12 @@ static void receive_victim_done(struct fd *fd, struct fd_header *hd, int len)
 				   id->fence_how, id->fence_time);
 	}
 
-	list_del(&node->list);
-	free(node);
+	/* we can have no node when reduce_victims() removes it, bz 678704 */
+
+	if (node) {
+		list_del(&node->list);
+		free(node);
+	}
 }
 
 /* we know that the quorum value here is consistent with the cpg events