resource-agents/SOURCES/bz1792196-rabbitmq-cluster-...

39 lines
1.3 KiB
Diff

From 47d75f8de9dc912da035805f141c674885ce432f Mon Sep 17 00:00:00 2001
From: John Eckersberg <jeckersb@redhat.com>
Date: Thu, 16 Jan 2020 10:20:59 -0500
Subject: [PATCH] rabbitmq-cluster: ensure we delete nodename if stop action
fails
If the stop action fails, we want to remove the nodename from the crm
attribute. Currently it is possible for the stop action to fail but
the rabbitmq server does actually stop. This leaves the attribute
still present. This means if the entire rabbitmq cluster is stopped,
it is not possible to start the cluster again because the first node
to start will think there is at least one other node running. Then
the node tries to join an existing cluster instead of rebootstrapping
the cluster from a single node.
---
heartbeat/rabbitmq-cluster | 2 ++
1 file changed, 2 insertions(+)
diff --git a/heartbeat/rabbitmq-cluster b/heartbeat/rabbitmq-cluster
index 7837e9e3c..a9ebd37ad 100755
--- a/heartbeat/rabbitmq-cluster
+++ b/heartbeat/rabbitmq-cluster
@@ -552,6 +552,7 @@ rmq_stop() {
if [ $rc -ne 0 ]; then
ocf_log err "rabbitmq-server stop command failed: $RMQ_CTL stop, $rc"
+ rmq_delete_nodename
return $rc
fi
@@ -565,6 +566,7 @@ rmq_stop() {
break
elif [ "$rc" -ne $OCF_SUCCESS ]; then
ocf_log info "rabbitmq-server stop failed: $rc"
+ rmq_delete_nodename
exit $OCF_ERR_GENERIC
fi
sleep 1