device-mapper-multipath/0112-libmultipath-change-flush_on_last_del-to-fix-a-multi.patch
Benjamin Marzinski 9ae3e98286 device-mapper-multipath-0.8.7-29
Add 0110-libmultipath-keep-track-of-queueing-state-in-feature.patch
Add 0111-libmultipath-export-partmap_in_use.patch
Add 0112-libmultipath-change-flush_on_last_del-to-fix-a-multi.patch
Add 0113-libmultipath-pad-dev_loss_tmo-to-avoid-race-with-no_.patch
  * Fixes RHEL-30272
Add 0114-libmultipath-remove-pathgroup-wildcard-options.patch
Add 0115-libmultipath-print-all-values-in-snprint_failback.patch
Add 0116-multipathd-Stop-double-counting-map-failures-for-no_.patch
Add 0117-multipath-tools-man-pages-add-missing-multipathd-com.patch
Add 0118-libmultipath-change-the-vend-prod-rev-printing.patch
Add 0119-multipath-tools-man-pages-Add-format-wildcard-descri.patch
  * Fixes RHEL-8304
Resolves: RHEL-8304
Resolves: RHEL-30272
2024-05-22 00:09:52 -04:00

347 lines
13 KiB
Diff

From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Benjamin Marzinski <bmarzins@redhat.com>
Date: Thu, 25 Apr 2024 19:35:14 -0400
Subject: [PATCH] libmultipath: change flush_on_last_del to fix a multipathd
hang
Commit 9bd3482e ("multipathd: make flush_map() delete maps like the
multipath command") fixed an issue where deleting a queueing multipath
device through multipathd could hang because the multipath device had
outstanding IO, even though the only openers of it at the time of
deletion were the kpartx partition devices. However it is still possible
to hang multipathd, because autoremoving the device when all paths have
been deleted doesn't disable queueing. To reproduce this hang:
1. create a multipath device with a kpartx partition on top of it and
no_path_retry set to either "queue" or something long enough to run all
the commands in the reproducer before it disables queueing.
2. disable all the paths to the device with something like:
# echo offline > /sys/block/<path_dev>/device/state
3. Write directly to the multipath device with something like:
# dd if=/dev/zero of=/dev/mapper/<mpath_dev> bs=4K count=1
4. delete all the paths to the device with something like:
# echo 1 > /sys/block/<path_dev>/device/delete
Multipathd will hang trying to delete the kpartx device because, as the
last opener, it must flush the multipath device before closing it.
Because it hangs holding the vecs_lock, multipathd will never disable
queueing on the device, so it will hang forever, even if no_path_retry
is set to a number.
This hang can occur, even if deferred_remove is set. Since nothing has
the kpartx device opened, device-mapper does an immediate remove, which
will still hang. This means that even if deferred_remove is set,
multipathd still cannot delete a map while queueing is enabled. It must
either disable queueing or skip the autoremove.
Mulitpath can currently be configured to avoid this hang by setting
flush_on_last_del yes
However there are good reasons why users wouldn't want to set that. They
may need to be able to survive having all paths getting temporarily
deleted. I should note that this is a pretty rare corner case, since
multipath automatically sets dev_loss_tmo so that it should not trigger
before queueing is disabled.
This commit avoids the hang by changing the possible values for
flush_on_last_del to "never", "unused", and "always", and sets the
default to "unused". "always" works like "yes" did, "never" will not
disable queueing, and "unused" will only disable queueing if nothing has
the kpartx devices or the multipath device open. In order to be safe, if
the device has queue_if_no_paths set (and in case of "unused", the
device is in-use) the autoremove will be skipped. Also, instead of just
trusting the lack of "queue_if_no_paths" in the current mpp->features,
multipathd will tell the kernel to disable queueing, just to be sure it
actually is.
I chose "unused" as the default because this should generally only cause
multipathd to work differently from the users perspective when nothing
has the multipath device open but it is queueing and there is
outstanding IO. Without this change, it would have hung instead of
failing the outstanding IO. However, I do understand that an argument
could be made that "never" makes more sense as default, even though it
will cause multipathd to skip autoremoves in cases where it wouldn't
before. The change to the behavior of deffered_remove will be
noticeable, but skipping an autoremove is much better than hanging.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
---
libmultipath/defaults.h | 2 +-
libmultipath/dict.c | 72 +++++++++++++++++++++++++++++++++-----
libmultipath/dict.h | 1 +
libmultipath/hwtable.c | 6 ++--
libmultipath/propsel.c | 4 ++-
libmultipath/structs.h | 7 ++--
multipath/multipath.conf.5 | 20 ++++++++---
multipathd/main.c | 39 ++++++++++++++++-----
8 files changed, 122 insertions(+), 29 deletions(-)
diff --git a/libmultipath/defaults.h b/libmultipath/defaults.h
index c3788bbc..5e77387e 100644
--- a/libmultipath/defaults.h
+++ b/libmultipath/defaults.h
@@ -41,7 +41,7 @@
#define DEFAULT_PRIO PRIO_CONST
#define DEFAULT_PRIO_ARGS ""
#define DEFAULT_CHECKER TUR
-#define DEFAULT_FLUSH FLUSH_DISABLED
+#define DEFAULT_FLUSH FLUSH_UNUSED
#define DEFAULT_USER_FRIENDLY_NAMES USER_FRIENDLY_NAMES_OFF
#define DEFAULT_FORCE_SYNC 0
#define UNSET_PARTITION_DELIM "/UNSET/"
diff --git a/libmultipath/dict.c b/libmultipath/dict.c
index ce1b6c99..3c011ece 100644
--- a/libmultipath/dict.c
+++ b/libmultipath/dict.c
@@ -825,14 +825,70 @@ declare_def_snprint(checker_timeout, print_nonzero)
declare_def_handler(allow_usb_devices, set_yes_no)
declare_def_snprint(allow_usb_devices, print_yes_no)
-declare_def_handler(flush_on_last_del, set_yes_no_undef)
-declare_def_snprint_defint(flush_on_last_del, print_yes_no_undef, DEFAULT_FLUSH)
-declare_ovr_handler(flush_on_last_del, set_yes_no_undef)
-declare_ovr_snprint(flush_on_last_del, print_yes_no_undef)
-declare_hw_handler(flush_on_last_del, set_yes_no_undef)
-declare_hw_snprint(flush_on_last_del, print_yes_no_undef)
-declare_mp_handler(flush_on_last_del, set_yes_no_undef)
-declare_mp_snprint(flush_on_last_del, print_yes_no_undef)
+
+static const char * const flush_on_last_del_optvals[] = {
+ [FLUSH_NEVER] = "never",
+ [FLUSH_ALWAYS] = "always",
+ [FLUSH_UNUSED] = "unused",
+};
+
+static int
+set_flush_on_last_del(vector strvec, void *ptr, const char *file, int line_nr)
+{
+ int i;
+ int *flush_val_ptr = (int *)ptr;
+ char *buff;
+
+ buff = set_value(strvec);
+ if (!buff)
+ return 1;
+
+ for (i = FLUSH_NEVER; i <= FLUSH_UNUSED; i++) {
+ if (flush_on_last_del_optvals[i] != NULL &&
+ !strcmp(buff, flush_on_last_del_optvals[i])) {
+ *flush_val_ptr = i;
+ break;
+ }
+ }
+
+ if (i > FLUSH_UNUSED) {
+ bool deprecated = true;
+ if (strcmp(buff, "no") == 0 || strcmp(buff, "0") == 0)
+ *flush_val_ptr = FLUSH_UNUSED;
+ else if (strcmp(buff, "yes") == 0 || strcmp(buff, "1") == 0)
+ *flush_val_ptr = FLUSH_ALWAYS;
+ else {
+ deprecated = false;
+ condlog(1, "%s line %d, invalid value for flush_on_last_del: \"%s\"",
+ file, line_nr, buff);
+ }
+ if (deprecated)
+ condlog(3, "%s line %d, \"%s\" is a deprecated value for flush_on_last_del and is treated as \"%s\"",
+ file, line_nr, buff,
+ flush_on_last_del_optvals[*flush_val_ptr]);
+ }
+
+ free(buff);
+ return 0;
+}
+
+int
+print_flush_on_last_del(struct strbuf *buff, long v)
+{
+ if (v == FLUSH_UNDEF)
+ return 0;
+ return append_strbuf_quoted(buff, flush_on_last_del_optvals[(int)v]);
+}
+
+declare_def_handler(flush_on_last_del, set_flush_on_last_del)
+declare_def_snprint_defint(flush_on_last_del, print_flush_on_last_del,
+ DEFAULT_FLUSH)
+declare_ovr_handler(flush_on_last_del, set_flush_on_last_del)
+declare_ovr_snprint(flush_on_last_del, print_flush_on_last_del)
+declare_hw_handler(flush_on_last_del, set_flush_on_last_del)
+declare_hw_snprint(flush_on_last_del, print_flush_on_last_del)
+declare_mp_handler(flush_on_last_del, set_flush_on_last_del)
+declare_mp_snprint(flush_on_last_del, print_flush_on_last_del)
declare_def_handler(user_friendly_names, set_yes_no_undef)
declare_def_snprint_defint(user_friendly_names, print_yes_no_undef,
diff --git a/libmultipath/dict.h b/libmultipath/dict.h
index d963b4ad..6b1aae5c 100644
--- a/libmultipath/dict.h
+++ b/libmultipath/dict.h
@@ -20,4 +20,5 @@ int print_reservation_key(struct strbuf *buff,
struct be64 key, uint8_t flags, int source);
int print_off_int_undef(struct strbuf *buff, long v);
int print_auto_resize(struct strbuf *buff, long v);
+int print_flush_on_last_del(struct strbuf *buff, long v);
#endif /* _DICT_H */
diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c
index 78ac7988..94012d50 100644
--- a/libmultipath/hwtable.c
+++ b/libmultipath/hwtable.c
@@ -60,7 +60,7 @@
.no_path_retry = NO_PATH_RETRY_UNDEF,
.minio = 1000,
.minio_rq = 1,
- .flush_on_last_del = FLUSH_DISABLED,
+ .flush_on_last_del = FLUSH_UNUSED,
.user_friendly_names = USER_FRIENDLY_NAMES_OFF,
.fast_io_fail = 5,
.dev_loss = 600,
@@ -800,7 +800,7 @@ static struct hwentry default_hw[] = {
.no_path_retry = NO_PATH_RETRY_QUEUE,
.pgpolicy = GROUP_BY_PRIO,
.pgfailback = -FAILBACK_IMMEDIATE,
- .flush_on_last_del = FLUSH_ENABLED,
+ .flush_on_last_del = FLUSH_ALWAYS,
.dev_loss = MAX_DEV_LOSS_TMO,
.prio_name = PRIO_ONTAP,
.user_friendly_names = USER_FRIENDLY_NAMES_OFF,
@@ -1122,7 +1122,7 @@ static struct hwentry default_hw[] = {
.no_path_retry = NO_PATH_RETRY_FAIL,
.minio = 1,
.minio_rq = 1,
- .flush_on_last_del = FLUSH_ENABLED,
+ .flush_on_last_del = FLUSH_ALWAYS,
.fast_io_fail = 15,
.dev_loss = 15,
},
diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
index 9dea6f92..be781ff7 100644
--- a/libmultipath/propsel.c
+++ b/libmultipath/propsel.c
@@ -902,6 +902,7 @@ out:
int select_flush_on_last_del(struct config *conf, struct multipath *mp)
{
const char *origin;
+ STRBUF_ON_STACK(buff);
mp_set_mpe(flush_on_last_del);
mp_set_ovr(flush_on_last_del);
@@ -909,8 +910,9 @@ int select_flush_on_last_del(struct config *conf, struct multipath *mp)
mp_set_conf(flush_on_last_del);
mp_set_default(flush_on_last_del, DEFAULT_FLUSH);
out:
+ print_flush_on_last_del(&buff, mp->flush_on_last_del);
condlog(3, "%s: flush_on_last_del = %s %s", mp->alias,
- (mp->flush_on_last_del == FLUSH_ENABLED)? "yes" : "no", origin);
+ get_strbuf_str(&buff), origin);
return 0;
}
diff --git a/libmultipath/structs.h b/libmultipath/structs.h
index d2ad4867..4bf8c93a 100644
--- a/libmultipath/structs.h
+++ b/libmultipath/structs.h
@@ -109,9 +109,10 @@ enum marginal_pathgroups_mode {
};
enum flush_states {
- FLUSH_UNDEF = YNU_UNDEF,
- FLUSH_DISABLED = YNU_NO,
- FLUSH_ENABLED = YNU_YES,
+ FLUSH_UNDEF,
+ FLUSH_NEVER,
+ FLUSH_ALWAYS,
+ FLUSH_UNUSED,
};
enum log_checker_err_states {
diff --git a/multipath/multipath.conf.5 b/multipath/multipath.conf.5
index 38eb5c90..10eddc0c 100644
--- a/multipath/multipath.conf.5
+++ b/multipath/multipath.conf.5
@@ -672,12 +672,22 @@ The default is: \fBno\fR
.TP
.B flush_on_last_del
If set to
-.I yes
+.I always
, multipathd will disable queueing when the last path to a device has been
-deleted.
-.RS
-.TP
-The default is: \fBno\fR
+deleted. If set to
+.I never
+, multipathd will not disable queueing when the last path to a device has
+been deleted. Since multipath cannot safely remove a device while queueing
+is enabled, setting this to \fInever\fR means that multipathd will not
+automatically remove an unused multipath device whose paths are all deleted if
+it is currently set to queue_if_no_path. If set to
+.I unused
+, multipathd will only disable queueing when the last path is removed if
+nothing currently has the multipath device or any of the kpartx partition
+devices on top of it open.
+.RS
+.TP
+The default is: \fBunused\fR
.RE
.
.
diff --git a/multipathd/main.c b/multipathd/main.c
index 74f8114c..8ec58f5d 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -491,19 +491,42 @@ int update_multipath (struct vectors *vecs, char *mapname, int reset)
static bool
flush_map_nopaths(struct multipath *mpp, struct vectors *vecs) {
int r;
+ bool is_queueing = true;
+ if (mpp->features)
+ is_queueing = strstr(mpp->features, "queue_if_no_path");
+
+ /* It's not safe to do a remove of a map that has "queue_if_no_path"
+ * set, since there could be outstanding IO which will cause
+ * multipathd to hang while attempting the remove */
+ if (mpp->flush_on_last_del == FLUSH_NEVER && is_queueing) {
+ condlog(2, "%s: map is queueing, can't remove", mpp->alias);
+ return false;
+ }
+ if (mpp->flush_on_last_del == FLUSH_UNUSED &&
+ partmap_in_use(mpp->alias, NULL) && is_queueing) {
+ condlog(2, "%s: map in use and queueing, can't remove",
+ mpp->alias);
+ return false;
+ }
/*
- * flush_map will fail if the device is open
+ * This will flush FLUSH_NEVER devices and FLUSH_UNUSED devices
+ * that are in use, but only if they are already marked as not
+ * queueing. That is just to make absolutely certain that they
+ * really are not queueing, like they claim.
*/
- if (mpp->flush_on_last_del == FLUSH_ENABLED) {
- condlog(2, "%s Last path deleted, disabling queueing",
+ condlog(is_queueing ? 2 : 3, "%s Last path deleted, disabling queueing",
+ mpp->alias);
+ mpp->retry_tick = 0;
+ mpp->no_path_retry = NO_PATH_RETRY_FAIL;
+ mpp->disable_queueing = 1;
+ mpp->stat_map_failures++;
+ if (dm_queue_if_no_path(mpp, 0) != 0) {
+ condlog(0, "%s: failed to disable queueing. Not removing",
mpp->alias);
- mpp->retry_tick = 0;
- mpp->no_path_retry = NO_PATH_RETRY_FAIL;
- mpp->disable_queueing = 1;
- mpp->stat_map_failures++;
- dm_queue_if_no_path(mpp, 0);
+ return false;
}
+
r = dm_flush_map_nopaths(mpp->alias, mpp->deferred_remove);
if (r) {
if (r == 1)