device-mapper-multipath/0161-limpathpersist-redesign-failed-release-workaround.patch
Benjamin Marzinski 5a8898b817 device-mapper-multipath-0.8.7-40
Add 0150-libmpathpersist-retry-commands-on-other-paths-in-mpa.patch
Add 0151-libmpathpersist-check-released-key-against-the-reser.patch
Add 0152-multipathd-remove-thread-from-mpath_pr_event_handle.patch
Add 0153-libmpathpersist-remove-uneeded-wrapper-function.patch
Add 0154-libmpathpersist-reduce-log-level-for-persistent-rese.patch
Add 0155-libmpathpersist-remove-pointless-update_map_pr-ret-v.patch
Add 0156-multipathd-use-update_map_pr-in-mpath_pr_event_handl.patch
Add 0157-libmpathpersist-limit-changing-prflag-in-update_map_.patch
Add 0158-multipathd-Don-t-call-update_map_pr-unnecessarily.patch
Add 0159-libmpathpersist-remove-useless-function-send_prout_a.patch
Add 0160-libmpathpersist-fix-memory-leak-in-mpath_prout_rel.patch
Add 0161-limpathpersist-redesign-failed-release-workaround.patch
Add 0162-libmpathpersist-fail-the-release-if-all-threads-fail.patch
Add 0163-limpathpersist-Handle-changing-key-corner-case.patch
Add 0164-libmpathpersist-fix-command-keyword-ordering.patch
Add 0165-libmpathpersist-use-conf-timeout-for-updating-persis.patch
Add 0166-libmapthpersist-Handle-REGISTER-AND-IGNORE-changing-.patch
Add 0167-libmultipath-rename-prflag_value-enums.patch
Add 0168-libmpathpersist-use-a-switch-statement-for-prout-com.patch
Add 0169-libmpathpersist-Add-safety-check-for-preempting-on-k.patch
Add 0170-limpathpersist-remove-update_map_pr-code-for-NULL-pp.patch
Add 0171-libmpathpersist-move-update_map_pr-to-multipathd.patch
Add 0172-multipathd-clean-up-update_map_pr-and-mpath_pr_event.patch
Add 0173-libmpathpersist-clean-up-duplicate-function-declarat.patch
Add 0174-multipathd-wrap-setting-and-unsetting-prflag.patch
Add 0175-multipathd-unregister-PR-key-when-path-is-restored-i.patch
Add 0176-libmpathpersist-Fix-up-reservation_key-checking.patch
Add 0177-libmpathpersist-change-how-reservation-conflicts-are.patch
Add 0178-libmpathpersist-Clear-prkey-in-multipathd-before-unr.patch
Add 0179-libmpathpersist-only-clear-the-key-if-we-are-using-t.patch
Add 0180-libmpathpersist-Restore-old-reservation-key-on-failu.patch
Add 0181-libmpatpersist-update-reservation-key-before-checkin.patch
Add 0182-libmpathpersist-retry-on-conflicts-in-mpath_prout_co.patch
Add 0183-libmpathpersist-Don-t-always-fail-registrations-for-.patch
Add 0184-libmpathpersist-Don-t-try-release-workaround-for-inv.patch
Add 0185-libmpathpersist-Don-t-fail-RESERVE-commands-unnecess.patch
Add 0186-libmpathpersist-reregister-keys-when-self-preempting.patch
Add 0187-libmpathpersist-handle-updating-key-race-condition.patch
Add 0188-libmpathpersist-handle-preempting-all-registrants-re.patch
Add 0189-libmpathpersist-Fix-REGISTER-AND-IGNORE-while-holdin.patch
Add 0190-libmpathpersist-Handle-RESERVE-with-reservation-held.patch
Add 0191-libmpathpersist-use-check_holding_reservation-in-mpa.patch
Add 0192-libmpathpersist-Fix-unregistering-while-holding-the-.patch
Add 0193-libmpathpersist-Fix-race-between-restoring-a-path-an.patch
Add 0194-multipathd-Fix-tracking-of-old-PR-key.patch
  * Fixes RHEL-118515 ("There are many bugs in multipath's persistent
    reservation handling")
Resolves: RHEL-118515
2025-10-01 15:39:06 -04:00

308 lines
11 KiB
Diff

From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Benjamin Marzinski <bmarzins@redhat.com>
Date: Mon, 23 Jun 2025 20:40:48 -0400
Subject: [PATCH] limpathpersist: redesign failed release workaround
The workaround for releasing a reservation held by a failed path
(clearing all persistent reservation data and then reregistering all the
keys) has several problems. It requires devices that support the READ
FULL STATUS command and are capable of specifying TransportIDs with
REGISTRATION commands (SIP_C), neither of which are mandatory to support
SCSI Persistent Reservations. Non SIP_C devices will be left with no
registered keys. Also, not all cleared keys are registered, just the
ones going to the Target Port receiving the REGISTRATION command. To
reregister all the keys, the code would have to also use the "All Target
Ports" flag to register keys on different Target Ports, but this could
end up registering keys on paths they aren't supposed to be on (for
instance if one of the registered keys was only there because the path
was down and it couldn't be released).
The redesign avoids these issues by only using mandatory Persistent
Reservation commands, without extra optional parameters or flags, and
only effects the keys of the multipath device it is being issued on.
The new workaround is:
1. Suspend the multipath device to prevent I/O
2. Preempt the key to move the reservation to an available path. This
also removes the registered keys from every path except the path
issuing the PREEMPT command. Since the device is suspended, not I/O
can go to these unregisted paths and fail.
3. Release the reservation on the path that now holds it.
4. Resume the device (since it no longer matters that most of the paths
no longer have a registered key)
5. Reregister the keys on all the paths.
If steps 3 or 4 fail, the code will attempt to reregister the keys, and
then attempt (or possibly re-attempt) the resume.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
libmpathpersist/mpath_persist.c | 171 ++++++++++++++----------------
libmpathpersist/mpathpr.h | 2 -
libmultipath/libmultipath.version | 1 +
3 files changed, 78 insertions(+), 96 deletions(-)
diff --git a/libmpathpersist/mpath_persist.c b/libmpathpersist/mpath_persist.c
index 59159737..7e60aa47 100644
--- a/libmpathpersist/mpath_persist.c
+++ b/libmpathpersist/mpath_persist.c
@@ -97,6 +97,11 @@ int libmpathpersist_exit(void)
return 0;
}
+static int mpath_prout_common(struct multipath *mpp,int rq_servact,
+ int rq_scope, unsigned int rq_type,
+ struct prout_param_descriptor* paramp, int noisy,
+ struct path **pptr);
+
int
mpath_prin_activepath (struct multipath *mpp, int rq_servact,
struct prin_resp * resp, int noisy)
@@ -283,6 +288,7 @@ static int do_mpath_persistent_reserve_out(vector curmp, vector pathvec, int fd,
conf = get_multipath_config();
select_reservation_key(conf, mpp);
select_all_tg_pt(conf, mpp);
+ select_skip_kpartx(conf, mpp);
put_multipath_config(conf);
memcpy(&prkey, paramp->sa_key, 8);
@@ -319,7 +325,8 @@ static int do_mpath_persistent_reserve_out(vector curmp, vector pathvec, int fd,
case MPATH_PROUT_PREE_SA :
case MPATH_PROUT_PREE_AB_SA :
case MPATH_PROUT_CLEAR_SA:
- ret = mpath_prout_common(mpp, rq_servact, rq_scope, rq_type, paramp, noisy);
+ ret = mpath_prout_common(mpp, rq_servact, rq_scope, rq_type,
+ paramp, noisy, NULL);
break;
case MPATH_PROUT_REL_SA:
ret = mpath_prout_rel(mpp, rq_servact, rq_scope, rq_type, paramp, noisy);
@@ -571,8 +578,10 @@ void * mpath_prout_pthread_fn(void *p)
pthread_exit(NULL);
}
-int mpath_prout_common(struct multipath *mpp,int rq_servact, int rq_scope,
- unsigned int rq_type, struct prout_param_descriptor* paramp, int noisy)
+static int mpath_prout_common(struct multipath *mpp,int rq_servact, int rq_scope,
+ unsigned int rq_type,
+ struct prout_param_descriptor* paramp, int noisy,
+ struct path **pptr)
{
int i,j, ret;
struct pathgroup *pgp = NULL;
@@ -591,6 +600,8 @@ int mpath_prout_common(struct multipath *mpp,int rq_servact, int rq_scope,
found = true;
ret = prout_do_scsi_ioctl(pp->dev, rq_servact, rq_scope,
rq_type, paramp, noisy);
+ if (ret == MPATH_PR_SUCCESS && pptr)
+ *pptr = pp;
if (ret != MPATH_PR_RETRYABLE_ERROR)
return ret;
}
@@ -610,14 +621,12 @@ int mpath_prout_rel(struct multipath *mpp,int rq_servact, int rq_scope,
struct path *pp = NULL;
int active_pathcount = 0;
pthread_attr_t attr;
- int rc, found = 0;
+ int rc;
int count = 0;
int status = MPATH_PR_SUCCESS;
struct prin_resp resp;
- struct prout_param_descriptor *pamp = NULL;
- struct prin_resp *pr_buff;
- int length;
- struct transportid *pptr = NULL;
+ uint16_t udev_flags = (mpp->skip_kpartx)? MPATH_UDEV_NO_KPARTX_FLAG : 0;
+ bool did_resume = false;
if (!mpp)
return MPATH_PR_DMMP_ERROR;
@@ -707,104 +716,78 @@ int mpath_prout_rel(struct multipath *mpp,int rq_servact, int rq_scope,
}
condlog (2, "%s: Path holding reservation is not available.", mpp->wwid);
+ /*
+ * Cannot free the reservation because the path that is holding it
+ * is not usable. Workaround this by:
+ * 1. Suspending the device
+ * 2. Preempting the reservation to move it to a usable path
+ * (this removes the registered keys on all paths except the
+ * preempting one. Since the device is suspended, no IO can
+ * go to these unregistered paths and fail).
+ * 3. Releasing the reservation on the path that now holds it.
+ * 4. Resuming the device (since it no longer matters that most of
+ * that paths no longer have a registered key)
+ * 5. Reregistering keys on all the paths
+ */
- pr_buff = mpath_alloc_prin_response(MPATH_PRIN_RFSTAT_SA);
- if (!pr_buff){
- condlog (0, "%s: failed to alloc pr in response buffer.", mpp->wwid);
+ if (!dm_simplecmd_noflush(DM_DEVICE_SUSPEND, mpp->alias, 0)) {
+ condlog(0, "%s: release: failed to suspend dm device.", mpp->wwid);
return MPATH_PR_OTHER;
}
- status = mpath_prin_activepath (mpp, MPATH_PRIN_RFSTAT_SA, pr_buff, noisy);
-
- if (status != MPATH_PR_SUCCESS){
- condlog (0, "%s: pr in read full status command failed.", mpp->wwid);
- goto out;
- }
-
- num = pr_buff->prin_descriptor.prin_readfd.number_of_descriptor;
- if (0 == num){
- goto out;
- }
- length = sizeof (struct prout_param_descriptor) + (sizeof (struct transportid *));
-
- pamp = (struct prout_param_descriptor *)malloc (length);
- if (!pamp){
- condlog (0, "%s: failed to alloc pr out parameter.", mpp->wwid);
- goto out;
- }
-
- memset(pamp, 0, length);
-
- pamp->trnptid_list[0] = (struct transportid *) malloc (sizeof (struct transportid));
- if (!pamp->trnptid_list[0]){
- condlog (0, "%s: failed to alloc pr out transportid.", mpp->wwid);
- goto out1;
+ memset(paramp, 0, sizeof(*paramp));
+ memcpy(paramp->key, &mpp->reservation_key, 8);
+ memcpy(paramp->sa_key, &mpp->reservation_key, 8);
+ status = mpath_prout_common(mpp, MPATH_PROUT_PREE_SA, rq_scope, rq_type,
+ paramp, noisy, &pp);
+ if (status != MPATH_PR_SUCCESS) {
+ condlog(0, "%s: release: pr preempt command failed.", mpp->wwid);
+ goto fail_resume;
}
- pptr = pamp->trnptid_list[0];
- if (get_be64(mpp->reservation_key)){
- memcpy (pamp->key, &mpp->reservation_key, 8);
- condlog (3, "%s: reservation key set.", mpp->wwid);
+ memset(paramp, 0, sizeof(*paramp));
+ memcpy(paramp->key, &mpp->reservation_key, 8);
+ status = prout_do_scsi_ioctl(pp->dev, MPATH_PROUT_REL_SA, rq_scope,
+ rq_type, paramp, noisy);
+ if (status != MPATH_PR_SUCCESS) {
+ condlog(0, "%s: release on alternate path failed.", mpp->wwid);
+ goto out_reregister;
}
- status = mpath_prout_common (mpp, MPATH_PROUT_CLEAR_SA,
- rq_scope, rq_type, pamp, noisy);
-
- if (status) {
- condlog(0, "%s: failed to send CLEAR_SA", mpp->wwid);
- goto out2;
+ if (!dm_simplecmd_noflush(DM_DEVICE_RESUME, mpp->alias, udev_flags)) {
+ condlog(0, "%s release: failed to resume dm device.", mpp->wwid);
+ /*
+ * leave status set to MPATH_PR_SUCCESS, we will have another
+ * chance to resume the device.
+ */
+ goto out_reregister;
}
+ did_resume = true;
- pamp->num_transportid = 1;
-
- for (i = 0; i < num; i++){
- if (get_be64(mpp->reservation_key) &&
- memcmp(pr_buff->prin_descriptor.prin_readfd.descriptors[i]->key,
- &mpp->reservation_key, 8)){
- /*register with tarnsport id*/
- memset(pamp, 0, length);
- pamp->trnptid_list[0] = pptr;
- memset (pamp->trnptid_list[0], 0, sizeof (struct transportid));
- memcpy (pamp->sa_key,
- pr_buff->prin_descriptor.prin_readfd.descriptors[i]->key, 8);
- pamp->sa_flags = MPATH_F_SPEC_I_PT_MASK;
- pamp->num_transportid = 1;
-
- memcpy (pamp->trnptid_list[0],
- &pr_buff->prin_descriptor.prin_readfd.descriptors[i]->trnptid,
- sizeof (struct transportid));
- status = mpath_prout_common (mpp, MPATH_PROUT_REG_SA, 0, rq_type,
- pamp, noisy);
-
- pamp->sa_flags = 0;
- memcpy (pamp->key, pr_buff->prin_descriptor.prin_readfd.descriptors[i]->key, 8);
- memset (pamp->sa_key, 0, 8);
- pamp->num_transportid = 0;
- status = mpath_prout_common (mpp, MPATH_PROUT_REG_SA, 0, rq_type,
- pamp, noisy);
- }
- else
- {
- if (get_be64(mpp->reservation_key))
- found = 1;
- }
-
-
- }
+out_reregister:
+ memset(paramp, 0, sizeof(*paramp));
+ memcpy(paramp->sa_key, &mpp->reservation_key, 8);
+ rc = mpath_prout_reg(mpp, MPATH_PROUT_REG_IGN_SA, rq_scope, rq_type,
+ paramp, noisy);
+ if (rc != MPATH_PR_SUCCESS)
+ condlog(0, "%s: release: failed to reregister paths.", mpp->wwid);
- if (found){
- memset (pamp, 0, length);
- memcpy (pamp->sa_key, &mpp->reservation_key, 8);
- memset (pamp->key, 0, 8);
- status = mpath_prout_reg(mpp, MPATH_PROUT_REG_SA, rq_scope, rq_type, pamp, noisy);
+ /*
+ * If we failed releasing the reservation or resuming earlier
+ * try resuming now. Otherwise, return with the reregistering status
+ * This means we will report failure, even though the resevation
+ * has been released, since the keys were not reregistered.
+ */
+ if (did_resume)
+ return rc;
+ else if (status == MPATH_PR_SUCCESS)
+ status = rc;
+fail_resume:
+ if (!dm_simplecmd_noflush(DM_DEVICE_RESUME, mpp->alias, udev_flags)) {
+ condlog(0, "%s: release: failed to resume dm device.", mpp->wwid);
+ if (status == MPATH_PR_SUCCESS)
+ status = MPATH_PR_OTHER;
}
-
-out2:
- free(pptr);
-out1:
- free (pamp);
-out:
- free (pr_buff);
return (status == MPATH_PR_RETRYABLE_ERROR) ? MPATH_PR_OTHER : status;
}
diff --git a/libmpathpersist/mpathpr.h b/libmpathpersist/mpathpr.h
index 0d1075e5..85eac1c7 100644
--- a/libmpathpersist/mpathpr.h
+++ b/libmpathpersist/mpathpr.h
@@ -37,8 +37,6 @@ void dumpHex(const char* , int len, int no_ascii);
int mpath_prout_reg(struct multipath *mpp,int rq_servact, int rq_scope,
unsigned int rq_type, struct prout_param_descriptor * paramp, int noisy);
-int mpath_prout_common(struct multipath *mpp,int rq_servact, int rq_scope,
- unsigned int rq_type, struct prout_param_descriptor * paramp, int noisy);
int mpath_prout_rel(struct multipath *mpp,int rq_servact, int rq_scope,
unsigned int rq_type, struct prout_param_descriptor * paramp, int noisy);
diff --git a/libmultipath/libmultipath.version b/libmultipath/libmultipath.version
index 46907278..5340957c 100644
--- a/libmultipath/libmultipath.version
+++ b/libmultipath/libmultipath.version
@@ -301,6 +301,7 @@ global:
LIBMULTIPATH_9.1.2 {
global:
cleanup_mutex;
+ select_skip_kpartx;
} LIBMULTIPATH_9.1.1;
LIBMULTIPATH_9.1.3 {