device-mapper-multipath/0002-libmultipath-fix-tur-checker-timeout.patch
Benjamin Marzinski 996407fc5f device-mapper-multipath-0.7.7-7.gitb80318b
Update Source to latest upstream commit
Rename files
  * Previous patches 0001-0020 are now patches 0002-0021
  * Previous patches 0021-0028 are now patches 0026-0033
Add 0001-kpartx-Use-absolute-paths-to-create-mappings.patch
Add 0022-multipathd-check-for-NULL-udevice-in-cli_add_path.patch
Add 0023-libmultipath-remove-max_fds-code-duplication.patch
Add 0024-multipathd-set-return-code-for-multipathd-commands.patch
Add 0025-mpathpersist-fix-registration-rollback-issue.patch
  * The above 5 patches have been submitted upstream
2018-10-10 00:16:58 -05:00

52 lines
2.1 KiB
Diff

From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Benjamin Marzinski <bmarzins@redhat.com>
Date: Thu, 26 Jul 2018 12:29:30 -0500
Subject: [PATCH] libmultipath: fix tur checker timeout
The code previously was timing out mode if ct->thread was 0 but
ct->running wasn't. This combination never happens. The idea was to
timeout if for some reason the path checker tried to cancel the thread,
but it didn't die. The correct thing to check for this is ct->holders.
ct->holders will always be at least one when libcheck_check() is called,
since libcheck_free() won't get called until the device is no longer
being checked. So, if ct->holders is 2, that means that the tur thread
is has not shut down yet.
Also, instead of timing out, the tur checker will switch to synchronous
mode. The chance of this code path happening is very low. I simply
exists because the old thread must not interfere with a new thread
starting up. But if something does go very wrong, and a thread does get
stuck, this solution will keep the checker from just ignoring the device
forever.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
libmultipath/checkers/tur.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/libmultipath/checkers/tur.c b/libmultipath/checkers/tur.c
index bf8486d..3c5e236 100644
--- a/libmultipath/checkers/tur.c
+++ b/libmultipath/checkers/tur.c
@@ -355,12 +355,13 @@ int libcheck_check(struct checker * c)
}
pthread_mutex_unlock(&ct->lock);
} else {
- if (uatomic_read(&ct->running) != 0) {
- /* pthread cancel failed. continue in sync mode */
+ if (uatomic_read(&ct->holders) > 1) {
+ /* The thread has been cancelled but hasn't
+ * quilt. Fail back to synchronous mode */
pthread_mutex_unlock(&ct->lock);
- condlog(3, "%s: tur thread not responding",
+ condlog(3, "%s: tur checker failing back to sync",
tur_devt(devt, sizeof(devt), ct));
- return PATH_TIMEOUT;
+ return tur_check(c->fd, c->timeout, copy_msg_to_checker, c);
}
/* Start new TUR checker */
ct->state = PATH_UNCHECKED;
--
2.7.4