f9dfbb37ac
Modify 0136-RHBZ-1304687-wait-for-map-add.patch * switch to missing_uev_wait_timeout to stop waiting for uev Refresh 0137-RHBZ-1280524-clear-chkr-msg.patch Refresh 0150-RHBZ-1253913-fix-startup-msg.patch Refresh 0154-UPBZ-1291406-disable-reinstate.patch Refresh 0156-UPBZ-1313324-dont-fail-discovery.patch Refresh 0161-RHBZ-1311659-no-kpartx.patch Refresh 0167-RHBZ-1335176-fix-show-cmds.patch Add 0173-RH-update-man-page.patch Add 0174-RHBZ-1362396-modprobe.patch * make starting the multipathd service modprobe dm-multipath in the sysvinit scripts Add 0175-RHBZ-1357382-ordering.patch * force multipathd.service to start after systemd-udev-trigger.service Add 0176-RHBZ-1363830-fix-rename.patch * initialized a variable to make dm_rename not fail randomly Add 0177-libmultipath-correctly-initialize-pp-sg_id.patch * This and all the following patches add the rbd patch checker Add 0178-libmultipath-add-rbd-discovery.patch Add 0179-multipath-tools-add-checker-callout-to-repair-path.patch Add 0180-multipath-tools-Add-rbd-checker.patch Add 0181-multipath-tools-Add-rbd-to-the-hwtable.patch Add 0182-multipath-tools-check-for-initialized-checker-before.patch Add 0183-multipathd-Don-t-call-repair-on-blacklisted-path.patch Add 0184-rbd-fix-sync-repair-support.patch Add 0185-rbd-check-for-nonshared-clients.patch Add 0186-rbd-check-for-exclusive-lock-enabled.patch Add 0187-rbd-fixup-log-messages.patch Add 0188-RHBZ-1368501-dont-exit.patch * make multipathd not exit if it encounters recoverable errors on startup Add 0189-RHBZ-1368211-remove-retries.patch * add "remove_retries" multipath.conf parameter to make multiple attempts to remove a multipath device if it is busy. Add 0190-RHBZ-1380602-rbd-lock-on-read.patch * pass lock_on_read when remapping image Add 0191-RHBZ-1169168-disable-changed-paths.patch * add "disabled_changed_wwids" multipath.conf parameter to disable paths whose wwid changes Add 0192-RHBZ-1362409-infinibox-config.patch Add 0194-RHBZ-1351964-kpartx-recurse.patch * fix recursion on corrupt dos partitions Add 0195-RHBZ-1359510-no-daemon-msg.patch * print a messages when multipathd isn't running Add 0196-RHBZ-1239173-dont-set-flag.patch * don't set reload flag on reloads when you gain your first valid path Add 0197-RHBZ-1394059-max-sectors-kb.patch * add "max_sectors_kb" multipath.conf parameter to set max_sectors_kb on a multipath device and all its path devices Add 0198-RHBZ-1372032-detect-path-checker.patch * add "detect_checker" multipath.conf parameter to detect ALUA arrays and set the path checker to TUR Add 0199-RHBZ-1279355-3pardata-config.patch Add 0200-RHBZ-1402092-orphan-status.patch * clear status on orphan paths Add 0201-RHBZ-1403552-silence-warning.patch Add 0202-RHBZ-1362120-skip-prio.patch * don't run prio on failed paths Add 0203-RHBZ-1363718-add-msgs.patch Add 0204-RHBZ-1406226-nimble-config.patch Add 0205-RHBZ-1416569-reset-stats.patch * add "reset maps stats" and "reset map <map> stats" multipathd interactive commands to reset the stats tracked by multipathd Add 0206-RHBZ-1239173-pt2-no-paths.patch * make multipath correctly disable scanning and rules running when it gets a uevent and there are not valid paths. Add 0207-UP-add-libmpathcmd.patch * New shared library, libmpathcmd, that sends and receives messages from multipathd. device-mapper-multipath now uses this library internally. Add 0208-UPBZ-1430097-multipathd-IPC-changes.patch * validation that modifying commands are coming from root. Add 0209-UPBZ-1430097-multipath-C-API.patch * New shared library. libdmmp, that presents the information from multipathd in a structured manner to make it easier for callers to use Add 0210-RH-fix-uninstall.patch * Minor compilation fixes Add 0211-RH-strlen-fix.patch * checks that variables are not NULL before passing them to strlen Add 0212-RHBZ-1431562-for-read-only.patch Make 3 new subpackages * device-mapper-multipath-devel, libdmmp, and libdmmp-devel. libmpathcmd and libmpathprio are in device-mapper-multipath-libs and device-mapper-multipath-devel. libdmmp is in its own subpackages Move libmpathprio devel files to device-mapper-multipath-devel Added BuildRequires on librados2-devel
747 lines
18 KiB
Diff
747 lines
18 KiB
Diff
From e28c340ed961409700d46a1cb9a820a8b7a4d016 Mon Sep 17 00:00:00 2001
|
|
From: Mike Christie <mchristi@redhat.com>
|
|
Date: Thu, 11 Aug 2016 02:12:12 -0500
|
|
Subject: [PATCH 04/11] multipath-tools: Add rbd checker.
|
|
|
|
For BZ 1348372 from upstream commit:
|
|
|
|
commit d1cad5649b6fcf9027d43ca0405c900080133e32
|
|
Author: Mike Christie <mchristi@redhat.com>
|
|
Date: Mon Aug 8 07:01:49 2016 -0500
|
|
|
|
multipath-tools: Add rbd checker.
|
|
|
|
This checker currently only handles the case where a path is failed
|
|
due to it being blacklisted by the ceph cluster. The specific use
|
|
case for me is when LIO exports rbd images through multiple LIO
|
|
instances.
|
|
|
|
The problem it handles is when rbd instance1 has the exclusive lock,
|
|
but becomes unreachable another host in the cluster will take over
|
|
and blacklist the instance1. This prevents it from sending stale IO
|
|
and corrupting data.
|
|
|
|
Later, when the host is reachable, we will want to failback to it.
|
|
To this, the checker will detect we were blacklisted, unmap the old
|
|
image which will make sure old IO is failed, and then remap the
|
|
image
|
|
and unblacklist the host. multipathd will then handle this like a
|
|
path being removed and re-added.
|
|
|
|
--------
|
|
|
|
Porting notes:
|
|
Added rbd to multipath.conf.annotated.
|
|
|
|
Signed-off-by: Mike Christie <mchristi@redhat.com>
|
|
---
|
|
libmultipath/checkers/Makefile | 7
|
|
libmultipath/checkers/rbd.c | 639 +++++++++++++++++++++++++++++++++++++++++
|
|
multipath.conf.annotated | 4
|
|
multipath/multipath.conf.5 | 3
|
|
4 files changed, 651 insertions(+), 2 deletions(-)
|
|
create mode 100644 libmultipath/checkers/rbd.c
|
|
|
|
Index: multipath-tools-130222/libmultipath/checkers/Makefile
|
|
===================================================================
|
|
--- multipath-tools-130222.orig/libmultipath/checkers/Makefile
|
|
+++ multipath-tools-130222/libmultipath/checkers/Makefile
|
|
@@ -14,10 +14,17 @@ LIBS= \
|
|
libcheckhp_sw.so \
|
|
libcheckrdac.so
|
|
|
|
+ifeq ($(shell test -r /usr/include/rados/librados.h && echo 1),1)
|
|
+LIBS += libcheckrbd.so
|
|
+endif
|
|
+
|
|
CFLAGS += -fPIC -I..
|
|
|
|
all: $(LIBS)
|
|
|
|
+libcheckrbd.so: rbd.o
|
|
+ $(CC) $(LDFLAGS) $(SHARED_FLAGS) -o $@ $^ -lrados -ludev
|
|
+
|
|
libcheckdirectio.so: libsg.o directio.o
|
|
$(CC) $(LDFLAGS) $(SHARED_FLAGS) -o $@ $^ -laio
|
|
|
|
Index: multipath-tools-130222/libmultipath/checkers/rbd.c
|
|
===================================================================
|
|
--- /dev/null
|
|
+++ multipath-tools-130222/libmultipath/checkers/rbd.c
|
|
@@ -0,0 +1,639 @@
|
|
+/*
|
|
+ * Copyright (c) 2016 Red Hat
|
|
+ * Copyright (c) 2004 Christophe Varoqui
|
|
+ *
|
|
+ * Code based off of tur.c and ceph's krbd.cc
|
|
+ */
|
|
+#define _GNU_SOURCE
|
|
+#include <stdio.h>
|
|
+#include <stdlib.h>
|
|
+#include <string.h>
|
|
+#include <unistd.h>
|
|
+#include <fcntl.h>
|
|
+#include <errno.h>
|
|
+#include <pthread.h>
|
|
+#include <libudev.h>
|
|
+#include <ifaddrs.h>
|
|
+#include <sys/types.h>
|
|
+#include <sys/stat.h>
|
|
+#include <sys/ioctl.h>
|
|
+#include <sys/time.h>
|
|
+#include <sys/wait.h>
|
|
+
|
|
+#include "rados/librados.h"
|
|
+
|
|
+#include "structs.h"
|
|
+#include "checkers.h"
|
|
+
|
|
+#include "../libmultipath/debug.h"
|
|
+#include "../libmultipath/uevent.h"
|
|
+
|
|
+struct rbd_checker_context;
|
|
+typedef int (thread_fn)(struct rbd_checker_context *ct, char *msg);
|
|
+
|
|
+#define RBD_MSG(msg, fmt, args...) snprintf(msg, CHECKER_MSG_LEN, fmt, ##args);
|
|
+
|
|
+struct rbd_checker_context {
|
|
+ int rbd_bus_id;
|
|
+ char *client_addr;
|
|
+ char *config_info;
|
|
+ char *snap;
|
|
+ char *pool;
|
|
+ char *image;
|
|
+ char *username;
|
|
+ int remapped;
|
|
+ int blacklisted;
|
|
+
|
|
+ rados_t cluster;
|
|
+
|
|
+ int state;
|
|
+ int running;
|
|
+ time_t time;
|
|
+ thread_fn *fn;
|
|
+ pthread_t thread;
|
|
+ pthread_mutex_t lock;
|
|
+ pthread_cond_t active;
|
|
+ pthread_spinlock_t hldr_lock;
|
|
+ int holders;
|
|
+ char message[CHECKER_MSG_LEN];
|
|
+};
|
|
+
|
|
+int libcheck_init(struct checker * c)
|
|
+{
|
|
+ struct rbd_checker_context *ct;
|
|
+ struct udev_device *block_dev;
|
|
+ struct udev_device *bus_dev;
|
|
+ struct udev *udev;
|
|
+ struct stat sb;
|
|
+ const char *block_name, *addr, *config_info;
|
|
+ const char *image, *pool, *snap, *username;
|
|
+ char sysfs_path[PATH_SIZE];
|
|
+ int ret;
|
|
+
|
|
+ ct = malloc(sizeof(struct rbd_checker_context));
|
|
+ if (!ct)
|
|
+ return 1;
|
|
+ memset(ct, 0, sizeof(struct rbd_checker_context));
|
|
+ ct->holders = 1;
|
|
+ pthread_cond_init(&ct->active, NULL);
|
|
+ pthread_mutex_init(&ct->lock, NULL);
|
|
+ pthread_spin_init(&ct->hldr_lock, PTHREAD_PROCESS_PRIVATE);
|
|
+ c->context = ct;
|
|
+
|
|
+ /*
|
|
+ * The rbd block layer sysfs device is not linked to the rbd bus
|
|
+ * device that we interact with, so figure that out now.
|
|
+ */
|
|
+ if (fstat(c->fd, &sb) != 0)
|
|
+ goto free_ct;
|
|
+
|
|
+ udev = udev_new();
|
|
+ if (!udev)
|
|
+ goto free_ct;
|
|
+
|
|
+ block_dev = udev_device_new_from_devnum(udev, 'b', sb.st_rdev);
|
|
+ if (!block_dev)
|
|
+ goto free_udev;
|
|
+
|
|
+ block_name = udev_device_get_sysname(block_dev);
|
|
+ ret = sscanf(block_name, "rbd%d", &ct->rbd_bus_id);
|
|
+
|
|
+ udev_device_unref(block_dev);
|
|
+ if (ret != 1)
|
|
+ goto free_udev;
|
|
+
|
|
+ snprintf(sysfs_path, sizeof(sysfs_path), "/sys/bus/rbd/devices/%d",
|
|
+ ct->rbd_bus_id);
|
|
+ bus_dev = udev_device_new_from_syspath(udev, sysfs_path);
|
|
+ if (!bus_dev)
|
|
+ goto free_udev;
|
|
+
|
|
+ addr = udev_device_get_sysattr_value(bus_dev, "client_addr");
|
|
+ if (!addr) {
|
|
+ condlog(0, "Could not find client_addr in rbd sysfs. Try "
|
|
+ "updating kernel");
|
|
+ goto free_dev;
|
|
+ }
|
|
+
|
|
+ ct->client_addr = strdup(addr);
|
|
+ if (!ct->client_addr)
|
|
+ goto free_dev;
|
|
+
|
|
+ config_info = udev_device_get_sysattr_value(bus_dev, "config_info");
|
|
+ if (!config_info)
|
|
+ goto free_addr;
|
|
+
|
|
+ ct->config_info = strdup(config_info);
|
|
+ if (!ct->config_info)
|
|
+ goto free_addr;
|
|
+
|
|
+ username = strstr(config_info, "name=");
|
|
+ if (username) {
|
|
+ char *end;
|
|
+ int len;
|
|
+
|
|
+ username += 5;
|
|
+ end = strchr(username, ',');
|
|
+ if (!end)
|
|
+ goto free_info;
|
|
+ len = end - username;
|
|
+
|
|
+ ct->username = malloc(len + 1);
|
|
+ if (!ct->username)
|
|
+ goto free_info;
|
|
+ strncpy(ct->username, username, len);
|
|
+ ct->username[len] = '\0';
|
|
+ }
|
|
+
|
|
+ image = udev_device_get_sysattr_value(bus_dev, "name");
|
|
+ if (!image)
|
|
+ goto free_username;
|
|
+
|
|
+ ct->image = strdup(image);
|
|
+ if (!ct->image)
|
|
+ goto free_info;
|
|
+
|
|
+ pool = udev_device_get_sysattr_value(bus_dev, "pool");
|
|
+ if (!pool)
|
|
+ goto free_image;
|
|
+
|
|
+ ct->pool = strdup(pool);
|
|
+ if (!ct->pool)
|
|
+ goto free_image;
|
|
+
|
|
+ snap = udev_device_get_sysattr_value(bus_dev, "current_snap");
|
|
+ if (!snap)
|
|
+ goto free_pool;
|
|
+
|
|
+ if (strcmp("-", snap)) {
|
|
+ ct->snap = strdup(snap);
|
|
+ if (!ct->snap)
|
|
+ goto free_pool;
|
|
+ }
|
|
+
|
|
+ if (rados_create(&ct->cluster, NULL) < 0) {
|
|
+ condlog(0, "Could not create rados cluster");
|
|
+ goto free_snap;
|
|
+ }
|
|
+
|
|
+ if (rados_conf_read_file(ct->cluster, NULL) < 0) {
|
|
+ condlog(0, "Could not read rados conf");
|
|
+ goto shutdown_rados;
|
|
+ }
|
|
+
|
|
+ ret = rados_connect(ct->cluster);
|
|
+ if (ret < 0) {
|
|
+ condlog(0, "Could not connect to rados cluster");
|
|
+ goto shutdown_rados;
|
|
+ }
|
|
+
|
|
+ udev_device_unref(bus_dev);
|
|
+ udev_unref(udev);
|
|
+
|
|
+ condlog(3, "rbd%d checker init %s %s/%s@%s %s", ct->rbd_bus_id,
|
|
+ ct->client_addr, ct->pool, ct->image, ct->snap ? ct->snap : "-",
|
|
+ ct->username ? ct->username : "none");
|
|
+ return 0;
|
|
+
|
|
+shutdown_rados:
|
|
+ rados_shutdown(ct->cluster);
|
|
+free_snap:
|
|
+ if (ct->snap)
|
|
+ free(ct->snap);
|
|
+free_pool:
|
|
+ free(ct->pool);
|
|
+free_image:
|
|
+ free(ct->image);
|
|
+free_username:
|
|
+ if (ct->username)
|
|
+ free(ct->username);
|
|
+free_info:
|
|
+ free(ct->config_info);
|
|
+free_addr:
|
|
+ free(ct->client_addr);
|
|
+free_dev:
|
|
+ udev_device_unref(bus_dev);
|
|
+free_udev:
|
|
+ udev_unref(udev);
|
|
+free_ct:
|
|
+ free(ct);
|
|
+ return 1;
|
|
+}
|
|
+
|
|
+void cleanup_context(struct rbd_checker_context *ct)
|
|
+{
|
|
+ pthread_mutex_destroy(&ct->lock);
|
|
+ pthread_cond_destroy(&ct->active);
|
|
+ pthread_spin_destroy(&ct->hldr_lock);
|
|
+
|
|
+ rados_shutdown(ct->cluster);
|
|
+
|
|
+ if (ct->username)
|
|
+ free(ct->username);
|
|
+ if (ct->snap)
|
|
+ free(ct->snap);
|
|
+ free(ct->pool);
|
|
+ free(ct->image);
|
|
+ free(ct->config_info);
|
|
+ free(ct->client_addr);
|
|
+ free(ct);
|
|
+}
|
|
+
|
|
+void libcheck_free(struct checker * c)
|
|
+{
|
|
+ if (c->context) {
|
|
+ struct rbd_checker_context *ct = c->context;
|
|
+ int holders;
|
|
+ pthread_t thread;
|
|
+
|
|
+ pthread_spin_lock(&ct->hldr_lock);
|
|
+ ct->holders--;
|
|
+ holders = ct->holders;
|
|
+ thread = ct->thread;
|
|
+ pthread_spin_unlock(&ct->hldr_lock);
|
|
+ if (holders)
|
|
+ pthread_cancel(thread);
|
|
+ else
|
|
+ cleanup_context(ct);
|
|
+ c->context = NULL;
|
|
+ }
|
|
+}
|
|
+
|
|
+static int rbd_is_blacklisted(struct rbd_checker_context *ct, char *msg)
|
|
+{
|
|
+ char *addr_tok, *start, *save;
|
|
+ char *cmd[2];
|
|
+ char *blklist, *stat;
|
|
+ size_t blklist_len, stat_len;
|
|
+ int ret;
|
|
+ char *end;
|
|
+
|
|
+ cmd[0] = "{\"prefix\": \"osd blacklist ls\"}";
|
|
+ cmd[1] = NULL;
|
|
+
|
|
+ ret = rados_mon_command(ct->cluster, (const char **)cmd, 1, "", 0,
|
|
+ &blklist, &blklist_len, &stat, &stat_len);
|
|
+ if (ret < 0) {
|
|
+ RBD_MSG(msg, "rbd checker failed: mon command failed %d",
|
|
+ ret);
|
|
+ return ret;
|
|
+ }
|
|
+
|
|
+ if (!blklist || !blklist_len)
|
|
+ goto free_bufs;
|
|
+
|
|
+ /*
|
|
+ * parse list of addrs with the format
|
|
+ * ipv4:port/nonce date time\n
|
|
+ * or
|
|
+ * [ipv6]:port/nonce date time\n
|
|
+ */
|
|
+ ret = 0;
|
|
+ for (start = blklist; ; start = NULL) {
|
|
+ addr_tok = strtok_r(start, "\n", &save);
|
|
+ if (!addr_tok || !strlen(addr_tok))
|
|
+ break;
|
|
+
|
|
+ end = strchr(addr_tok, ' ');
|
|
+ if (!end) {
|
|
+ RBD_MSG(msg, "rbd%d checker failed: invalid blacklist %s",
|
|
+ ct->rbd_bus_id, addr_tok);
|
|
+ break;
|
|
+ }
|
|
+ *end = '\0';
|
|
+
|
|
+ if (!strcmp(addr_tok, ct->client_addr)) {
|
|
+ ct->blacklisted = 1;
|
|
+ RBD_MSG(msg, "rbd%d checker: %s is blacklisted",
|
|
+ ct->rbd_bus_id, ct->client_addr);
|
|
+ ret = 1;
|
|
+ break;
|
|
+ }
|
|
+ }
|
|
+
|
|
+free_bufs:
|
|
+ rados_buffer_free(blklist);
|
|
+ rados_buffer_free(stat);
|
|
+ return ret;
|
|
+}
|
|
+
|
|
+int rbd_check(struct rbd_checker_context *ct, char *msg)
|
|
+{
|
|
+ if (ct->blacklisted || rbd_is_blacklisted(ct, msg) == 1)
|
|
+ return PATH_DOWN;
|
|
+
|
|
+ RBD_MSG(msg, "rbd checker reports path is up");
|
|
+ /*
|
|
+ * Path may have issues, but the ceph cluster is at least
|
|
+ * accepting IO, so we can attempt to do IO.
|
|
+ *
|
|
+ * TODO: in future versions, we can run other tests to
|
|
+ * verify OSDs and networks.
|
|
+ */
|
|
+ return PATH_UP;
|
|
+}
|
|
+
|
|
+int safe_write(int fd, const void *buf, size_t count)
|
|
+{
|
|
+ while (count > 0) {
|
|
+ ssize_t r = write(fd, buf, count);
|
|
+ if (r < 0) {
|
|
+ if (errno == EINTR)
|
|
+ continue;
|
|
+ return -errno;
|
|
+ }
|
|
+ count -= r;
|
|
+ buf = (char *)buf + r;
|
|
+ }
|
|
+ return 0;
|
|
+}
|
|
+
|
|
+static int sysfs_write_rbd_bus(const char *which, const char *buf,
|
|
+ size_t buf_len)
|
|
+{
|
|
+ char sysfs_path[PATH_SIZE];
|
|
+ int fd;
|
|
+ int r;
|
|
+
|
|
+ /* we require newer kernels so single_major should alwayws be there */
|
|
+ snprintf(sysfs_path, sizeof(sysfs_path),
|
|
+ "/sys/bus/rbd/%s_single_major", which);
|
|
+ fd = open(sysfs_path, O_WRONLY);
|
|
+ if (fd < 0)
|
|
+ return -errno;
|
|
+
|
|
+ r = safe_write(fd, buf, buf_len);
|
|
+ close(fd);
|
|
+ return r;
|
|
+}
|
|
+
|
|
+static int rbd_remap(struct rbd_checker_context *ct)
|
|
+{
|
|
+ char *argv[11];
|
|
+ pid_t pid;
|
|
+ int ret = 0, i = 0;
|
|
+ int status;
|
|
+
|
|
+ pid = fork();
|
|
+ switch (pid) {
|
|
+ case 0:
|
|
+ argv[i++] = "rbd";
|
|
+ argv[i++] = "map";
|
|
+ argv[i++] = "-o noshare";
|
|
+ if (ct->username) {
|
|
+ argv[i++] = "--id";
|
|
+ argv[i++] = ct->username;
|
|
+ }
|
|
+ argv[i++] = "--pool";
|
|
+ argv[i++] = ct->pool;
|
|
+ if (ct->snap) {
|
|
+ argv[i++] = "--snap";
|
|
+ argv[i++] = ct->snap;
|
|
+ }
|
|
+ argv[i++] = ct->image;
|
|
+ argv[i] = NULL;
|
|
+
|
|
+ ret = execvp(argv[0], argv);
|
|
+ condlog(0, "Error executing rbd: %s", strerror(errno));
|
|
+ exit(-1);
|
|
+ case -1:
|
|
+ condlog(0, "fork failed: %s", strerror(errno));
|
|
+ return -1;
|
|
+ default:
|
|
+ ret = -1;
|
|
+ wait(&status);
|
|
+ if (WIFEXITED(status)) {
|
|
+ status = WEXITSTATUS(status);
|
|
+ if (status == 0)
|
|
+ ret = 0;
|
|
+ else
|
|
+ condlog(0, "rbd failed with %d", status);
|
|
+ }
|
|
+ }
|
|
+
|
|
+ return ret;
|
|
+}
|
|
+
|
|
+static int sysfs_write_rbd_remove(const char *buf, int buf_len)
|
|
+{
|
|
+ return sysfs_write_rbd_bus("remove", buf, buf_len);
|
|
+}
|
|
+
|
|
+static int rbd_rm_blacklist(struct rbd_checker_context *ct)
|
|
+{
|
|
+ char *cmd[2];
|
|
+ char *stat, *cmd_str;
|
|
+ size_t stat_len;
|
|
+ int ret;
|
|
+
|
|
+ ret = asprintf(&cmd_str, "{\"prefix\": \"osd blacklist\", \"blacklistop\": \"rm\", \"addr\": \"%s\"}",
|
|
+ ct->client_addr);
|
|
+ if (ret == -1)
|
|
+ return -ENOMEM;
|
|
+
|
|
+ cmd[0] = cmd_str;
|
|
+ cmd[1] = NULL;
|
|
+
|
|
+ ret = rados_mon_command(ct->cluster, (const char **)cmd, 1, "", 0,
|
|
+ NULL, 0, &stat, &stat_len);
|
|
+ if (ret < 0) {
|
|
+ condlog(1, "rbd%d repair failed to remove blacklist for %s %d",
|
|
+ ct->rbd_bus_id, ct->client_addr, ret);
|
|
+ goto free_cmd;
|
|
+ }
|
|
+
|
|
+ condlog(1, "rbd%d repair rm blacklist for %s",
|
|
+ ct->rbd_bus_id, ct->client_addr);
|
|
+ free(stat);
|
|
+free_cmd:
|
|
+ free(cmd_str);
|
|
+ return ret;
|
|
+}
|
|
+
|
|
+static int rbd_repair(struct rbd_checker_context *ct, char *msg)
|
|
+{
|
|
+ char del[17];
|
|
+ int ret;
|
|
+
|
|
+ if (!ct->blacklisted)
|
|
+ return PATH_UP;
|
|
+
|
|
+ if (!ct->remapped) {
|
|
+ ret = rbd_remap(ct);
|
|
+ if (ret) {
|
|
+ RBD_MSG(msg, "rbd%d repair failed to remap. Err %d",
|
|
+ ct->rbd_bus_id, ret);
|
|
+ return PATH_DOWN;
|
|
+ }
|
|
+ }
|
|
+ ct->remapped = 1;
|
|
+
|
|
+ snprintf(del, sizeof(del), "%d force", ct->rbd_bus_id);
|
|
+ ret = sysfs_write_rbd_remove(del, strlen(del) + 1);
|
|
+ if (ret) {
|
|
+ RBD_MSG(msg, "rbd%d repair failed to clean up. Err %d",
|
|
+ ct->rbd_bus_id, ret);
|
|
+ return PATH_DOWN;
|
|
+ }
|
|
+
|
|
+ ret = rbd_rm_blacklist(ct);
|
|
+ if (ret) {
|
|
+ RBD_MSG(msg, "rbd%d repair could not remove blacklist entry. Err %d",
|
|
+ ct->rbd_bus_id, ret);
|
|
+ return PATH_DOWN;
|
|
+ }
|
|
+
|
|
+ ct->remapped = 0;
|
|
+ ct->blacklisted = 0;
|
|
+
|
|
+ RBD_MSG(msg, "rbd%d has been repaired", ct->rbd_bus_id);
|
|
+ return PATH_UP;
|
|
+}
|
|
+
|
|
+#define rbd_thread_cleanup_push(ct) pthread_cleanup_push(cleanup_func, ct)
|
|
+#define rbd_thread_cleanup_pop(ct) pthread_cleanup_pop(1)
|
|
+
|
|
+void cleanup_func(void *data)
|
|
+{
|
|
+ int holders;
|
|
+ struct rbd_checker_context *ct = data;
|
|
+ pthread_spin_lock(&ct->hldr_lock);
|
|
+ ct->holders--;
|
|
+ holders = ct->holders;
|
|
+ ct->thread = 0;
|
|
+ pthread_spin_unlock(&ct->hldr_lock);
|
|
+ if (!holders)
|
|
+ cleanup_context(ct);
|
|
+}
|
|
+
|
|
+void *rbd_thread(void *ctx)
|
|
+{
|
|
+ struct rbd_checker_context *ct = ctx;
|
|
+ int state;
|
|
+
|
|
+ condlog(3, "rbd%d thread starting up", ct->rbd_bus_id);
|
|
+
|
|
+ ct->message[0] = '\0';
|
|
+ /* This thread can be canceled, so setup clean up */
|
|
+ rbd_thread_cleanup_push(ct)
|
|
+
|
|
+ /* checker start up */
|
|
+ pthread_mutex_lock(&ct->lock);
|
|
+ ct->state = PATH_PENDING;
|
|
+ pthread_mutex_unlock(&ct->lock);
|
|
+
|
|
+ state = ct->fn(ct, ct->message);
|
|
+
|
|
+ /* checker done */
|
|
+ pthread_mutex_lock(&ct->lock);
|
|
+ ct->state = state;
|
|
+ pthread_mutex_unlock(&ct->lock);
|
|
+ pthread_cond_signal(&ct->active);
|
|
+
|
|
+ condlog(3, "rbd%d thead finished, state %s", ct->rbd_bus_id,
|
|
+ checker_state_name(state));
|
|
+ rbd_thread_cleanup_pop(ct);
|
|
+ return ((void *)0);
|
|
+}
|
|
+
|
|
+static void rbd_timeout(struct timespec *tsp)
|
|
+{
|
|
+ struct timeval now;
|
|
+
|
|
+ gettimeofday(&now, NULL);
|
|
+ tsp->tv_sec = now.tv_sec;
|
|
+ tsp->tv_nsec = now.tv_usec * 1000;
|
|
+ tsp->tv_nsec += 1000000; /* 1 millisecond */
|
|
+}
|
|
+
|
|
+static int rbd_exec_fn(struct checker *c, thread_fn *fn)
|
|
+{
|
|
+ struct rbd_checker_context *ct = c->context;
|
|
+ struct timespec tsp;
|
|
+ pthread_attr_t attr;
|
|
+ int rbd_status, r;
|
|
+
|
|
+ if (c->sync)
|
|
+ return rbd_check(ct, c->message);
|
|
+ /*
|
|
+ * Async mode
|
|
+ */
|
|
+ r = pthread_mutex_lock(&ct->lock);
|
|
+ if (r != 0) {
|
|
+ condlog(2, "rbd%d mutex lock failed with %d", ct->rbd_bus_id,
|
|
+ r);
|
|
+ MSG(c, "rbd%d thread failed to initialize", ct->rbd_bus_id);
|
|
+ return PATH_WILD;
|
|
+ }
|
|
+
|
|
+ if (ct->running) {
|
|
+ /* Check if checker is still running */
|
|
+ if (ct->thread) {
|
|
+ condlog(3, "rbd%d thread not finished", ct->rbd_bus_id);
|
|
+ rbd_status = PATH_PENDING;
|
|
+ } else {
|
|
+ /* checker done */
|
|
+ ct->running = 0;
|
|
+ rbd_status = ct->state;
|
|
+ strncpy(c->message, ct->message, CHECKER_MSG_LEN);
|
|
+ c->message[CHECKER_MSG_LEN - 1] = '\0';
|
|
+ }
|
|
+ pthread_mutex_unlock(&ct->lock);
|
|
+ } else {
|
|
+ /* Start new checker */
|
|
+ ct->state = PATH_UNCHECKED;
|
|
+ ct->fn = fn;
|
|
+ pthread_spin_lock(&ct->hldr_lock);
|
|
+ ct->holders++;
|
|
+ pthread_spin_unlock(&ct->hldr_lock);
|
|
+ setup_thread_attr(&attr, 32 * 1024, 1);
|
|
+ r = pthread_create(&ct->thread, &attr, rbd_thread, ct);
|
|
+ if (r) {
|
|
+ pthread_mutex_unlock(&ct->lock);
|
|
+ ct->thread = 0;
|
|
+ ct->holders--;
|
|
+ condlog(3, "rbd%d failed to start rbd thread, using sync mode",
|
|
+ ct->rbd_bus_id);
|
|
+ return fn(ct, c->message);
|
|
+ }
|
|
+ pthread_attr_destroy(&attr);
|
|
+ rbd_timeout(&tsp);
|
|
+ r = pthread_cond_timedwait(&ct->active, &ct->lock, &tsp);
|
|
+ rbd_status = ct->state;
|
|
+ strncpy(c->message, ct->message,CHECKER_MSG_LEN);
|
|
+ c->message[CHECKER_MSG_LEN -1] = '\0';
|
|
+ pthread_mutex_unlock(&ct->lock);
|
|
+
|
|
+ if (ct->thread &&
|
|
+ (rbd_status == PATH_PENDING || rbd_status == PATH_UNCHECKED)) {
|
|
+ condlog(3, "rbd%d thread still running",
|
|
+ ct->rbd_bus_id);
|
|
+ ct->running = 1;
|
|
+ rbd_status = PATH_PENDING;
|
|
+ }
|
|
+ }
|
|
+
|
|
+ return rbd_status;
|
|
+}
|
|
+
|
|
+void libcheck_repair(struct checker * c)
|
|
+{
|
|
+ struct rbd_checker_context *ct = c->context;
|
|
+
|
|
+ if (!ct || !ct->blacklisted)
|
|
+ return;
|
|
+ rbd_exec_fn(c, rbd_repair);
|
|
+}
|
|
+
|
|
+int libcheck_check(struct checker * c)
|
|
+{
|
|
+ struct rbd_checker_context *ct = c->context;
|
|
+
|
|
+ if (!ct)
|
|
+ return PATH_UNCHECKED;
|
|
+
|
|
+ if (ct->blacklisted)
|
|
+ return PATH_DOWN;
|
|
+
|
|
+ return rbd_exec_fn(c, rbd_check);
|
|
+}
|
|
Index: multipath-tools-130222/multipath.conf.annotated
|
|
===================================================================
|
|
--- multipath-tools-130222.orig/multipath.conf.annotated
|
|
+++ multipath-tools-130222/multipath.conf.annotated
|
|
@@ -97,7 +97,7 @@
|
|
# # scope : multipath & multipathd
|
|
# # desc : the default method used to determine the paths' state
|
|
# # values : readsector0|tur|emc_clariion|hp_sw|directio|rdac|
|
|
-# cciss_tur|hp_tur
|
|
+# cciss_tur|hp_tur|rbd
|
|
# # default : directio
|
|
# #
|
|
# path_checker directio
|
|
@@ -493,7 +493,7 @@
|
|
# # scope : multipathd & multipathd
|
|
# # desc : path checking algorithm to use to check path state
|
|
# # values : readsector0|tur|emc_clariion|hp_sw|directio|rdac|
|
|
-# # cciss_tur|hp_tur
|
|
+# # cciss_tur|hp_tur|rbd
|
|
# #
|
|
# path_checker directio
|
|
#
|
|
Index: multipath-tools-130222/multipath/multipath.conf.5
|
|
===================================================================
|
|
--- multipath-tools-130222.orig/multipath/multipath.conf.5
|
|
+++ multipath-tools-130222/multipath/multipath.conf.5
|
|
@@ -284,6 +284,9 @@ Check the path state for LSI/Engenio/Net
|
|
.B directio
|
|
Read the first sector with direct I/O.
|
|
.TP
|
|
+.B rbd
|
|
+Check if the path is in the Ceph blacklist.
|
|
+.TP
|
|
Default value is \fIdirectio\fR.
|
|
.RE
|
|
.TP
|