autobuild v6.0-3

Resolves: bz#1583585 bz#1671862 bz#1702686 bz#1703434 bz#1703753
Resolves: bz#1703897 bz#1704562 bz#1704769 bz#1704851 bz#1706683
Resolves: bz#1706776 bz#1706893
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
This commit is contained in:
Rinku Kothiya 2019-05-14 05:40:54 -04:00
parent 9e3a39e72d
commit d2b03be249
18 changed files with 2825 additions and 1 deletions

View File

@ -0,0 +1,75 @@
From 0cd08d9e89f5ee86d5f4f90f0ca5c07bd290636c Mon Sep 17 00:00:00 2001
From: Sanju Rakonde <srakonde@redhat.com>
Date: Fri, 26 Apr 2019 22:28:53 +0530
Subject: [PATCH 125/141] glusterd: define dumpops in the xlator_api of
glusterd
Problem: statedump is not capturing information related to glusterd
Solution: statdump is not capturing glusterd info because
trav->dumpops is null in gf_proc_dump_single_xlator_info ()
where trav is glusterd xlator object. trav->dumpops is null
because we missed to define dumpops in xlator_api of glusterd.
defining dumpops in xlator_api of glusterd fixes the issue.
> fixes: bz#1703629
> Change-Id: If85429ecb1ef580aced8d5b88d09fc15258bfc4c
> Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
upstream patch: https://review.gluster.org/#/c/glusterfs/+/22640/
BUG: 1703753
Change-Id: If85429ecb1ef580aced8d5b88d09fc15258bfc4c
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/169207
Tested-by: RHGS Build Bot <nigelb@redhat.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
---
tests/bugs/glusterd/optimized-basic-testcases.t | 13 +++++++++++++
xlators/mgmt/glusterd/src/glusterd.c | 1 +
2 files changed, 14 insertions(+)
diff --git a/tests/bugs/glusterd/optimized-basic-testcases.t b/tests/bugs/glusterd/optimized-basic-testcases.t
index dd98a65..d700b5e 100644
--- a/tests/bugs/glusterd/optimized-basic-testcases.t
+++ b/tests/bugs/glusterd/optimized-basic-testcases.t
@@ -32,6 +32,16 @@ function get_brick_host_uuid()
echo $host_uuid_list | awk '{print $1}'
}
+function generate_statedump_and_check_for_glusterd_info {
+ pid=`pidof glusterd`
+ #remove old stale statedumps
+ cleanup_statedump $pid
+ kill -USR1 $pid
+ #Wait till the statedump is generated
+ sleep 1
+ fname=$(ls $statedumpdir | grep -E "\.$pid\.dump\.")
+ cat $statedumpdir/$fname | grep "xlator.glusterd.priv" | wc -l
+}
cleanup;
@@ -279,4 +289,7 @@ mkdir -p /xyz/var/lib/glusterd/abc
TEST $CLI volume create "test" $H0:/xyz/var/lib/glusterd/abc
EXPECT 'Created' volinfo_field "test" 'Status';
+EXPECT "1" generate_statedump_and_check_for_glusterd_info
+
+cleanup_statedump `pidof glusterd`
cleanup
diff --git a/xlators/mgmt/glusterd/src/glusterd.c b/xlators/mgmt/glusterd/src/glusterd.c
index d4ab630..c0973cb 100644
--- a/xlators/mgmt/glusterd/src/glusterd.c
+++ b/xlators/mgmt/glusterd/src/glusterd.c
@@ -2231,6 +2231,7 @@ xlator_api_t xlator_api = {
.fini = fini,
.mem_acct_init = mem_acct_init,
.op_version = {1}, /* Present from the initial version */
+ .dumpops = &dumpops,
.fops = &fops,
.cbks = &cbks,
.options = options,
--
1.8.3.1

View File

@ -0,0 +1,663 @@
From 6565749c95e90f360a994bde1416cffd22cd8ce9 Mon Sep 17 00:00:00 2001
From: N Balachandran <nbalacha@redhat.com>
Date: Mon, 25 Mar 2019 15:56:56 +0530
Subject: [PATCH 126/141] cluster/dht: refactor dht lookup functions
Part 1: refactor the dht_lookup_dir_cbk
and dht_selfheal_directory functions.
Added a simple dht selfheal directory test
upstream: https://review.gluster.org/#/c/glusterfs/+/22407/
> Change-Id: I1410c26359e3c14b396adbe751937a52bd2fcff9
> updates: bz#1590385
Change-Id: Idd0a7df7122d634c371ecf30c0dbb94dc6063416
BUG: 1703897
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/169037
Tested-by: RHGS Build Bot <nigelb@redhat.com>
Reviewed-by: Susant Palai <spalai@redhat.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
---
tests/basic/distribute/dir-heal.t | 145 +++++++++++++++++++++++++++
xlators/cluster/dht/src/dht-common.c | 178 +++++++++++++++------------------
xlators/cluster/dht/src/dht-selfheal.c | 65 +++++++-----
3 files changed, 264 insertions(+), 124 deletions(-)
create mode 100644 tests/basic/distribute/dir-heal.t
diff --git a/tests/basic/distribute/dir-heal.t b/tests/basic/distribute/dir-heal.t
new file mode 100644
index 0000000..851f765
--- /dev/null
+++ b/tests/basic/distribute/dir-heal.t
@@ -0,0 +1,145 @@
+#!/bin/bash
+
+. $(dirname $0)/../../include.rc
+. $(dirname $0)/../../volume.rc
+. $(dirname $0)/../../nfs.rc
+. $(dirname $0)/../../common-utils.rc
+
+# Test 1 overview:
+# ----------------
+#
+# 1. Kill one brick of the volume.
+# 2. Create directories and change directory properties.
+# 3. Bring up the brick and access the directory
+# 4. Check the permissions and xattrs on the backend
+
+cleanup
+
+TEST glusterd
+TEST pidof glusterd
+
+TEST $CLI volume create $V0 $H0:$B0/$V0-{1..3}
+TEST $CLI volume start $V0
+
+# We want the lookup to reach DHT
+TEST $CLI volume set $V0 performance.stat-prefetch off
+
+# Mount using FUSE , kill a brick and create directories
+TEST glusterfs --entry-timeout=0 --attribute-timeout=0 -s $H0 --volfile-id $V0 $M0
+
+ls $M0/
+cd $M0
+
+TEST kill_brick $V0 $H0 $B0/$V0-1
+EXPECT_WITHIN $PROCESS_UP_TIMEOUT "0" brick_up_status $V0 $H0 $B0/$V0-1
+
+TEST mkdir dir{1..4}
+
+# No change for dir1
+# Change permissions for dir2
+# Set xattr on dir3
+# Change permissions and set xattr on dir4
+
+TEST chmod 777 $M0/dir2
+
+TEST setfattr -n "user.test" -v "test" $M0/dir3
+
+TEST chmod 777 $M0/dir4
+TEST setfattr -n "user.test" -v "test" $M0/dir4
+
+
+# Start all bricks
+
+TEST $CLI volume start $V0 force
+EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status $V0 $H0 $B0/$V0-1
+
+#$CLI volume status
+
+# It takes a while for the client to reconnect to the brick
+sleep 5
+
+stat $M0/dir* > /dev/null
+
+# Check that directories have been created on the brick that was killed
+
+TEST ls $B0/$V0-1/dir1
+
+TEST ls $B0/$V0-1/dir2
+EXPECT "777" stat -c "%a" $B0/$V0-1/dir2
+
+TEST ls $B0/$V0-1/dir3
+EXPECT "test" getfattr -n "user.test" --absolute-names --only-values $B0/$V0-1/dir3
+
+
+TEST ls $B0/$V0-1/dir4
+EXPECT "777" stat -c "%a" $B0/$V0-1/dir4
+EXPECT "test" getfattr -n "user.test" --absolute-names --only-values $B0/$V0-1/dir4
+
+
+TEST rm -rf $M0/*
+
+cd
+
+EXPECT_WITHIN $UMOUNT_TIMEOUT "Y" force_umount $M0
+
+
+# Test 2 overview:
+# ----------------
+# 1. Create directories with all bricks up.
+# 2. Kill a brick and change directory properties and set user xattr.
+# 2. Bring up the brick and access the directory
+# 3. Check the permissions and xattrs on the backend
+
+
+TEST glusterfs --entry-timeout=0 --attribute-timeout=0 -s $H0 --volfile-id $V0 $M0
+
+ls $M0/
+cd $M0
+TEST mkdir dir{1..4}
+
+TEST kill_brick $V0 $H0 $B0/$V0-1
+EXPECT_WITHIN $PROCESS_UP_TIMEOUT "0" brick_up_status $V0 $H0 $B0/$V0-1
+
+# No change for dir1
+# Change permissions for dir2
+# Set xattr on dir3
+# Change permissions and set xattr on dir4
+
+TEST chmod 777 $M0/dir2
+
+TEST setfattr -n "user.test" -v "test" $M0/dir3
+
+TEST chmod 777 $M0/dir4
+TEST setfattr -n "user.test" -v "test" $M0/dir4
+
+
+# Start all bricks
+
+TEST $CLI volume start $V0 force
+EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status $V0 $H0 $B0/$V0-1
+
+#$CLI volume status
+
+# It takes a while for the client to reconnect to the brick
+sleep 5
+
+stat $M0/dir* > /dev/null
+
+# Check directories on the brick that was killed
+
+TEST ls $B0/$V0-1/dir2
+EXPECT "777" stat -c "%a" $B0/$V0-1/dir2
+
+TEST ls $B0/$V0-1/dir3
+EXPECT "test" getfattr -n "user.test" --absolute-names --only-values $B0/$V0-1/dir3
+
+
+TEST ls $B0/$V0-1/dir4
+EXPECT "777" stat -c "%a" $B0/$V0-1/dir4
+EXPECT "test" getfattr -n "user.test" --absolute-names --only-values $B0/$V0-1/dir4
+cd
+
+
+# Cleanup
+cleanup
+
diff --git a/xlators/cluster/dht/src/dht-common.c b/xlators/cluster/dht/src/dht-common.c
index 2a68193..d3e900c 100644
--- a/xlators/cluster/dht/src/dht-common.c
+++ b/xlators/cluster/dht/src/dht-common.c
@@ -801,9 +801,8 @@ dht_common_mark_mdsxattr(call_frame_t *frame, int *errst,
call_frame_t *xattr_frame = NULL;
gf_boolean_t vol_down = _gf_false;
- this = frame->this;
-
GF_VALIDATE_OR_GOTO("dht", frame, out);
+ this = frame->this;
GF_VALIDATE_OR_GOTO("dht", this, out);
GF_VALIDATE_OR_GOTO(this->name, frame->local, out);
GF_VALIDATE_OR_GOTO(this->name, this->private, out);
@@ -812,6 +811,7 @@ dht_common_mark_mdsxattr(call_frame_t *frame, int *errst,
conf = this->private;
layout = local->selfheal.layout;
local->mds_heal_fresh_lookup = mark_during_fresh_lookup;
+
gf_uuid_unparse(local->gfid, gfid_local);
/* Code to update hashed subvol consider as a mds subvol
@@ -1240,6 +1240,31 @@ out:
}
int
+dht_needs_selfheal(call_frame_t *frame, xlator_t *this)
+{
+ dht_local_t *local = NULL;
+ dht_layout_t *layout = NULL;
+ int needs_selfheal = 0;
+ int ret = 0;
+
+ local = frame->local;
+ layout = local->layout;
+
+ if (local->need_attrheal || local->need_xattr_heal ||
+ local->need_selfheal) {
+ needs_selfheal = 1;
+ }
+
+ ret = dht_layout_normalize(this, &local->loc, layout);
+
+ if (ret != 0) {
+ gf_msg_debug(this->name, 0, "fixing assignment on %s", local->loc.path);
+ needs_selfheal = 1;
+ }
+ return needs_selfheal;
+}
+
+int
dht_lookup_dir_cbk(call_frame_t *frame, void *cookie, xlator_t *this,
int op_ret, int op_errno, inode_t *inode, struct iatt *stbuf,
dict_t *xattr, struct iatt *postparent)
@@ -1256,8 +1281,6 @@ dht_lookup_dir_cbk(call_frame_t *frame, void *cookie, xlator_t *this,
char gfid_local[GF_UUID_BUF_SIZE] = {0};
char gfid_node[GF_UUID_BUF_SIZE] = {0};
int32_t mds_xattr_val[1] = {0};
- call_frame_t *copy = NULL;
- dht_local_t *copy_local = NULL;
GF_VALIDATE_OR_GOTO("dht", frame, out);
GF_VALIDATE_OR_GOTO("dht", this, out);
@@ -1270,7 +1293,11 @@ dht_lookup_dir_cbk(call_frame_t *frame, void *cookie, xlator_t *this,
conf = this->private;
layout = local->layout;
+ gf_msg_debug(this->name, op_errno,
+ "%s: lookup on %s returned with op_ret = %d, op_errno = %d",
+ local->loc.path, prev->name, op_ret, op_errno);
+ /* The first successful lookup*/
if (!op_ret && gf_uuid_is_null(local->gfid)) {
memcpy(local->gfid, stbuf->ia_gfid, 16);
}
@@ -1298,13 +1325,10 @@ dht_lookup_dir_cbk(call_frame_t *frame, void *cookie, xlator_t *this,
if (op_ret == -1) {
local->op_errno = op_errno;
- gf_msg_debug(this->name, op_errno,
- "%s: lookup on %s returned error", local->loc.path,
- prev->name);
/* The GFID is missing on this subvol. Force a heal. */
if (op_errno == ENODATA) {
- local->need_selfheal = 1;
+ local->need_lookup_everywhere = 1;
}
goto unlock;
}
@@ -1312,12 +1336,11 @@ dht_lookup_dir_cbk(call_frame_t *frame, void *cookie, xlator_t *this,
is_dir = check_is_dir(inode, stbuf, xattr);
if (!is_dir) {
gf_msg_debug(this->name, 0,
- "lookup of %s on %s returned non"
- "dir 0%o"
+ "%s: lookup on %s returned non dir 0%o"
"calling lookup_everywhere",
local->loc.path, prev->name, stbuf->ia_type);
- local->need_selfheal = 1;
+ local->need_lookup_everywhere = 1;
goto unlock;
}
@@ -1328,14 +1351,8 @@ dht_lookup_dir_cbk(call_frame_t *frame, void *cookie, xlator_t *this,
dht_aggregate_xattr(local->xattr, xattr);
}
- if (dict_get(xattr, conf->mds_xattr_key)) {
- local->mds_subvol = prev;
- local->mds_stbuf.ia_gid = stbuf->ia_gid;
- local->mds_stbuf.ia_uid = stbuf->ia_uid;
- local->mds_stbuf.ia_prot = stbuf->ia_prot;
- }
-
if (local->stbuf.ia_type != IA_INVAL) {
+ /* This is not the first subvol to respond */
if (!__is_root_gfid(stbuf->ia_gfid) &&
((local->stbuf.ia_gid != stbuf->ia_gid) ||
(local->stbuf.ia_uid != stbuf->ia_uid) ||
@@ -1348,65 +1365,64 @@ dht_lookup_dir_cbk(call_frame_t *frame, void *cookie, xlator_t *this,
if (local->inode == NULL)
local->inode = inode_ref(inode);
+ /* This could be a problem */
dht_iatt_merge(this, &local->stbuf, stbuf);
dht_iatt_merge(this, &local->postparent, postparent);
if (!dict_get(xattr, conf->mds_xattr_key)) {
gf_msg_debug(this->name, 0,
- "Internal xattr %s is not present "
- " on path %s gfid is %s ",
- conf->mds_xattr_key, local->loc.path, gfid_local);
+ "%s: mds xattr %s is not present "
+ "on %s(gfid = %s)",
+ local->loc.path, conf->mds_xattr_key, prev->name,
+ gfid_local);
goto unlock;
- } else {
- /* Save mds subvol on inode ctx */
- ret = dht_inode_ctx_mdsvol_set(local->inode, this, prev);
- if (ret) {
- gf_msg(this->name, GF_LOG_ERROR, 0,
- DHT_MSG_SET_INODE_CTX_FAILED,
- "Failed to set hashed subvol for %s vol is %s",
- local->loc.path, prev->name);
- }
+ }
+
+ local->mds_subvol = prev;
+ local->mds_stbuf = *stbuf;
+
+ /* Save mds subvol on inode ctx */
+
+ ret = dht_inode_ctx_mdsvol_set(local->inode, this, prev);
+ if (ret) {
+ gf_msg(this->name, GF_LOG_ERROR, 0, DHT_MSG_SET_INODE_CTX_FAILED,
+ "%s: Failed to set mds (%s)", local->loc.path, prev->name);
}
check_mds = dht_dict_get_array(xattr, conf->mds_xattr_key,
mds_xattr_val, 1, &errst);
if ((check_mds < 0) && !errst) {
local->mds_xattr = dict_ref(xattr);
gf_msg_debug(this->name, 0,
- "Value of %s is not zero on hashed subvol "
- "so xattr needs to be heal on non hashed"
- " path is %s and vol name is %s "
- " gfid is %s",
- conf->mds_xattr_key, local->loc.path, prev->name,
+ "%s: %s is not zero on %s. Xattrs need to be healed."
+ "(gfid = %s)",
+ local->loc.path, conf->mds_xattr_key, prev->name,
gfid_local);
local->need_xattr_heal = 1;
- local->mds_subvol = prev;
}
}
+
unlock:
UNLOCK(&frame->lock);
this_call_cnt = dht_frame_return(frame);
if (is_last_call(this_call_cnt)) {
+ /* If the mds subvol is not set correctly*/
+ if (!__is_root_gfid(local->gfid) &&
+ (!dict_get(local->xattr, conf->mds_xattr_key))) {
+ local->need_selfheal = 1;
+ }
+
/* No need to call xattr heal code if volume count is 1
*/
- if (conf->subvolume_cnt == 1)
+ if (conf->subvolume_cnt == 1) {
local->need_xattr_heal = 0;
-
- /* Code to update all extended attributed from hashed subvol
- to local->xattr
- */
- if (local->need_xattr_heal && (local->mds_xattr)) {
- dht_dir_set_heal_xattr(this, local, local->xattr, local->mds_xattr,
- NULL, NULL);
- dict_unref(local->mds_xattr);
- local->mds_xattr = NULL;
}
- if (local->need_selfheal) {
- local->need_selfheal = 0;
+ if (local->need_selfheal || local->need_lookup_everywhere) {
/* Set the gfid-req so posix will set the GFID*/
if (!gf_uuid_is_null(local->gfid)) {
+ /* Ok, this should _never_ happen */
ret = dict_set_static_bin(local->xattr_req, "gfid-req",
local->gfid, 16);
} else {
@@ -1414,73 +1430,36 @@ unlock:
ret = dict_set_static_bin(local->xattr_req, "gfid-req",
local->gfid_req, 16);
}
+ }
+
+ if (local->need_lookup_everywhere) {
+ local->need_lookup_everywhere = 0;
dht_lookup_everywhere(frame, this, &local->loc);
return 0;
}
if (local->op_ret == 0) {
- ret = dht_layout_normalize(this, &local->loc, layout);
-
- if (ret != 0) {
- gf_msg_debug(this->name, 0, "fixing assignment on %s",
- local->loc.path);
+ if (dht_needs_selfheal(frame, this)) {
goto selfheal;
}
dht_layout_set(this, local->inode, layout);
- if (!dict_get(local->xattr, conf->mds_xattr_key) ||
- local->need_xattr_heal)
- goto selfheal;
- }
-
- if (local->inode) {
- dht_inode_ctx_time_update(local->inode, this, &local->stbuf, 1);
- }
-
- if (local->loc.parent) {
- dht_inode_ctx_time_update(local->loc.parent, this,
- &local->postparent, 1);
- }
-
- if (local->need_attrheal) {
- local->need_attrheal = 0;
- if (!__is_root_gfid(inode->gfid)) {
- local->stbuf.ia_gid = local->mds_stbuf.ia_gid;
- local->stbuf.ia_uid = local->mds_stbuf.ia_uid;
- local->stbuf.ia_prot = local->mds_stbuf.ia_prot;
+ if (local->inode) {
+ dht_inode_ctx_time_update(local->inode, this, &local->stbuf, 1);
}
- copy = create_frame(this, this->ctx->pool);
- if (copy) {
- copy_local = dht_local_init(copy, &local->loc, NULL, 0);
- if (!copy_local) {
- DHT_STACK_DESTROY(copy);
- goto skip_attr_heal;
- }
- copy_local->stbuf = local->stbuf;
- gf_uuid_copy(copy_local->loc.gfid, local->stbuf.ia_gfid);
- copy_local->mds_stbuf = local->mds_stbuf;
- copy_local->mds_subvol = local->mds_subvol;
- copy->local = copy_local;
- FRAME_SU_DO(copy, dht_local_t);
- ret = synctask_new(this->ctx->env, dht_dir_attr_heal,
- dht_dir_attr_heal_done, copy, copy);
- if (ret) {
- gf_msg(this->name, GF_LOG_ERROR, ENOMEM,
- DHT_MSG_DIR_ATTR_HEAL_FAILED,
- "Synctask creation failed to heal attr "
- "for path %s gfid %s ",
- local->loc.path, local->gfid);
- DHT_STACK_DESTROY(copy);
- }
+
+ if (local->loc.parent) {
+ dht_inode_ctx_time_update(local->loc.parent, this,
+ &local->postparent, 1);
}
}
- skip_attr_heal:
DHT_STRIP_PHASE1_FLAGS(&local->stbuf);
dht_set_fixed_dir_stat(&local->postparent);
/* Delete mds xattr at the time of STACK UNWIND */
if (local->xattr)
GF_REMOVE_INTERNAL_XATTR(conf->mds_xattr_key, local->xattr);
+
DHT_STACK_UNWIND(lookup, frame, local->op_ret, local->op_errno,
local->inode, &local->stbuf, local->xattr,
&local->postparent);
@@ -5444,9 +5423,8 @@ dht_dir_common_set_remove_xattr(call_frame_t *frame, xlator_t *this, loc_t *loc,
} else {
gf_msg(this->name, GF_LOG_ERROR, 0,
DHT_MSG_HASHED_SUBVOL_GET_FAILED,
- "Failed to get mds subvol for path %s"
- "gfid is %s ",
- loc->path, gfid_local);
+ "%s: Failed to get mds subvol. (gfid is %s)", loc->path,
+ gfid_local);
}
(*op_errno) = ENOENT;
goto err;
diff --git a/xlators/cluster/dht/src/dht-selfheal.c b/xlators/cluster/dht/src/dht-selfheal.c
index bd1b7ea..5420fca 100644
--- a/xlators/cluster/dht/src/dht-selfheal.c
+++ b/xlators/cluster/dht/src/dht-selfheal.c
@@ -1033,18 +1033,27 @@ dht_selfheal_dir_setattr(call_frame_t *frame, loc_t *loc, struct iatt *stbuf,
int missing_attr = 0;
int i = 0, ret = -1;
dht_local_t *local = NULL;
+ dht_conf_t *conf = NULL;
xlator_t *this = NULL;
int cnt = 0;
local = frame->local;
this = frame->this;
+ conf = this->private;
+
+ /* We need to heal the attrs if:
+ * 1. Any directories were missing - the newly created dirs will need
+ * to have the correct attrs set
+ * 2. An existing dir does not have the correct permissions -they may
+ * have been changed when a brick was down.
+ */
for (i = 0; i < layout->cnt; i++) {
if (layout->list[i].err == -1)
missing_attr++;
}
- if (missing_attr == 0) {
+ if ((missing_attr == 0) && (local->need_attrheal == 0)) {
if (!local->heal_layout) {
gf_msg_trace(this->name, 0, "Skip heal layout for %s gfid = %s ",
loc->path, uuid_utoa(loc->gfid));
@@ -1062,19 +1071,12 @@ dht_selfheal_dir_setattr(call_frame_t *frame, loc_t *loc, struct iatt *stbuf,
return 0;
}
- local->call_cnt = missing_attr;
- cnt = layout->cnt;
+ cnt = local->call_cnt = conf->subvolume_cnt;
for (i = 0; i < cnt; i++) {
- if (layout->list[i].err == -1) {
- gf_msg_trace(this->name, 0, "%s: setattr on subvol %s, gfid = %s",
- loc->path, layout->list[i].xlator->name,
- uuid_utoa(loc->gfid));
-
- STACK_WIND(
- frame, dht_selfheal_dir_setattr_cbk, layout->list[i].xlator,
- layout->list[i].xlator->fops->setattr, loc, stbuf, valid, NULL);
- }
+ STACK_WIND(frame, dht_selfheal_dir_setattr_cbk, layout->list[i].xlator,
+ layout->list[i].xlator->fops->setattr, loc, stbuf, valid,
+ NULL);
}
return 0;
@@ -1492,6 +1494,9 @@ dht_selfheal_dir_mkdir(call_frame_t *frame, loc_t *loc, dht_layout_t *layout,
}
if (missing_dirs == 0) {
+ /* We don't need to create any directories. Proceed to heal the
+ * attrs and xattrs
+ */
if (!__is_root_gfid(local->stbuf.ia_gfid)) {
if (local->need_xattr_heal) {
local->need_xattr_heal = 0;
@@ -1499,8 +1504,8 @@ dht_selfheal_dir_mkdir(call_frame_t *frame, loc_t *loc, dht_layout_t *layout,
if (ret)
gf_msg(this->name, GF_LOG_ERROR, ret,
DHT_MSG_DIR_XATTR_HEAL_FAILED,
- "xattr heal failed for "
- "directory %s gfid %s ",
+ "%s:xattr heal failed for "
+ "directory (gfid = %s)",
local->loc.path, local->gfid);
} else {
if (!gf_uuid_is_null(local->gfid))
@@ -1512,8 +1517,8 @@ dht_selfheal_dir_mkdir(call_frame_t *frame, loc_t *loc, dht_layout_t *layout,
gf_msg(this->name, GF_LOG_INFO, 0,
DHT_MSG_DIR_XATTR_HEAL_FAILED,
- "Failed to set mds xattr "
- "for directory %s gfid %s ",
+ "%s: Failed to set mds xattr "
+ "for directory (gfid = %s)",
local->loc.path, local->gfid);
}
}
@@ -2085,10 +2090,10 @@ dht_selfheal_directory(call_frame_t *frame, dht_selfheal_dir_cbk_t dir_cbk,
loc_t *loc, dht_layout_t *layout)
{
dht_local_t *local = NULL;
+ xlator_t *this = NULL;
uint32_t down = 0;
uint32_t misc = 0;
int ret = 0;
- xlator_t *this = NULL;
char pgfid[GF_UUID_BUF_SIZE] = {0};
char gfid[GF_UUID_BUF_SIZE] = {0};
inode_t *linked_inode = NULL, *inode = NULL;
@@ -2099,6 +2104,11 @@ dht_selfheal_directory(call_frame_t *frame, dht_selfheal_dir_cbk_t dir_cbk,
local->selfheal.dir_cbk = dir_cbk;
local->selfheal.layout = dht_layout_ref(this, layout);
+ if (local->need_attrheal && !IA_ISINVAL(local->mds_stbuf.ia_type)) {
+ /*Use the one in the mds_stbuf*/
+ local->stbuf = local->mds_stbuf;
+ }
+
if (!__is_root_gfid(local->stbuf.ia_gfid)) {
gf_uuid_unparse(local->stbuf.ia_gfid, gfid);
gf_uuid_unparse(loc->parent->gfid, pgfid);
@@ -2118,6 +2128,13 @@ dht_selfheal_directory(call_frame_t *frame, dht_selfheal_dir_cbk_t dir_cbk,
inode_unref(inode);
}
+ if (local->need_xattr_heal && (local->mds_xattr)) {
+ dht_dir_set_heal_xattr(this, local, local->xattr, local->mds_xattr,
+ NULL, NULL);
+ dict_unref(local->mds_xattr);
+ local->mds_xattr = NULL;
+ }
+
dht_layout_anomalies(this, loc, layout, &local->selfheal.hole_cnt,
&local->selfheal.overlaps_cnt,
&local->selfheal.missing_cnt, &local->selfheal.down,
@@ -2128,18 +2145,18 @@ dht_selfheal_directory(call_frame_t *frame, dht_selfheal_dir_cbk_t dir_cbk,
if (down) {
gf_msg(this->name, GF_LOG_WARNING, 0, DHT_MSG_DIR_SELFHEAL_FAILED,
- "Directory selfheal failed: %d subvolumes down."
- "Not fixing. path = %s, gfid = %s",
- down, loc->path, gfid);
+ "%s: Directory selfheal failed: %d subvolumes down."
+ "Not fixing. gfid = %s",
+ loc->path, down, gfid);
ret = 0;
goto sorry_no_fix;
}
if (misc) {
gf_msg(this->name, GF_LOG_WARNING, 0, DHT_MSG_DIR_SELFHEAL_FAILED,
- "Directory selfheal failed : %d subvolumes "
- "have unrecoverable errors. path = %s, gfid = %s",
- misc, loc->path, gfid);
+ "%s: Directory selfheal failed : %d subvolumes "
+ "have unrecoverable errors. gfid = %s",
+ loc->path, misc, gfid);
ret = 0;
goto sorry_no_fix;
@@ -2369,13 +2386,13 @@ dht_dir_attr_heal(void *data)
frame = data;
local = frame->local;
- mds_subvol = local->mds_subvol;
this = frame->this;
GF_VALIDATE_OR_GOTO("dht", this, out);
GF_VALIDATE_OR_GOTO("dht", local, out);
conf = this->private;
GF_VALIDATE_OR_GOTO("dht", conf, out);
+ mds_subvol = local->mds_subvol;
call_cnt = conf->subvolume_cnt;
if (!__is_root_gfid(local->stbuf.ia_gfid) && (!mds_subvol)) {
--
1.8.3.1

View File

@ -0,0 +1,200 @@
From 884ba13ee47888b5de9b6d6acaf051e895f55053 Mon Sep 17 00:00:00 2001
From: N Balachandran <nbalacha@redhat.com>
Date: Wed, 10 Apr 2019 14:28:55 +0530
Subject: [PATCH 127/141] cluster/dht: Refactor dht lookup functions
Part 2: Modify dht_revalidate_cbk to call
dht_selfheal_directory instead of separate calls
to heal attrs and xattrs.
upstream: https://review.gluster.org/#/c/glusterfs/+/22542/
> Change-Id: Id41ac6c4220c2c35484812bbfc6157fc3c86b142
> updates: bz#1590385
Change-Id: Id53962306dd142efc741de838b585fa5c78f9b1f
BUG:1703897
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/169038
Tested-by: RHGS Build Bot <nigelb@redhat.com>
Reviewed-by: Susant Palai <spalai@redhat.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
---
xlators/cluster/dht/src/dht-common.c | 104 ++++++++++-------------------------
1 file changed, 30 insertions(+), 74 deletions(-)
diff --git a/xlators/cluster/dht/src/dht-common.c b/xlators/cluster/dht/src/dht-common.c
index d3e900c..183872f 100644
--- a/xlators/cluster/dht/src/dht-common.c
+++ b/xlators/cluster/dht/src/dht-common.c
@@ -1365,7 +1365,6 @@ dht_lookup_dir_cbk(call_frame_t *frame, void *cookie, xlator_t *this,
if (local->inode == NULL)
local->inode = inode_ref(inode);
- /* This could be a problem */
dht_iatt_merge(this, &local->stbuf, stbuf);
dht_iatt_merge(this, &local->postparent, postparent);
@@ -1509,8 +1508,6 @@ dht_revalidate_cbk(call_frame_t *frame, void *cookie, xlator_t *this,
int is_dir = 0;
int is_linkfile = 0;
int follow_link = 0;
- call_frame_t *copy = NULL;
- dht_local_t *copy_local = NULL;
char gfid[GF_UUID_BUF_SIZE] = {0};
uint32_t vol_commit_hash = 0;
xlator_t *subvol = NULL;
@@ -1538,17 +1535,16 @@ dht_revalidate_cbk(call_frame_t *frame, void *cookie, xlator_t *this,
gf_uuid_unparse(local->loc.gfid, gfid);
+ gf_msg_debug(this->name, op_errno,
+ "%s: revalidate lookup on %s returned op_ret %d",
+ local->loc.path, prev->name, op_ret);
+
LOCK(&frame->lock);
{
if (gf_uuid_is_null(local->gfid)) {
memcpy(local->gfid, local->loc.gfid, 16);
}
- gf_msg_debug(this->name, op_errno,
- "revalidate lookup of %s "
- "returned with op_ret %d",
- local->loc.path, op_ret);
-
if (op_ret == -1) {
local->op_errno = op_errno;
@@ -1580,6 +1576,8 @@ dht_revalidate_cbk(call_frame_t *frame, void *cookie, xlator_t *this,
local->loc.path);
local->need_lookup_everywhere = 1;
+ } else if (IA_ISDIR(local->loc.inode->ia_type)) {
+ local->need_selfheal = 1;
}
}
@@ -1638,15 +1636,16 @@ dht_revalidate_cbk(call_frame_t *frame, void *cookie, xlator_t *this,
(local->stbuf.ia_uid != stbuf->ia_uid) ||
is_permission_different(&local->stbuf.ia_prot,
&stbuf->ia_prot)) {
- local->need_selfheal = 1;
+ local->need_attrheal = 1;
}
}
if (!dict_get(xattr, conf->mds_xattr_key)) {
gf_msg_debug(this->name, 0,
- "internal xattr %s is not present"
- " on path %s gfid is %s ",
- conf->mds_xattr_key, local->loc.path, gfid);
+ "%s: internal xattr %s is not present"
+ " on subvol %s(gfid is %s)",
+ local->loc.path, conf->mds_xattr_key, prev->name,
+ gfid);
} else {
check_mds = dht_dict_get_array(xattr, conf->mds_xattr_key,
mds_xattr_val, 1, &errst);
@@ -1734,71 +1733,28 @@ unlock:
local->need_xattr_heal = 0;
if (IA_ISDIR(local->stbuf.ia_type)) {
- /* Code to update all extended attributed from hashed
- subvol to local->xattr and call heal code to heal
- custom xattr from hashed subvol to non-hashed subvol
- */
- if (local->need_xattr_heal && (local->mds_xattr)) {
- dht_dir_set_heal_xattr(this, local, local->xattr,
- local->mds_xattr, NULL, NULL);
- dict_unref(local->mds_xattr);
- local->mds_xattr = NULL;
- local->need_xattr_heal = 0;
- ret = dht_dir_xattr_heal(this, local);
- if (ret)
- gf_msg(this->name, GF_LOG_ERROR, ret,
- DHT_MSG_DIR_XATTR_HEAL_FAILED,
- "xattr heal failed for directory %s "
- " gfid %s ",
- local->loc.path, gfid);
- } else {
- /* Call function to save hashed subvol on inode
- ctx if internal mds xattr is not present and
- all subvols are up
- */
- if (inode && !__is_root_gfid(inode->gfid) && (!local->op_ret))
- (void)dht_common_mark_mdsxattr(frame, NULL, 1);
- }
- }
- if (local->need_selfheal) {
- local->need_selfheal = 0;
- if (!__is_root_gfid(inode->gfid)) {
- gf_uuid_copy(local->gfid, local->mds_stbuf.ia_gfid);
- local->stbuf.ia_gid = local->mds_stbuf.ia_gid;
- local->stbuf.ia_uid = local->mds_stbuf.ia_uid;
- local->stbuf.ia_prot = local->mds_stbuf.ia_prot;
- } else {
- gf_uuid_copy(local->gfid, local->stbuf.ia_gfid);
- local->stbuf.ia_gid = local->prebuf.ia_gid;
- local->stbuf.ia_uid = local->prebuf.ia_uid;
- local->stbuf.ia_prot = local->prebuf.ia_prot;
- }
+ if (!__is_root_gfid(local->loc.inode->gfid) &&
+ (!dict_get(local->xattr, conf->mds_xattr_key)))
+ local->need_selfheal = 1;
- copy = create_frame(this, this->ctx->pool);
- if (copy) {
- copy_local = dht_local_init(copy, &local->loc, NULL, 0);
- if (!copy_local) {
- DHT_STACK_DESTROY(copy);
- goto cont;
- }
- copy_local->stbuf = local->stbuf;
- copy_local->mds_stbuf = local->mds_stbuf;
- copy_local->mds_subvol = local->mds_subvol;
- copy->local = copy_local;
- FRAME_SU_DO(copy, dht_local_t);
- ret = synctask_new(this->ctx->env, dht_dir_attr_heal,
- dht_dir_attr_heal_done, copy, copy);
- if (ret) {
- gf_msg(this->name, GF_LOG_ERROR, ENOMEM,
- DHT_MSG_DIR_ATTR_HEAL_FAILED,
- "Synctask creation failed to heal attr "
- "for path %s gfid %s ",
- local->loc.path, local->gfid);
- DHT_STACK_DESTROY(copy);
+ if (dht_needs_selfheal(frame, this)) {
+ if (!__is_root_gfid(local->loc.inode->gfid)) {
+ local->stbuf.ia_gid = local->mds_stbuf.ia_gid;
+ local->stbuf.ia_uid = local->mds_stbuf.ia_uid;
+ local->stbuf.ia_prot = local->mds_stbuf.ia_prot;
+ } else {
+ local->stbuf.ia_gid = local->prebuf.ia_gid;
+ local->stbuf.ia_uid = local->prebuf.ia_uid;
+ local->stbuf.ia_prot = local->prebuf.ia_prot;
}
+
+ layout = local->layout;
+ dht_selfheal_directory(frame, dht_lookup_selfheal_cbk,
+ &local->loc, layout);
+ return 0;
}
}
- cont:
+
if (local->layout_mismatch) {
/* Found layout mismatch in the directory, need to
fix this in the inode context */
@@ -1814,7 +1770,7 @@ unlock:
dht_layout_unref(this, local->layout);
local->layout = NULL;
- /* We know that current cached subvol is no more
+ /* We know that current cached subvol is no longer
valid, get the new one */
local->cached_subvol = NULL;
if (local->xattr_req) {
--
1.8.3.1

View File

@ -0,0 +1,86 @@
From bb39abc1dab3c7b7b725f9eefe119218e94f610b Mon Sep 17 00:00:00 2001
From: Mohit Agrawal <moagrawal@redhat.com>
Date: Mon, 29 Apr 2019 18:48:36 +0530
Subject: [PATCH 128/141] glusterd: Fix bulkvoldict thread logic in brick
multiplexing
Problem: Currently glusterd spawn bulkvoldict in brick_mux
environment while no. of volumes are less than configured
glusterd.vol_count_per_thread
Solution: Correct the logic to spawn bulkvoldict thread
1) Calculate endindex only while total thread is non zero
2) Update end index correctly to pass index for bulkvoldict
thread
> Fixes: bz#1704252
> Change-Id: I1def847fbdd6a605e7687bfc4e42b706bf0eb70b
> (Cherry picked from commit ac70f66c5805e10b3a1072bd467918730c0aeeb4)
> (Reviewed on upstream link https://review.gluster.org/#/c/glusterfs/+/22647/)
BUG: 1704769
Change-Id: I1def847fbdd6a605e7687bfc4e42b706bf0eb70b
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/169091
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Tested-by: RHGS Build Bot <nigelb@redhat.com>
---
xlators/mgmt/glusterd/src/glusterd-utils.c | 24 ++++++++++++++++++------
1 file changed, 18 insertions(+), 6 deletions(-)
diff --git a/xlators/mgmt/glusterd/src/glusterd-utils.c b/xlators/mgmt/glusterd/src/glusterd-utils.c
index ff6102b..efa5a86 100644
--- a/xlators/mgmt/glusterd/src/glusterd-utils.c
+++ b/xlators/mgmt/glusterd/src/glusterd-utils.c
@@ -3436,9 +3436,19 @@ glusterd_add_bulk_volumes_create_thread(void *data)
cds_list_for_each_entry(volinfo, &priv->volumes, vol_list)
{
count++;
- if ((count < start) || (count > end))
+
+ /* Skip volumes if index count is less than start
+ index to handle volume for specific thread
+ */
+ if (count < start)
continue;
+ /* No need to process volume if index count is greater
+ than end index
+ */
+ if (count > end)
+ break;
+
ret = glusterd_add_volume_to_dict(volinfo, dict, count, "volume");
if (ret)
goto out;
@@ -3499,9 +3509,11 @@ glusterd_add_volumes_to_export_dict(dict_t **peer_data)
totthread = 0;
} else {
totthread = volcnt / vol_per_thread_limit;
- endindex = volcnt % vol_per_thread_limit;
- if (endindex)
- totthread++;
+ if (totthread) {
+ endindex = volcnt % vol_per_thread_limit;
+ if (endindex)
+ totthread++;
+ }
}
if (totthread == 0) {
@@ -3527,10 +3539,10 @@ glusterd_add_volumes_to_export_dict(dict_t **peer_data)
arg->this = this;
arg->voldict = dict_arr[i];
arg->start = start;
- if (!endindex) {
+ if ((i + 1) != totthread) {
arg->end = ((i + 1) * vol_per_thread_limit);
} else {
- arg->end = (start + endindex);
+ arg->end = ((i * vol_per_thread_limit) + endindex);
}
th_ret = gf_thread_create_detached(
&th_id, glusterd_add_bulk_volumes_create_thread, arg,
--
1.8.3.1

View File

@ -0,0 +1,401 @@
From f305ee93ec9dbbd679e1eb58c7c0bf8d9b5659d5 Mon Sep 17 00:00:00 2001
From: Xavi Hernandez <xhernandez@redhat.com>
Date: Fri, 12 Apr 2019 13:40:59 +0200
Subject: [PATCH 129/141] core: handle memory accounting correctly
When a translator stops, memory accounting for that translator is not
destroyed (because there could remain memory allocated that references
it), but mutexes that coordinate updates of memory accounting were
destroyed. This caused incorrect memory accounting and even crashes in
debug mode.
This patch also fixes some other things:
* Reduce the number of atomic operations needed to manage memory
accounting.
* Correctly account memory when realloc() is used.
* Merge two critical sections into one.
* Cleaned the code a bit.
Upstream patch:
> Change-Id: Id5eaee7338729b9bc52c931815ca3ff1e5a7dcc8
> Upstream patch link : https://review.gluster.org/#/c/glusterfs/+/22554/
> BUG: 1659334
> Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
Change-Id: Id5eaee7338729b9bc52c931815ca3ff1e5a7dcc8
Fixes: bz#1702270
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/169325
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Tested-by: RHGS Build Bot <nigelb@redhat.com>
---
libglusterfs/src/glusterfs/xlator.h | 2 +
libglusterfs/src/libglusterfs.sym | 1 +
libglusterfs/src/mem-pool.c | 193 ++++++++++++++++--------------------
libglusterfs/src/xlator.c | 23 +++--
4 files changed, 105 insertions(+), 114 deletions(-)
diff --git a/libglusterfs/src/glusterfs/xlator.h b/libglusterfs/src/glusterfs/xlator.h
index 06152ec..8998976 100644
--- a/libglusterfs/src/glusterfs/xlator.h
+++ b/libglusterfs/src/glusterfs/xlator.h
@@ -1035,6 +1035,8 @@ gf_boolean_t
loc_is_nameless(loc_t *loc);
int
xlator_mem_acct_init(xlator_t *xl, int num_types);
+void
+xlator_mem_acct_unref(struct mem_acct *mem_acct);
int
is_gf_log_command(xlator_t *trans, const char *name, char *value);
int
diff --git a/libglusterfs/src/libglusterfs.sym b/libglusterfs/src/libglusterfs.sym
index fa2025e..cf5757c 100644
--- a/libglusterfs/src/libglusterfs.sym
+++ b/libglusterfs/src/libglusterfs.sym
@@ -1093,6 +1093,7 @@ xlator_foreach
xlator_foreach_depth_first
xlator_init
xlator_mem_acct_init
+xlator_mem_acct_unref
xlator_notify
xlator_option_info_list
xlator_option_init_bool
diff --git a/libglusterfs/src/mem-pool.c b/libglusterfs/src/mem-pool.c
index 34cb87a..3934a78 100644
--- a/libglusterfs/src/mem-pool.c
+++ b/libglusterfs/src/mem-pool.c
@@ -35,61 +35,92 @@ gf_mem_acct_enable_set(void *data)
return;
}
-int
-gf_mem_set_acct_info(xlator_t *xl, char **alloc_ptr, size_t size, uint32_t type,
- const char *typestr)
+static void *
+gf_mem_header_prepare(struct mem_header *header, size_t size)
{
- void *ptr = NULL;
- struct mem_header *header = NULL;
+ void *ptr;
- if (!alloc_ptr)
- return -1;
+ header->size = size;
- ptr = *alloc_ptr;
+ ptr = header + 1;
- GF_ASSERT(xl != NULL);
+ /* data follows in this gap of 'size' bytes */
+ *(uint32_t *)(ptr + size) = GF_MEM_TRAILER_MAGIC;
- GF_ASSERT(xl->mem_acct != NULL);
+ return ptr;
+}
- GF_ASSERT(type <= xl->mem_acct->num_types);
+static void *
+gf_mem_set_acct_info(struct mem_acct *mem_acct, struct mem_header *header,
+ size_t size, uint32_t type, const char *typestr)
+{
+ struct mem_acct_rec *rec = NULL;
+ bool new_ref = false;
- LOCK(&xl->mem_acct->rec[type].lock);
- {
- if (!xl->mem_acct->rec[type].typestr)
- xl->mem_acct->rec[type].typestr = typestr;
- xl->mem_acct->rec[type].size += size;
- xl->mem_acct->rec[type].num_allocs++;
- xl->mem_acct->rec[type].total_allocs++;
- xl->mem_acct->rec[type].max_size = max(xl->mem_acct->rec[type].max_size,
- xl->mem_acct->rec[type].size);
- xl->mem_acct->rec[type].max_num_allocs = max(
- xl->mem_acct->rec[type].max_num_allocs,
- xl->mem_acct->rec[type].num_allocs);
- }
- UNLOCK(&xl->mem_acct->rec[type].lock);
+ if (mem_acct != NULL) {
+ GF_ASSERT(type <= mem_acct->num_types);
- GF_ATOMIC_INC(xl->mem_acct->refcnt);
+ rec = &mem_acct->rec[type];
+ LOCK(&rec->lock);
+ {
+ if (!rec->typestr) {
+ rec->typestr = typestr;
+ }
+ rec->size += size;
+ new_ref = (rec->num_allocs == 0);
+ rec->num_allocs++;
+ rec->total_allocs++;
+ rec->max_size = max(rec->max_size, rec->size);
+ rec->max_num_allocs = max(rec->max_num_allocs, rec->num_allocs);
+
+#ifdef DEBUG
+ list_add(&header->acct_list, &rec->obj_list);
+#endif
+ }
+ UNLOCK(&rec->lock);
+
+ /* We only take a reference for each memory type used, not for each
+ * allocation. This minimizes the use of atomic operations. */
+ if (new_ref) {
+ GF_ATOMIC_INC(mem_acct->refcnt);
+ }
+ }
- header = (struct mem_header *)ptr;
header->type = type;
- header->size = size;
- header->mem_acct = xl->mem_acct;
+ header->mem_acct = mem_acct;
header->magic = GF_MEM_HEADER_MAGIC;
+ return gf_mem_header_prepare(header, size);
+}
+
+static void *
+gf_mem_update_acct_info(struct mem_acct *mem_acct, struct mem_header *header,
+ size_t size)
+{
+ struct mem_acct_rec *rec = NULL;
+
+ if (mem_acct != NULL) {
+ rec = &mem_acct->rec[header->type];
+ LOCK(&rec->lock);
+ {
+ rec->size += size - header->size;
+ rec->total_allocs++;
+ rec->max_size = max(rec->max_size, rec->size);
+
#ifdef DEBUG
- INIT_LIST_HEAD(&header->acct_list);
- LOCK(&xl->mem_acct->rec[type].lock);
- {
- list_add(&header->acct_list, &(xl->mem_acct->rec[type].obj_list));
- }
- UNLOCK(&xl->mem_acct->rec[type].lock);
+ /* The old 'header' already was present in 'obj_list', but
+ * realloc() could have changed its address. We need to remove
+ * the old item from the list and add the new one. This can be
+ * done this way because list_move() doesn't use the pointers
+ * to the old location (which are not valid anymore) already
+ * present in the list, it simply overwrites them. */
+ list_move(&header->acct_list, &rec->obj_list);
#endif
- ptr += sizeof(struct mem_header);
- /* data follows in this gap of 'size' bytes */
- *(uint32_t *)(ptr + size) = GF_MEM_TRAILER_MAGIC;
+ }
+ UNLOCK(&rec->lock);
+ }
- *alloc_ptr = ptr;
- return 0;
+ return gf_mem_header_prepare(header, size);
}
void *
@@ -97,7 +128,7 @@ __gf_calloc(size_t nmemb, size_t size, uint32_t type, const char *typestr)
{
size_t tot_size = 0;
size_t req_size = 0;
- char *ptr = NULL;
+ void *ptr = NULL;
xlator_t *xl = NULL;
if (!THIS->ctx->mem_acct_enable)
@@ -114,16 +145,15 @@ __gf_calloc(size_t nmemb, size_t size, uint32_t type, const char *typestr)
gf_msg_nomem("", GF_LOG_ALERT, tot_size);
return NULL;
}
- gf_mem_set_acct_info(xl, &ptr, req_size, type, typestr);
- return (void *)ptr;
+ return gf_mem_set_acct_info(xl->mem_acct, ptr, req_size, type, typestr);
}
void *
__gf_malloc(size_t size, uint32_t type, const char *typestr)
{
size_t tot_size = 0;
- char *ptr = NULL;
+ void *ptr = NULL;
xlator_t *xl = NULL;
if (!THIS->ctx->mem_acct_enable)
@@ -138,84 +168,32 @@ __gf_malloc(size_t size, uint32_t type, const char *typestr)
gf_msg_nomem("", GF_LOG_ALERT, tot_size);
return NULL;
}
- gf_mem_set_acct_info(xl, &ptr, size, type, typestr);
- return (void *)ptr;
+ return gf_mem_set_acct_info(xl->mem_acct, ptr, size, type, typestr);
}
void *
__gf_realloc(void *ptr, size_t size)
{
size_t tot_size = 0;
- char *new_ptr;
- struct mem_header *old_header = NULL;
- struct mem_header *new_header = NULL;
- struct mem_header tmp_header;
+ struct mem_header *header = NULL;
if (!THIS->ctx->mem_acct_enable)
return REALLOC(ptr, size);
REQUIRE(NULL != ptr);
- old_header = (struct mem_header *)(ptr - GF_MEM_HEADER_SIZE);
- GF_ASSERT(old_header->magic == GF_MEM_HEADER_MAGIC);
- tmp_header = *old_header;
-
-#ifdef DEBUG
- int type = 0;
- size_t copy_size = 0;
-
- /* Making these changes for realloc is not straightforward. So
- * I am simulating realloc using calloc and free
- */
-
- type = tmp_header.type;
- new_ptr = __gf_calloc(1, size, type,
- tmp_header.mem_acct->rec[type].typestr);
- if (new_ptr) {
- copy_size = (size > tmp_header.size) ? tmp_header.size : size;
- memcpy(new_ptr, ptr, copy_size);
- __gf_free(ptr);
- }
-
- /* This is not quite what the man page says should happen */
- return new_ptr;
-#endif
+ header = (struct mem_header *)(ptr - GF_MEM_HEADER_SIZE);
+ GF_ASSERT(header->magic == GF_MEM_HEADER_MAGIC);
tot_size = size + GF_MEM_HEADER_SIZE + GF_MEM_TRAILER_SIZE;
- new_ptr = realloc(old_header, tot_size);
- if (!new_ptr) {
+ header = realloc(header, tot_size);
+ if (!header) {
gf_msg_nomem("", GF_LOG_ALERT, tot_size);
return NULL;
}
- /*
- * We used to pass (char **)&ptr as the second
- * argument after the value of realloc was saved
- * in ptr, but the compiler warnings complained
- * about the casting to and forth from void ** to
- * char **.
- * TBD: it would be nice to adjust the memory accounting info here,
- * but calling gf_mem_set_acct_info here is wrong because it bumps
- * up counts as though this is a new allocation - which it's not.
- * The consequence of doing nothing here is only that the sizes will be
- * wrong, but at least the counts won't be.
- uint32_t type = 0;
- xlator_t *xl = NULL;
- type = header->type;
- xl = (xlator_t *) header->xlator;
- gf_mem_set_acct_info (xl, &new_ptr, size, type, NULL);
- */
-
- new_header = (struct mem_header *)new_ptr;
- *new_header = tmp_header;
- new_header->size = size;
-
- new_ptr += sizeof(struct mem_header);
- /* data follows in this gap of 'size' bytes */
- *(uint32_t *)(new_ptr + size) = GF_MEM_TRAILER_MAGIC;
-
- return (void *)new_ptr;
+ return gf_mem_update_acct_info(header->mem_acct, header, size);
}
int
@@ -321,6 +299,7 @@ __gf_free(void *free_ptr)
void *ptr = NULL;
struct mem_acct *mem_acct;
struct mem_header *header = NULL;
+ bool last_ref = false;
if (!THIS->ctx->mem_acct_enable) {
FREE(free_ptr);
@@ -352,16 +331,18 @@ __gf_free(void *free_ptr)
mem_acct->rec[header->type].num_allocs--;
/* If all the instances are freed up then ensure typestr is set
* to NULL */
- if (!mem_acct->rec[header->type].num_allocs)
+ if (!mem_acct->rec[header->type].num_allocs) {
+ last_ref = true;
mem_acct->rec[header->type].typestr = NULL;
+ }
#ifdef DEBUG
list_del(&header->acct_list);
#endif
}
UNLOCK(&mem_acct->rec[header->type].lock);
- if (GF_ATOMIC_DEC(mem_acct->refcnt) == 0) {
- FREE(mem_acct);
+ if (last_ref) {
+ xlator_mem_acct_unref(mem_acct);
}
free:
diff --git a/libglusterfs/src/xlator.c b/libglusterfs/src/xlator.c
index 5d6f8d2..022c3ed 100644
--- a/libglusterfs/src/xlator.c
+++ b/libglusterfs/src/xlator.c
@@ -736,6 +736,19 @@ xlator_mem_acct_init(xlator_t *xl, int num_types)
}
void
+xlator_mem_acct_unref(struct mem_acct *mem_acct)
+{
+ uint32_t i;
+
+ if (GF_ATOMIC_DEC(mem_acct->refcnt) == 0) {
+ for (i = 0; i < mem_acct->num_types; i++) {
+ LOCK_DESTROY(&(mem_acct->rec[i].lock));
+ }
+ FREE(mem_acct);
+ }
+}
+
+void
xlator_tree_fini(xlator_t *xl)
{
xlator_t *top = NULL;
@@ -766,7 +779,6 @@ xlator_list_destroy(xlator_list_t *list)
int
xlator_memrec_free(xlator_t *xl)
{
- uint32_t i = 0;
struct mem_acct *mem_acct = NULL;
if (!xl) {
@@ -775,13 +787,8 @@ xlator_memrec_free(xlator_t *xl)
mem_acct = xl->mem_acct;
if (mem_acct) {
- for (i = 0; i < mem_acct->num_types; i++) {
- LOCK_DESTROY(&(mem_acct->rec[i].lock));
- }
- if (GF_ATOMIC_DEC(mem_acct->refcnt) == 0) {
- FREE(mem_acct);
- xl->mem_acct = NULL;
- }
+ xlator_mem_acct_unref(mem_acct);
+ xl->mem_acct = NULL;
}
return 0;
--
1.8.3.1

View File

@ -0,0 +1,117 @@
From 01bb17a0910a638e89a44a6da4b1359123940498 Mon Sep 17 00:00:00 2001
From: Hari Gowtham <hgowtham@redhat.com>
Date: Wed, 17 Apr 2019 12:17:27 +0530
Subject: [PATCH 130/141] tier/test: new-tier-cmds.t fails after a glusterd
restart
Problem: new-tier-cmds.t does a restart of gluster processes and
after the restart the bricks and the tier process takes more
time than before to come online. This causes the detach start to
fail.
Fix: Give it enough time to come online after the restart.
label: DOWNSTREAM ONLY
Change-Id: I0f50b0bb77fe49ebd3a0292e190d0350d7994cfe
Signed-off-by: Hari Gowtham <hgowtham@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/168130
Tested-by: RHGS Build Bot <nigelb@redhat.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
---
tests/basic/tier/new-tier-cmds.t | 45 ++++++++++++++++++++++++++--------------
tests/volume.rc | 8 +++++++
2 files changed, 37 insertions(+), 16 deletions(-)
diff --git a/tests/basic/tier/new-tier-cmds.t b/tests/basic/tier/new-tier-cmds.t
index b9c9390..92881ac 100644
--- a/tests/basic/tier/new-tier-cmds.t
+++ b/tests/basic/tier/new-tier-cmds.t
@@ -19,14 +19,6 @@ function create_dist_tier_vol () {
TEST $CLI_1 volume tier $V0 attach replica 2 $H1:$B1/${V0}_h1 $H2:$B2/${V0}_h2 $H3:$B3/${V0}_h3 $H1:$B1/${V0}_h4 $H2:$B2/${V0}_h5 $H3:$B3/${V0}_h6
}
-function tier_daemon_status {
- local _VAR=CLI_$1
- local xpath_sel='//node[hostname="Tier Daemon"][path="localhost"]/status'
- ${!_VAR} --xml volume status $V0 \
- | xmllint --xpath "$xpath_sel" - \
- | sed -n '/.*<status>\([0-9]*\).*/s//\1/p'
-}
-
function detach_xml_status {
$CLI_1 volume tier $V0 detach status --xml | sed -n \
'/.*<opErrstr>Detach tier status successful/p' | wc -l
@@ -70,7 +62,20 @@ TEST $glusterd_2;
EXPECT_WITHIN $PROBE_TIMEOUT 2 check_peers;
#after starting detach tier the detach tier status should display the status
-sleep 2
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 1 $V0 $H1 $B1/${V0}_b1
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 1 $V0 $H1 $B1/${V0}_b4
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 1 $V0 $H1 $B1/${V0}_h1
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 1 $V0 $H1 $B1/${V0}_h4
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 2 $V0 $H2 $B2/${V0}_b2
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 2 $V0 $H2 $B2/${V0}_b5
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 2 $V0 $H2 $B2/${V0}_h2
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 2 $V0 $H2 $B2/${V0}_h5
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 3 $V0 $H3 $B3/${V0}_b3
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 3 $V0 $H3 $B3/${V0}_b6
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 3 $V0 $H3 $B3/${V0}_h3
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 3 $V0 $H3 $B3/${V0}_h6
+EXPECT_WITHIN $PROCESS_UP_TIMEOUT "3" get_shd_count
+EXPECT_WITHIN $PROCESS_UP_TIMEOUT "3" get_tierd_count
$CLI_1 volume status
TEST $CLI_1 volume tier $V0 detach start
@@ -91,13 +96,21 @@ EXPECT_WITHIN $PROBE_TIMEOUT 2 check_peers;
EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 1 $V0 $H2 $B2/${V0}_b2
EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 1 $V0 $H2 $B2/${V0}_h2
-# Parsing normal output doesn't work because of line-wrap issues on our
-# regression machines, and the version of xmllint there doesn't support --xpath
-# so we can't do it that way either. In short, there's no way for us to detect
-# when we can stop waiting, so we just have to wait the maximum time every time
-# and hope any failures will show up later in the script.
-sleep $PROCESS_UP_TIMEOUT
-#XPECT_WITHIN $PROCESS_UP_TIMEOUT 1 tier_daemon_status 2
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 1 $V0 $H1 $B1/${V0}_b1
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 1 $V0 $H1 $B1/${V0}_b4
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 1 $V0 $H1 $B1/${V0}_h1
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 1 $V0 $H1 $B1/${V0}_h4
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 2 $V0 $H2 $B2/${V0}_b2
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 2 $V0 $H2 $B2/${V0}_b5
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 2 $V0 $H2 $B2/${V0}_h2
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 2 $V0 $H2 $B2/${V0}_h5
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 3 $V0 $H3 $B3/${V0}_b3
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 3 $V0 $H3 $B3/${V0}_b6
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 3 $V0 $H3 $B3/${V0}_h3
+EXPECT_WITHIN $CHILD_UP_TIMEOUT 1 cluster_brick_up_status 3 $V0 $H3 $B3/${V0}_h6
+EXPECT_WITHIN $PROCESS_UP_TIMEOUT "3" get_shd_count
+EXPECT_WITHIN $PROCESS_UP_TIMEOUT "0" get_tierd_count
+$CLI_1 volume status
EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" tier_detach_status
diff --git a/tests/volume.rc b/tests/volume.rc
index 289b197..b326098 100644
--- a/tests/volume.rc
+++ b/tests/volume.rc
@@ -719,6 +719,14 @@ function get_snapd_count {
ps auxww | grep glusterfs | grep snapd.pid | grep -v grep | wc -l
}
+function get_tierd_count {
+ ps auxww | grep glusterfs | grep tierd.pid | grep -v grep | wc -l
+}
+
+function get_shd_count {
+ ps auxww | grep glusterfs | grep shd.pid | grep -v grep | wc -l
+}
+
function drop_cache() {
case $OSTYPE in
Linux)
--
1.8.3.1

View File

@ -0,0 +1,113 @@
From a0949929282529e0e866e074721c1bdfe3928c8c Mon Sep 17 00:00:00 2001
From: N Balachandran <nbalacha@redhat.com>
Date: Thu, 11 Apr 2019 12:12:12 +0530
Subject: [PATCH 131/141] tests/dht: Test that lookups are sent post brick up
upstream: https://review.gluster.org/#/c/glusterfs/+/22545/
>Change-Id: I3556793c5e9d58cc6a08644b41dc5740fab2610b
>updates: bz#1628194
BUG:1704562
Change-Id: Ie45331298902bd5268c56cb29a966d8246abfd6d
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/169592
Tested-by: RHGS Build Bot <nigelb@redhat.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
---
tests/basic/distribute/brick-down.t | 83 +++++++++++++++++++++++++++++++++++++
1 file changed, 83 insertions(+)
create mode 100644 tests/basic/distribute/brick-down.t
diff --git a/tests/basic/distribute/brick-down.t b/tests/basic/distribute/brick-down.t
new file mode 100644
index 0000000..522ccc0
--- /dev/null
+++ b/tests/basic/distribute/brick-down.t
@@ -0,0 +1,83 @@
+#!/bin/bash
+
+. $(dirname $0)/../../include.rc
+. $(dirname $0)/../../volume.rc
+. $(dirname $0)/../../common-utils.rc
+. $(dirname $0)/../../dht.rc
+
+# Test 1 overview:
+# ----------------
+# Test whether lookups are sent after a brick comes up again
+#
+# 1. Create a 3 brick pure distribute volume
+# 2. Fuse mount the volume so the layout is set on the root
+# 3. Kill one brick and try to create a directory which hashes to that brick.
+# It should fail with EIO.
+# 4. Restart the brick that was killed.
+# 5. Do not remount the volume. Try to create the same directory as in step 3.
+
+cleanup
+
+TEST glusterd
+TEST pidof glusterd
+
+TEST $CLI volume create $V0 $H0:$B0/$V0-{1..3}
+TEST $CLI volume start $V0
+
+# We want the lookup to reach DHT
+TEST $CLI volume set $V0 performance.stat-prefetch off
+
+# Mount using FUSE and lookup the mount so a layout is set on the brick root
+TEST glusterfs --entry-timeout=0 --attribute-timeout=0 -s $H0 --volfile-id $V0 $M0
+
+ls $M0/
+
+TEST mkdir $M0/level1
+
+# Find a dirname that will hash to the brick we are going to kill
+hashed=$V0-client-1
+TEST dht_first_filename_with_hashsubvol "$hashed" $M0 "dir-"
+roottestdir=$fn_return_val
+
+hashed=$V0-client-1
+TEST dht_first_filename_with_hashsubvol "$hashed" $M0/level1 "dir-"
+level1testdir=$fn_return_val
+
+
+TEST kill_brick $V0 $H0 $B0/$V0-2
+EXPECT_WITHIN $PROCESS_UP_TIMEOUT "0" brick_up_status $V0 $H0 $B0/$V0-2
+
+TEST $CLI volume status $V0
+
+
+# Unmount and mount the volume again so dht has an incomplete in memory layout
+
+umount -f $M0
+TEST glusterfs --entry-timeout=0 --attribute-timeout=0 -s $H0 --volfile-id $V0 $M0
+
+
+mkdir $M0/$roottestdir
+TEST [ $? -ne 0 ]
+
+mkdir $M0/level1/$level1testdir
+TEST [ $? -ne 0 ]
+
+TEST $CLI volume start $V0 force
+EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status $V0 $H0 $B0/$V0-2
+
+#$CLI volume status
+
+# It takes a while for the client to reconnect to the brick
+sleep 5
+
+
+mkdir $M0/$roottestdir
+TEST [ $? -eq 0 ]
+
+mkdir $M0/$level1/level1testdir
+TEST [ $? -eq 0 ]
+
+# Cleanup
+cleanup
+
+
--
1.8.3.1

View File

@ -0,0 +1,41 @@
From 83d5ebd6ca68e319db86e310cf072888d0f0f1d1 Mon Sep 17 00:00:00 2001
From: Jiffin Tony Thottan <jthottan@redhat.com>
Date: Wed, 8 May 2019 10:07:29 +0530
Subject: [PATCH 132/141] glusterd: remove duplicate occurrence of
features.selinux from volume option table
Label : DOWNSTREAM ONLY
Change-Id: I0a49fece7a1fcbb9f3bbfe5806ec470aeb33ad70
Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/169664
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Tested-by: RHGS Build Bot <nigelb@redhat.com>
---
xlators/mgmt/glusterd/src/glusterd-volume-set.c | 10 ----------
1 file changed, 10 deletions(-)
diff --git a/xlators/mgmt/glusterd/src/glusterd-volume-set.c b/xlators/mgmt/glusterd/src/glusterd-volume-set.c
index 10aa2ae..e52de20 100644
--- a/xlators/mgmt/glusterd/src/glusterd-volume-set.c
+++ b/xlators/mgmt/glusterd/src/glusterd-volume-set.c
@@ -3242,16 +3242,6 @@ struct volopt_map_entry glusterd_volopt_map[] = {
"pages."
"The max value is 262144 pages i.e 1 GB and "
"the min value is 1000 pages i.e ~4 MB."},
- {.key = VKEY_FEATURES_SELINUX,
- .voltype = "features/selinux",
- .type = NO_DOC,
- .value = "on",
- .op_version = GD_OP_VERSION_3_11_0,
- .description = "Convert security.selinux xattrs to "
- "trusted.gluster.selinux on the bricks. Recommended "
- "to have enabled when clients and/or bricks support "
- "SELinux."},
-
#endif /* USE_GFDB */
{
.key = "locks.trace",
--
1.8.3.1

View File

@ -0,0 +1,62 @@
From f1f27e5839dd99389bef65f79ea491e98e6935d2 Mon Sep 17 00:00:00 2001
From: Ravishankar N <ravishankar@redhat.com>
Date: Tue, 23 Apr 2019 18:05:36 +0530
Subject: [PATCH 133/141] glusterd: enable fips-mode-rchecksum for new volumes
...during volume create if the cluster op-version is >=GD_OP_VERSION_7_0.
This option itself was introduced in GD_OP_VERSION_4_0_0 via commit 6daa65356.
We missed enabling it by default for new volume creates in that commit.
If we are to do it now safely, we need to use op version
GD_OP_VERSION_7_0 and target it for release-7.
Patch in upstream master: https://review.gluster.org/#/c/glusterfs/+/22609/
BUG: 1706683
Change-Id: I7c6d4a8abe0816367e7069cb5cad01744f04858f
fixes: bz#1706683
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/169443
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Tested-by: RHGS Build Bot <nigelb@redhat.com>
---
xlators/mgmt/glusterd/src/glusterd-volgen.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/xlators/mgmt/glusterd/src/glusterd-volgen.c b/xlators/mgmt/glusterd/src/glusterd-volgen.c
index da877aa..77aa705 100644
--- a/xlators/mgmt/glusterd/src/glusterd-volgen.c
+++ b/xlators/mgmt/glusterd/src/glusterd-volgen.c
@@ -1614,10 +1614,17 @@ brick_graph_add_posix(volgen_graph_t *graph, glusterd_volinfo_t *volinfo,
gf_boolean_t pgfid_feat = _gf_false;
char *value = NULL;
xlator_t *xl = NULL;
+ xlator_t *this = NULL;
+ glusterd_conf_t *priv = NULL;
if (!graph || !volinfo || !set_dict || !brickinfo)
goto out;
+ this = THIS;
+ GF_VALIDATE_OR_GOTO("glusterd", this, out);
+ priv = this->private;
+ GF_VALIDATE_OR_GOTO("glusterd", priv, out);
+
ret = glusterd_volinfo_get(volinfo, VKEY_FEATURES_QUOTA, &value);
if (value) {
ret = gf_string2boolean(value, &quota_enabled);
@@ -1661,6 +1668,12 @@ brick_graph_add_posix(volgen_graph_t *graph, glusterd_volinfo_t *volinfo,
}
}
+ if (priv->op_version >= GD_OP_VERSION_7_0) {
+ ret = xlator_set_fixed_option(xl, "fips-mode-rchecksum", "on");
+ if (ret) {
+ goto out;
+ }
+ }
snprintf(tmpstr, sizeof(tmpstr), "%d", brickinfo->fs_share_count);
ret = xlator_set_fixed_option(xl, "shared-brick-count", tmpstr);
out:
--
1.8.3.1

View File

@ -0,0 +1,79 @@
From 76127f4f8f3c2bf415f66a335e7b37670cb9bd84 Mon Sep 17 00:00:00 2001
From: Raghavendra G <rgowdapp@redhat.com>
Date: Fri, 3 May 2019 10:14:48 +0530
Subject: [PATCH 134/141] performance/write-behind: remove request from wip
list in wb_writev_cbk
There is a race in the way O_DIRECT writes are handled. Assume two
overlapping write requests w1 and w2.
* w1 is issued and is in wb_inode->wip queue as the response is still
pending from bricks. Also wb_request_unref in wb_do_winds is not yet
invoked.
list_for_each_entry_safe (req, tmp, tasks, winds) {
list_del_init (&req->winds);
if (req->op_ret == -1) {
call_unwind_error_keep_stub (req->stub, req->op_ret,
req->op_errno);
} else {
call_resume_keep_stub (req->stub);
}
wb_request_unref (req);
}
* w2 is issued and wb_process_queue is invoked. w2 is not picked up
for winding as w1 is still in wb_inode->wip. w1 is added to todo
list and wb_writev for w2 returns.
* response to w1 is received and invokes wb_request_unref. Assume
wb_request_unref in wb_do_winds (see point 1) is not invoked
yet. Since there is one more refcount, wb_request_unref in
wb_writev_cbk of w1 doesn't remove w1 from wip.
* wb_process_queue is invoked as part of wb_writev_cbk of w1. But, it
fails to wind w2 as w1 is still in wip.
* wb_requet_unref is invoked on w1 as part of wb_do_winds. w1 is
removed from all queues including w1.
* After this point there is no invocation of wb_process_queue unless
new request is issued from application causing w2 to be hung till
the next request.
This bug is similar to bz 1626780 and bz 1379655.
upstream patch: https://review.gluster.org/#/c/glusterfs/+/22654/
BUG: 1702686
Change-Id: Iaa47437613591699d4c8ad18bc0b32de6affcc31
fixes: bz#1702686
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/169552
Tested-by: RHGS Build Bot <nigelb@redhat.com>
Reviewed-by: Sunil Kumar Heggodu Gopala Acharya <sheggodu@redhat.com>
---
xlators/performance/write-behind/src/write-behind.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/xlators/performance/write-behind/src/write-behind.c b/xlators/performance/write-behind/src/write-behind.c
index cf302bd..70e281a 100644
--- a/xlators/performance/write-behind/src/write-behind.c
+++ b/xlators/performance/write-behind/src/write-behind.c
@@ -1813,6 +1813,12 @@ wb_writev_cbk(call_frame_t *frame, void *cookie, xlator_t *this, int32_t op_ret,
frame->local = NULL;
wb_inode = req->wb_inode;
+ LOCK(&req->wb_inode->lock);
+ {
+ list_del_init(&req->wip);
+ }
+ UNLOCK(&req->wb_inode->lock);
+
wb_request_unref(req);
/* requests could be pending while this was in progress */
--
1.8.3.1

View File

@ -0,0 +1,56 @@
From 677f575d2289285d2e553ddd610944856cb947db Mon Sep 17 00:00:00 2001
From: Sunny Kumar <sunkumar@redhat.com>
Date: Fri, 10 May 2019 11:21:03 +0530
Subject: [PATCH 135/141] geo-rep: fix incorrectly formatted authorized_keys
There are two ways for creating secret pem pub file during geo-rep
setup.
1. gluster-georep-sshkey generate
2. gluster system:: execute gsec_create
Below patch solves this problem for `gluster-georep-sshkey generate`
method.
Patch link: https://review.gluster.org/#/c/glusterfs/+/22246/
This patch is added to support old way of creating secret pem pub file
`gluster system:: execute gsec_create`.
Problem: While Geo-rep setup when creating an ssh authorized_keys
the geo-rep setup inserts an extra space before the "ssh-rsa" label.
This gets flagged by an enterprise customer's security scan as a
security violation.
Solution: Remove extra space while creating secret key.
Upstream Patch: https://review.gluster.org/#/c/glusterfs/+/22673/
>fixes: bz#1679401
>Change-Id: I92ba7e25aaa5123dae9ebe2f3c68d14315aa5f0e
>Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
BUG: 1671862
Change-Id: I11e90c00a14a301a5d95e14b5e8984867e6ff893
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/169870
Tested-by: RHGS Build Bot <nigelb@redhat.com>
Reviewed-by: Sunil Kumar Heggodu Gopala Acharya <sheggodu@redhat.com>
---
geo-replication/src/peer_gsec_create.in | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/geo-replication/src/peer_gsec_create.in b/geo-replication/src/peer_gsec_create.in
index 05c1638..6d4a484 100755
--- a/geo-replication/src/peer_gsec_create.in
+++ b/geo-replication/src/peer_gsec_create.in
@@ -18,7 +18,7 @@ if [ "Xcontainer" = "X$1" ]; then
output1=`cat "$GLUSTERD_WORKDIR"/geo-replication/secret.pem.pub`
output2=`cat "$GLUSTERD_WORKDIR"/geo-replication/tar_ssh.pem.pub`
else
- output1=`echo command=\"${libexecdir}/glusterfs/gsyncd\" " "``cat "$GLUSTERD_WORKDIR"/geo-replication/secret.pem.pub`
- output2=`echo command=\"tar \$\{SSH_ORIGINAL_COMMAND#* \}\" " "``cat "$GLUSTERD_WORKDIR"/geo-replication/tar_ssh.pem.pub`
+ output1=`echo command=\"${libexecdir}/glusterfs/gsyncd\" ""``cat "$GLUSTERD_WORKDIR"/geo-replication/secret.pem.pub`
+ output2=`echo command=\"tar \$\{SSH_ORIGINAL_COMMAND#* \}\" ""``cat "$GLUSTERD_WORKDIR"/geo-replication/tar_ssh.pem.pub`
fi
echo -e "$output1\n$output2"
--
1.8.3.1

View File

@ -0,0 +1,51 @@
From c63346dab3e5da0605bf4ddaa314253f42892c9d Mon Sep 17 00:00:00 2001
From: Atin Mukherjee <amukherj@redhat.com>
Date: Wed, 8 May 2019 12:13:59 +0530
Subject: [PATCH 136/141] glusterd: fix inconsistent global option output in
volume get
volume get all all | grep <key> & volume get <volname> all | grep <key>
dumps two different output value for cluster.brick-multiplex and
cluster.server-quorum-ratio
>upstream patch : https://review.gluster.org/#/c/glusterfs/+/22680/
>Fixes: bz#1707700
>Change-Id: Id131734e0502aa514b84768cf67fce3c22364eae
>Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
BUG: 1706776
Change-Id: Id131734e0502aa514b84768cf67fce3c22364eae
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/169948
Tested-by: RHGS Build Bot <nigelb@redhat.com>
Reviewed-by: Sunil Kumar Heggodu Gopala Acharya <sheggodu@redhat.com>
---
xlators/mgmt/glusterd/src/glusterd-volume-set.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/xlators/mgmt/glusterd/src/glusterd-volume-set.c b/xlators/mgmt/glusterd/src/glusterd-volume-set.c
index e52de20..4b32fb6 100644
--- a/xlators/mgmt/glusterd/src/glusterd-volume-set.c
+++ b/xlators/mgmt/glusterd/src/glusterd-volume-set.c
@@ -2906,7 +2906,7 @@ struct volopt_map_entry glusterd_volopt_map[] = {
.op_version = 1},
{.key = GLUSTERD_QUORUM_RATIO_KEY,
.voltype = "mgmt/glusterd",
- .value = "0",
+ .value = "51",
.op_version = 1},
/* changelog translator - global tunables */
{.key = "changelog.changelog",
@@ -3547,7 +3547,7 @@ struct volopt_map_entry glusterd_volopt_map[] = {
/* Brick multiplexing options */
{.key = GLUSTERD_BRICK_MULTIPLEX_KEY,
.voltype = "mgmt/glusterd",
- .value = "off",
+ .value = "disable",
.op_version = GD_OP_VERSION_3_10_0,
.validate_fn = validate_boolean,
.type = GLOBAL_DOC,
--
1.8.3.1

View File

@ -0,0 +1,160 @@
From 646292b4f73bf1b506d034b85787f794963d7196 Mon Sep 17 00:00:00 2001
From: Mohammed Rafi KC <rkavunga@redhat.com>
Date: Mon, 6 May 2019 23:35:08 +0530
Subject: [PATCH 137/141] shd/glusterd: Serialize shd manager to prevent race
condition
At the time of a glusterd restart, while doing a handshake
there is a possibility that multiple shd manager might get
executed. Because of this, there is a chance that multiple
shd get spawned during a glusterd restart
> upstream patch : https://review.gluster.org/#/c/glusterfs/+/22667/
>Change-Id: Ie20798441e07d7d7a93b7d38dfb924cea178a920
>fixes: bz#1707081
>Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
BUG: 1704851
Change-Id: Ie20798441e07d7d7a93b7d38dfb924cea178a920
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/169947
Tested-by: RHGS Build Bot <nigelb@redhat.com>
Reviewed-by: Sunil Kumar Heggodu Gopala Acharya <sheggodu@redhat.com>
---
.../serialize-shd-manager-glusterd-restart.t | 54 ++++++++++++++++++++++
xlators/mgmt/glusterd/src/glusterd-shd-svc.c | 14 ++++++
xlators/mgmt/glusterd/src/glusterd.c | 1 +
xlators/mgmt/glusterd/src/glusterd.h | 3 ++
4 files changed, 72 insertions(+)
create mode 100644 tests/bugs/glusterd/serialize-shd-manager-glusterd-restart.t
diff --git a/tests/bugs/glusterd/serialize-shd-manager-glusterd-restart.t b/tests/bugs/glusterd/serialize-shd-manager-glusterd-restart.t
new file mode 100644
index 0000000..3a27c2a
--- /dev/null
+++ b/tests/bugs/glusterd/serialize-shd-manager-glusterd-restart.t
@@ -0,0 +1,54 @@
+#! /bin/bash
+
+. $(dirname $0)/../../include.rc
+. $(dirname $0)/../../cluster.rc
+
+function check_peers {
+count=`$CLI_1 peer status | grep 'Peer in Cluster (Connected)' | wc -l`
+echo $count
+}
+
+function check_shd {
+ps aux | grep $1 | grep glustershd | wc -l
+}
+
+cleanup
+
+
+TEST launch_cluster 6
+
+TESTS_EXPECTED_IN_LOOP=25
+for i in $(seq 2 6); do
+ hostname="H$i"
+ TEST $CLI_1 peer probe ${!hostname}
+done
+
+
+EXPECT_WITHIN $PROBE_TIMEOUT 5 check_peers;
+for i in $(seq 1 5); do
+
+ TEST $CLI_1 volume create ${V0}_$i replica 3 $H1:$B1/${V0}_$i $H2:$B2/${V0}_$i $H3:$B3/${V0}_$i $H4:$B4/${V0}_$i $H5:$B5/${V0}_$i $H6:$B6/${V0}_$i
+ TEST $CLI_1 volume start ${V0}_$i force
+
+done
+
+#kill a node
+TEST kill_node 3
+
+TEST $glusterd_3;
+EXPECT_WITHIN $PROBE_TIMEOUT 5 check_peers
+
+EXPECT_WITHIN $PROCESS_UP_TIMEOUT 1 check_shd $H3
+
+for i in $(seq 1 5); do
+
+ TEST $CLI_1 volume stop ${V0}_$i
+ TEST $CLI_1 volume delete ${V0}_$i
+
+done
+
+for i in $(seq 1 6); do
+ hostname="H$i"
+ EXPECT_WITHIN $PROCESS_DOWN_TIMEOUT 0 check_shd ${!hostname}
+done
+cleanup
diff --git a/xlators/mgmt/glusterd/src/glusterd-shd-svc.c b/xlators/mgmt/glusterd/src/glusterd-shd-svc.c
index a9eab42..75f9a07 100644
--- a/xlators/mgmt/glusterd/src/glusterd-shd-svc.c
+++ b/xlators/mgmt/glusterd/src/glusterd-shd-svc.c
@@ -254,14 +254,26 @@ glusterd_shdsvc_manager(glusterd_svc_t *svc, void *data, int flags)
{
int ret = -1;
glusterd_volinfo_t *volinfo = NULL;
+ glusterd_conf_t *conf = NULL;
+ gf_boolean_t shd_restart = _gf_false;
+ conf = THIS->private;
volinfo = data;
+ GF_VALIDATE_OR_GOTO("glusterd", conf, out);
GF_VALIDATE_OR_GOTO("glusterd", svc, out);
GF_VALIDATE_OR_GOTO("glusterd", volinfo, out);
if (volinfo)
glusterd_volinfo_ref(volinfo);
+ while (conf->restart_shd) {
+ synclock_unlock(&conf->big_lock);
+ sleep(2);
+ synclock_lock(&conf->big_lock);
+ }
+ conf->restart_shd = _gf_true;
+ shd_restart = _gf_true;
+
ret = glusterd_shdsvc_create_volfile(volinfo);
if (ret)
goto out;
@@ -310,6 +322,8 @@ glusterd_shdsvc_manager(glusterd_svc_t *svc, void *data, int flags)
}
}
out:
+ if (shd_restart)
+ conf->restart_shd = _gf_false;
if (volinfo)
glusterd_volinfo_unref(volinfo);
if (ret)
diff --git a/xlators/mgmt/glusterd/src/glusterd.c b/xlators/mgmt/glusterd/src/glusterd.c
index c0973cb..6d7dd4a 100644
--- a/xlators/mgmt/glusterd/src/glusterd.c
+++ b/xlators/mgmt/glusterd/src/glusterd.c
@@ -1819,6 +1819,7 @@ init(xlator_t *this)
conf->rpc = rpc;
conf->uds_rpc = uds_rpc;
conf->gfs_mgmt = &gd_brick_prog;
+ conf->restart_shd = _gf_false;
this->private = conf;
/* conf->workdir and conf->rundir are smaller than PATH_MAX; gcc's
* snprintf checking will throw an error here if sprintf is used.
diff --git a/xlators/mgmt/glusterd/src/glusterd.h b/xlators/mgmt/glusterd/src/glusterd.h
index bd9f509..2ea8560 100644
--- a/xlators/mgmt/glusterd/src/glusterd.h
+++ b/xlators/mgmt/glusterd/src/glusterd.h
@@ -222,6 +222,9 @@ typedef struct {
gf_atomic_t blockers;
uint32_t mgmt_v3_lock_timeout;
gf_boolean_t restart_bricks;
+ gf_boolean_t restart_shd; /* This flag prevents running two shd manager
+ simultaneously
+ */
pthread_mutex_t attach_lock; /* Lock can be per process or a common one */
pthread_mutex_t volume_lock; /* We release the big_lock from lot of places
which might lead the modification of volinfo
--
1.8.3.1

View File

@ -0,0 +1,64 @@
From d08083d057d6cc7136128cad6ecefba43b886c4c Mon Sep 17 00:00:00 2001
From: Vishal Pandey <vpandey@redhat.com>
Date: Thu, 9 May 2019 14:37:22 +0530
Subject: [PATCH 138/141] glusterd: Add gluster volume stop operation to
glusterd_validate_quorum()
ISSUE: gluster volume stop succeeds even if quorum is not met.
Fix: Add GD_OP_STOP_VOLUME to gluster_validate_quorum in
glusterd_mgmt_v3_pre_validate ().
Since the volume stop command has been ported from synctask to mgmt_v3,
the quorum check was missed out.
>upstream patch : https://review.gluster.org/#/c/glusterfs/+/22692/
>Change-Id: I7a634ad89ec2e286ea262d7952061efad5360042
>fixes: bz#1690753
>Signed-off-by: Vishal Pandey <vpandey@redhat.com>
BUG: 1706893
Change-Id: I7a634ad89ec2e286ea262d7952061efad5360042
Signed-off-by: Vishal Pandey <vpandey@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/169949
Tested-by: RHGS Build Bot <nigelb@redhat.com>
Reviewed-by: Sunil Kumar Heggodu Gopala Acharya <sheggodu@redhat.com>
---
tests/bugs/glusterd/quorum-validation.t | 4 +++-
xlators/mgmt/glusterd/src/glusterd-mgmt.c | 2 +-
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/tests/bugs/glusterd/quorum-validation.t b/tests/bugs/glusterd/quorum-validation.t
index 05aef4e..ff46729 100644
--- a/tests/bugs/glusterd/quorum-validation.t
+++ b/tests/bugs/glusterd/quorum-validation.t
@@ -34,9 +34,11 @@ TEST ! $CLI_1 volume add-brick $V0 $H1:$B1/${V0}2
TEST ! $CLI_1 volume remove-brick $V0 $H1:$B1/${V0}0 start
TEST ! $CLI_1 volume set $V0 barrier enable
-# Now execute a command which goes through op state machine and it should fail
TEST ! $CLI_1 volume profile $V0 start
+#bug-1690753 - Volume stop when quorum not met is successful
+TEST ! $CLI_1 volume stop $V0
+
#Bring back the 2nd glusterd
TEST $glusterd_2
diff --git a/xlators/mgmt/glusterd/src/glusterd-mgmt.c b/xlators/mgmt/glusterd/src/glusterd-mgmt.c
index 61ad66e..ec78913 100644
--- a/xlators/mgmt/glusterd/src/glusterd-mgmt.c
+++ b/xlators/mgmt/glusterd/src/glusterd-mgmt.c
@@ -1059,7 +1059,7 @@ glusterd_mgmt_v3_pre_validate(glusterd_op_t op, dict_t *req_dict,
goto out;
}
- if (op == GD_OP_PROFILE_VOLUME) {
+ if (op == GD_OP_PROFILE_VOLUME || op == GD_OP_STOP_VOLUME) {
ret = glusterd_validate_quorum(this, op, req_dict, op_errstr);
if (ret) {
gf_msg(this->name, GF_LOG_ERROR, 0, GD_MSG_SERVER_QUORUM_NOT_MET,
--
1.8.3.1

View File

@ -0,0 +1,300 @@
From edc238e40060773f5f5fd59fcdad8ae27d65749f Mon Sep 17 00:00:00 2001
From: Mohammed Rafi KC <rkavunga@redhat.com>
Date: Mon, 29 Apr 2019 13:22:32 +0530
Subject: [PATCH 139/141] ec/shd: Cleanup self heal daemon resources during ec
fini
We were not properly cleaning self-heal daemon resources
during ec fini. With shd multiplexing, it is absolutely
necessary to cleanup all the resources during ec fini.
Back port of
upstream patch: https://review.gluster.org/#/c/glusterfs/+/22644/
>Change-Id: Iae4f1bce7d8c2e1da51ac568700a51088f3cc7f2
>fixes: bz#1703948
>Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
BUG: 1703434
Change-Id: I98ae03178d3176772c62e34baa08a5c35b8f7217
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/169994
Tested-by: RHGS Build Bot <nigelb@redhat.com>
Reviewed-by: Sunil Kumar Heggodu Gopala Acharya <sheggodu@redhat.com>
---
libglusterfs/src/syncop-utils.c | 2 +
xlators/cluster/afr/src/afr-self-heald.c | 5 +++
xlators/cluster/ec/src/ec-heald.c | 77 +++++++++++++++++++++++++++-----
xlators/cluster/ec/src/ec-heald.h | 3 ++
xlators/cluster/ec/src/ec-messages.h | 3 +-
xlators/cluster/ec/src/ec.c | 47 +++++++++++++++++++
6 files changed, 124 insertions(+), 13 deletions(-)
diff --git a/libglusterfs/src/syncop-utils.c b/libglusterfs/src/syncop-utils.c
index b842142..4167db4 100644
--- a/libglusterfs/src/syncop-utils.c
+++ b/libglusterfs/src/syncop-utils.c
@@ -354,6 +354,8 @@ syncop_mt_dir_scan(call_frame_t *frame, xlator_t *subvol, loc_t *loc, int pid,
if (frame) {
this = frame->this;
+ } else {
+ this = THIS;
}
/*For this functionality to be implemented in general, we need
diff --git a/xlators/cluster/afr/src/afr-self-heald.c b/xlators/cluster/afr/src/afr-self-heald.c
index 8bc4720..522fe5d 100644
--- a/xlators/cluster/afr/src/afr-self-heald.c
+++ b/xlators/cluster/afr/src/afr-self-heald.c
@@ -524,6 +524,11 @@ afr_shd_full_heal(xlator_t *subvol, gf_dirent_t *entry, loc_t *parent,
afr_private_t *priv = NULL;
priv = this->private;
+
+ if (this->cleanup_starting) {
+ return -ENOTCONN;
+ }
+
if (!priv->shd.enabled)
return -EBUSY;
diff --git a/xlators/cluster/ec/src/ec-heald.c b/xlators/cluster/ec/src/ec-heald.c
index cba111a..edf5e11 100644
--- a/xlators/cluster/ec/src/ec-heald.c
+++ b/xlators/cluster/ec/src/ec-heald.c
@@ -71,6 +71,11 @@ disabled_loop:
break;
}
+ if (ec->shutdown) {
+ healer->running = _gf_false;
+ return -1;
+ }
+
ret = healer->rerun;
healer->rerun = 0;
@@ -241,9 +246,11 @@ ec_shd_index_sweep(struct subvol_healer *healer)
goto out;
}
+ _mask_cancellation();
ret = syncop_mt_dir_scan(NULL, subvol, &loc, GF_CLIENT_PID_SELF_HEALD,
healer, ec_shd_index_heal, xdata,
ec->shd.max_threads, ec->shd.wait_qlength);
+ _unmask_cancellation();
out:
if (xdata)
dict_unref(xdata);
@@ -263,6 +270,11 @@ ec_shd_full_heal(xlator_t *subvol, gf_dirent_t *entry, loc_t *parent,
int ret = 0;
ec = this->private;
+
+ if (this->cleanup_starting) {
+ return -ENOTCONN;
+ }
+
if (ec->xl_up_count <= ec->fragments) {
return -ENOTCONN;
}
@@ -305,11 +317,15 @@ ec_shd_full_sweep(struct subvol_healer *healer, inode_t *inode)
{
ec_t *ec = NULL;
loc_t loc = {0};
+ int ret = -1;
ec = healer->this->private;
loc.inode = inode;
- return syncop_ftw(ec->xl_list[healer->subvol], &loc,
- GF_CLIENT_PID_SELF_HEALD, healer, ec_shd_full_heal);
+ _mask_cancellation();
+ ret = syncop_ftw(ec->xl_list[healer->subvol], &loc,
+ GF_CLIENT_PID_SELF_HEALD, healer, ec_shd_full_heal);
+ _unmask_cancellation();
+ return ret;
}
void *
@@ -317,13 +333,16 @@ ec_shd_index_healer(void *data)
{
struct subvol_healer *healer = NULL;
xlator_t *this = NULL;
+ int run = 0;
healer = data;
THIS = this = healer->this;
ec_t *ec = this->private;
for (;;) {
- ec_shd_healer_wait(healer);
+ run = ec_shd_healer_wait(healer);
+ if (run == -1)
+ break;
if (ec->xl_up_count > ec->fragments) {
gf_msg_debug(this->name, 0, "starting index sweep on subvol %s",
@@ -352,16 +371,12 @@ ec_shd_full_healer(void *data)
rootloc.inode = this->itable->root;
for (;;) {
- pthread_mutex_lock(&healer->mutex);
- {
- run = __ec_shd_healer_wait(healer);
- if (!run)
- healer->running = _gf_false;
- }
- pthread_mutex_unlock(&healer->mutex);
-
- if (!run)
+ run = ec_shd_healer_wait(healer);
+ if (run < 0) {
break;
+ } else if (run == 0) {
+ continue;
+ }
if (ec->xl_up_count > ec->fragments) {
gf_msg(this->name, GF_LOG_INFO, 0, EC_MSG_FULL_SWEEP_START,
@@ -562,3 +577,41 @@ out:
dict_del(output, this->name);
return ret;
}
+
+void
+ec_destroy_healer_object(xlator_t *this, struct subvol_healer *healer)
+{
+ if (!healer)
+ return;
+
+ pthread_cond_destroy(&healer->cond);
+ pthread_mutex_destroy(&healer->mutex);
+}
+
+void
+ec_selfheal_daemon_fini(xlator_t *this)
+{
+ struct subvol_healer *healer = NULL;
+ ec_self_heald_t *shd = NULL;
+ ec_t *priv = NULL;
+ int i = 0;
+
+ priv = this->private;
+ if (!priv)
+ return;
+
+ shd = &priv->shd;
+ if (!shd->iamshd)
+ return;
+
+ for (i = 0; i < priv->nodes; i++) {
+ healer = &shd->index_healers[i];
+ ec_destroy_healer_object(this, healer);
+
+ healer = &shd->full_healers[i];
+ ec_destroy_healer_object(this, healer);
+ }
+
+ GF_FREE(shd->index_healers);
+ GF_FREE(shd->full_healers);
+}
diff --git a/xlators/cluster/ec/src/ec-heald.h b/xlators/cluster/ec/src/ec-heald.h
index 2eda2a7..8184cf4 100644
--- a/xlators/cluster/ec/src/ec-heald.h
+++ b/xlators/cluster/ec/src/ec-heald.h
@@ -24,4 +24,7 @@ ec_selfheal_daemon_init(xlator_t *this);
void
ec_shd_index_healer_wake(ec_t *ec);
+void
+ec_selfheal_daemon_fini(xlator_t *this);
+
#endif /* __EC_HEALD_H__ */
diff --git a/xlators/cluster/ec/src/ec-messages.h b/xlators/cluster/ec/src/ec-messages.h
index 7c28808..ce299bb 100644
--- a/xlators/cluster/ec/src/ec-messages.h
+++ b/xlators/cluster/ec/src/ec-messages.h
@@ -55,6 +55,7 @@ GLFS_MSGID(EC, EC_MSG_INVALID_CONFIG, EC_MSG_HEAL_FAIL,
EC_MSG_CONFIG_XATTR_INVALID, EC_MSG_EXTENSION, EC_MSG_EXTENSION_NONE,
EC_MSG_EXTENSION_UNKNOWN, EC_MSG_EXTENSION_UNSUPPORTED,
EC_MSG_EXTENSION_FAILED, EC_MSG_NO_GF, EC_MSG_MATRIX_FAILED,
- EC_MSG_DYN_CREATE_FAILED, EC_MSG_DYN_CODEGEN_FAILED);
+ EC_MSG_DYN_CREATE_FAILED, EC_MSG_DYN_CODEGEN_FAILED,
+ EC_MSG_THREAD_CLEANUP_FAILED);
#endif /* !_EC_MESSAGES_H_ */
diff --git a/xlators/cluster/ec/src/ec.c b/xlators/cluster/ec/src/ec.c
index 3c8013e..264582a 100644
--- a/xlators/cluster/ec/src/ec.c
+++ b/xlators/cluster/ec/src/ec.c
@@ -429,6 +429,51 @@ ec_disable_delays(ec_t *ec)
}
void
+ec_cleanup_healer_object(ec_t *ec)
+{
+ struct subvol_healer *healer = NULL;
+ ec_self_heald_t *shd = NULL;
+ void *res = NULL;
+ int i = 0;
+ gf_boolean_t is_join = _gf_false;
+
+ shd = &ec->shd;
+ if (!shd->iamshd)
+ return;
+
+ for (i = 0; i < ec->nodes; i++) {
+ healer = &shd->index_healers[i];
+ pthread_mutex_lock(&healer->mutex);
+ {
+ healer->rerun = 1;
+ if (healer->running) {
+ pthread_cond_signal(&healer->cond);
+ is_join = _gf_true;
+ }
+ }
+ pthread_mutex_unlock(&healer->mutex);
+ if (is_join) {
+ pthread_join(healer->thread, &res);
+ is_join = _gf_false;
+ }
+
+ healer = &shd->full_healers[i];
+ pthread_mutex_lock(&healer->mutex);
+ {
+ healer->rerun = 1;
+ if (healer->running) {
+ pthread_cond_signal(&healer->cond);
+ is_join = _gf_true;
+ }
+ }
+ pthread_mutex_unlock(&healer->mutex);
+ if (is_join) {
+ pthread_join(healer->thread, &res);
+ is_join = _gf_false;
+ }
+ }
+}
+void
ec_pending_fops_completed(ec_t *ec)
{
if (ec->shutdown) {
@@ -544,6 +589,7 @@ ec_notify(xlator_t *this, int32_t event, void *data, void *data2)
/* If there aren't pending fops running after we have waken up
* them, we immediately propagate the notification. */
propagate = ec_disable_delays(ec);
+ ec_cleanup_healer_object(ec);
goto unlock;
}
@@ -759,6 +805,7 @@ failed:
void
fini(xlator_t *this)
{
+ ec_selfheal_daemon_fini(this);
__ec_destroy_private(this);
}
--
1.8.3.1

View File

@ -0,0 +1,40 @@
From 40bd6e9c186adb427e136a84eaab631e6a6f5263 Mon Sep 17 00:00:00 2001
From: Pranith Kumar K <pkarampu@redhat.com>
Date: Sun, 5 May 2019 21:17:24 +0530
Subject: [PATCH 140/141] cluster/ec: Reopen shouldn't happen with O_TRUNC
Problem:
Doing re-open with O_TRUNC will truncate the fragment even when it is not
needed needing extra heals
Fix:
At the time of re-open don't use O_TRUNC.
Upstream-patch: https://review.gluster.org/c/glusterfs/+/22660/
fixes bz#1706549
Change-Id: Idc6408968efaad897b95a5a52481c66e843d3fb8
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/169982
Tested-by: RHGS Build Bot <nigelb@redhat.com>
Reviewed-by: Sunil Kumar Heggodu Gopala Acharya <sheggodu@redhat.com>
---
xlators/cluster/ec/src/ec-common.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/xlators/cluster/ec/src/ec-common.c b/xlators/cluster/ec/src/ec-common.c
index 1454ae2..b1ba5e9 100644
--- a/xlators/cluster/ec/src/ec-common.c
+++ b/xlators/cluster/ec/src/ec-common.c
@@ -128,7 +128,7 @@ ec_fix_open(ec_fop_data_t *fop, uintptr_t mask)
} else {
ec_open(fop->frame, fop->xl, need_open,
EC_MINIMUM_ONE | EC_FOP_NO_PROPAGATE_ERROR, NULL, NULL, &loc,
- fop->fd->flags, fop->fd, NULL);
+ fop->fd->flags & (~O_TRUNC), fop->fd, NULL);
}
out:
--
1.8.3.1

View File

@ -0,0 +1,295 @@
From e3020e43344ddbc32e62e06bbbf88a4f5d7cdc82 Mon Sep 17 00:00:00 2001
From: Mohit Agrawal <moagrawa@redhat.com>
Date: Fri, 10 May 2019 11:13:45 +0530
Subject: [PATCH 141/141] socket/ssl: fix crl handling
Problem:
Just setting the path to the CRL directory in socket_init() wasn't working.
Solution:
Need to use special API to retrieve and set X509_VERIFY_PARAM and set
the CRL checking flags explicitly.
Also, setting the CRL checking flags is a big pain, since the connection
is declared as failed if any CRL isn't found in the designated file or
directory. A comment has been added to the code appropriately.
> Change-Id: I8a8ed2ddaf4b5eb974387d2f7b1a85c1ca39fe79
> fixes: bz#1687326
> Signed-off-by: Milind Changire <mchangir@redhat.com>
> (Cherry pick from commit 06fa261207f0f0625c52fa977b96e5875e9a91e0)
> (Reviewed on upstream link https://review.gluster.org/#/c/glusterfs/+/22334/)
Change-Id: I0958e9890035fd376f1e1eafc1452caf3edd184b
BUG: 1583585
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/166458
Tested-by: RHGS Build Bot <nigelb@redhat.com>
Reviewed-by: Sunil Kumar Heggodu Gopala Acharya <sheggodu@redhat.com>
---
configure.ac | 2 +
rpc/rpc-transport/socket/src/socket.c | 110 ++++++++++++++++++++++++++++------
rpc/rpc-transport/socket/src/socket.h | 2 +
tests/features/ssl-ciphers.t | 13 +++-
4 files changed, 107 insertions(+), 20 deletions(-)
diff --git a/configure.ac b/configure.ac
index 3065077..0e11d4c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -491,6 +491,8 @@ AC_CHECK_HEADERS([openssl/dh.h])
AC_CHECK_HEADERS([openssl/ecdh.h])
+AC_CHECK_LIB([ssl], [SSL_CTX_get0_param], [AC_DEFINE([HAVE_SSL_CTX_GET0_PARAM], [1], [define if found OpenSSL SSL_CTX_get0_param])])
+
dnl Math library
AC_CHECK_LIB([m], [pow], [MATH_LIB='-lm'], [MATH_LIB=''])
AC_SUBST(MATH_LIB)
diff --git a/rpc/rpc-transport/socket/src/socket.c b/rpc/rpc-transport/socket/src/socket.c
index f6de1d3..bf2fa71 100644
--- a/rpc/rpc-transport/socket/src/socket.c
+++ b/rpc/rpc-transport/socket/src/socket.c
@@ -308,8 +308,65 @@ out:
#define ssl_write_one(t, b, l) \
ssl_do((t), (b), (l), (SSL_trinary_func *)SSL_write)
+/* set crl verify flags only for server */
+/* see man X509_VERIFY_PARAM_SET_FLAGS(3)
+ * X509_V_FLAG_CRL_CHECK enables CRL checking for the certificate chain
+ * leaf certificate. An error occurs if a suitable CRL cannot be found.
+ * Since we're never going to revoke a gluster node cert, we better disable
+ * CRL check for server certs to avoid getting error and failed connection
+ * attempts.
+ */
+static void
+ssl_clear_crl_verify_flags(SSL_CTX *ssl_ctx)
+{
+#ifdef X509_V_FLAG_CRL_CHECK_ALL
+#ifdef HAVE_SSL_CTX_GET0_PARAM
+ X509_VERIFY_PARAM *vpm;
+
+ vpm = SSL_CTX_get0_param(ssl_ctx);
+ if (vpm) {
+ X509_VERIFY_PARAM_clear_flags(
+ vpm, (X509_V_FLAG_CRL_CHECK | X509_V_FLAG_CRL_CHECK_ALL));
+ }
+#else
+ /* CRL verify flag need not be cleared for rhel6 kind of clients */
+#endif
+#else
+ gf_log(this->name, GF_LOG_ERROR, "OpenSSL version does not support CRL");
+#endif
+ return;
+}
+
+/* set crl verify flags only for server */
+static void
+ssl_set_crl_verify_flags(SSL_CTX *ssl_ctx)
+{
+#ifdef X509_V_FLAG_CRL_CHECK_ALL
+#ifdef HAVE_SSL_CTX_GET0_PARAM
+ X509_VERIFY_PARAM *vpm;
+
+ vpm = SSL_CTX_get0_param(ssl_ctx);
+ if (vpm) {
+ unsigned long flags;
+
+ flags = X509_VERIFY_PARAM_get_flags(vpm);
+ flags |= (X509_V_FLAG_CRL_CHECK | X509_V_FLAG_CRL_CHECK_ALL);
+ X509_VERIFY_PARAM_set_flags(vpm, flags);
+ }
+#else
+ X509_STORE *x509store;
+
+ x509store = SSL_CTX_get_cert_store(ssl_ctx);
+ X509_STORE_set_flags(x509store,
+ X509_V_FLAG_CRL_CHECK | X509_V_FLAG_CRL_CHECK_ALL);
+#endif
+#else
+ gf_log(this->name, GF_LOG_ERROR, "OpenSSL version does not support CRL");
+#endif
+}
+
int
-ssl_setup_connection_prefix(rpc_transport_t *this)
+ssl_setup_connection_prefix(rpc_transport_t *this, gf_boolean_t server)
{
int ret = -1;
socket_private_t *priv = NULL;
@@ -332,6 +389,9 @@ ssl_setup_connection_prefix(rpc_transport_t *this)
priv->ssl_accepted = _gf_false;
priv->ssl_context_created = _gf_false;
+ if (!server && priv->crl_path)
+ ssl_clear_crl_verify_flags(priv->ssl_ctx);
+
priv->ssl_ssl = SSL_new(priv->ssl_ctx);
if (!priv->ssl_ssl) {
gf_log(this->name, GF_LOG_ERROR, "SSL_new failed");
@@ -2664,7 +2724,7 @@ ssl_handle_server_connection_attempt(rpc_transport_t *this)
fd = priv->sock;
if (!priv->ssl_context_created) {
- ret = ssl_setup_connection_prefix(this);
+ ret = ssl_setup_connection_prefix(this, _gf_true);
if (ret < 0) {
gf_log(this->name, GF_LOG_TRACE,
"> ssl_setup_connection_prefix() failed!");
@@ -2718,7 +2778,7 @@ ssl_handle_client_connection_attempt(rpc_transport_t *this)
ret = -1;
} else {
if (!priv->ssl_context_created) {
- ret = ssl_setup_connection_prefix(this);
+ ret = ssl_setup_connection_prefix(this, _gf_false);
if (ret < 0) {
gf_log(this->name, GF_LOG_TRACE,
"> ssl_setup_connection_prefix() "
@@ -3085,7 +3145,30 @@ socket_server_event_handler(int fd, int idx, int gen, void *data, int poll_in,
gf_log(this->name, GF_LOG_TRACE, "XXX server:%s, client:%s",
new_trans->myinfo.identifier, new_trans->peerinfo.identifier);
+ /* Make options available to local socket_init() to create new
+ * SSL_CTX per transport. A separate SSL_CTX per transport is
+ * required to avoid setting crl checking options for client
+ * connections. The verification options eventually get copied
+ * to the SSL object. Unfortunately, there's no way to identify
+ * whether socket_init() is being called after a client-side
+ * connect() or a server-side accept(). Although, we could pass
+ * a flag from the transport init() to the socket_init() and
+ * from this place, this doesn't identify the case where the
+ * server-side transport loading is done for the first time.
+ * Also, SSL doesn't apply for UNIX sockets.
+ */
+ if (new_sockaddr.ss_family != AF_UNIX)
+ new_trans->options = dict_ref(this->options);
+ new_trans->ctx = this->ctx;
+
ret = socket_init(new_trans);
+
+ /* reset options to NULL to avoid double free */
+ if (new_sockaddr.ss_family != AF_UNIX) {
+ dict_unref(new_trans->options);
+ new_trans->options = NULL;
+ }
+
if (ret != 0) {
gf_log(this->name, GF_LOG_WARNING,
"initialization of new_trans "
@@ -4150,7 +4233,6 @@ ssl_setup_connection_params(rpc_transport_t *this)
char *cipher_list = DEFAULT_CIPHER_LIST;
char *dh_param = DEFAULT_DH_PARAM;
char *ec_curve = DEFAULT_EC_CURVE;
- char *crl_path = NULL;
priv = this->private;
@@ -4192,6 +4274,7 @@ ssl_setup_connection_params(rpc_transport_t *this)
}
priv->ssl_ca_list = gf_strdup(priv->ssl_ca_list);
+ optstr = NULL;
if (dict_get_str(this->options, SSL_CRL_PATH_OPT, &optstr) == 0) {
if (!priv->ssl_enabled) {
gf_log(this->name, GF_LOG_WARNING,
@@ -4199,9 +4282,9 @@ ssl_setup_connection_params(rpc_transport_t *this)
SSL_ENABLED_OPT);
}
if (strcasecmp(optstr, "NULL") == 0)
- crl_path = NULL;
+ priv->crl_path = NULL;
else
- crl_path = optstr;
+ priv->crl_path = gf_strdup(optstr);
}
gf_log(this->name, priv->ssl_enabled ? GF_LOG_INFO : GF_LOG_DEBUG,
@@ -4343,24 +4426,15 @@ ssl_setup_connection_params(rpc_transport_t *this)
}
if (!SSL_CTX_load_verify_locations(priv->ssl_ctx, priv->ssl_ca_list,
- crl_path)) {
+ priv->crl_path)) {
gf_log(this->name, GF_LOG_ERROR, "could not load CA list");
goto err;
}
SSL_CTX_set_verify_depth(priv->ssl_ctx, cert_depth);
- if (crl_path) {
-#ifdef X509_V_FLAG_CRL_CHECK_ALL
- X509_STORE *x509store;
-
- x509store = SSL_CTX_get_cert_store(priv->ssl_ctx);
- X509_STORE_set_flags(
- x509store, X509_V_FLAG_CRL_CHECK | X509_V_FLAG_CRL_CHECK_ALL);
-#else
- gf_log(this->name, GF_LOG_ERROR,
- "OpenSSL version does not support CRL");
-#endif
+ if (priv->crl_path) {
+ ssl_set_crl_verify_flags(priv->ssl_ctx);
}
priv->ssl_session_id = session_id++;
diff --git a/rpc/rpc-transport/socket/src/socket.h b/rpc/rpc-transport/socket/src/socket.h
index e1ccae2..e7c0090 100644
--- a/rpc/rpc-transport/socket/src/socket.h
+++ b/rpc/rpc-transport/socket/src/socket.h
@@ -14,6 +14,7 @@
#include <openssl/ssl.h>
#include <openssl/err.h>
#include <openssl/x509v3.h>
+#include <openssl/x509_vfy.h>
#ifdef HAVE_OPENSSL_DH_H
#include <openssl/dh.h>
#endif
@@ -246,6 +247,7 @@ typedef struct {
char *ssl_own_cert;
char *ssl_private_key;
char *ssl_ca_list;
+ char *crl_path;
int pipe[2];
struct gf_sock_incoming incoming;
/* -1 = not connected. 0 = in progress. 1 = connected */
diff --git a/tests/features/ssl-ciphers.t b/tests/features/ssl-ciphers.t
index 563d37c..7e1e199 100644
--- a/tests/features/ssl-ciphers.t
+++ b/tests/features/ssl-ciphers.t
@@ -175,8 +175,6 @@ BRICK_PORT=`brick_port $V0`
EXPECT "Y" openssl_connect -cipher EECDH -connect $H0:$BRICK_PORT
# test revocation
-# no need to restart the volume since the options are used
-# by the client here.
TEST $CLI volume set $V0 ssl.crl-path $TMPDIR
EXPECT $TMPDIR volume_option $V0 ssl.crl-path
$GFS --volfile-id=$V0 --volfile-server=$H0 $M0
@@ -189,14 +187,25 @@ TEST openssl ca -batch -config $SSL_CFG -revoke $SSL_CERT 2>&1
TEST openssl ca -config $SSL_CFG -gencrl -out $SSL_CRL 2>&1
# Failed once revoked
+# Although client fails to mount without restarting the server after crl-path
+# is set when no actual crl file is found on the client, it would also fail
+# when server is restarted for the same reason. Since the socket initialization
+# code is the same for client and server, the crl verification flags need to
+# be turned off for the client to avoid SSL searching for CRLs in the
+# ssl.crl-path. If no CRL files are found in the ssl.crl-path, SSL fails the
+# connect() attempt on the client.
+TEST $CLI volume stop $V0
+TEST $CLI volume start $V0
$GFS --volfile-id=$V0 --volfile-server=$H0 $M0
EXPECT "N" wait_mount $M0
TEST ! test -f $TEST_FILE
EXPECT_WITHIN $UMOUNT_TIMEOUT "Y" force_umount $M0
# Succeed with CRL disabled
+TEST $CLI volume stop $V0
TEST $CLI volume set $V0 ssl.crl-path NULL
EXPECT NULL volume_option $V0 ssl.crl-path
+TEST $CLI volume start $V0
$GFS --volfile-id=$V0 --volfile-server=$H0 $M0
EXPECT "Y" wait_mount $M0
TEST test -f $TEST_FILE
--
1.8.3.1

View File

@ -231,7 +231,7 @@ Release: 0.1%{?prereltag:.%{prereltag}}%{?dist}
%else
Name: glusterfs
Version: 6.0
Release: 2%{?dist}
Release: 3%{?dist}
ExcludeArch: i686
%endif
License: GPLv2 or LGPLv3+
@ -430,6 +430,23 @@ Patch0121: 0121-spec-glusterfs-devel-for-client-build-should-not-dep.patch
Patch0122: 0122-posix-ctime-Fix-stat-time-attributes-inconsistency-d.patch
Patch0123: 0123-ctime-Fix-log-repeated-logging-during-open.patch
Patch0124: 0124-spec-remove-duplicate-references-to-files.patch
Patch0125: 0125-glusterd-define-dumpops-in-the-xlator_api-of-gluster.patch
Patch0126: 0126-cluster-dht-refactor-dht-lookup-functions.patch
Patch0127: 0127-cluster-dht-Refactor-dht-lookup-functions.patch
Patch0128: 0128-glusterd-Fix-bulkvoldict-thread-logic-in-brick-multi.patch
Patch0129: 0129-core-handle-memory-accounting-correctly.patch
Patch0130: 0130-tier-test-new-tier-cmds.t-fails-after-a-glusterd-res.patch
Patch0131: 0131-tests-dht-Test-that-lookups-are-sent-post-brick-up.patch
Patch0132: 0132-glusterd-remove-duplicate-occurrence-of-features.sel.patch
Patch0133: 0133-glusterd-enable-fips-mode-rchecksum-for-new-volumes.patch
Patch0134: 0134-performance-write-behind-remove-request-from-wip-lis.patch
Patch0135: 0135-geo-rep-fix-incorrectly-formatted-authorized_keys.patch
Patch0136: 0136-glusterd-fix-inconsistent-global-option-output-in-vo.patch
Patch0137: 0137-shd-glusterd-Serialize-shd-manager-to-prevent-race-c.patch
Patch0138: 0138-glusterd-Add-gluster-volume-stop-operation-to-gluste.patch
Patch0139: 0139-ec-shd-Cleanup-self-heal-daemon-resources-during-ec-.patch
Patch0140: 0140-cluster-ec-Reopen-shouldn-t-happen-with-O_TRUNC.patch
Patch0141: 0141-socket-ssl-fix-crl-handling.patch
%description
GlusterFS is a distributed file-system capable of scaling to several
@ -2132,6 +2149,10 @@ fi
%endif
%changelog
* Tue May 14 2019 Rinku Kothiya <rkothiya@redhat.com> - 6.0-3
- fixes bugs bz#1583585 bz#1671862 bz#1702686 bz#1703434 bz#1703753
bz#1703897 bz#1704562 bz#1704769 bz#1704851 bz#1706683 bz#1706776 bz#1706893
* Thu Apr 25 2019 Milind Changire <mchangir@redhat.com> - 6.0-2
- fixes bugs bz#1471742 bz#1652461 bz#1671862 bz#1676495 bz#1691620
bz#1696334 bz#1696903 bz#1697820 bz#1698436 bz#1698728 bz#1699709 bz#1699835