glusterfs/0555-geo-rep-Improve-handling-of-gfid-mismatches.patch
Gluster Jenkins af562eab56 autobuild v6.0-57
Related: rhbz#2055630
Resolves: bz#1600379 bz#1689375 bz#1782428 bz#1798897 bz#1815462
Resolves: bz#1889966 bz#1891403 bz#1901468 bz#1903911 bz#1908635
Resolves: bz#1917488 bz#1918018 bz#1919132 bz#1925425 bz#1927411
Resolves: bz#1927640 bz#1928676 bz#1942816 bz#1943467 bz#1945143
Resolves: bz#1946171 bz#1957191 bz#1957641
Signed-off-by: Gluster Jenkins <dkhandel+glusterjenkins@redhat.com>
2022-02-28 21:15:34 +02:00

80 lines
3.5 KiB
Diff

From f2d3866e617d25ea62cda01afddc81ef0db3356e Mon Sep 17 00:00:00 2001
From: Xavi Hernandez <xhernandez@redhat.com>
Date: Tue, 4 May 2021 22:39:03 +0200
Subject: [PATCH 555/584] geo-rep: Improve handling of gfid mismatches
In some circumstances geo-replication can detect mismatching gfids
between primary and secondary. These entries are fixed in an iterative
way, assuming that after a fix, a previously failing entry could
succeed.
Previous code was trying to fix them in a loop that can be executed
up to 10 times. If some entry cannot be fixed after 10 attempts, it's
discarded. These fixes are very slow, so trying to do them many times
causes geo-replication to get out of sync.
To minimize the number of iterations done, this patch checks if the
number of entries and failures remains constant after each iteration.
If they are constant, it means that nothing else can be fixed, so it
makes no sense to do more iterations. This reduces the number of
iterations to 2 or 3 in most of the cases, improving geo-replication
performance.
Backport of:
> Upstream-patch: https://github.com/gluster/glusterfs/pull/2389
> Fixes: #2388
> Change-Id: I6d9a623a60045694e1a832195e1dc1fb9e88ae54
> Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
BUG: 1957191
Change-Id: I6d9a623a60045694e1a832195e1dc1fb9e88ae54
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/c/rhs-glusterfs/+/244550
Tested-by: RHGS Build Bot <nigelb@redhat.com>
Reviewed-by: Sunil Kumar Heggodu Gopala Acharya <sheggodu@redhat.com>
---
geo-replication/syncdaemon/master.py | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/geo-replication/syncdaemon/master.py b/geo-replication/syncdaemon/master.py
index 98637e7..aef9373 100644
--- a/geo-replication/syncdaemon/master.py
+++ b/geo-replication/syncdaemon/master.py
@@ -1224,9 +1224,11 @@ class GMasterChangelogMixin(GMasterCommon):
if gconf.get("gfid-conflict-resolution"):
count = 0
+ num_entries = len(entries)
+ num_failures = len(failures)
if failures:
logging.info(lf('Entry ops failed with gfid mismatch',
- count=len(failures)))
+ count=num_failures))
while failures and count < self.MAX_OE_RETRIES:
count += 1
self.handle_entry_failures(failures, entries)
@@ -1237,6 +1239,20 @@ class GMasterChangelogMixin(GMasterCommon):
"gfid mismatch")
break
+ # If this iteration has not removed any entry or reduced
+ # the number of failures compared to the previous one, we
+ # don't need to keep iterating because we'll get the same
+ # result in all other attempts.
+ if ((num_entries == len(entries)) and
+ (num_failures == len(failures))):
+ logging.info(lf("No more gfid mismatches can be fixed",
+ entries=num_entries,
+ failures=num_failures))
+ break
+
+ num_entries = len(entries)
+ num_failures = len(failures)
+
self.log_failures(failures, 'gfid', gauxpfx(), 'ENTRY')
self.status.dec_value("entry", len(entries))
--
1.8.3.1