import Oracle_OSS mdadm-4.4-4.0.1.el9_7
This commit is contained in:
parent
54ec5c3b72
commit
d9e2e5ae99
210
SOURCES/0038-mdadm-enable-sync-file-for-udev-rules.patch
Normal file
210
SOURCES/0038-mdadm-enable-sync-file-for-udev-rules.patch
Normal file
@ -0,0 +1,210 @@
|
||||
From 8da27191aa62b08075d8e7ec36c14083f528eb89 Mon Sep 17 00:00:00 2001
|
||||
From: Nigel Croxon <ncroxon@redhat.com>
|
||||
Date: Fri, 4 Apr 2025 08:44:47 -0400
|
||||
Subject: [PATCH 1/1] mdadm: enable sync file for udev rules
|
||||
|
||||
Mounting an md device may fail during boot from mdadm's claim
|
||||
on the device not being released before systemd attempts to mount.
|
||||
|
||||
In this case it was found that essentially there is a race condition
|
||||
occurring in which the mount cannot happen without some kind of delay
|
||||
being added BEFORE the mount itself triggers, or manual intervention
|
||||
after a timeout.
|
||||
|
||||
The findings:
|
||||
the inode was for a tmp block node made by mdadm for md0.
|
||||
|
||||
crash> detailedsearch ff1b0c398ff28380
|
||||
ff1b0c398f079720: ff1b0c398ff28380 slab:filp state:alloc
|
||||
obj:ff1b0c398f079700 size:256
|
||||
ff1b0c398ff284f8: ff1b0c398ff28380 slab:shmem_inode_cache
|
||||
state:alloc obj:ff1b0c398ff28308 size:768
|
||||
|
||||
crash> struct file.f_inode,f_path ff1b0c398f079700
|
||||
f_inode = 0xff1b0c398ff28380,
|
||||
f_path = {
|
||||
mnt = 0xff1b0c594aecc7a0,
|
||||
dentry = 0xff1b0c3a8c614f00
|
||||
},
|
||||
crash> struct dentry.d_name 0xff1b0c3a8c614f00
|
||||
d_name = {
|
||||
{
|
||||
{ hash = 3714992780, len = 16 },
|
||||
hash_len = 72434469516
|
||||
},
|
||||
name = 0xff1b0c3a8c614f38 ".tmp.md.1454:9:0"
|
||||
},
|
||||
|
||||
For the race condition, mdadm and udev have some infrastructure for making
|
||||
the device be ignored while under construction. e.g.
|
||||
|
||||
$ cat lib/udev/rules.d/01-md-raid-creating.rules
|
||||
|
||||
do not edit this file, it will be overwritten on update
|
||||
While mdadm is creating an array, it creates a file
|
||||
/run/mdadm/creating-mdXXX. If that file exists, then
|
||||
the array is not "ready" and we should make sure the
|
||||
content is ignored.
|
||||
KERNEL=="md*", TEST=="/run/mdadm/creating-$kernel", ENV{SYSTEMD_READY}="0"
|
||||
|
||||
However, this feature currently is only used by the mdadm create command.
|
||||
See calls to udev_block/udev_unblock in the mdadm code as to where and when
|
||||
this behavior is used. Any md array being started by incremental or
|
||||
normal assemble commands does not use this udev integration. So assembly
|
||||
of an existing array does not look to have any explicit protection from
|
||||
systemd/udev seeing an array as in a usable state before an mdadm instance
|
||||
with O_EXCL closes its file handle.
|
||||
This is for the sake of showing the use case for such an option and why
|
||||
it would be helpful to delay the mount itself.
|
||||
|
||||
While mdadm is still constructing the array mdadm --incremental
|
||||
that is called from within /usr/lib/udev/rules.d/64-md-raid-assembly.rules,
|
||||
there is an attempt to mount the md device, but there is not a creation
|
||||
of "/run/mdadm/creating-xxx" file when in incremental mode that
|
||||
the rule is looking for. Therefore the device is not marked
|
||||
as SYSTEMD_READY=0 in
|
||||
"/usr/lib/udev/rules.d/01-md-raid-creating.rules" and missing
|
||||
synchronization using the "/run/mdadm/creating-xxx" file.
|
||||
|
||||
As to this change affecting containers or IMSM...
|
||||
(container's array state is inactive all the time)
|
||||
|
||||
Even if the "array_state" reports "inactive" when previous components
|
||||
are added, the mdadm call for the very last array component that makes
|
||||
it usable/ready, still needs to be synced properly - mdadm needs to drop
|
||||
the claim first calling "close", then delete the "/run/mdadm/creating-xxx".
|
||||
Then lets the udev know it is clear to act now (the "udev_unblock" in
|
||||
mdadm code that generates a synthetic udev event so the rules are
|
||||
reevalutated). It's this processing of the very last array component
|
||||
that is the issue here (which is not IO error, but it is that trying to
|
||||
open the dev returns -EBUSY because of the exclusive claim that mdadm
|
||||
still holds while the mdadm device is being processed already by udev in
|
||||
parallel, and that is what the
|
||||
/run/mdadm/creating-xxx should prevent exactly).
|
||||
|
||||
The patch to Incremental.c is to enable creating the
|
||||
"/run/mdadm/creating-xxx" file during incremental mode.
|
||||
|
||||
For the change to Create.c, the unlink is called right before dropping
|
||||
the exculusive claim for the device. This should be the other way round
|
||||
to avoid the race 100%. That is, if there's a "close" call and
|
||||
"udev_unblock" call, the "close" should go first, then followed
|
||||
"udev_unblock".
|
||||
|
||||
Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
|
||||
---
|
||||
Create.c | 2 +-
|
||||
Incremental.c | 20 +++++++++++++++-----
|
||||
2 files changed, 16 insertions(+), 6 deletions(-)
|
||||
|
||||
diff --git a/Create.c b/Create.c
|
||||
index de90b0b8e781..420b9136c2c2 100644
|
||||
--- a/Create.c
|
||||
+++ b/Create.c
|
||||
@@ -1316,8 +1316,8 @@ int Create(struct supertype *st, struct mddev_ident *ident, int subdevs,
|
||||
} else {
|
||||
pr_err("not starting array - not enough devices.\n");
|
||||
}
|
||||
- udev_unblock();
|
||||
close(mdfd);
|
||||
+ udev_unblock();
|
||||
sysfs_uevent(&info, "change");
|
||||
dev_policy_free(custom_pols);
|
||||
|
||||
diff --git a/Incremental.c b/Incremental.c
|
||||
index 228d2bdd5de2..ba3810e6157f 100644
|
||||
--- a/Incremental.c
|
||||
+++ b/Incremental.c
|
||||
@@ -30,6 +30,7 @@
|
||||
|
||||
#include "mdadm.h"
|
||||
#include "xmalloc.h"
|
||||
+#include "udev.h"
|
||||
|
||||
#include <sys/wait.h>
|
||||
#include <dirent.h>
|
||||
@@ -286,7 +287,7 @@ int Incremental(struct mddev_dev *devlist, struct context *c,
|
||||
|
||||
/* Couldn't find an existing array, maybe make a new one */
|
||||
mdfd = create_mddev(match ? match->devname : NULL, name_to_use, trustworthy,
|
||||
- chosen_name, 0);
|
||||
+ chosen_name, 1);
|
||||
|
||||
if (mdfd < 0)
|
||||
goto out_unlock;
|
||||
@@ -447,7 +448,6 @@ int Incremental(struct mddev_dev *devlist, struct context *c,
|
||||
info.array.working_disks = 0;
|
||||
for (d = sra->devs; d; d=d->next)
|
||||
info.array.working_disks ++;
|
||||
-
|
||||
}
|
||||
if (strncmp(chosen_name, DEV_MD_DIR, DEV_MD_DIR_LEN) == 0)
|
||||
md_devname = chosen_name + DEV_MD_DIR_LEN;
|
||||
@@ -464,7 +464,6 @@ int Incremental(struct mddev_dev *devlist, struct context *c,
|
||||
if (is_container(info.array.level)) {
|
||||
char devnm[32];
|
||||
/* Try to assemble within the container */
|
||||
- sysfs_uevent(sra, "change");
|
||||
if (!c->export && c->verbose >= 0)
|
||||
pr_err("container %s now has %d device%s\n",
|
||||
chosen_name, info.array.working_disks,
|
||||
@@ -476,6 +475,8 @@ int Incremental(struct mddev_dev *devlist, struct context *c,
|
||||
if (st->ss->load_container)
|
||||
rv = st->ss->load_container(st, mdfd, NULL);
|
||||
close(mdfd);
|
||||
+ udev_unblock();
|
||||
+ sysfs_uevent(sra, "change");
|
||||
sysfs_free(sra);
|
||||
if (!rv)
|
||||
rv = Incremental_container(st, chosen_name, c, NULL);
|
||||
@@ -484,6 +485,7 @@ int Incremental(struct mddev_dev *devlist, struct context *c,
|
||||
* so that it can eg. try to rebuild degraded array */
|
||||
if (st->ss->external)
|
||||
ping_monitor(devnm);
|
||||
+ udev_unblock();
|
||||
return rv;
|
||||
}
|
||||
|
||||
@@ -606,7 +608,11 @@ out:
|
||||
close(mdfd);
|
||||
if (policy)
|
||||
dev_policy_free(policy);
|
||||
- sysfs_free(sra);
|
||||
+ udev_unblock();
|
||||
+ if (sra) {
|
||||
+ sysfs_uevent(sra, "change");
|
||||
+ sysfs_free(sra);
|
||||
+ }
|
||||
return rv;
|
||||
out_unlock:
|
||||
map_unlock(&map);
|
||||
@@ -1561,7 +1567,7 @@ static int Incremental_container(struct supertype *st, char *devname,
|
||||
trustworthy = LOCAL;
|
||||
|
||||
mdfd = create_mddev(match ? match->devname : NULL, ra->name, trustworthy,
|
||||
- chosen_name, 0);
|
||||
+ chosen_name, 1);
|
||||
|
||||
if (!is_fd_valid(mdfd)) {
|
||||
pr_err("create_mddev failed with chosen name %s: %s.\n",
|
||||
@@ -1581,6 +1587,8 @@ static int Incremental_container(struct supertype *st, char *devname,
|
||||
map_free(map);
|
||||
map = NULL;
|
||||
close_fd(&mdfd);
|
||||
+ udev_unblock();
|
||||
+ sysfs_uevent(&info, "change");
|
||||
}
|
||||
if (c->export && result) {
|
||||
char sep = '=';
|
||||
@@ -1607,6 +1615,8 @@ static int Incremental_container(struct supertype *st, char *devname,
|
||||
release:
|
||||
map_free(map);
|
||||
sysfs_free(list);
|
||||
+ udev_unblock();
|
||||
+ sysfs_uevent(&info, "change");
|
||||
return rv;
|
||||
}
|
||||
|
||||
--
|
||||
2.50.1
|
||||
|
||||
@ -0,0 +1,30 @@
|
||||
From ea4cdaea1a553685444a3fb39aae6b2cfee387ef Mon Sep 17 00:00:00 2001
|
||||
From: Xiao Ni <xni@redhat.com>
|
||||
Date: Tue, 3 Jun 2025 08:49:29 +0800
|
||||
Subject: [PATCH 1/1] mdadm/assemble: Don't stop array after creating it
|
||||
|
||||
It stops the array which is just created. From the comment it wants to
|
||||
stop the array if it has no content. But it hasn't added member disks,
|
||||
so it's a clean array. It's meaningless to do it.
|
||||
|
||||
Signed-off-by: Xiao Ni <xni@redhat.com>
|
||||
---
|
||||
Assemble.c | 2 --
|
||||
1 file changed, 2 deletions(-)
|
||||
|
||||
diff --git a/Assemble.c b/Assemble.c
|
||||
index f8099cd32aa3..1949bf96c478 100644
|
||||
--- a/Assemble.c
|
||||
+++ b/Assemble.c
|
||||
@@ -1570,8 +1570,6 @@ try_again:
|
||||
goto try_again;
|
||||
goto out;
|
||||
}
|
||||
- /* just incase it was started but has no content */
|
||||
- ioctl(mdfd, STOP_ARRAY, NULL);
|
||||
}
|
||||
|
||||
if (content != &info) {
|
||||
--
|
||||
2.50.1
|
||||
|
||||
@ -0,0 +1,42 @@
|
||||
From cf73540e294d6a5e7dbce560ab163a3ec384a350 Mon Sep 17 00:00:00 2001
|
||||
From: Xiao Ni <xni@redhat.com>
|
||||
Date: Sun, 25 Jan 2026 16:50:58 +0800
|
||||
Subject: [PATCH 1/1] mdadm/incremental: set sysfs name after assembling imsm
|
||||
array
|
||||
|
||||
The sysfs name is not set after assembling imsm array. So sysfs_uevent
|
||||
can't send the change event. The raid device's state depends on the
|
||||
genuine events from the kernel. If the kernel geniune event is sent
|
||||
after udev_unblock, the raid can be ready on time. Then the it can be
|
||||
mounted during boot rightly. If the kernel geniune event is sent
|
||||
before udev_unblock, the mount will fail during boot.
|
||||
|
||||
Signed-off-by: <xni@redhat.com>
|
||||
---
|
||||
Incremental.c | 3 +++
|
||||
1 file changed, 3 insertions(+)
|
||||
|
||||
diff --git a/Incremental.c b/Incremental.c
|
||||
index f30697fa684f..555a50cda58b 100644
|
||||
--- a/Incremental.c
|
||||
+++ b/Incremental.c
|
||||
@@ -1494,6 +1494,7 @@ static int Incremental_container(struct supertype *st, char *devname,
|
||||
for (ra = list ; ra ; ra = ra->next) {
|
||||
int mdfd = -1;
|
||||
char chosen_name[1024];
|
||||
+ char *sysname;
|
||||
struct map_ent *mp;
|
||||
struct mddev_ident *match = NULL;
|
||||
|
||||
@@ -1586,6 +1587,8 @@ static int Incremental_container(struct supertype *st, char *devname,
|
||||
chosen_name, &result);
|
||||
map_free(map);
|
||||
map = NULL;
|
||||
+ sysname = fd2devnm(mdfd);
|
||||
+ strncpy(info.sys_name, sysname, sizeof(sysname) - 1);
|
||||
close_fd(&mdfd);
|
||||
udev_unblock();
|
||||
sysfs_uevent(&info, "change");
|
||||
--
|
||||
2.50.1 (Apple Git-155)
|
||||
|
||||
@ -0,0 +1,55 @@
|
||||
From 58a8393cddc3859b8818e76f0d5ee81d887b6cd5 Mon Sep 17 00:00:00 2001
|
||||
From: Xiao Ni <xni@redhat.com>
|
||||
Date: Wed, 4 Feb 2026 21:08:31 +0800
|
||||
Subject: [PATCH 1/1] mdadm/imsm: use creation_time for ctime in container info
|
||||
|
||||
When a disk has both DDF and IMSM metadata (e.g., migrated from DDF
|
||||
to IMSM or has remnant DDF metadata), guess_super_type() selects the
|
||||
metadata with the later creation time. Previously, IMSM always
|
||||
returned ctime=0 in getinfo_super_imsm(), causing it to lose to DDF
|
||||
which extracts a real timestamp from its GUID.
|
||||
|
||||
This resulted in the wrong metadata being selected during assembly,
|
||||
leading to boot failures when LVM activated raw PVs instead of MD
|
||||
devices.
|
||||
|
||||
Fix this by extracting the actual creation time from the IMSM
|
||||
metadata structure (mpb->creation_time) instead of hardcoding 0.
|
||||
This ensures that when both metadata types are present, the more
|
||||
recent one is correctly selected.
|
||||
|
||||
Signed-off-by: Xiao Ni <xni@redhat.com>
|
||||
---
|
||||
super-intel.c | 5 +++--
|
||||
1 file changed, 3 insertions(+), 2 deletions(-)
|
||||
|
||||
diff --git a/super-intel.c b/super-intel.c
|
||||
index e9fce12c35c7..2ff9d4862f7f 100644
|
||||
--- a/super-intel.c
|
||||
+++ b/super-intel.c
|
||||
@@ -3826,11 +3826,13 @@ static void getinfo_super_imsm(struct supertype *st, struct mdinfo *info, char *
|
||||
/* Set raid_disks to zero so that Assemble will always pull in valid
|
||||
* spares
|
||||
*/
|
||||
+ mpb = super->anchor;
|
||||
+
|
||||
info->array.raid_disks = 0;
|
||||
info->array.level = LEVEL_CONTAINER;
|
||||
info->array.layout = 0;
|
||||
info->array.md_minor = -1;
|
||||
- info->array.ctime = 0; /* N/A for imsm */
|
||||
+ info->array.ctime = __le64_to_cpu(mpb->creation_time);
|
||||
info->array.utime = 0;
|
||||
info->array.chunk_size = 0;
|
||||
|
||||
@@ -3850,7 +3852,6 @@ static void getinfo_super_imsm(struct supertype *st, struct mdinfo *info, char *
|
||||
info->bb.supported = 1;
|
||||
|
||||
/* do we have the all the insync disks that we expect? */
|
||||
- mpb = super->anchor;
|
||||
info->events = __le32_to_cpu(mpb->generation_num);
|
||||
|
||||
for (i = 0; i < mpb->num_raid_devs; i++) {
|
||||
--
|
||||
2.50.1 (Apple Git-155)
|
||||
|
||||
120
SOURCES/0042-mdadm-Create-array-with-sync-del-gendisk-mode.patch
Normal file
120
SOURCES/0042-mdadm-Create-array-with-sync-del-gendisk-mode.patch
Normal file
@ -0,0 +1,120 @@
|
||||
From d354d314db86379f18a4ccd35af9f6e56635b61d Mon Sep 17 00:00:00 2001
|
||||
From: Xiao Ni <xni@redhat.com>
|
||||
Date: Fri, 24 Oct 2025 15:17:29 +0800
|
||||
Subject: [PATCH 1/1] mdadm: Create array with sync del gendisk mode
|
||||
|
||||
kernel patch 9e59d609763f ('md: call del_gendisk in control path') calls
|
||||
del_gendisk in sync way. After the patch mentioned just now, device node
|
||||
(/dev/md0 .e.g) will disappear after mdadm --stop command. It resolves the
|
||||
problem raid can be created again because raid can be created when opening
|
||||
device node. Then regression tests will be interrupted.
|
||||
|
||||
But it causes an error when assembling array which has been fixed by pr182.
|
||||
So people can't assemble array if they use new kernel and old mdadm. So
|
||||
in kernel space, 25db5f284fb8 ('md: add legacy_async_del_gendisk mod') is
|
||||
used to fix this problem. The default is async mode.
|
||||
|
||||
async del mode will be removed in future. We'll start use sync del mode in
|
||||
new mdadm version. So people will not see failure when upgrading to the
|
||||
new mdadm version with sync del mode.
|
||||
|
||||
Signed-off-by: Xiao Ni <xni@redhat.com>
|
||||
---
|
||||
mdadm.h | 3 +++
|
||||
mdopen.c | 5 +++++
|
||||
util.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
|
||||
3 files changed, 53 insertions(+)
|
||||
|
||||
diff --git a/mdadm.h b/mdadm.h
|
||||
index 84bd2c915fc2..7dcb20ed1f34 100644
|
||||
--- a/mdadm.h
|
||||
+++ b/mdadm.h
|
||||
@@ -141,6 +141,8 @@ struct dlm_lksb {
|
||||
#define MDMON_DIR "/run/mdadm"
|
||||
#endif /* MDMON_DIR */
|
||||
|
||||
+#define MD_MOD_ASYNC_DEL_GENDISK "legacy_async_del_gendisk"
|
||||
+
|
||||
/* FAILED_SLOTS is where to save files storing recent removal of array
|
||||
* member in order to allow future reuse of disk inserted in the same
|
||||
* slot for array recovery
|
||||
@@ -855,6 +857,7 @@ extern int restore_stripes(int *dest, unsigned long long *offsets,
|
||||
unsigned long long start, unsigned long long length,
|
||||
char *src_buf);
|
||||
extern bool sysfs_is_libata_allow_tpm_enabled(const int verbose);
|
||||
+extern bool init_md_mod_param(void);
|
||||
|
||||
#ifndef Sendmail
|
||||
#define Sendmail "/usr/lib/sendmail -t"
|
||||
diff --git a/mdopen.c b/mdopen.c
|
||||
index 57252b646137..b685603d6de5 100644
|
||||
--- a/mdopen.c
|
||||
+++ b/mdopen.c
|
||||
@@ -148,6 +148,11 @@ int create_mddev(char *dev, char *name, int trustworthy,
|
||||
char devnm[32];
|
||||
char cbuf[400];
|
||||
|
||||
+ if (!init_md_mod_param()) {
|
||||
+ pr_err("init md module parameters fail\n");
|
||||
+ return -1;
|
||||
+ }
|
||||
+
|
||||
if (!udev_is_available())
|
||||
block_udev = 0;
|
||||
|
||||
diff --git a/util.c b/util.c
|
||||
index 5d6fe800d666..146f38fddd82 100644
|
||||
--- a/util.c
|
||||
+++ b/util.c
|
||||
@@ -2559,3 +2559,48 @@ bool is_file(const char *path)
|
||||
|
||||
return true;
|
||||
}
|
||||
+
|
||||
+bool set_md_mod_parameter(const char *name, const char *value)
|
||||
+{
|
||||
+ char path[256];
|
||||
+ int fd;
|
||||
+ bool ret = true;
|
||||
+
|
||||
+ snprintf(path, sizeof(path), "/sys/module/md_mod/parameters/%s", name);
|
||||
+
|
||||
+ fd = open(path, O_WRONLY);
|
||||
+ if (fd < 0) {
|
||||
+ pr_err("Can't open %s\n", path);
|
||||
+ return false;
|
||||
+ }
|
||||
+
|
||||
+ if (write(fd, value, strlen(value)) != (ssize_t)strlen(value)) {
|
||||
+ pr_err("Failed to write to %s\n", path);
|
||||
+ ret = false;
|
||||
+ }
|
||||
+
|
||||
+ close(fd);
|
||||
+ return ret;
|
||||
+}
|
||||
+
|
||||
+/* Init kernel md_mod parameters here if needed */
|
||||
+bool init_md_mod_param(void)
|
||||
+{
|
||||
+ bool ret = true;
|
||||
+
|
||||
+ /*
|
||||
+ * In kernel 9e59d609763f calls del_gendisk in sync way. So device
|
||||
+ * node can be removed after stop command. But it can introduce a
|
||||
+ * regression which can be fixed by github pr182. New mdadm version
|
||||
+ * with pr182 can work well with new kernel. But users who don't
|
||||
+ * update mdadm and update to new kernel, they can't assemble array
|
||||
+ * anymore. So kernel adds a kernel parameter legacy_async_del_gendisk
|
||||
+ * and uses async as default.
|
||||
+ * We'll use sync mode since 6.18 rather than async mode. So in future
|
||||
+ * the kernel parameter will be removed.
|
||||
+ */
|
||||
+ if (get_linux_version() >= 6018000)
|
||||
+ ret = set_md_mod_parameter(MD_MOD_ASYNC_DEL_GENDISK, "N");
|
||||
+
|
||||
+ return ret;
|
||||
+}
|
||||
--
|
||||
2.51.0
|
||||
|
||||
133
SOURCES/0043-mdadm-load-md_mod-first.patch
Normal file
133
SOURCES/0043-mdadm-load-md_mod-first.patch
Normal file
@ -0,0 +1,133 @@
|
||||
From b166a2042615aa81a5e60b6b9f553f101827609e Mon Sep 17 00:00:00 2001
|
||||
From: Xiao Ni <xni@redhat.com>
|
||||
Date: Sun, 4 Jan 2026 15:18:48 +0800
|
||||
Subject: [PATCH 1/1] mdadm: load md_mod first
|
||||
|
||||
Load md_mod first before setting module parameter legacy_async_del_gendisk
|
||||
Everything works well if md_mod is built in kernel. If not, create and
|
||||
assemble will fail.
|
||||
|
||||
Fixes: d354d314db86 ("mdadm: Create array with sync del gendisk mode")
|
||||
Signed-off-by: Xiao Ni <xni@redhat.com>
|
||||
---
|
||||
mdadm.h | 2 +-
|
||||
mdopen.c | 30 ++++++------------------------
|
||||
util.c | 35 +++++++++++++++++++++++++++++++++--
|
||||
3 files changed, 40 insertions(+), 27 deletions(-)
|
||||
|
||||
diff --git a/mdadm.h b/mdadm.h
|
||||
index b63dded31a17..9b7052dabee4 100644
|
||||
--- a/mdadm.h
|
||||
+++ b/mdadm.h
|
||||
@@ -860,7 +860,7 @@ extern int restore_stripes(int *dest, unsigned long long *offsets,
|
||||
unsigned long long start, unsigned long long length,
|
||||
char *src_buf);
|
||||
extern bool sysfs_is_libata_allow_tpm_enabled(const int verbose);
|
||||
-extern bool init_md_mod_param(void);
|
||||
+extern bool init_md_mod(void);
|
||||
|
||||
#ifndef Sendmail
|
||||
#define Sendmail "/usr/lib/sendmail -t"
|
||||
diff --git a/mdopen.c b/mdopen.c
|
||||
index b685603d6de5..9af0284b6914 100644
|
||||
--- a/mdopen.c
|
||||
+++ b/mdopen.c
|
||||
@@ -38,33 +38,15 @@ int create_named_array(char *devnm)
|
||||
};
|
||||
|
||||
fd = open(new_array_file, O_WRONLY);
|
||||
- if (fd < 0 && errno == ENOENT) {
|
||||
- char buf[PATH_MAX] = {0};
|
||||
- char *env_ptr;
|
||||
-
|
||||
- env_ptr = getenv("PATH");
|
||||
- /*
|
||||
- * When called by udev worker context, path of modprobe
|
||||
- * might not be in env PATH. Set sbin paths into PATH
|
||||
- * env to avoid potential failure when run modprobe here.
|
||||
- */
|
||||
- if (env_ptr)
|
||||
- snprintf(buf, PATH_MAX - 1, "%s:%s", env_ptr,
|
||||
- "/sbin:/usr/sbin:/usr/local/sbin");
|
||||
- else
|
||||
- snprintf(buf, PATH_MAX - 1, "%s",
|
||||
- "/sbin:/usr/sbin:/usr/local/sbin");
|
||||
-
|
||||
- setenv("PATH", buf, 1);
|
||||
-
|
||||
- if (system("modprobe md_mod") == 0)
|
||||
- fd = open(new_array_file, O_WRONLY);
|
||||
- }
|
||||
if (fd >= 0) {
|
||||
n = write(fd, devnm, strlen(devnm));
|
||||
close(fd);
|
||||
+ } else {
|
||||
+ pr_err("Fail to open %s\n", new_array_file);
|
||||
+ return 0;
|
||||
}
|
||||
- if (fd < 0 || n != (int)strlen(devnm)) {
|
||||
+
|
||||
+ if (n != (int)strlen(devnm)) {
|
||||
pr_err("Fail to create %s when using %s, fallback to creation via node\n",
|
||||
devnm, new_array_file);
|
||||
return 0;
|
||||
@@ -148,7 +130,7 @@ int create_mddev(char *dev, char *name, int trustworthy,
|
||||
char devnm[32];
|
||||
char cbuf[400];
|
||||
|
||||
- if (!init_md_mod_param()) {
|
||||
+ if (!init_md_mod()) {
|
||||
pr_err("init md module parameters fail\n");
|
||||
return -1;
|
||||
}
|
||||
diff --git a/util.c b/util.c
|
||||
index 146f38fddd82..cdc55435a707 100644
|
||||
--- a/util.c
|
||||
+++ b/util.c
|
||||
@@ -2583,10 +2583,41 @@ bool set_md_mod_parameter(const char *name, const char *value)
|
||||
return ret;
|
||||
}
|
||||
|
||||
-/* Init kernel md_mod parameters here if needed */
|
||||
-bool init_md_mod_param(void)
|
||||
+/* Init kernel md_mod and parameters here if needed */
|
||||
+bool init_md_mod(void)
|
||||
{
|
||||
bool ret = true;
|
||||
+ char module_path[32];
|
||||
+ FILE *fp;
|
||||
+
|
||||
+ snprintf(module_path, sizeof(module_path), "/sys/module/md_mod");
|
||||
+ fp = fopen(module_path, "r");
|
||||
+ if (fp == NULL) {
|
||||
+
|
||||
+ char buf[PATH_MAX] = {0};
|
||||
+ char *env_ptr;
|
||||
+
|
||||
+ env_ptr = getenv("PATH");
|
||||
+ /*
|
||||
+ * When called by udev worker context, path of modprobe
|
||||
+ * might not be in env PATH. Set sbin paths into PATH
|
||||
+ * env to avoid potential failure when run modprobe here.
|
||||
+ */
|
||||
+ if (env_ptr)
|
||||
+ snprintf(buf, PATH_MAX - 1, "%s:%s", env_ptr,
|
||||
+ "/sbin:/usr/sbin:/usr/local/sbin");
|
||||
+ else
|
||||
+ snprintf(buf, PATH_MAX - 1, "%s",
|
||||
+ "/sbin:/usr/sbin:/usr/local/sbin");
|
||||
+
|
||||
+ setenv("PATH", buf, 1);
|
||||
+
|
||||
+ if (system("modprobe md_mod") != 0) {
|
||||
+ pr_err("Can't load kernel module md_mod\n");
|
||||
+ return false;
|
||||
+ }
|
||||
+ } else
|
||||
+ fclose(fp);
|
||||
|
||||
/*
|
||||
* In kernel 9e59d609763f calls del_gendisk in sync way. So device
|
||||
--
|
||||
2.51.0
|
||||
|
||||
@ -0,0 +1,101 @@
|
||||
From 6c84dd7fec7eff826793613db664fe74d2d0683c Mon Sep 17 00:00:00 2001
|
||||
From: Richard Li <tianqi.li@oracle.com>
|
||||
Date: Thu, 14 Aug 2025 21:41:13 +0000
|
||||
Subject: [PATCH] mdadm: Fix IMSM Raid assembly after disk link failure and
|
||||
reboot
|
||||
MIME-Version: 1.0
|
||||
Content-Type: text/plain; charset=UTF-8
|
||||
Content-Transfer-Encoding: 8bit
|
||||
|
||||
This patch addresses a scenario observed in production where disk links
|
||||
go down. After a system reboot, depending on which disk becomes
|
||||
available first, the IMSM RAID array may either fully assemble or
|
||||
come up with missing disks.
|
||||
|
||||
Below is an example of the production case simulating disk link failures
|
||||
and subsequent system reboot.
|
||||
|
||||
(note: "echo "1" | sudo tee /sys/class/scsi_device/x:x:x:x/device/delete"
|
||||
is used here to fail/unplug/disconnect disks)
|
||||
|
||||
Raid Configuration: IMSM Raid1 with two disks
|
||||
|
||||
- When sda is unplugged first, then sdb, and after reboot sdb is
|
||||
reconnected first followed by sda, the container (/dev/md127) and
|
||||
subarrays (/dev/md125, /dev/md126) correctly assemble and become active.
|
||||
- However, when sda is reconnected first, then sdb, the subarrays fail to
|
||||
fully reconstruct — sda remains missing from the assembled subarrays.
|
||||
|
||||
This patch addresses this issue in monitor.c. Specifically, when an IMSM
|
||||
RAID is detected and the faulty disk found does not yet exist
|
||||
under /sys/block/CONTAINER_NAME (we do this check so the behavior of
|
||||
"mdadm --fail" is not impacted), the disk will be marked as a spare
|
||||
instead, allowing it to be reused during array reconstruction.
|
||||
|
||||
The patch improves resilience by ensuring consistent array reconstruction
|
||||
regardless of disk detection order. This aligns system behavior with
|
||||
expected RAID redundancy and reduces risk of unnecessary manual recovery
|
||||
steps after reboots in degraded hardware environments.
|
||||
|
||||
Orabug: 37635990
|
||||
Signed-off-by: Richard Li <tianqi.li@oracle.com>
|
||||
---
|
||||
monitor.c | 35 +++++++++++++++++++++++++++++++++--
|
||||
1 file changed, 33 insertions(+), 2 deletions(-)
|
||||
|
||||
diff --git a/monitor.c b/monitor.c
|
||||
index b771707..b08e32c 100644
|
||||
--- a/monitor.c
|
||||
+++ b/monitor.c
|
||||
@@ -393,6 +393,24 @@ static void signal_manager(void)
|
||||
* - request a sync_action
|
||||
*
|
||||
*/
|
||||
+static int find_disk_in_container(struct supertype *container, struct mdinfo *mdi)
|
||||
+{
|
||||
+ struct mdinfo *fdi, *di;
|
||||
+
|
||||
+ fdi = sysfs_read(-1, container->container_devnm, GET_DEVS);
|
||||
+ if (!fdi)
|
||||
+ return 0;
|
||||
+
|
||||
+ for (di = fdi->devs; di; di = di->next) {
|
||||
+ if (di->disk.major == mdi->disk.major &&
|
||||
+ di->disk.minor == mdi->disk.minor) {
|
||||
+ dprintf("%d:%d found in container in sysfs\n",
|
||||
+ mdi->disk.major, mdi->disk.minor);
|
||||
+ return 1;
|
||||
+ }
|
||||
+ }
|
||||
+ return 0;
|
||||
+}
|
||||
|
||||
#define ARRAY_DIRTY 1
|
||||
#define ARRAY_BUSY 2
|
||||
@@ -552,8 +570,21 @@ static int read_and_act(struct active_array *a)
|
||||
*/
|
||||
for (mdi = a->info.devs ; mdi ; mdi = mdi->next) {
|
||||
if (mdi->curr_state & DS_FAULTY) {
|
||||
- a->container->ss->set_disk(a, mdi->disk.raid_disk,
|
||||
- mdi->curr_state);
|
||||
+ /* Mark faulty disk as spare to allow it to be reused during IMSM array
|
||||
+ * reconstruction. This fixes the issue when disks links go down
|
||||
+ * and up againfter a reboot, IMSM RAID array may come up
|
||||
+ * with missing disks.
|
||||
+ */
|
||||
+ if (strcmp(a->container->ss->name, "imsm") == 0 &&
|
||||
+ !find_disk_in_container(a->container, mdi) &&
|
||||
+ !(mdi->curr_state & DS_SPARE)) {
|
||||
+ dprintf("Marking %d:%d as spare for reuse\n",
|
||||
+ mdi->disk.major, mdi->disk.minor);
|
||||
+ a->container->ss->set_disk(a, mdi->disk.raid_disk, DS_SPARE);
|
||||
+ } else {
|
||||
+ a->container->ss->set_disk(a, mdi->disk.raid_disk, mdi->curr_state);
|
||||
+ }
|
||||
+
|
||||
check_degraded = 1;
|
||||
if (mdi->curr_state & DS_BLOCKED)
|
||||
mdi->next_state |= DS_UNBLOCK;
|
||||
--
|
||||
2.47.1
|
||||
|
||||
0
SOURCES/md-auto-readd.sh
Normal file → Executable file
0
SOURCES/md-auto-readd.sh
Normal file → Executable file
51
SOURCES/mdadm-get-rhel-version.patch
Normal file
51
SOURCES/mdadm-get-rhel-version.patch
Normal file
@ -0,0 +1,51 @@
|
||||
--- a/util.c.orig 2026-02-10 10:32:30.910233101 +0800
|
||||
+++ b/util.c 2026-02-10 10:33:48.024018165 +0800
|
||||
@@ -2580,6 +2580,38 @@
|
||||
return ret;
|
||||
}
|
||||
|
||||
+int get_rhel_version(void)
|
||||
+{
|
||||
+ struct utsname name;
|
||||
+ char *cp;
|
||||
+ int rhel = 0;
|
||||
+
|
||||
+ if (uname(&name) < 0)
|
||||
+ return -1;
|
||||
+
|
||||
+ cp = name.release;
|
||||
+
|
||||
+ /* 5.14.0-611.28 */
|
||||
+ strtoul(cp, &cp, 10);
|
||||
+ if (*cp == '.') {
|
||||
+ strtoul(cp+1, &cp, 10);
|
||||
+ if (*cp == '.') {
|
||||
+ strtoul(cp+1, &cp, 10);
|
||||
+ } else
|
||||
+ return -1;
|
||||
+ } else
|
||||
+ return -1;
|
||||
+
|
||||
+ /* -611.28 */
|
||||
+ if (*cp == '-') {
|
||||
+ strtoul(cp+1, &cp, 10);
|
||||
+ if (*cp == '.')
|
||||
+ rhel = strtoul(cp+1, &cp, 10);
|
||||
+ }
|
||||
+
|
||||
+ return rhel;
|
||||
+}
|
||||
+
|
||||
/* Init kernel md_mod and parameters here if needed */
|
||||
bool init_md_mod(void)
|
||||
{
|
||||
@@ -2627,7 +2659,8 @@
|
||||
* We'll use sync mode since 6.18 rather than async mode. So in future
|
||||
* the kernel parameter will be removed.
|
||||
*/
|
||||
- if (get_linux_version() >= 6018000)
|
||||
+ /* kernel-5.14.0-611.28.1.el9_7 merged del sync mode */
|
||||
+ if (get_rhel_version() >= 28)
|
||||
ret = set_md_mod_parameter(MD_MOD_ASYNC_DEL_GENDISK, "N");
|
||||
|
||||
return ret;
|
||||
0
SOURCES/mdadm-raid-check-sysconfig
Normal file → Executable file
0
SOURCES/mdadm-raid-check-sysconfig
Normal file → Executable file
0
SOURCES/mdcheck
Normal file → Executable file
0
SOURCES/mdcheck
Normal file → Executable file
0
SOURCES/raid-check
Normal file → Executable file
0
SOURCES/raid-check
Normal file → Executable file
@ -1,8 +1,8 @@
|
||||
Name: mdadm
|
||||
Version: 4.4
|
||||
# extraversion is used to define rhel internal version
|
||||
%define extraversion 2
|
||||
Release: %{extraversion}%{?dist}
|
||||
%define extraversion 4
|
||||
Release: %{extraversion}.0.1%{?dist}
|
||||
Summary: The mdadm program controls Linux md devices (software RAID arrays)
|
||||
URL: https://git.kernel.org/pub/scm/utils/mdadm/mdadm.git
|
||||
License: GPLv2+
|
||||
@ -57,12 +57,22 @@ Patch034: 0035-mdadm-Remove-klibc-and-uclibc-support.patch
|
||||
Patch035: 0036-mdadm-include-asm-byteorder.h.patch
|
||||
Patch036: 0037-mdadm-use-kernel-raid-headers.patch
|
||||
Patch037: mdadm-use-standard-libc-nftw.patch
|
||||
Patch038: 0038-mdadm-enable-sync-file-for-udev-rules.patch
|
||||
Patch039: 0039-mdadm-assemble-Don-t-stop-array-after-creating-it.patch
|
||||
Patch040: 0040-mdadm-incremental-set-sysfs-name-after-assembling-im.patch
|
||||
Patch041: 0041-mdadm-imsm-use-creation_time-for-ctime-in-container-.patch
|
||||
Patch042: 0042-mdadm-Create-array-with-sync-del-gendisk-mode.patch
|
||||
Patch043: 0043-mdadm-load-md_mod-first.patch
|
||||
|
||||
# Fedora customization patches
|
||||
Patch196: mdadm-fix-building-errors.patch
|
||||
Patch197: mdadm-check-posix-name-before-setting-name-and-devna.patch
|
||||
Patch200: mdadm-udev.patch
|
||||
Patch201: mdadm-2.5.2-static.patch
|
||||
Patch202: mdadm-get-rhel-version.patch
|
||||
|
||||
# Oracle Patch
|
||||
Patch1001: 1001-mdadm-Fix-IMSM-Raid-assembly-after-disk-link-failure.patch
|
||||
|
||||
BuildRequires: make
|
||||
BuildRequires: systemd-rpm-macros binutils-devel gcc systemd-devel
|
||||
@ -136,6 +146,17 @@ install -m644 %{SOURCE5} %{buildroot}/etc/libreport/events.d
|
||||
/usr/share/mdadm/mdcheck
|
||||
|
||||
%changelog
|
||||
* Tue Apr 07 2026 EL Errata <el-errata_ww@oracle.com> - 4.4-4.0.1
|
||||
- mdadm: Fix IMSM Raid assembly after disk link failure and reboot [Orabug: 37635990]
|
||||
|
||||
* Tue Feb 10 2026 Xiao Ni <xni@redhat.com> 4.4-4
|
||||
- enable sync del mode and some booting fixes
|
||||
- Resolves RHEL-106747 RHEL-130808
|
||||
|
||||
* Wed Nov 26 2025 Xiao Ni <xni@redhat.com> 4.4-3
|
||||
- udev change and don't stop array during assemble
|
||||
- Resolves RHEL-130808 RHEL-106747
|
||||
|
||||
* Mon May 19 2025 Xiao Ni <xni@redhat.com> 4.4-2
|
||||
- mdadm grow command can't work
|
||||
- Resolves RHEL-92270
|
||||
|
||||
Loading…
Reference in New Issue
Block a user