Upstream: RHEL-only
Resolves: https://issues.redhat.com/browse/RHEL-47130
kdump prefixes kdump- to a name of a network interface when its name
matches eth.*. However, the regex is incorrect and matches names like
baremetal*. As a consequence, kdump failed with the following message,
[ 305.950223] kdump.sh[790]: Device "kdump-baremetal" does not exist.
[ 305.992018] kdump[795]: wrong kdumpnic: kdump-baremetal
[ 306.029386] kdump[797]: get_host_ip exited with non-zero status!
[ 306.038085] systemd[1]: kdump-capture.service: Main process exited, code=exited, status=1/FAILURE
[ 306.050082] systemd[1]: kdump-capture.service: Failed with result 'exit-code'.
Fixes: ba7660f ("dracut-module-setup: NIC renamed with prefix "kdump-" for native ethX")
Reported-by: Dima Shtranvasser <dshtranv@redhat.com>
Suggeste-by: Dima Shtranvasser <dshtranv@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-25490
Upstream: Fedora
Conflict: None
commit 468336700d
Author: Lichen Liu <lichliu@redhat.com>
Date: Mon Jan 22 15:59:09 2024 +0800
dracut-module-setup: Skip initrd-cleanup and initrd-parse-etc in kdump
When using multipath devices as the target for kdump, if user_friendly_name
is also specified, devices default to names like "mpath*", e.g., mpatha.
In dracut, we obtain a persistent device name via get_persistent_dev. However,
dracut currently believes using /dev/mapper/mpath* could cause issues, thus
alternatively names are used, here it's /dev/disk/by-uuid/<FS_UUID>.
During the kdump boot progress, the /dev/disk/by-uuid/<FS_UUID> will exist as
soon as one of the path devices exists, but it won't be usable by systemd,
since multipathd will claim that device as a path device. Then multipathd will
get stopped before it can create the multipath device.
Without user_friendly_name, /dev/mapper/<WWID> is considered a persistent
device name, avoiding the issue.
The exit of multipathd is due to two dependencies in the current dracut module
90multipath/multipathd.service, "Before=initrd-cleanup.service" and
"Conflicts=initrd-cleanup.service".
As per man 5 systemd.unit, if A.service has "Conflicts=B.service", starting
B.service will stop A.service.
This is useful during normal boot. However, we will never switch-root after
capturing vmcore in kdump.
We need to ensure that multipathd is not killed due to such dependency issue.
Without modifying multipathd.service, we add ConditionPathExists=!/proc/vmcore
to skip initrd-cleanup.service in kdump. This approach is beneficial as
it avoid the potential termination of other services that conflict with
initrd-cleanup.service. Also skip initrd-parse-etc.service as it will try to
start initrd-cleanup.service. Both of these services are used for switch root,
so they can be safely skipped in kdump.
Suggested-by: Benjamin Marzinski <bmarzins@redhat.com>
Suggested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-23831
Upstream: Fedora
Conflict: None
commit bc101086e2
Author: Coiby Xu <coxu@redhat.com>
Date: Mon Dec 12 18:37:25 2022 +0800
dracut-module-setup.sh: also install the driver of physical NIC for
Hyper-V VM with accelerated networking
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2151842
Currently, vmcore dumping to remote fs fails on Azure Hyper-V VM with
accelerated networking because it uses a physical NIC for accrelarated
networking [1]. In this case, the driver for this physical NIC should be
installed as well.
[1] https://learn.microsoft.com/en-us/azure/virtual-network/accelerated-networking-overview
Fixes: a65dde2d ("Reduce kdump memory consumption by only installing needed NIC drivers")
Reported-by: Xiaoqiang Xiong <xxiong@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: bz2229287
Upstream: RHEL-ONLY
Conflict: None
There is a use case where a separate NIC is used to handle DNS queries.
In this case this NIC should be added to the allowlist as well.
Fixes: e67e4bd ("Reduce kdump memory consumption by only installing needed NIC drivers")
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: bz1958587
Upstream: Fedora
Conflict: None
commit 3b22cce1cb
Author: Coiby Xu <coxu@redhat.com>
Date: Wed Dec 14 10:12:17 2022 +0800
dracut-module-setup.sh: skip installing driver for the loopback
interface
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2151500
Currently, kdump initrd fails to be built when dumping vmcore to
localhost via ssh or nfs,
kdumpctl[3331]: Cannot get driver information: Operation not supported
kdumpctl[1991]: dracut: Failed to get the driver of lo
dracut[2020]: Failed to get the driver of lo
kdumpctl[1775]: kdump: mkdumprd: failed to make kdump initrd
kdumpctl[1775]: kdump: Starting kdump: [FAILED]
systemd[1]: kdump.service: Main process exited, code=exited, status=1/FAILURE
systemd[1]: kdump.service: Failed with result 'exit-code'.
systemd[1]: Failed to start Crash recovery kernel arming.
systemd[1]: kdump.service: Consumed 1.710s CPU time.
This is because the loopback interface is used for transferring vmcore and
ethtool can't get the driver of the loopback interface. In fact, once
COFNIG_NET is enabled, the loopback device is enabled and there is no driver
for the loopback device. So skip installing driver for the loopback device.
The loopback interface is implemented in linux/drivers/net/loopback.c
and always has the name "lo". So we can safely tell if a network
interface is the loopback interface by its name.
Fixes: a65dde2d ("Reduce kdump memory consumption by only installing needed NIC drivers")
Reported-by: Martin Pitt <mpitt@redhat.com>
Reported-by: Rich Megginson <rmeggins@redhat.com>
Reviewed-by: Lichen Liu <lichliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: bz1958587
Upstream: Fedora
Conflict: 1. RHEL8's mkdumprd has different dracut_args from upstream's.
2. RHEL8's kdump_install_net is different from upstream's and
we should install needed NIC drivers in the end.
commit a65dde2d10
Author: Coiby Xu <coxu@redhat.com>
Date: Thu May 19 11:39:25 2022 +0800
Reduce kdump memory consumption by only installing needed NIC drivers
Even after having asked NM to stop managing a unneeded NIC, a NIC driver
may still waste memory. For example, mlx5_core uses a substantial amount
of memory during driver initialization,
======== Report format module_summary: ========
Module mlx5_core using 350.2MB (89650 pages), peak allocation 367.4MB (94056 pages)
Module squashfs using 13.1MB (3360 pages), peak allocation 13.1MB (3360 pages)
Module overlay using 2.1MB (550 pages), peak allocation 2.2MB (555 pages)
Module dns_resolver using 0.9MB (219 pages), peak allocation 5.2MB (1338 pages)
Module mlxfw using 0.7MB (172 pages), peak allocation 5.3MB (1349 pages)
======== Report format module_summary END ========
======== Report format module_top: ========
Top stack usage of module mlx5_core:
(null) Pages: 89650 (peak: 94056)
ret_from_fork (0xffffda088b4165f8) Pages: 60007 (peak: 60007)
kthread (0xffffda088b4bd7e4) Pages: 60007 (peak: 60007)
worker_thread (0xffffda088b4b48d0) Pages: 60007 (peak: 60007)
process_one_work (0xffffda088b4b3f40) Pages: 60007 (peak: 60007)
work_for_cpu_fn (0xffffda088b4aef00) Pages: 53906 (peak: 53906)
local_pci_probe (0xffffda088b9e1e44) Pages: 53906 (peak: 53906)
probe_one mlx5_core (0xffffda084f899cc8) Pages: 53518 (peak: 53518)
mlx5_init_one mlx5_core (0xffffda084f8994ac) Pages: 49756 (peak: 49756)
mlx5_function_setup.constprop.0 mlx5_core (0xffffda084f899100) Pages: 44434 (eak: 44434)
mlx5_satisfy_startup_pages mlx5_core (0xffffda084f8a4f24) Pages: 44434 (peak: 44434)
mlx5_function_setup.constprop.0 mlx5_core (0xffffda084f899078) Pages: 5285 (peak: 5285)
mlx5_cmd_init mlx5_core (0xffffda084f89e414) Pages: 4818 (peak: 4818)
mlx5_alloc_cmd_msg mlx5_core (0xffffda084f89aaa0) Pages: 4403 (peak: 4403)
This memory consumption is completely unnecessary when kdump doesn't need
this NIC. Only install needed NIC drivers to prevent this kind of waste.
Note
1. this patch depends on [1] to ask dracut to not install NIC drivers.
2. "ethtool -i" somehow fails to get the vlan driver
3. team.ko doesn't depend on the team mode drivers so we need to install
the team mode drivers manually.
[1] https://github.com/dracutdevs/dracut/pull/1789
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Thomas Haller <thaller@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1964822
Upstream: RHEL-only
Currently, vmcore dumping to remote fs gives a warning "eth0: Failed to
rename network interface 3 from 'eth0' to 'kdump-eth0': File exists" on
Azure Hyper-V VM with accelerated networking because it uses a physical
NIC for accelerated networking [1] and the backing physical NIC has the
same MAC address as the virtual NIC. In the kdump initrd, an udev rule
will try renaming NICs with the given MAC address and fails as expected
since there are two NICs having the same MAC address. This udev rule is
created automatically when specifying the dracut cmdline
"ifname=<interface>:<MAC>". For the case of Azure Hyper-V VM with
accelerated networking, only the virtual network interface need to be
renamed. So create an udev rule manually.
[1] https://learn.microsoft.com/en-us/azure/virtual-network/accelerated-networking-overview
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1964822
Upstream: RHEL-only
Currently, vmcore dumping to remote fs gives a warning "eth0: Failed to
rename network interface 3 from 'eth0' to 'kdump-eth0': File exists" on
Azure Hyper-V VM with accelerated networking because it uses a physical
NIC for accelerated networking [1] and the backing physical NIC has the
same MAC as the virtual NIC. There is no need to rename a Hypver-V
interface in this case which also leads the aforementioned warning.
[1] https://learn.microsoft.com/en-us/azure/virtual-network/accelerated-networking-overview
Signed-off-by: Coiby Xu <coxu@redhat.com>