Resolves: https://issues.redhat.com/browse/RHEL-104940
Conflict: None
commit 4094199402119ad4f97d073b40a35d890754dc8e
Author: Coiby Xu <coxu@redhat.com>
Date: Wed Sep 17 08:57:30 2025 +0800
Limit LUKS support to x86_64
The LUKS support depends on the kernel but only x86_64 kernel part is
ready. So limit this feature to x86_64.
And don't fail kdump.service even when x86_64 kernel doesn't have
/sys/kernel/config/crash_dm_crypt_keys in case users have already
manually made dumping to encrypted target work.
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-104940
Conflict: None
commit ddd33c8d0f552cb46097aeade86178266637aa05
Author: Coiby Xu <coxu@redhat.com>
Date: Tue Sep 16 16:22:18 2025 +0800
Add kdumpctl setup-crypttab subcommand
Resolves: https://issues.redhat.com/browse/RHEL-29037
Relates: https://issues.redhat.com/browse/RHEL-29039
This subcommand is to add the 'link-volume-key' option to /etc/crypttab
so the volume keys can be passed to the crash kernel to unlock
LUKS-encrypted device automatically.
This API will be also be called by kdump-anaconda-addon to set up
/etc/crypttab on installation.
Signed-off-by: Coiby Xu <coxu@redhat.com>
Assisted-by: Google Gemini
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-104940
Conflict: None
commit 9980a416759c58e67a206525ddb82d581932c3ad
Author: Coiby Xu <coxu@redhat.com>
Date: Tue Sep 16 11:52:00 2025 +0800
Return LUKS devices in the form of UUIDs directly
So the callers of kdump_check_crypt_targets don't need to convert the
result into UUIDs.
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-104940
Conflict: .github/workflows/main.yml only exists upstream
commit e25009be073adcf00885d42c5ae3856b49ce7188
Author: Coiby Xu <coxu@redhat.com>
Date: Wed Jul 23 13:32:42 2025 +0800
Fix SC2181 issues in kdump-udev-throttler
Fix the issues found by shellcheck,
In kdump-udev-throttler line 30:
if [ $? -ne 0 ]; then
^-- SC2181 (style): Check exit code directly with e.g. 'if ! mycmd;', not indirectly with $?.
In kdump-udev-throttler line 37:
if [ $? -ne 0 ]; then
^-- SC2181 (style): Check exit code directly with e.g. 'if ! mycmd;', not indirectly with $?.
Also add kdump-udev-throttler to shellcheck list for Github Action.
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-104940
Conflict: None
commit dd8d34c4baf9878fa7ff2e5f6807cfc93ae40e0b
Author: Coiby Xu <coxu@redhat.com>
Date: Tue Jun 4 15:51:47 2024 +0800
LUKS: make /usr writable
Since systemd commit ffc1ec73b3 ("pid1: add ProtectSystem= as system-wide
configuration, and default it to true in the initrd"), systemd makes
/usr read-only by default and it will cause dracut to not wait for the
LUKS-encrypted devices to be unlocked,
dracut-cmdline[296]: mv: inter-device move failed: '/tmp/294-daemon-reload.sh' to '/lib/dracut/hooks/initqueue/daemon-reload.sh'; unable to remove target: Read-only file syste
dracut-cmdline[294]: /sbin/initqueue: line 71: /lib/dracut/hooks/initqueue/work: Read-only file system
dracut-cmdline[221]: /lib/dracut-dev-lib.sh: line 118: /lib/dracut/hooks/initqueue/finished/devexists-\x2fdev\x2fmyvg\x2fluks_lv.sh: Read-only file system
dracut-cmdline[221]: /lib/dracut-dev-lib.sh: line 103: /lib/dracut/hooks/emergency/80-\x2fdev\x2fmyvg\x2fluks_lv.sh: Read-only file system
Fix the above issue by making /usr writable.
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-104940
Conflict: None
commit 9de617170ad9bac97db53a2bf031e895fb482dba
Author: Coiby Xu <coxu@redhat.com>
Date: Wed Jun 25 11:26:53 2025 +0800
Address CPU/memory hot plugging for kdump LUKS support
We can reuse LUKS volume keys when there is CPU/memory hot plugging by
writing 1 to /sys/kernel/config/crash_dm_crypt_keys/resue to reuse keys
already saved to kdump reserved memory.
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-104940
Conflict: None
commit 0d39f4fd626fea070d9b8af624feacd89938e7db
Author: Coiby Xu <coxu@redhat.com>
Date: Wed Jun 18 09:16:26 2025 +0800
Use cryptsetup --link-vk-to-keyring to save volume keys
cryptsetup open --link-vk-to-keyring (man cryptsetup-open) will link
volume key to specified keyring after successfully unlocking the volume.
Use this feature to save key to @u::%logon:cryptsetup:$UUID to support
the following cases
- volume is unlocked automatically say using TPM-sealed key
- ask user to input passphrase to unlock the volume
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-104940
Conflict: Manually resolve conflict in kdumpctl and mkdumprd.
Install 99kdumpbase/kexec-crypt-setup.sh in kexec-tools.spec.
commit 5cbd7ddd2e164d13f2cd992373df89912fe6e79f
Author: Coiby Xu <coxu@redhat.com>
Date: Mon Aug 7 15:19:36 2023 +0800
Support dumping to a LUKS-encrypted target
Based on the new kernel feature that dm-crypt keys can persist for the
kdump kernel [1][2], this patch which is adapted from [3]
1) ask the 1st kernel to save a copy of the LUKS volume keys
2) ask the kdump kernel to add the copy of the LUKS volume keys to
specified keyring and then use --volume-key-keyring the unlock the
LUKS device.
Users need to set up the link-volume-key option in /etc/crypttab (man
crypttab) so kdumpctl can read the key from @u::%logon:cryptsetup:UUID
and instruct the kernel to save it to kdump-reserved memory.
[1] https://github.com/torvalds/linux/blob/master/Documentation/admin-guide/kdump/kdump.rst#write-the-dump-file-to-encrypted-disk-volume
[2] 180cf31af7
[3] https://lists.fedorahosted.org/archives/list/kexec@lists.fedoraproject.org/message/Y3KUSJQPN3JHUUC2FPIK7H4HTSX2TUCX/
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-104940
Conflict: None
commit f2c18c4934777cf55a5ea9359c8423adb8f1388b
Author: Coiby Xu <coxu@redhat.com>
Date: Mon Mar 4 17:18:46 2024 +0800
Add a helper function to get uuid by MAJ:MIN
This helper function will be used for kdump LUKS support.
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: RHEL-49555
Upstream: rhkdump/kdump-utils
Conflicts: Slightly changed warning message to reference RHEL11 rather
than "the future".
commit ca6da9b484e7995d8f3ee7e74dd871ee8919e409
Author: Philipp Rudo <prudo@redhat.com>
Date: Wed Oct 2 15:26:34 2024 +0200
kdumpctl: deprecate --reboot for reset-creashkernel
The --reboot option for reset-crashkernel causes quite some confusion.
Especially, there are different expectations when --reboot shall reboot
the system. The current code only reboots the system when the
crashkernel parameter was updated during this run. But there are other
opinions, that --reboot shall also reboot the system if a previous run
of `kdumpctl reset-crashkernel` updated the crashkernel parameter and no
reboot occurred yet. While possible this would add extra complexity to
the code. Neither the confusion nor the extra complexity are justified,
given that --reboot only replaces a single additional command for the
user.
Thus deprecate the --reboot option so it can be removed later on.
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-103429
Upstream: rhkdump/kdump-utils
Conflict: Need manually edit kdump.sysconfig files
commit ddb0bab1f7e1e43a802993aadad03f85a3c045a9
Author: Baoquan He <bhe@redhat.com>
Date: Wed Jun 25 23:30:24 2025 +0800
sysconfig: disable kfence in kdump kernel
In the current fedora and RHEL, below config items related to kfence
feature are set by default:
===
CONFIG_HAVE_ARCH_KFENCE=y
CONFIG_KFENCE=y
CONFIG_KFENCE_SAMPLE_INTERVAL=100
CONFIG_KFENCE_NUM_OBJECTS=255
CONFIG_KFENCE_STRESS_TEST_FAULTS=0
CONFIG_KFENCE_KUNIT_TEST=m
===
With them set, on x86_64, it will cost 2M extra memory used for kfence when
page size if 4K; while on arm64 with 64K page size, it will cost 32M extra
memory. This doesn't take memory cost of initializing and running kfence
itself into account, here only saying the kfence objects and guarded
pages. However, it doesn't make any sense to have kfence in a kdump
kernel. Hence, disable kfence in kdump kernel to save crashkernel
memory.
Signed-off-by: Baoquan He <bhe@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-33413
Upstream: rhkdump/kdump-utils
Conflict: Miss upstream patch 0d90d580 ("dracut-module-setup:
consolidate s390 network device config (#1937048)") and
upstream switches to a different way of supporting OVS.
commit 0678138331f6de43aaee0b7fbacf8adb38e73ff0
Author: Coiby Xu <coxu@redhat.com>
Date: Thu Jul 17 17:25:38 2025 +0800
Support dumping to NVMe/TCP configured using NVMe Boot Firmware Table
Resolves: https://issues.redhat.com/browse/RHEL-100907
Resolves: https://issues.redhat.com/browse/RHEL-33413
The dracut nvmf module can take care of all things. It can parse ACPI
NVMe Boot Firmware Table (NBFT) tables, generate NetworkManager profiles
and discover and connect all subsystems.
Currently, the dracut kdump module will try to bring up the same network
connections as in 1st kernel. But a different set of NVMe connections
and active network interfaces will be used for the case of multipathing.
So the dracut kdump module should let dracut nvmf module do everything.
Note connecting everything and having network redundancy may require extra
memory and the default crashkernel may not work. We'll document this
issue and ask users to increase the crashkernel.
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-79413
Upstream: kdump-utils
Conflict: Yes, upstream use has_command to check if a command is
avaiable but that function is not backported to RHEL-9.
commit f758448cc7f29a24d8f5ddd7418dc9dd2fc3fd35
Author: Lichen Liu <llc123456a@gmail.com>
Date: Thu May 8 17:22:17 2025 +0800
kdumpctl: check and generate /etc/vconsole.conf
For VMs created from KVM Guest images, /etc/vconsole.conf is missing
so that dracut module 10i18n will install all kbd files.
```
# du -sh initramfs/squash/usr/lib/kbd/*
438K initramfs/squash/usr/lib/kbd/consolefonts
340K initramfs/squash/usr/lib/kbd/consoletrans
2.1M initramfs/squash/usr/lib/kbd/keymaps
232K initramfs/squash/usr/lib/kbd/unimaps
```
From man 5 vconsole.conf, KEYMAP= defaults to "us" if not set. We can
safely generate a /etc/vconsole.conf with KEYMAP=us by localectl to
reduce the initramfs size.
```
# du -sh initramfs/squash/usr/lib/kbd/*
11K initramfs/squash/usr/lib/kbd/consolefonts
121K initramfs/squash/usr/lib/kbd/keymaps
```
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: RHEL-91274
Upstream: rhel-only
In the previous commit (3053d959), we adjusted the default value of
crashkernel, this action changed the default behavior of RHEL-9.
Systems with less than 2G memory will not be able to start kdump by
default. This is a breaking change on small VMs that only use local
dump.
In order to keep compatibility, reserve crashkernel for systems with
1G-2G memory on RHEL-9.
Fixes: 3053d959 ("kdump-lib.sh: Adjust default crashkernel reservation for x86_64 and s390x")
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Upstream: kdump-utils
Resolves: RHEL-19064
commit 0fc7e887a20a9e8536411750b64f0f8a1315f01b
Author: Lichen Liu <lichliu@redhat.com>
Date: Tue Mar 25 14:05:49 2025 +0800
99-kdump.conf: Omit clevis related dracut modules
The clevis related dracut modules are unconditionly included into the
kdump initramfs after installing clevis-dracut package.
Normally, we don't use clevis in the kdump process, but it will increase
the size of kdump initramfs by about 11M, which is a relatively large
overhead for kdump. Omit them by default can reduce the memory required
by kdump and avoid the OOM issue.
If the user really needs to use it, it can also be added to kdump
initramfs by using dracut_args --force-add in /etc/kdump.conf.
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: RHEL-83645
Upstream: kdump-utils
commit 18fc5b7e7b89ee3caba04ce18bf32036aa19da6e
Author: Lichen Liu <lichliu@redhat.com>
Date: Fri Mar 7 12:22:05 2025 +0800
kdump-lib.sh: rounded up the total_mem to 128M in get_system_size
The kernel code to calculate reserving size rounded up the total_mem because
usually the memblock usable mem size is smaller than the physical dimm memory
size:
total_mem = roundup(total_mem, SZ_128M);
This patch is aimed to align with the kernel's behavior. A machine showing
4000MB of total memory will be treated as having 4G instead of 3G memory.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2306493
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: RHEL-83645
Upstream: kdump-utils
commit fe18f933baed9eaa18e0c2427aaca4640d8f6fa1
Author: Lichen Liu <lichliu@redhat.com>
Date: Thu Mar 6 10:21:43 2025 +0800
kdump-lib.sh: Adjust default crashkernel reservation for x86_64 and s390x
With new kernel features being added, both the kernel itself and the initramfs
have gradually increased in size.
Previously, we used squashfs to package and reduce the initramfs size. However,
since squashfs will be replaced by erofs, we have transitioned to erofs. The
compression algorithms supported by erofs differ from those used in squashfs.
In previous squashfs implementations, we used zstd compression, but in RHEL-10,
the erofs implementation in the kernel does not yet support zstd decompression.
As a result, we had to switch to other compression algorithms, leading to
changes in the initramfs size.
In some scenarios, the previous 192M crashkernel reservation is no longer
sufficient. Recent NFS testing has shown that at least 238M is required to
successfully capture a vmcore. Given this, we have updated the default
crashkernel reservation to start from 2G, with 256M allocated for crash
recovery.
Since 256M is a significant portion of memory on smaller systems, we have
decided not to reserve crashkernel memory by default on systems with less
than 2G of RAM. However, users can still manually adjust the `crashkernel=`
setting via tools like `grubby` if needed.
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-84470
Upstream: kdump-utils
Conflict: None
commit d6e1edc677bfd15fb4553101ffb5a34130959861
Author: Coiby Xu <coxu@redhat.com>
Date: Wed Mar 13 14:12:58 2024 +0800
doc/kdump.conf: correctly align the options
Currently, the other options like "raw <partition>" become child items
of the auto_reset_crashkernel option,
auto_reset_crashkernel <yes|no>
...
raw <partition>
...
nfs <nfs mount>
...
...
Fix it by ending the auto_reset_crashkernel with ".RE".
Fixes: 73ced7f4 ("introduce the auto_reset_crashkernel option to kdump.conf")
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-66065
Conflict: in upstream, 99-kdump.conf is introduced by
commit dacb34341 (Add kdump dracut config). And it
has been back ported to RHEL-9. So implementing the
omit based on the new skeleton.
commit 1732a3bd477b3bc0b078b0f070024f350c22ef2b
Author: Pingfan Liu <piliu@redhat.com>
Date: Mon Sep 9 11:52:00 2024 +0800
mkdumprd: Omit rdma module
Resolves: https://issues.redhat.com/browse/RHEL-57006
05rdma dracut module from rdma-core is installed unconditionally even if
kdump dumps the vmcore to local disk. And those kmod will cost
additional 200M memory on x86, which likely triggers OOM.
Since the Infiniband (and in fact none of RDMA devices are supported in
kdump), exclude the rdma dracut module explicitly.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-67131
Conflict: Upstream uses a different path dracut/99kdumpbase/module-setup.sh.
Upstream Status: git@github.com:rhkdump/kdump-utils.git
commit 821a5e648dcb72d89065078e0cadb58a5313b183
Author: Coiby Xu <coxu@redhat.com>
Date: Thu Nov 28 18:07:49 2024 +0800
Fallback to get NIC driver by /sys/class/net/NIC/device/driver/module
Resolves: https://issues.redhat.com/browse/RHEL-67131
Some drivers like dwmac_tegra may report its name incorrectly. So
fallback to /sys/class/net/NIC/device/driver/modul to address these
cases. Note this method only for physical NICs i.e it doesn't work for
virtual NICs like bonding NIC.
Suggested-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-69568
Conflict: None
Upstream Status: git@github.com:rhkdump/kdump-utils.git
commit 200107e45ac2619bf7d865a19ff896a83b97fe56
Author: Coiby Xu <coxu@redhat.com>
Date: Tue Dec 10 16:38:23 2024 +0800
Note user-specified crashkernel value will be overwritten by default value
Resolves: https://issues.redhat.com/browse/RHEL-69568
When kdump-utils gets updated automatically i.e. not updated manually by
user, users won't notice custom crashkernel value gets overwritten as
the logs in /var/log/dnf.rpm.log may be ignored,
2024-12-10T02:57:12-0500 INFO kdump: For kernel=/boot/vmlinuz-6.11.0-28.el10.x86_64, crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M now. Please reboot the system for the change to take effect. Note if you don't want kexec-tools to manage the crashkernel kernel parameter, please set auto_reset_crashkernel=no in /etc/kdump.conf.
So add a note to kdump.conf.
Suggested-by: Ryosuke Yasuoka <ryasuoka@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://bugzilla.redhat.com/RHEL-52304
Upstream: kdump-utils
commit 77a0246cde3505777cfa1f9c2a1a834e76b7eed6
Author: Lichen Liu <lichliu@redhat.com>
Date: Mon Jan 13 17:39:56 2025 +0800
99-kdump.conf: Omit nouveau and amdgpu module
Resolves: https://issues.redhat.com/browse/RHEL-52304
The GPU module provides no significant utility in second kernel, and it
introduces firmware that occupies lots of memory, which is critical in
the constrained environment of kdump. Omit it helps reduce memory usage
and optimize the crash recovery process.
See also:
https://access.redhat.com/solutions/6977793https://access.redhat.com/solutions/7100186
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-61620
Upstream: kdump-utils
commit 6f32fab791746da3f075273e156bd70055db5d4c
Author: Sourabh Jain <sourabhjain@linux.ibm.com>
Date: Fri Oct 4 20:09:35 2024 +0530
kdump.service: Replace ConditionKernelCommandLine with ExecCondition
Commit 0084806493 ("kdump.service: use ConditionKernelCommandLine=crashkernel")
added a condition based on the crashkernel kernel command line parameter to
control the start of the kdump service.
While ConditionKernelCommandLine=crashkernel works well for kdump, it causes
issues for fadump (specific to the PowerPC architecture), which also uses the
same service unit. Unlike kdump, crashkernel kernel command-line is not
mandatory for fadump.
Since ConditionKernelCommandLine doesn't support evaluating multiple kernel
command line parameters with dependencies between them, it has been replaced
with ExecCondition to resolve this limitation.
Now, if fadump is configured and the crashkernel parameter is NOT present in the
kernel command line, kdump service will start.
Fixes: #35
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-52925
Upstream: kdump-utils
commit 53d8e6ecd6177568a245bf3ce3558798897195c4
Author: Hari Bathini <hbathini@linux.ibm.com>
Date: Sun Nov 17 09:15:48 2024 +0530
fadump: fix passing additional parameters for capture kernel
Currently, additional parameters for fadump capture kernel are being
set only with reload command. In fact, the check for bootargs_append
node before writing to it is incorrect too. Fix it and also, setup
fadump aditional parameters for start/restart case as well.
Fixes: 77b80ce5 ("fadump: pass additional parameters for capture kernel")
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-52925
Upstream: kdump-utils
Confilict: apply the changes in kdump.sysconfig.ppc64le by manual
commit 77b80ce5e369c7b5cf8321e4cdc20c139910f92c
Author: Hari Bathini <hbathini@linux.ibm.com>
Date: Sat Oct 19 00:28:34 2024 +0530
fadump: pass additional parameters for capture kernel
Since kernel commit 3416c9daa6b13 ("powerpc/fadump: pass additional
parameters when fadump is active"), fadump supports passing additional
parameters to dump capture kernel. Leverage that support here to pass
additional parameters to dump capture kernel.
Also, update fadump-howto.txt to make clear on the options that are
not relevant for fadump in /etc/sysconfig/kdump
The default bootargs to append for fadump capture kernel boot are
chosen with the intent to optimize resources and reduce memory
footprint in dump capture environment.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Upstream: fedora
Resolves: RHEL-70214
Conflict: Yes, the conflict is the same as the original c9s commit
c5aa4609 ("Introduce vmcore creation notification to kdump")
9ec61f6c ("Return the correct exit code of rebuild initrd")
Also this patch cherry-picked the ipv6 fixed in [1].
[1]: https://github.com/rhkdump/kdump-utils/pull/60/files
commit 24e76222c740def1d03a506652400fe55959e024
Author: Tao Liu <ltao@redhat.com>
Date: Fri Nov 29 16:15:18 2024 +1300
Re-introduce vmcore creation notification to kdump
Motivation
==========
People may forget to recheck to ensure kdump works, which as a result, a
possibility of no vmcores generated after a real system crash. It is
unexpected for kdump.
It is highly recommended people to test kdump after any system modification,
such as:
a. after kernel patching or whole yum update, as it might break something
on which kdump is dependent, maybe due to introduction of any new bug etc.
b. after any change at hardware level, maybe storage, networking,
firmware upgrading etc.
c. after implementing any new application, like which involves 3rd party modules
etc.
Though these exceed the range of kdump, however a simple vmcore creation
status notification is good to have for now.
Design
======
Kdump currently will check any relating files/fs/drivers modified before
determine if initrd should rebuild when (re)start. A rebuild is an
indicator of such modification, and kdump need to be tested. This will
clear the vmcore creation status specified in $VMCORE_CREATION_STATUS,
and as a result, a notification of vmcore creation test will be
outputted.
To test kdump, there is an entry for doing that by "kdumpctl test". It
will generate a timestamp string as the ID of the current test, along
with a "pending" status in $VMCORE_CREATION_STATUS, then a real crash &
dump process will be triggered.
After system reboot back to normal, a vmcore creation check will start at
"kdumpctl (re)start/status", and will report the results as
success/fail/manual status to users.
To achieve that, program will first check the status in $VMCORE_CREATION_STATUS.
If "pending" status if found, which means the test result is
undetermined and need a retrive from remote/local dump folder. Then if test
id is found in the dump folder and vmcore is complete, then "pending"
would be overwritten by "success", which indicates a successful kdump
test. If test id is found in the dump folder but vmcore is incomplete,
then it is a "fail" kdump test. If no test id is found, then it is a "manual"
status, which indicates users should check the test results manually.
If $VMCORE_CREATION_STATUS is already success/fail/manual status, it indicates
the test result has already been determined, so the program will not access
the remote/local dump folder again. This can limite any unnecessary
access to dump target, shorten the time consumption.
User should check for the root cause of fail/manual status when get
reports.
$VMCORE_CREATION_STATUS is used for recording the vmcore creation status of
the current env. The format is like:
<status> kdump_test_id=<timestamp sec>-<timestamp nanosec>
e.g:
success kdump_test_id=1729823462-938751820
Which means, there has been a successful kdump test at
$(date -d "@1729823462") timestamp for the current env. Timestamp
nanosec is only meaningful for uniquify id string.
Difference
==========
Previously there is one commit 88525ebf ("Introduce vmcore creation
notification to kdump") merged and addressing the same issue, but
implemented differently:
The prev one:
Save the $VMCORE_CREATION_STATUS to local drive during the 2nd kernel
dumping. If vmcore dumping target is different from $VMCORE_CREATION_STATUS's
drive, then the latter one need to be mounted in 2nd kernel.
This one:
Save the $VMCORE_CREATION_STATUS to local drive only in 1nd kernel, that
is, the test result is retrived after 2nd kernel dumping. So it doesn't
load or mount other drive in 2nd kernel.
The advantage:
Extra mounting in 2nd kernel will introduce higher risk of failure,
as a result, lower the success of vmcore dumping, which is
unaccepted. So keep the code for 2nd kernel as simple is preferred.
Usage
=====
[root@localhost ~]# kdumpctl restart
kdump: kexec: unloaded kdump kernel
kdump: Stopping kdump: [OK]
kdump: kexec: loaded kdump kernel
kdump: Starting kdump: [OK]
kdump: Notice: No vmcore creation test performed!
[root@localhost ~]# kdumpctl status
kdump: Kdump is operational
kdump: Notice: No vmcore creation test performed!
[root@localhost ~]# kdumpctl test
[root@localhost ~]# cat /var/lib/kdump/vmcore-creation.status
pending kdump_test_id=1729823462-938751820
[root@localhost ~]# kdumpctl status
kdump: Kdump is operational
kdump: Notice: Last successful vmcore creation on Fri Oct 25 02:31:02 AM UTC 2024
[root@localhost ~]# cat /var/lib/kdump/vmcore-creation.status
success kdump_test_id=1729823462-938751820
[root@localhost ~]# kdumpctl restart
kdump: kexec: unloaded kdump kernel
kdump: Stopping kdump: [OK]
kdump: kexec: loaded kdump kernel
kdump: Starting kdump: [OK]
kdump: Notice: Last successful vmcore creation on Fri Oct 25 02:31:02 AM UTC 2024
Note: the notification for kdumpctl (re)start/status can be disabled by
setting VMCORE_CREATION_NOTIFICATION in /etc/sysconfig/kdump. And fadump
is NOT supported for this feature.
Signed-off-by: Tao Liu <ltao@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Resolves: RHEL-70214
Upstream: fedora
Conflict: Yes, the conflict is the same as the original c9s commit
c5aa4609 ("Introduce vmcore creation notification to kdump")
9ec61f6c ("Return the correct exit code of rebuild initrd")
commit 96956928a66d9256cdf8bfed6a8963ddea35aac9
Author: Tao Liu <ltao@redhat.com>
Date: Fri Nov 29 14:42:01 2024 +1300
Revert "Introduce vmcore creation notification to kdump"
This patch will revert the following 2 patches:
88525ebf ("Introduce vmcore creation notification to kdump")
35449537 ("Return the correct exit code of rebuild initrd")
For the preparation of reimplementation of vmcore creation notification.
Signed-off-by: Tao Liu <ltao@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Resolves: RHEL-49590
Upstream: https://github.com/rhkdump/kdump-utils
Conflict: Yes, kexec-tools is split into 3 parts upstream, some changes should
be applied to kdump-utils Makefile, but RHEL-9 kexec-tools doesn't have that.
Also missing upstream commits:
- 1732a3b(mkdumprd: Omit rdma module)
- 7fec2f56(mkdumprd: simplify handling of dracut arguments)
commit dacb34341113fa925c15e28a7ce56a80dd370e2f
Author: Lichen Liu <lichliu@redhat.com>
Date: Tue Nov 5 12:07:42 2024 +0800
Add kdump dracut config
In some cases, customizing the first kernel's initrd is necessary by
modifying the dracut `omit_dracutmodules` options, such as in Bootc
or CoreOS scenarios [1]. However, these changes can unintentionally
break existing functionality in kdump. For instance, setting
`omit_dracutmodules='nfs'` prevents the `nfs` module from being added.
Additionally, some dracut configurations [2] use
`dracutmodules+='some modules'` instead of
`add_dracutmodules+='some modules'`. When `dracutmodules` is non-empty,
dracut includes only the specified modules, which can result in an
initrd that lacks necessary modules, causing kdump to fail.
Dracut upstream support --add-confdir now, kdump can use this
option when building kdump initramfs.
This patch moved the hardcoded dracutmodules from mkdumprd to the new
conf file /lib/kdump/dracut.conf.d/99-kdump.conf, it is easier to check
and modify to omit or add certain modules. This patch also initialize
dracutmodules to empty to avoid the influence of other configurations.
See also:
[1] https://github.com/rhkdump/kdump-utils/issues/11
[2] https://issues.redhat.com/browse/RHEL-49590?focusedId=25197134&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-25197134
Suggested-by: Dave Young <dyoung@redhat.com>
Suggested-by: Colin Walters <walters@verbum.org>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: RHEL-35885
Upstream: https://github.com/rhkdump/kdump-utils/
Conflict: none
commit 27a9d1dc85283239ee0b0b29ce5a00597d3f965f
Author: Lichen Liu <lichliu@redhat.com>
Date: Sun Sep 29 16:02:04 2024 +0800
kdump-lib-initramfs: Improve mount point retrieval logic
We use findmnt to find the real mount point by mount source, however,
when there is a bind mount target, findmnt will give more than one
results and what we actually want is the one without fsroot.
Will fallback to previous method if the $_mntpoint is empty.
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: RHEL-63047
Upstream: fedora
Conflict: Yes, due to upstream commit d4e87721 ("kdumpctl:
make do_estimate more robust") is not backported.
commit f6e00948aba7c31f722af79ed72c4020868dcad7
Author: Tao Liu <ltao@redhat.com>
Date: Fri Oct 18 21:45:03 2024 +1300
Return the correct exit code of rebuild initrd
Resolves: https://issues.redhat.com/browse/RHEL-63047
The exit code of rebuild_initrd() should be either of
rebuild_kdump/fadump_initrd(), rather than set_vmcore_creation_status(),
otherwise it will cause a regression when rebuild initrd fails.
Fixes: 88525ebf ("Introduce vmcore creation notification to kdump")
Signed-off-by: Tao Liu <ltao@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Upstream: fedora
Resolves: RHEL-32060
Conflict: Yes, there are several conflicts. 1) Upstream have moved
dracut-kdump.sh into kdump-utils/dracut/99kdumpbase/kdump.sh,
so the targeting files are changed. 2) There are several
patchsets([1] [2]) which not backported to rhel9, so some
formating conflicts encountered. But there is no functional
change been made for the patch backporting.
[1]: https://github.com/rhkdump/kdump-utils/pull/18/commits
[2]: https://github.com/rhkdump/kdump-utils/pull/33/commits
commit 88525ebf5e43cc86aea66dc75ec83db58233883b
Author: Tao Liu <ltao@redhat.com>
Date: Thu Sep 5 15:49:07 2024 +1200
Introduce vmcore creation notification to kdump
Motivation
==========
People may forget to recheck to ensure kdump works, which as a result, a
possibility of no vmcores generated after a real system crash. It is
unexpected for kdump.
It is highly recommended people to recheck kdump after any system
modification, such as:
a. after kernel patching or whole yum update, as it might break something
on which kdump is dependent, maybe due to introduction of any new bug etc.
b. after any change at hardware level, maybe storage, networking,
firmware upgrading etc.
c. after implementing any new application, like which involves 3rd party modules
etc.
Though these exceed the range of kdump, however a simple vmcore creation
status notification is good to have for now.
Design
======
Kdump currently will check any relating files/fs/drivers modified before
determine if initrd should rebuild when (re)start. A rebuild is an
indicator of such modification, and kdump need to be rechecked. This will
clear the vmcore creation status specified in $VMCORE_CREATION_STATUS.
Vmcore creation check will happen at "kdumpctl (re)start/status", and will
report the creation success/fail status to users. A "success" status indicates
previously there has been a vmcore successfully generated based on the current
env, so it is more likely a vmcore will be generated later when real crash
happens; A "fail" status indicates previously there was no vmcore
generated, or has been a vmcore creation failed based on current env. User
should check the 2nd kernel log or the kexec-dmesg.log for the failing reason.
$VMCORE_CREATION_STATUS is used for recording the vmcore creation status of
the current env. The format will be like:
success 1718682002
Which means, there has been a vmcore generated successfully at this
timestamp for the current env.
Usage
=====
[root@localhost ~]# kdumpctl restart
kdump: kexec: unloaded kdump kernel
kdump: Stopping kdump: [OK]
kdump: kexec: loaded kdump kernel
kdump: Starting kdump: [OK]
kdump: Notice: No vmcore creation test performed!
[root@localhost ~]# kdumpctl test
[root@localhost ~]# kdumpctl status
kdump: Kdump is operational
kdump: Notice: Last successful vmcore creation on Tue Jun 18 16:39:10 CST 2024
[root@localhost ~]# kdumpctl restart
kdump: kexec: unloaded kdump kernel
kdump: Stopping kdump: [OK]
kdump: kexec: loaded kdump kernel
kdump: Starting kdump: [OK]
kdump: Notice: Last successful vmcore creation on Tue Jun 18 16:39:10 CST 2024
The notification for kdumpctl (re)start/status can be disabled by
setting VMCORE_CREATION_NOTIFICATION in /etc/sysconfig/kdump
Signed-off-by: Tao Liu <ltao@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-56832
Upstream Status: RHEL-only
This reverts commit 099aead590.
Currently get_mntpoint_from_target incorrectly return empty result for
targets that contain square bracket '[', e.g
- eng.redhat.com:/srv/[nfs]
- [2620:52:0:a1:217:38ff:fe01:131]:/srv/[nfs]
- /dev/mapper/rhel[disk]
get_mntpoint_from_target is also used in several places. To avoid
RHEL-56832 and other possible regressions, revert the bad commit.
Suggested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-33465
Conflict: C9S misses the following two commits,
- 1397006 ("dracut-module-setup: Remove remove_cpu_online_rule() since PowerPC uses nr_cpus")
- 73c9eb7 ("dracut-module-setup: remove old s390 network device config (#1937048)")
Upstream Status: git@github.com:rhkdump/kdump-utils.git
commit 224d3102c54749eae98bfa1af8932aade8e4d2da
Author: Coiby Xu <coxu@redhat.com>
Date: Mon Apr 22 15:02:42 2024 +0800
Support setting up Open vSwitch (Ovs) Bridge network
Resolves: https://issues.redhat.com/browse/RHEL-33465
This patch supports setting up an Ovs bridge in kdump initrd. An Ovs
bridge is similar to a classic Linux bridge but we use ovs-vsctl to find
out the Ethernet device (having the MAC address as the bridge) added to
an Ovs bridge. Once we copy all the needed NetworkManager (NM) connection
profiles to kdump initrd and all the necessary files, NM will create an Ovs bridge
automatically in kdump initrd.
In the case of OpenShift Container Platform (OCP),
ovs-configuration.service [1] is responsible for setting up an Ovs bridge.
In theory, we can also try to bring up the original physical network
interface before ovs-configuration.service. But this approach is
cumbersome because it breaks our assumption that we should bring up the
same network in kdump intrd as in 1st kernel (establishing the same network
in kdump initrd only needs to copy the needed NM connection profiles
thus we don't need to learn how different network setup work under the
hood).
How to test this patch with the help of configure-ovs.sh?
=========================================================
1. Extract configure-ovs.sh from [2]
2. Install necessary packages for configure-ovs.sh
dnf install openvswitch -yq
dnf install NetworkManager-ovs nmap-ncat -yq
systemctl enable --now openvswitch
# restart NM so the ovs plugin can be activated
systemctl restart NetworkManager
3. Assume the network interface used for creating an Ovs bridge is
"ens2", use configure-ovs.sh to create an Ovs bridge,
interface=ens2
mkdir -p /etc/ovnk
echo $interface > /etc/ovnk/iface_default_hint
bash configure-ovs.sh OVNKubernetes
4. (Optional) If you want to make the created Ovs bridge survive a
reboot, simply make the created NM connections created by
configure-ovs.sh persist,
cp /run/NetworkManager/system-connections/ovs-* /etc/NetworkManager/system-connections/
If you need to create an Ovs bridge on top of a bonding network, use the
following commands for step 3,
nmcli con add type bond ifname bond0
nmcli con add type ethernet ifname eth0 master bond0
nmcli con add type ethernet ifname eth1 master bond0
echo bond0 > /etc/ovnk/iface_default_hint
bash configure-ovs.sh OVNKubernetes
Note
1. For RHEL, openvswitch3.3 may be installed so we need to get the
package name by "rpm -qf /usr/lib/systemd/system/openvswitch.service"
2. For RHEL9, openvswitch package needs to installed from another repo,
cat << 'EOF' > /etc/yum.repos.d/ovs.repo
[rhosp-rhel-9-fdp-cdn]
name=Red Hat Enterprise Linux Fast Datapath $releasever - $basearch cdn
baseurl=http://rhsm-pulp.corp.redhat.com/content/dist/layered/rhel9/$basearch/fast-datapath/os/
enabled=1
gpgcheck=0
EOF
dnf install openvswitch3.3 -yq
3. We instruct ovsdb-server to ignore NM connection files changes by
"--ovsdb-server-options='--disable-file-column-diff'". In the
future, this may not be needed if we simply copy all active NM
connection profiles to kdump initrd without changing them after
coming up with different solutions for the following cases,
1. Some environments like some Azure machine doesn't use persistent
NIC name. Current solution is to modify a NM connection
profile to match a device by MAC address, for details check
commit 568623e)
2. If a NIC has an IPv4 or IPv6 address, set the corresponding
may-fail property to no. Otherwise, dumping vmcore over IPv6
could fail because only IPv4 network is ready or vice versa. Current
solution is to disable IPv6 if only IPv4 is used and vice versa,
for details check commit 9dfcacf,
3. Some NICs need longer connection.wait-device-timeout otherwise
the connection will fail to be established (commit 6b586a9).
[1] https://github.com/openshift/machine-config-operator/blob/master/templates/common/_base/units/ovs-configuration.service.yaml
[2] https://github.com/openshift/machine-config-operator/blob/master/templates/common/_base/files/configure-ovs-network.yaml
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>