Resolves: https://issues.redhat.com/browse/RHEL-104940
Conflict: None
commit f2c18c4934777cf55a5ea9359c8423adb8f1388b
Author: Coiby Xu <coxu@redhat.com>
Date: Mon Mar 4 17:18:46 2024 +0800
Add a helper function to get uuid by MAJ:MIN
This helper function will be used for kdump LUKS support.
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: RHEL-49555
Upstream: rhkdump/kdump-utils
Conflicts: Slightly changed warning message to reference RHEL11 rather
than "the future".
commit ca6da9b484e7995d8f3ee7e74dd871ee8919e409
Author: Philipp Rudo <prudo@redhat.com>
Date: Wed Oct 2 15:26:34 2024 +0200
kdumpctl: deprecate --reboot for reset-creashkernel
The --reboot option for reset-crashkernel causes quite some confusion.
Especially, there are different expectations when --reboot shall reboot
the system. The current code only reboots the system when the
crashkernel parameter was updated during this run. But there are other
opinions, that --reboot shall also reboot the system if a previous run
of `kdumpctl reset-crashkernel` updated the crashkernel parameter and no
reboot occurred yet. While possible this would add extra complexity to
the code. Neither the confusion nor the extra complexity are justified,
given that --reboot only replaces a single additional command for the
user.
Thus deprecate the --reboot option so it can be removed later on.
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-103429
Upstream: rhkdump/kdump-utils
Conflict: Need manually edit kdump.sysconfig files
commit ddb0bab1f7e1e43a802993aadad03f85a3c045a9
Author: Baoquan He <bhe@redhat.com>
Date: Wed Jun 25 23:30:24 2025 +0800
sysconfig: disable kfence in kdump kernel
In the current fedora and RHEL, below config items related to kfence
feature are set by default:
===
CONFIG_HAVE_ARCH_KFENCE=y
CONFIG_KFENCE=y
CONFIG_KFENCE_SAMPLE_INTERVAL=100
CONFIG_KFENCE_NUM_OBJECTS=255
CONFIG_KFENCE_STRESS_TEST_FAULTS=0
CONFIG_KFENCE_KUNIT_TEST=m
===
With them set, on x86_64, it will cost 2M extra memory used for kfence when
page size if 4K; while on arm64 with 64K page size, it will cost 32M extra
memory. This doesn't take memory cost of initializing and running kfence
itself into account, here only saying the kfence objects and guarded
pages. However, it doesn't make any sense to have kfence in a kdump
kernel. Hence, disable kfence in kdump kernel to save crashkernel
memory.
Signed-off-by: Baoquan He <bhe@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-33413
Upstream: rhkdump/kdump-utils
Conflict: Miss upstream patch 0d90d580 ("dracut-module-setup:
consolidate s390 network device config (#1937048)") and
upstream switches to a different way of supporting OVS.
commit 0678138331f6de43aaee0b7fbacf8adb38e73ff0
Author: Coiby Xu <coxu@redhat.com>
Date: Thu Jul 17 17:25:38 2025 +0800
Support dumping to NVMe/TCP configured using NVMe Boot Firmware Table
Resolves: https://issues.redhat.com/browse/RHEL-100907
Resolves: https://issues.redhat.com/browse/RHEL-33413
The dracut nvmf module can take care of all things. It can parse ACPI
NVMe Boot Firmware Table (NBFT) tables, generate NetworkManager profiles
and discover and connect all subsystems.
Currently, the dracut kdump module will try to bring up the same network
connections as in 1st kernel. But a different set of NVMe connections
and active network interfaces will be used for the case of multipathing.
So the dracut kdump module should let dracut nvmf module do everything.
Note connecting everything and having network redundancy may require extra
memory and the default crashkernel may not work. We'll document this
issue and ask users to increase the crashkernel.
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-79413
Upstream: kdump-utils
Conflict: Yes, upstream use has_command to check if a command is
avaiable but that function is not backported to RHEL-9.
commit f758448cc7f29a24d8f5ddd7418dc9dd2fc3fd35
Author: Lichen Liu <llc123456a@gmail.com>
Date: Thu May 8 17:22:17 2025 +0800
kdumpctl: check and generate /etc/vconsole.conf
For VMs created from KVM Guest images, /etc/vconsole.conf is missing
so that dracut module 10i18n will install all kbd files.
```
# du -sh initramfs/squash/usr/lib/kbd/*
438K initramfs/squash/usr/lib/kbd/consolefonts
340K initramfs/squash/usr/lib/kbd/consoletrans
2.1M initramfs/squash/usr/lib/kbd/keymaps
232K initramfs/squash/usr/lib/kbd/unimaps
```
From man 5 vconsole.conf, KEYMAP= defaults to "us" if not set. We can
safely generate a /etc/vconsole.conf with KEYMAP=us by localectl to
reduce the initramfs size.
```
# du -sh initramfs/squash/usr/lib/kbd/*
11K initramfs/squash/usr/lib/kbd/consolefonts
121K initramfs/squash/usr/lib/kbd/keymaps
```
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: RHEL-91274
Upstream: rhel-only
In the previous commit (3053d959), we adjusted the default value of
crashkernel, this action changed the default behavior of RHEL-9.
Systems with less than 2G memory will not be able to start kdump by
default. This is a breaking change on small VMs that only use local
dump.
In order to keep compatibility, reserve crashkernel for systems with
1G-2G memory on RHEL-9.
Fixes: 3053d959 ("kdump-lib.sh: Adjust default crashkernel reservation for x86_64 and s390x")
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Upstream: kdump-utils
Resolves: RHEL-19064
commit 0fc7e887a20a9e8536411750b64f0f8a1315f01b
Author: Lichen Liu <lichliu@redhat.com>
Date: Tue Mar 25 14:05:49 2025 +0800
99-kdump.conf: Omit clevis related dracut modules
The clevis related dracut modules are unconditionly included into the
kdump initramfs after installing clevis-dracut package.
Normally, we don't use clevis in the kdump process, but it will increase
the size of kdump initramfs by about 11M, which is a relatively large
overhead for kdump. Omit them by default can reduce the memory required
by kdump and avoid the OOM issue.
If the user really needs to use it, it can also be added to kdump
initramfs by using dracut_args --force-add in /etc/kdump.conf.
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: RHEL-83645
Upstream: kdump-utils
commit 18fc5b7e7b89ee3caba04ce18bf32036aa19da6e
Author: Lichen Liu <lichliu@redhat.com>
Date: Fri Mar 7 12:22:05 2025 +0800
kdump-lib.sh: rounded up the total_mem to 128M in get_system_size
The kernel code to calculate reserving size rounded up the total_mem because
usually the memblock usable mem size is smaller than the physical dimm memory
size:
total_mem = roundup(total_mem, SZ_128M);
This patch is aimed to align with the kernel's behavior. A machine showing
4000MB of total memory will be treated as having 4G instead of 3G memory.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2306493
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: RHEL-83645
Upstream: kdump-utils
commit fe18f933baed9eaa18e0c2427aaca4640d8f6fa1
Author: Lichen Liu <lichliu@redhat.com>
Date: Thu Mar 6 10:21:43 2025 +0800
kdump-lib.sh: Adjust default crashkernel reservation for x86_64 and s390x
With new kernel features being added, both the kernel itself and the initramfs
have gradually increased in size.
Previously, we used squashfs to package and reduce the initramfs size. However,
since squashfs will be replaced by erofs, we have transitioned to erofs. The
compression algorithms supported by erofs differ from those used in squashfs.
In previous squashfs implementations, we used zstd compression, but in RHEL-10,
the erofs implementation in the kernel does not yet support zstd decompression.
As a result, we had to switch to other compression algorithms, leading to
changes in the initramfs size.
In some scenarios, the previous 192M crashkernel reservation is no longer
sufficient. Recent NFS testing has shown that at least 238M is required to
successfully capture a vmcore. Given this, we have updated the default
crashkernel reservation to start from 2G, with 256M allocated for crash
recovery.
Since 256M is a significant portion of memory on smaller systems, we have
decided not to reserve crashkernel memory by default on systems with less
than 2G of RAM. However, users can still manually adjust the `crashkernel=`
setting via tools like `grubby` if needed.
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-84470
Upstream: kdump-utils
Conflict: None
commit d6e1edc677bfd15fb4553101ffb5a34130959861
Author: Coiby Xu <coxu@redhat.com>
Date: Wed Mar 13 14:12:58 2024 +0800
doc/kdump.conf: correctly align the options
Currently, the other options like "raw <partition>" become child items
of the auto_reset_crashkernel option,
auto_reset_crashkernel <yes|no>
...
raw <partition>
...
nfs <nfs mount>
...
...
Fix it by ending the auto_reset_crashkernel with ".RE".
Fixes: 73ced7f4 ("introduce the auto_reset_crashkernel option to kdump.conf")
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-66065
Conflict: in upstream, 99-kdump.conf is introduced by
commit dacb34341 (Add kdump dracut config). And it
has been back ported to RHEL-9. So implementing the
omit based on the new skeleton.
commit 1732a3bd477b3bc0b078b0f070024f350c22ef2b
Author: Pingfan Liu <piliu@redhat.com>
Date: Mon Sep 9 11:52:00 2024 +0800
mkdumprd: Omit rdma module
Resolves: https://issues.redhat.com/browse/RHEL-57006
05rdma dracut module from rdma-core is installed unconditionally even if
kdump dumps the vmcore to local disk. And those kmod will cost
additional 200M memory on x86, which likely triggers OOM.
Since the Infiniband (and in fact none of RDMA devices are supported in
kdump), exclude the rdma dracut module explicitly.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-67131
Conflict: Upstream uses a different path dracut/99kdumpbase/module-setup.sh.
Upstream Status: git@github.com:rhkdump/kdump-utils.git
commit 821a5e648dcb72d89065078e0cadb58a5313b183
Author: Coiby Xu <coxu@redhat.com>
Date: Thu Nov 28 18:07:49 2024 +0800
Fallback to get NIC driver by /sys/class/net/NIC/device/driver/module
Resolves: https://issues.redhat.com/browse/RHEL-67131
Some drivers like dwmac_tegra may report its name incorrectly. So
fallback to /sys/class/net/NIC/device/driver/modul to address these
cases. Note this method only for physical NICs i.e it doesn't work for
virtual NICs like bonding NIC.
Suggested-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-69568
Conflict: None
Upstream Status: git@github.com:rhkdump/kdump-utils.git
commit 200107e45ac2619bf7d865a19ff896a83b97fe56
Author: Coiby Xu <coxu@redhat.com>
Date: Tue Dec 10 16:38:23 2024 +0800
Note user-specified crashkernel value will be overwritten by default value
Resolves: https://issues.redhat.com/browse/RHEL-69568
When kdump-utils gets updated automatically i.e. not updated manually by
user, users won't notice custom crashkernel value gets overwritten as
the logs in /var/log/dnf.rpm.log may be ignored,
2024-12-10T02:57:12-0500 INFO kdump: For kernel=/boot/vmlinuz-6.11.0-28.el10.x86_64, crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M now. Please reboot the system for the change to take effect. Note if you don't want kexec-tools to manage the crashkernel kernel parameter, please set auto_reset_crashkernel=no in /etc/kdump.conf.
So add a note to kdump.conf.
Suggested-by: Ryosuke Yasuoka <ryasuoka@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://bugzilla.redhat.com/RHEL-52304
Upstream: kdump-utils
commit 77a0246cde3505777cfa1f9c2a1a834e76b7eed6
Author: Lichen Liu <lichliu@redhat.com>
Date: Mon Jan 13 17:39:56 2025 +0800
99-kdump.conf: Omit nouveau and amdgpu module
Resolves: https://issues.redhat.com/browse/RHEL-52304
The GPU module provides no significant utility in second kernel, and it
introduces firmware that occupies lots of memory, which is critical in
the constrained environment of kdump. Omit it helps reduce memory usage
and optimize the crash recovery process.
See also:
https://access.redhat.com/solutions/6977793https://access.redhat.com/solutions/7100186
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-61620
Upstream: kdump-utils
commit 6f32fab791746da3f075273e156bd70055db5d4c
Author: Sourabh Jain <sourabhjain@linux.ibm.com>
Date: Fri Oct 4 20:09:35 2024 +0530
kdump.service: Replace ConditionKernelCommandLine with ExecCondition
Commit 0084806493 ("kdump.service: use ConditionKernelCommandLine=crashkernel")
added a condition based on the crashkernel kernel command line parameter to
control the start of the kdump service.
While ConditionKernelCommandLine=crashkernel works well for kdump, it causes
issues for fadump (specific to the PowerPC architecture), which also uses the
same service unit. Unlike kdump, crashkernel kernel command-line is not
mandatory for fadump.
Since ConditionKernelCommandLine doesn't support evaluating multiple kernel
command line parameters with dependencies between them, it has been replaced
with ExecCondition to resolve this limitation.
Now, if fadump is configured and the crashkernel parameter is NOT present in the
kernel command line, kdump service will start.
Fixes: #35
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-52925
Upstream: kdump-utils
commit 53d8e6ecd6177568a245bf3ce3558798897195c4
Author: Hari Bathini <hbathini@linux.ibm.com>
Date: Sun Nov 17 09:15:48 2024 +0530
fadump: fix passing additional parameters for capture kernel
Currently, additional parameters for fadump capture kernel are being
set only with reload command. In fact, the check for bootargs_append
node before writing to it is incorrect too. Fix it and also, setup
fadump aditional parameters for start/restart case as well.
Fixes: 77b80ce5 ("fadump: pass additional parameters for capture kernel")
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-52925
Upstream: kdump-utils
Confilict: apply the changes in kdump.sysconfig.ppc64le by manual
commit 77b80ce5e369c7b5cf8321e4cdc20c139910f92c
Author: Hari Bathini <hbathini@linux.ibm.com>
Date: Sat Oct 19 00:28:34 2024 +0530
fadump: pass additional parameters for capture kernel
Since kernel commit 3416c9daa6b13 ("powerpc/fadump: pass additional
parameters when fadump is active"), fadump supports passing additional
parameters to dump capture kernel. Leverage that support here to pass
additional parameters to dump capture kernel.
Also, update fadump-howto.txt to make clear on the options that are
not relevant for fadump in /etc/sysconfig/kdump
The default bootargs to append for fadump capture kernel boot are
chosen with the intent to optimize resources and reduce memory
footprint in dump capture environment.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Upstream: fedora
Resolves: RHEL-70214
Conflict: Yes, the conflict is the same as the original c9s commit
c5aa4609 ("Introduce vmcore creation notification to kdump")
9ec61f6c ("Return the correct exit code of rebuild initrd")
Also this patch cherry-picked the ipv6 fixed in [1].
[1]: https://github.com/rhkdump/kdump-utils/pull/60/files
commit 24e76222c740def1d03a506652400fe55959e024
Author: Tao Liu <ltao@redhat.com>
Date: Fri Nov 29 16:15:18 2024 +1300
Re-introduce vmcore creation notification to kdump
Motivation
==========
People may forget to recheck to ensure kdump works, which as a result, a
possibility of no vmcores generated after a real system crash. It is
unexpected for kdump.
It is highly recommended people to test kdump after any system modification,
such as:
a. after kernel patching or whole yum update, as it might break something
on which kdump is dependent, maybe due to introduction of any new bug etc.
b. after any change at hardware level, maybe storage, networking,
firmware upgrading etc.
c. after implementing any new application, like which involves 3rd party modules
etc.
Though these exceed the range of kdump, however a simple vmcore creation
status notification is good to have for now.
Design
======
Kdump currently will check any relating files/fs/drivers modified before
determine if initrd should rebuild when (re)start. A rebuild is an
indicator of such modification, and kdump need to be tested. This will
clear the vmcore creation status specified in $VMCORE_CREATION_STATUS,
and as a result, a notification of vmcore creation test will be
outputted.
To test kdump, there is an entry for doing that by "kdumpctl test". It
will generate a timestamp string as the ID of the current test, along
with a "pending" status in $VMCORE_CREATION_STATUS, then a real crash &
dump process will be triggered.
After system reboot back to normal, a vmcore creation check will start at
"kdumpctl (re)start/status", and will report the results as
success/fail/manual status to users.
To achieve that, program will first check the status in $VMCORE_CREATION_STATUS.
If "pending" status if found, which means the test result is
undetermined and need a retrive from remote/local dump folder. Then if test
id is found in the dump folder and vmcore is complete, then "pending"
would be overwritten by "success", which indicates a successful kdump
test. If test id is found in the dump folder but vmcore is incomplete,
then it is a "fail" kdump test. If no test id is found, then it is a "manual"
status, which indicates users should check the test results manually.
If $VMCORE_CREATION_STATUS is already success/fail/manual status, it indicates
the test result has already been determined, so the program will not access
the remote/local dump folder again. This can limite any unnecessary
access to dump target, shorten the time consumption.
User should check for the root cause of fail/manual status when get
reports.
$VMCORE_CREATION_STATUS is used for recording the vmcore creation status of
the current env. The format is like:
<status> kdump_test_id=<timestamp sec>-<timestamp nanosec>
e.g:
success kdump_test_id=1729823462-938751820
Which means, there has been a successful kdump test at
$(date -d "@1729823462") timestamp for the current env. Timestamp
nanosec is only meaningful for uniquify id string.
Difference
==========
Previously there is one commit 88525ebf ("Introduce vmcore creation
notification to kdump") merged and addressing the same issue, but
implemented differently:
The prev one:
Save the $VMCORE_CREATION_STATUS to local drive during the 2nd kernel
dumping. If vmcore dumping target is different from $VMCORE_CREATION_STATUS's
drive, then the latter one need to be mounted in 2nd kernel.
This one:
Save the $VMCORE_CREATION_STATUS to local drive only in 1nd kernel, that
is, the test result is retrived after 2nd kernel dumping. So it doesn't
load or mount other drive in 2nd kernel.
The advantage:
Extra mounting in 2nd kernel will introduce higher risk of failure,
as a result, lower the success of vmcore dumping, which is
unaccepted. So keep the code for 2nd kernel as simple is preferred.
Usage
=====
[root@localhost ~]# kdumpctl restart
kdump: kexec: unloaded kdump kernel
kdump: Stopping kdump: [OK]
kdump: kexec: loaded kdump kernel
kdump: Starting kdump: [OK]
kdump: Notice: No vmcore creation test performed!
[root@localhost ~]# kdumpctl status
kdump: Kdump is operational
kdump: Notice: No vmcore creation test performed!
[root@localhost ~]# kdumpctl test
[root@localhost ~]# cat /var/lib/kdump/vmcore-creation.status
pending kdump_test_id=1729823462-938751820
[root@localhost ~]# kdumpctl status
kdump: Kdump is operational
kdump: Notice: Last successful vmcore creation on Fri Oct 25 02:31:02 AM UTC 2024
[root@localhost ~]# cat /var/lib/kdump/vmcore-creation.status
success kdump_test_id=1729823462-938751820
[root@localhost ~]# kdumpctl restart
kdump: kexec: unloaded kdump kernel
kdump: Stopping kdump: [OK]
kdump: kexec: loaded kdump kernel
kdump: Starting kdump: [OK]
kdump: Notice: Last successful vmcore creation on Fri Oct 25 02:31:02 AM UTC 2024
Note: the notification for kdumpctl (re)start/status can be disabled by
setting VMCORE_CREATION_NOTIFICATION in /etc/sysconfig/kdump. And fadump
is NOT supported for this feature.
Signed-off-by: Tao Liu <ltao@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Resolves: RHEL-70214
Upstream: fedora
Conflict: Yes, the conflict is the same as the original c9s commit
c5aa4609 ("Introduce vmcore creation notification to kdump")
9ec61f6c ("Return the correct exit code of rebuild initrd")
commit 96956928a66d9256cdf8bfed6a8963ddea35aac9
Author: Tao Liu <ltao@redhat.com>
Date: Fri Nov 29 14:42:01 2024 +1300
Revert "Introduce vmcore creation notification to kdump"
This patch will revert the following 2 patches:
88525ebf ("Introduce vmcore creation notification to kdump")
35449537 ("Return the correct exit code of rebuild initrd")
For the preparation of reimplementation of vmcore creation notification.
Signed-off-by: Tao Liu <ltao@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Resolves: RHEL-49590
Upstream: https://github.com/rhkdump/kdump-utils
Conflict: Yes, kexec-tools is split into 3 parts upstream, some changes should
be applied to kdump-utils Makefile, but RHEL-9 kexec-tools doesn't have that.
Also missing upstream commits:
- 1732a3b(mkdumprd: Omit rdma module)
- 7fec2f56(mkdumprd: simplify handling of dracut arguments)
commit dacb34341113fa925c15e28a7ce56a80dd370e2f
Author: Lichen Liu <lichliu@redhat.com>
Date: Tue Nov 5 12:07:42 2024 +0800
Add kdump dracut config
In some cases, customizing the first kernel's initrd is necessary by
modifying the dracut `omit_dracutmodules` options, such as in Bootc
or CoreOS scenarios [1]. However, these changes can unintentionally
break existing functionality in kdump. For instance, setting
`omit_dracutmodules='nfs'` prevents the `nfs` module from being added.
Additionally, some dracut configurations [2] use
`dracutmodules+='some modules'` instead of
`add_dracutmodules+='some modules'`. When `dracutmodules` is non-empty,
dracut includes only the specified modules, which can result in an
initrd that lacks necessary modules, causing kdump to fail.
Dracut upstream support --add-confdir now, kdump can use this
option when building kdump initramfs.
This patch moved the hardcoded dracutmodules from mkdumprd to the new
conf file /lib/kdump/dracut.conf.d/99-kdump.conf, it is easier to check
and modify to omit or add certain modules. This patch also initialize
dracutmodules to empty to avoid the influence of other configurations.
See also:
[1] https://github.com/rhkdump/kdump-utils/issues/11
[2] https://issues.redhat.com/browse/RHEL-49590?focusedId=25197134&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-25197134
Suggested-by: Dave Young <dyoung@redhat.com>
Suggested-by: Colin Walters <walters@verbum.org>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: RHEL-35885
Upstream: https://github.com/rhkdump/kdump-utils/
Conflict: none
commit 27a9d1dc85283239ee0b0b29ce5a00597d3f965f
Author: Lichen Liu <lichliu@redhat.com>
Date: Sun Sep 29 16:02:04 2024 +0800
kdump-lib-initramfs: Improve mount point retrieval logic
We use findmnt to find the real mount point by mount source, however,
when there is a bind mount target, findmnt will give more than one
results and what we actually want is the one without fsroot.
Will fallback to previous method if the $_mntpoint is empty.
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: RHEL-63047
Upstream: fedora
Conflict: Yes, due to upstream commit d4e87721 ("kdumpctl:
make do_estimate more robust") is not backported.
commit f6e00948aba7c31f722af79ed72c4020868dcad7
Author: Tao Liu <ltao@redhat.com>
Date: Fri Oct 18 21:45:03 2024 +1300
Return the correct exit code of rebuild initrd
Resolves: https://issues.redhat.com/browse/RHEL-63047
The exit code of rebuild_initrd() should be either of
rebuild_kdump/fadump_initrd(), rather than set_vmcore_creation_status(),
otherwise it will cause a regression when rebuild initrd fails.
Fixes: 88525ebf ("Introduce vmcore creation notification to kdump")
Signed-off-by: Tao Liu <ltao@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Upstream: fedora
Resolves: RHEL-32060
Conflict: Yes, there are several conflicts. 1) Upstream have moved
dracut-kdump.sh into kdump-utils/dracut/99kdumpbase/kdump.sh,
so the targeting files are changed. 2) There are several
patchsets([1] [2]) which not backported to rhel9, so some
formating conflicts encountered. But there is no functional
change been made for the patch backporting.
[1]: https://github.com/rhkdump/kdump-utils/pull/18/commits
[2]: https://github.com/rhkdump/kdump-utils/pull/33/commits
commit 88525ebf5e43cc86aea66dc75ec83db58233883b
Author: Tao Liu <ltao@redhat.com>
Date: Thu Sep 5 15:49:07 2024 +1200
Introduce vmcore creation notification to kdump
Motivation
==========
People may forget to recheck to ensure kdump works, which as a result, a
possibility of no vmcores generated after a real system crash. It is
unexpected for kdump.
It is highly recommended people to recheck kdump after any system
modification, such as:
a. after kernel patching or whole yum update, as it might break something
on which kdump is dependent, maybe due to introduction of any new bug etc.
b. after any change at hardware level, maybe storage, networking,
firmware upgrading etc.
c. after implementing any new application, like which involves 3rd party modules
etc.
Though these exceed the range of kdump, however a simple vmcore creation
status notification is good to have for now.
Design
======
Kdump currently will check any relating files/fs/drivers modified before
determine if initrd should rebuild when (re)start. A rebuild is an
indicator of such modification, and kdump need to be rechecked. This will
clear the vmcore creation status specified in $VMCORE_CREATION_STATUS.
Vmcore creation check will happen at "kdumpctl (re)start/status", and will
report the creation success/fail status to users. A "success" status indicates
previously there has been a vmcore successfully generated based on the current
env, so it is more likely a vmcore will be generated later when real crash
happens; A "fail" status indicates previously there was no vmcore
generated, or has been a vmcore creation failed based on current env. User
should check the 2nd kernel log or the kexec-dmesg.log for the failing reason.
$VMCORE_CREATION_STATUS is used for recording the vmcore creation status of
the current env. The format will be like:
success 1718682002
Which means, there has been a vmcore generated successfully at this
timestamp for the current env.
Usage
=====
[root@localhost ~]# kdumpctl restart
kdump: kexec: unloaded kdump kernel
kdump: Stopping kdump: [OK]
kdump: kexec: loaded kdump kernel
kdump: Starting kdump: [OK]
kdump: Notice: No vmcore creation test performed!
[root@localhost ~]# kdumpctl test
[root@localhost ~]# kdumpctl status
kdump: Kdump is operational
kdump: Notice: Last successful vmcore creation on Tue Jun 18 16:39:10 CST 2024
[root@localhost ~]# kdumpctl restart
kdump: kexec: unloaded kdump kernel
kdump: Stopping kdump: [OK]
kdump: kexec: loaded kdump kernel
kdump: Starting kdump: [OK]
kdump: Notice: Last successful vmcore creation on Tue Jun 18 16:39:10 CST 2024
The notification for kdumpctl (re)start/status can be disabled by
setting VMCORE_CREATION_NOTIFICATION in /etc/sysconfig/kdump
Signed-off-by: Tao Liu <ltao@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-56832
Upstream Status: RHEL-only
This reverts commit 099aead590.
Currently get_mntpoint_from_target incorrectly return empty result for
targets that contain square bracket '[', e.g
- eng.redhat.com:/srv/[nfs]
- [2620:52:0:a1:217:38ff:fe01:131]:/srv/[nfs]
- /dev/mapper/rhel[disk]
get_mntpoint_from_target is also used in several places. To avoid
RHEL-56832 and other possible regressions, revert the bad commit.
Suggested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-33465
Conflict: C9S misses the following two commits,
- 1397006 ("dracut-module-setup: Remove remove_cpu_online_rule() since PowerPC uses nr_cpus")
- 73c9eb7 ("dracut-module-setup: remove old s390 network device config (#1937048)")
Upstream Status: git@github.com:rhkdump/kdump-utils.git
commit 224d3102c54749eae98bfa1af8932aade8e4d2da
Author: Coiby Xu <coxu@redhat.com>
Date: Mon Apr 22 15:02:42 2024 +0800
Support setting up Open vSwitch (Ovs) Bridge network
Resolves: https://issues.redhat.com/browse/RHEL-33465
This patch supports setting up an Ovs bridge in kdump initrd. An Ovs
bridge is similar to a classic Linux bridge but we use ovs-vsctl to find
out the Ethernet device (having the MAC address as the bridge) added to
an Ovs bridge. Once we copy all the needed NetworkManager (NM) connection
profiles to kdump initrd and all the necessary files, NM will create an Ovs bridge
automatically in kdump initrd.
In the case of OpenShift Container Platform (OCP),
ovs-configuration.service [1] is responsible for setting up an Ovs bridge.
In theory, we can also try to bring up the original physical network
interface before ovs-configuration.service. But this approach is
cumbersome because it breaks our assumption that we should bring up the
same network in kdump intrd as in 1st kernel (establishing the same network
in kdump initrd only needs to copy the needed NM connection profiles
thus we don't need to learn how different network setup work under the
hood).
How to test this patch with the help of configure-ovs.sh?
=========================================================
1. Extract configure-ovs.sh from [2]
2. Install necessary packages for configure-ovs.sh
dnf install openvswitch -yq
dnf install NetworkManager-ovs nmap-ncat -yq
systemctl enable --now openvswitch
# restart NM so the ovs plugin can be activated
systemctl restart NetworkManager
3. Assume the network interface used for creating an Ovs bridge is
"ens2", use configure-ovs.sh to create an Ovs bridge,
interface=ens2
mkdir -p /etc/ovnk
echo $interface > /etc/ovnk/iface_default_hint
bash configure-ovs.sh OVNKubernetes
4. (Optional) If you want to make the created Ovs bridge survive a
reboot, simply make the created NM connections created by
configure-ovs.sh persist,
cp /run/NetworkManager/system-connections/ovs-* /etc/NetworkManager/system-connections/
If you need to create an Ovs bridge on top of a bonding network, use the
following commands for step 3,
nmcli con add type bond ifname bond0
nmcli con add type ethernet ifname eth0 master bond0
nmcli con add type ethernet ifname eth1 master bond0
echo bond0 > /etc/ovnk/iface_default_hint
bash configure-ovs.sh OVNKubernetes
Note
1. For RHEL, openvswitch3.3 may be installed so we need to get the
package name by "rpm -qf /usr/lib/systemd/system/openvswitch.service"
2. For RHEL9, openvswitch package needs to installed from another repo,
cat << 'EOF' > /etc/yum.repos.d/ovs.repo
[rhosp-rhel-9-fdp-cdn]
name=Red Hat Enterprise Linux Fast Datapath $releasever - $basearch cdn
baseurl=http://rhsm-pulp.corp.redhat.com/content/dist/layered/rhel9/$basearch/fast-datapath/os/
enabled=1
gpgcheck=0
EOF
dnf install openvswitch3.3 -yq
3. We instruct ovsdb-server to ignore NM connection files changes by
"--ovsdb-server-options='--disable-file-column-diff'". In the
future, this may not be needed if we simply copy all active NM
connection profiles to kdump initrd without changing them after
coming up with different solutions for the following cases,
1. Some environments like some Azure machine doesn't use persistent
NIC name. Current solution is to modify a NM connection
profile to match a device by MAC address, for details check
commit 568623e)
2. If a NIC has an IPv4 or IPv6 address, set the corresponding
may-fail property to no. Otherwise, dumping vmcore over IPv6
could fail because only IPv4 network is ready or vice versa. Current
solution is to disable IPv6 if only IPv4 is used and vice versa,
for details check commit 9dfcacf,
3. Some NICs need longer connection.wait-device-timeout otherwise
the connection will fail to be established (commit 6b586a9).
[1] https://github.com/openshift/machine-config-operator/blob/master/templates/common/_base/units/ovs-configuration.service.yaml
[2] https://github.com/openshift/machine-config-operator/blob/master/templates/common/_base/files/configure-ovs-network.yaml
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: RHEL-35885
Note: the fixes for RHEL-35885 is in a hurry without patch merge
in upstream, so the final upstream patch may be different than
this downstream one. Be careful here if conflicts happen for later
upstream patch backport.
Signed-off-by: Tao Liu <ltao@redhat.com>
Resolves: RHEL-35885
commit 9252d6b1b492016aa11a73340f286822e6d545f2
Author: Colin Walters <walters@verbum.org>
Date: Fri Jul 19 11:44:09 2024 -0400
lib: Ensure we don't find bind mounts for device target
There's comment here that `--source` somehow avoids bind
mounts, but that appears not to be the case in my
testing. I think we just happened to be lucky before
now with the `--first` picking the value we wanted.
Instead of using `--first` and hoping for the best,
parse the mounts and skip ones which are bind mounts
explicitly.
Signed-off-by: Colin Walters <walters@verbum.org>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: RHEL-35885
commit 42cdc05a8c99a2c0834377faca04b583404cb86f
Author: Colin Walters <walters@verbum.org>
Date: Fri Jul 19 14:23:39 2024 -0400
dracut: Disable ostree-prepare-root
In some images such as the recent fedora/rhel bootc base image,
the ostree dracut module is statically enabled:
40df0eb382/tier-0/initramfs.yaml (L9)
And also recently, we changed the ostree systemd unit
to enter emergency.target if it fails in:
05b3b66275
These two things combine mean we'll fail before kdump gets
a chance to run.
For our use case we don't need ostree in the initrd.
I tried to override this with `--omit=ostree` in our dracut
invocation, but that causes an error (dracut doesn't let the
cmdline override static config).
For now, let's just mask the service in our initrd.
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: RHEL-46777
commit 2f21fa2acfac9f6e19e071330f917e60aafc4600
Author: Philipp Rudo <prudo@redhat.com>
Date: Mon Jun 24 17:34:35 2024 +0200
kdump-lib: Drop 'file' dependency in is_uki
The 'file' utility is no longer installed per default. In addition there
was an update to it so it now reports the file type as
application/vnd.microsoft.portable-executable. Thus fall back to use
objdump to avoid adding yet an other dependency for kdump-utils and deal
with different versions of 'file'.
Note: This has the small drawback that objdump is arch specific. I.e.
examining a aarch64 UKI on a x86_64 machine will return an 'file format
not recognized' error.
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Resolves: RHEL-42442
Resolves: RHEL-22171
commit 2c741555d9749e9a137378332e561382f9e25739
Author: Philipp Rudo <prudo@redhat.com>
Date: Mon Jul 1 12:52:39 2024 +0200
kdumpctl.8: Add description to reset-crashkernel --reboot
There is no description for parameter --reboot for reset-crashkernel.
Thus add one.
Suggested-by: Lichen Liu <lichliu@redhat.com>
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Resolves: RHEL-42442
Resolves: RHEL-22171
Upstream: RHEL-only
This bug was fixed upstream in 574f8f5 ("kdumpctl: Simplify fadump
handling in reset_crashkernel"). Backporting this commit to RHEL9 would
require also to backport other commits that would change the behavior of
reset-crashkernel, e.g. no longer setting the default kernel command
line in the grub config. These changes are too invasive for RHEL9. Thus
go with a minimalistic RHEL-only fix.
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Resolves: RHEL-39494
Conflicts: Small difference in context of 2nd hunk.
commit 3028529915d3026e62b59d8f3faadddd410baa75
Author: Philipp Rudo <prudo@redhat.com>
Date: Fri Jun 14 11:48:24 2024 +0200
kdumpctl: Drop default kexec '-d' option
Kernel commits cbc2fe9d9cb2 ("kexec_file: add kexec_file flag to control
debug printing") and a85ee18c7900 ("kexec_file: print out debugging
message if required") added debug messages to the kexec_file_load system
call when option -d is provided to the kexec user space tool. As
kexec_file_load is the default and option -d is set by default these
messages are always printed when a crash kernel is loaded. This not only
clutters the kernel log but also potentially leaks confidential kernel
information to users. As the messages are printed to the kernel log, not
stderr, the redirection to /var/log/kdump.log won't catch them. This
will become even more problematic as for RHEL10 the kernel will be built
without support for the kexec_load system call. So kexec_file_load will
be the only choice in the future.
The redirection also caused confusion in a recent bug report. There a
user moved a working /etc/sysconfig/kdump from ppc to s390 with
KEXEC_ARGS containing the --dt-no-old-root option. This option is arch
specific and does not exist on s390. Thus the kexec-tools failed with an
'unrecognized option' error followed by the usage(). The problem was
that the 'unrecognized option' error is printed to stderr, which got
redirected to /var/log/kdump.log, while the usage() is printed to
stdout, which ended up in the systemd journal. This caused confusion as
the user only checked the journal and found the usage() without any
error message.
Thus remove the default -d option and the redirection of stderr to
/var/log/kdump.log for the kexec-tools user space tool.
This commit ultimately reverts 88a8b94 ("kdumpctl: add the '-d' option to
enable the kexec loading debugging messages").
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Philipp Rudo <prudo@redhat.com>