kdumpctl only accepts a specified set of options. Add
auto_reset_crashkernel to this set.
Fixes: 73ced7f ("introduce the auto_reset_crashkernel option to kdump.conf")
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Tao Liu <ltao@redhat.com>
When a file doesn't exist or isn't readable, grep complains as follows,
grep: /proc/cmdline: No such file or directory
grep: /etc/kernel/cmdline: No such file or directory
/proc/cmdline doesn't exist when installing package for an OS image and
/etc/kernel/cmdline may not exist if osbuild doesn't want set custom
kernel cmdline.
Use "-s" to suppress the error messages.
Fixes: 0adb0f4 ("try to reset kernel crashkernel when kexec-tools updates the default crashkernel value")
Fixes: ddd428a ("set up kernel crashkernel for osbuild in kernel hook")
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Tao Liu <ltao@redhat.com>
The '|' in 'failure_action|default' should be replaced with '\|' when
passed to kdump_get_conf_val function. Because '|' needs to be escaped
to mean OR operation in sed regex, otherwise it will consider
'failure_action|default' as a whole string.
Fixes: ab1ef78 ("kdump-lib.sh: use kdump_get_conf_val to read config values")
v1 -> v2:
Rephased the commit message.
Replaced " with '.
Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
zstd has better compression ratio and time consumption balance.
When no customized compression method specified in kdump.conf,
we will use zstd as the default compression method.
**The test method:
I installed kexec-tools with and without the patch, executing the following
command for 4 times, and calculate the averange time:
$ rm -f /boot/initramfs-*kdump.img && time kdumpctl rebuild && \
ls -ail /boot/initramfs-*kdump.img
**The test result:
Bare metal x86_64 machine:
dracut with squash module
zlib lzo xz lz4 zstd
real 10.6282 11.0398 11.395 8.6424 10.1676
user 9.8932 11.9072 14.2304 2.8286 8.6468
sys 3.523 3.4626 3.6028 3.5 3.4942
size of
kdump.img 30575616 31419392 27102208 36666368 29236224
dracut without squash module
zlib lzo xz lz4 zstd
real 9.509 19.4876 11.6724 9.0338 10.267
user 10.6028 14.516 17.8662 4.0476 9.0936
sys 2.942 2.9184 3.0662 2.9232 3.0662
size of
kdump.img 19247949 19958120 14505056 21112544 17007764
PowerVM hosted ppc64le VM:
dracut with squash module | dracut without sqaush module
zlib zstd | zlib zstd
real 10.6742 10.7572 | 9.7676 10.5722
user 18.754 19.8338 | 20.7932 13.179
sys 1.8358 1.864 | 1.637 1.663
|
size of |
kdump.img 36917248 35467264 | 21441323 19007108
**discussion
zstd has a better compression ratio and time consumption balance.
v1 -> v2:
Use kdump_get_conf_val() to get dracut_args values of kdump.conf
v2 -> v3:
Attached testing benchmark
v3 -> v4:
Re-measured and re-attached the testing benchmark of x86_64 and ppc64le.
Changed regex '.*[[:space:]]' to '(^|[[:space:]])'
v4 -> v5:
Attacked lzo/xz/lz4 testing benchmark.
v5 -> v6:
Add zstd as required in kexec-tools.spec
Hello Coiby, you may use "RELEASE=34 make test-run", for
CONFIG_RD_ZSTD is enabled since fc-cloud-34
Acked-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Update crashkernel-howto since crashkernel.default has been removed. The
documentation is also simplified as a result.
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
osbuild is a tool to build OS images. It uses bwrap to install packages
inside a sandbox/container. Since the kernel package recommends
kexec-tools which in turn recommends grubby, the installation order would
be grubby -> kexec-tools -> kernel. So we can use the kernel hook
92-crashkernel.install provided by kexec-tools to set up kernel
crashkernel for the target OS image. But in osbuild's case, there is no
current running kernel and running `uname -r` in the container/sandbox
actually returns the host kernel release. To set up kernel crashkernel for
the OS image built by osbuild, a different logic is needed.
We will check if kernel hook is running inside the osbuild container
then set up kernel crashkernel only if osbuild hasn't specified a
custome value. osbuild exposes [1] the container=bwrap-osbuild environment
variable. According to [2], the environment variable is not inherited down
the process tree, so we need to check /proc/1/environ to detect this
environment variable to tell if the kernel hook is running inside a
bwrap-osbuild container. After that we need to know if osbuild wants to use
custom crashkernel value. This is done by checking if /etc/kernel/cmdline
has crashkernel set [3]. /etc/kernel/cmdline is written before packages
are installed.
[1] https://github.com/osbuild/osbuild/pull/926
[2] https://systemd.io/CONTAINER_INTERFACE/
[3] https://bugzilla.redhat.com/show_bug.cgi?id=2024976#c5
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
When kexec-tools updates the default crashkernel value, it will try to
reset the existing installed kernels including the currently running
kernel. So the running kernel could have different kernel cmdline
parameters from /proc/cmdline. When installing a kernel after updating
kexec-tools, /usr/lib/kernel/install.d/20-grub.install would be called
by kernel-install [1] which would use /proc/cmdline to set up new kernel's
cmdline. To address this special case, reset the new kernel's crashkernel
and fadump value to the value that would be used by running kernel after
rebooting by the installation hook. One side effect of this commit is it
would reset the installed kernel's crashkernel even currently running kernel
don't use the default crashkernel value after rebooting. But I think this
side effect is a benefit for the user.
The implementation depends on kernel-install which run the scripts in
/usr/lib/kernel/install.d passing the following arguments,
add KERNEL-VERSION $BOOT/MACHINE-ID/KERNEL-VERSION/ KERNEL-IMAGE [INITRD-FILE ...]
An concrete example is given as follows,
add 5.11.12-300.fc34.x86_64 /boot/e986846f63134c7295458cf36300ba5b/5.11.12-300.fc34.x86_64 /lib/modules/5.11.12-300.fc34.x86_64/vmlinuz
kernel-install could be started by the kernel package's RPM scriplet [2].
As mentioned in previous commit "try to reset kernel crashkernel when
kexec-tools updates the default crashkernel value", kdumpctl has difficulty
running in RPM scriptlet fore CoreOS. But rpm-ostree ignores all kernel hooks,
there is no need to disable the kernel hook for CoreOS/Atomic/Silverblue. But a
collaboration between rpm-ostree and kexec-tools is needed [3] to take care
of this special case.
Note the crashkernel.default support is dropped.
[1] https://www.freedesktop.org/software/systemd/man/kernel-install.html
[2] https://src.fedoraproject.org/rpms/kernel/blob/rawhide/f/kernel.spec#_2680
[3] https://github.com/coreos/rpm-ostree/issues/2894
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
kexec-tools could update the default crashkernel value.
When auto_reset_crashkernel=yes, reset kernel to new crashkernel
value in the following two cases,
- crashkernel=auto is found in the kernel cmdline
- the kernel crashkernel was previously set by kexec-tools i.e.
the kernel is using old default crashkernel value
To tell if the user is using a custom value for the kernel crashkernel
or not, we assume the user would never use the default crashkernel value
as custom value. When kexec-tools gets updated,
1. save the default crashkernel value of the older package to
/tmp/crashkernel (for POWER system, /tmp/crashkernel_fadump is saved
as well).
2. If auto_reset_crashkernel=yes, iterate all installed kernels.
For each kernel, compare its crashkernel value with the old
default crashkernel and reset it if yes
The implementation makes use of two RPM scriptlets [2],
- %pre is run before a package is installed so we can use it to save
old default crashkernel value
- %post is run after a package installed so we can use it to try to reset
kernel crashkernel
There are several problems when running kdumpctl in the RPM scripts
for CoreOS/Atomic/Silverblue, for example, the lock can't be acquired by
kdumpctl, "rpm-ostree kargs" can't be run and etc.. So don't enable this
feature for CoreOS/Atomic/Silverblue.
Note latest shellcheck (0.8.0) gives false positives about the
associative array as of this commit. And Fedora's shellcheck is 0.7.2
and can't even correctly parse the shell code because of the associative
array.
[1] https://github.com/koalaman/shellcheck/issues/2399
[2] https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
This option will determine whether to reset kernel crashkernel
to new default value or not when kexec-tools updates the default
crashkernel value and existing kernels using the old default kernel
crashkernel value. Default to yes.
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Rewrite kdumpctl reset-crashkernel KERNEL_PATH as
kdumpctl reset-crashkernel [--fadump=[on|off|nocma]] [--kernel=path_to_kernel] [--reboot]
This interface would reset a specific kernel to the default crashkernel value
given the kernel path. And it also supports grubby's syntax so there are the
following special cases,
- if --kernel not specified,
- use KDUMP_KERNELVER if it's defined in /etc/sysconfig/kdump
- otherwise use current running kernel, i.e. `uname -r`
- if --kernel=DEFAULT, the default boot kernel is chosen
- if --kernel=ALL, all kernels would have its crashkernel reset to the
default value and the /etc/default/grub is updated as well
--fadump=[on|off|nocma] toggles fadump on/off for the kernel provided
in KERNEL_PATH. If --fadump is omitted, the dump mode is determined by
parsing the kernel command line for the kernel(s) to update.
CoreOS/Atomic/Silverblue needs to be treated as a special case because,
- "rpm-ostree kargs" is used to manage kernel command line parameters
so --kernel doesn't make sense and there is no need to find current
running kernel
- "rpm-ostree kargs" itself would prompt the user to reboot the system
after modify the kernel command line parameter
- POWER is not supported so we can assume the dump mode is always kdump
This interface will also be called by kexec-tools RPM scriptlets [1]
to reset crashkernel.
Note the support of crashkenrel.default is dropped.
[1] https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
grubby --info=kernel-path or --add-kernel=kernel-path accepts a kernel
path (e.g. /boot/vmlinuz-5.14.14-200.fc34.x86_64) instead of kernel release
(e.g 5.14.14-200.fc34.x86_64). So we need to know the kernel path given
a kernel release. Although for Fedora/RHEL, the kernel path is
"/boot/vmlinuz-<KERNEL_RELEASE>", a path kernel could also be
/boot/<machine-id>/<KERNEL_RELEASE>/vmlinuz. So the most reliable way to
find the kernel path given a kernel release is to use "grubby --info".
For osbuild, a kernel path may not yet exist but it's valid for
"grubby --update-kernel=KERNEL_PATH". For example, "grubby -info" may
output something as follows,
index=0
kernel="/var/cache/osbuild-worker/osbuild-store/tmp/tmp2prywdy5object/tree/boot/vmlinuz-5.15.10-100.fc34.x86_64"
args="ro no_timer_check net.ifnames=0 console=tty1 console=ttyS0,115200n8"
root="UUID=76a22bf4-f153-4541-b6c7-0332c0dfaeac"
initrd="/var/cache/osbuild-worker/osbuild-store/tmp/tmp2prywdy5object/tree/boot/initramfs-5.15.10-100.fc34.x86_64.img"
There is no need to check if path like
/var/cache/osbuild-worker/osbuild-store/tmp/tmp2prywdy5object/tree/boot/vmlinuz-5.15.10-100.fc34.x86_64
physically exists.
Note these helper functions doesn't support CoreOS/Atomic/Silverblue
since grubby isn't used by them.
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Add a helper function to get dump mode. The dump mode would be
- fadump if fadump=on or fadump=nocma
- kdump if fadump=off or empty fadump
Otherwise return 1.
Also add another helper function to return a kernel's dump mode.
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
This helper function will be used to retrieve the value of kernel
cmdline parameters including crashkernel, fadump, swiotlb and etc.
Suggested-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Provide "kdumpctl get-default-crashkernel" for kdump_anaconda_addon
so crashkernel.default isn't needed.
When fadump is on, kdump_anaconda_addon would need to specify the dump
mode, i.e. "kdumpctl get-default-crashkernel fadump".
This interface would also be used by RPM scriptlet [1] to fetch default
crashkernel value.
[1] https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Factor out kdump_get_arch_recommend_crashkernel to prepare for
kdump-anaconda-plugin for example to retrieve the default crashkernel
value.
Note the support of crashkenrel.default is dropped.
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
It has been decided to increase default crashkernel value to reduce the
possibility of OOM.
Fixes: 7b7ddab ("kdump-lib.sh: kdump_get_arch_recommend_size uses crashkernel.default")
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
It seems the save_core function and vmcore detection was used a long
time ago when kdump shares same userspace in first and second kernel.
It's now heavily deprecated (only support cp, hardcoded path, dumpoops
no longer exists) and not used.
Now vmcore will never show up in first kernel for both kdump and fadump
case, and kdumpctl is only used in first kernel, so just remove them.
Signed-off-by: Kairui Song <kasong@tencent.com>
Acked-by: Coiby Xu <coxu@redhat.com>
For earlykdump, kdump-lib-initramfs.sh is sourced by kdump-lib.sh,
however it is not installed in dracut-early-kdump-module-setup.sh. Same
as xargs, which is used by kdump-lib.sh. Otherwise earlykdump will report
file not found errors.
Fixes: a5faa052d4
("kdump-lib-initramfs.sh: prepare to be a POSIX compatible lib")
Fixes: 4f01cb1b0a
("kdump-lib.sh: fix variable quoting issue")
Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
Onlining secondary cpus breaks kdump completely on KVM on Power hosts
Though we use maxcpus=1 by default but 40-redhat.rules will bring up all
possible cpus by default.
Thus before we get the kernel fix and the systemd rule fix let's remove
the cpu rule in 40-redhat.rules for ppc64/ppc64le kdump initramfs.
This is back ported from RHEL, and original credit goes to Dave Young
<dyoung@redhat.com>
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Tao Liu <ltao@redhat.com>
When kexec-tools is newly installed, kdump migration action hasn't
registered and the following error could occur,
INF dnf.rpm: Could not find a registered notification tool with the specified command ('/usr/lib/kdump/kdump-migrate-action.sh').
"servicelog_notify --list" could list registered notification tools for
a command but it outputs the above error as well. So simply redirect the
error to /dev/null when running "servicelog_notify --remove".
Fixes: commit 146f662622
("kdump/ppc64: migration action registration clean up")
Acked-by: Tao Liu <ltao@redhat.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
When secureboot is enabled, kdumpctl needs to use keyctl to add/remove
a key to/from the .ima keyring.
Fixes: commit 596fa0a07f
("kdumpctl: enable secure boot on ppc64le LPARs")
Signed-off-by: Coiby Xu <coxu@redhat.com>
The Zstandard (zstd) compression method is not enabled:
$ makedumpfile -v
makedumpfile: version 1.7.0 (released on 8 Nov 2021)
lzo enabled
snappy enabled
zstd disabled
This patch will enable it when building kexec-tools rpm package.
Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
When there more than one binaries, quoting "$val" would make
dracut-install treat multiple binaries as one binary. Take
"extra_bins /usr/sbin/ping /usr/sbin/ip" as an example, the
following error would occur when building initrd,
dracut-install: ERROR: installing '/usr/sbin/ping /usr/sbin/ip'
dracut: FAILED: /usr/lib/dracut/dracut-install -D /var/tmp/dracut.ODrioZ/initramfs -a /usr/sbin/ping /usr/sbin/ip
Fix it by not quoting the variable and bypassing SC2086 shellcheck.
Fixes: commit 86538ca6e2
("bash scripts: fix variable quoting issue")
Acked-by: Tao Liu <ltao@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
nfs service will append extra mount options to kernel mount options.
Such as mountaddr/mountproto options. These options only represent
current mounting details of the 1st kernel, but may not appropriate
for the 2nd kernel for the same reason as commit
d4f04afa47 ("mkdumprd: drop some nfs
mount options when reading from kernel"). This patch will remove
these options.
Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
The return value of set_ck_kernel or set_grub_ck is wrongly being used
as the exit code. This hook should exit with 0 or it may result in
unexpected behavior of kernel-install.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
To make things cleaner and more human readable, add a short comment for
the POSIX scripts.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
All non-POSIX syntax in second kernel are gone, tested on Fedora 34
with latest dracut, dash now works fine.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
This is a batch update done with:
shfmt -s -w kdump-lib.sh
Clean up code style and reduce code base size, no behaviour change.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
Fix a few ambiguous syntax issues and remove some unused variables.
Also refactor some code to make it more robust.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
Get rid of let, and remove useless '$' on arithmetic variables.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
Replace echo "$(cmd)" and "var=$(cmd); echo $var" with just `cmd`.
And remove some useless cat.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
This fixes word splitting issue with nmcli args. Current kexec-tools
scripts won't call nmcli with correct arguments when there are space in
network interface name.
nmcli expects multiple parameters, but get_nmcli_value_by_field only
accepts two params and depends on shell word splitting to split the
_nm_show_cmd into multiple params, which is very fragile.
So switch the param order, simplified this function and now multiple
params can be used properly.
And get_nmcli_connection_show_cmd_by_ifname returns multiple
nmcli params in a single variable, it depend on shell word splitting to
split the words when calling nmcli. But this is very fragile and break
easily when there are any special character in the connection path.
This function is only introduced to get and cache the nmcli command
which contains the "connection name".
Actually only cache the "connection path" is enough. Callers should
just call get_nmcli_connection_apath_by_ifname to cache the path, and
a new helper get_nmcli_field_by_conpath is introduced here to get value
from nmcli. This way "connection path" can contain any character.
Also get rid of another nmcli_cmd usage in
get_nmcli_connection_apath_by_ifname which stores multiple params in a
single bash variable separated by space.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
Updated file syntax with following command:
sed -i -e 's/\(\s\)\[\s\([^]]*\)\s\]/\1\[\[\ \2 \]\]/g' kdump-lib.sh
(replace '[ ]' with '[[ ]]')
sed -i -e 's/`\([^`]*\)`/\$(\1)/g' kdump-lib.sh
(replace `...` with $(...))
And manually updated [[ ... -a ... ]] and [[ ... -o ... ]] with && and
||.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
POSIX doesn't support keyword local, so add double underscore and prefix
to variable names, and reduce variable usage, to avoid any variable name
conflict.
Also reformat the code with `shfmt -s -w kdump-lib-initramfs.sh`.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
This is done with `shfmt -w -s dracut-kdump.sh`. There is no behaviour
change.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
POSIX doesn't support keyword `local`, so this commit reduced variable usage.
Heredoc ("<<<") operation is also not supported, so kdump.conf is now pre-parse
into a temp file. Also fixes many POSIX syntax errors.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
Set pipefail will cause POSIX shell to exit with failure. So only do
that in bash.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>