For CoreOS based systems we use Ignition for provisioning machines
in the initramfs on first boot. We trigger Ignition right now by
the presence of `ignition.firstboot` in the kernel command line. The
kernel argument is only present on first boot so after a reboot it
no longer is in the kernel command line.
If a kernel crash happens before the first reboot of a machine we
want the `ignition.firstboot` kernel argument to be removed and not
passed on to the crash kernel.
This patch rewrites get_recommend_size to get rid of the following
limitations,
1. only supports ranges in crashkernel sorted in increasing order
2. the first entry of crashkernel should have only a single digit and
it's in gigabytes
Suggested-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Recently, it's found 'kdumpctl estimate' returns 512M while the system
reserves 1024M kdump memory in a case. This happens because the ranges
in /proc/iomem are inclusively. For example, "0-1: System RAM" means 2
bytes of system memory other than 1 byte. Fix this error by adding one
more byte.
Note
1. the function has been simplified as well.
2. define PROC_IOMEM as /proc/iomem for the sake of unit tests
Reported-by: Ruowen Qin <ruqin@redhat.com>
Fixes: 1813189 ("kdump-lib.sh: introduce functions to return recommened mem size")
Suggested-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Make log saving the last step of kdump.sh, so it can catch more info,
for example, the output of post.d hooks will be covered by the log now.
Signed-off-by: Kairui Song <kasong@tencent.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
1. yum is deprecated so use dnf instead
2. use the "kdumpctl reset-crashkernel --fadump=on" API
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
1. yum is deprecated so use dnf instead
2. use the "kdumpctl reset-crashkernel" API
3. ask the users to refer to crashkernel-howto.txt for setting custom
crashkernel value
4. fix a typo
Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
1. clean up left crashkernel.default
2. fix a few typos and grammar mistakes
3. ask the users to refer to `man kdumpctl` for reset-crashkernel
Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
This test prevents the mistake of adding an option to kdump.conf
without changing check_config as is the case with commit 73ced7f
("introduce the auto_reset_crashkernel option to kdump.conf").
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
kdump_get_conf_val allows to retrieves config value defined in
kdump.conf and it also supports sed regex like
"ext[234]\|xfs\|btrfs\|minix\|raw\|nfs\|ssh".
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
This commit adds a relatively thorough test suite for
kdumpctl reset-crashkernel [--fadump=[on|off|nocma]] [--kernel=path_to_kernel] [--reboot]
as implemented in commit 140da74 ("rewrite reset_crashkernel to support
fadump and to used by RPM scriptlet").
grubby have a few options to support its own testing,
- --no-etc-grub-update, not update /etc/default/grub
- --bad-image-okay, don't check the validity of the image
- --env, specify custom grub2 environment block file to avoid modifying
the default /boot/grub2/grubenv
- --bls-directory, specify custom BootLoaderSpec config files to avoid
modifying the default /boot/loader/entries
So the grubby called by kdumpctl is mocked as
@grubby --grub2 --no-etc-grub-update --bad-image-okay --env=$SPEC_TEST_DIR/env_temp -b $SPEC_TEST_DIR/boot_load_entries "$@"
in the tests. To be able to call the actual grubby in the mock function [1],
ShellSpec provides the following command
$ shellspec --gen-bin @grubby
to generate spec/support/bins/@grubby which is used to call the actual grubby.
kdumpctl has implemented its own version of updating /etc/default/grub
in _update_kernel_cmdline_in_grub_etc_default. To avoiding writing to
/etc/default/grub, this function is mocked as outputting its name and
received arguments similar to python unitest's assert_called_with.
[1] https://github.com/shellspec/shellspec#execute-the-actual-command-within-a-mock-function
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
AfterAll is an example group hook [1] which would be run after the group
tests are executed. Use this hook to clean up the files created by mktemp.
[1] https://github.com/shellspec/shellspec#beforeall-afterall---example-group-hook
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Currently there are two issues with unit-testing the functions defined
in kdumpctl and other shell scripts after sourcing them,
- kdumpctl would call main which requires root permission and would
create single instance lock (/var/lock/kdump)
- kdumpctl and other shell scripts directly source files under /usr/lib/kdump/
When ShellSpec load a script via "Include", it defines the__SOURCED__
variable. By making use of __SOURCED__, we can
1. let kdumpctl not call main when kdumpctl is "Include"d by ShellSpec
2. instruct kdumpctl and kdump-lib.sh to source the files in the repo
when running ShelSpec tests
Note coverage/ is added to .gitignore because ShellSpec generates code
coverage results in this folder.
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Make use of the new ${OPT[]} array and simplify local_fs_dump_target to
remove one more file operations.
While at it rename the local_fs_dump_target to is_local_target
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
With the introduction of ${OPT[fstype]} this call to kdump_get_conf_val
can be removed now as well.
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
The variable is only used for ssh dump targets. Furthermore it is
identical to the value stored in ${OPT[_target]}. Thus drop DUMP_TARGET and
use ${OPT[_target]} instead.
In order to be able to distinguish between the different target types
introduce the internal ${OPT[_fstype]}.
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
The variable is only used for ssh dump targets. Furthermore it is
identical to the value stored in ${OPT[sshkey]}. Thus drop
SSH_KEY_LOCATION and use ${OPT[sshkey]} instead.
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
The variable is only used for ssh dump targets. Furthermore it is
identical to the value stored in ${OPT[path]}. Thus drop SAVE_PATH and
use ${OPT[path]} instead.
Also make sure that ${OPT[path]} is always set to the default value when
no entry in kdump.conf is found.
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
Every call to kdump_get_conf_val parses kdump.conf although the file has
already been parsed in check_config. Thus store the values parsed in
check_config in an array and use them later instead of re-parsing the
file over and over again.
While at it rename check_config to parse_config.
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
check_config and check_ssh_config both parse /etc/kdump.conf and are
usually used together. The difference between both is that
check_ssh_config does some extra checks on the format of the provided
ssh destination but ignores invalid or deprecated options in the config.
Thus merge check_ssh_config into check_config. Leave the additional
checks on the ssh destination in check_ssh_config but treat it like the
checks done for e.g. the failure_action.
This slightly changes the behavior of 'kdumpctl propagate', which now
fails if kdump.conf contains an invalid value unrelated to ssh. This
change in behavior isn't problematic because 'kdumpctl propagate' always
needs to be followed by a 'kdumpctl start' to have a working kdump
environment. For the situations where 'propagate' fails now the 'start'
would have failed in the past. So the failure only moved one step ahead
in the sequence.
While at it drop check_ssh_target and call check_and_wait_network_ready
directly.
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
The function has multiple problems:
1) SSH_{USER,SERVER} aren't defined local
2) Weird use of cut and sed to parse the DUMP_TARGET for the user and
host although check_ssh_config guarantees that it has the format
<user>@<host>.
3) Unnecessary use of a variable for the return value
4) Weird behavior to first unpack the DUMP_TARGET to SSH_USER and
SSH_SERVER and then putting it back together again
5) Definition of variable errmsg that is only used once but breaks
grep-ability of error message.
6) Wrong order when redirecting output of ssh-keygen, see SC2069 [1]
Fix them now.
While at it also improve the error messages in the function.
[1] https://www.shellcheck.net/wiki/SC2069
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
For ssh targets kdumpctl only verifies that the config value has the
correct <user>@<host> format itself. For all other tests, e.g. if the
destination can be reached, it relies on ssh. This allows users to
provide a <host> that isn't the proper hostname but an alias defined in
the ssh_config without failing the tests. If this is done
dracut-module-setup.sh:kdump_get_remote_ip will fail to obtain the
targets ip address. This failure is not detected and thus will not fail
the initramfs creation. The resulting initramfs however doesn't have the
necessary information for setting up the network and thus will fail to
boot.
Prevent the use of alias hostnames by verifying that the given hostname
is the same one ssh would use after parsing the ssh_config.
Note: Don't use getent ahosts to verify that the given host can be
resolved as this requires the network to be up which cannot be
guaranteed when the kdump.conf is parsed.
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
The time out was increased to 180 seconds in 680c0d3 ("kdumpctl:
distinguish the failed reason of ssh"). Update the comment to reflect
that change.
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
There are currently three identical definitions for the default ssh key.
Combine them into one in kdump-lib-initramfs.sh.
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
in prepare_kdump_bootinfo s/defaut/default/.
While at it declare it with the other local variables as local.
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
Using syslog for StandardOutput in a service file was deprecated in
systemd v246 with commit f3dc6af20f ("core: automatically update
StandardOuput=syslog to =journal (and similar for StandardError=)").
Thus the following warnings are printed in the crash kernel when
creating a dump.
systemd[1]: /usr/lib/systemd/system/kdump-capture.service:23: Standard output type syslog is obsolete, automatically updating to journal. Please update your unit file, and consider removing the setting altogether.
systemd[1]: /usr/lib/systemd/system/kdump-capture.service:24: Standard output type syslog+console is obsolete, automatically updating to journal+console. Please update your unit file, and consider removing the setting altogether.
Fix this by redirecting the stdout and stderr to the journal.
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
do_estimate prints the warning that the reserved crashkernel is lower
than the recommended one even then when both values are identical. This
might cause confusion. So omit printing the warning when both values are
equal.
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
There is a system-wide sync call at the end of mkdumprd, move it to
kdumpctl after rebuild initrd and add another one for mkfadumprd.
Sync only the $TARGET_INITRD to avoid a system-wide sync taking too
long on a system with high disk activity.
Also update the sync in kdumpctl:restore_default_initrd which will
mv the $DEFAULT_INITRD_BAK to $DEFAULT_INITRD.
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
If GRUB_ETC_DEFAULT use crashkernel=auto or
crashkernel=OLD_DEFAULT_CRASHKERNEL, it should be updated as well.
Add a helper function to read kernel cmdline parameter from
GRUB_ETC_DEFAULT. This function is used to read kernel cmdline
parameter like fadump or crashkernel.
Reviewed-by: Philipp Rudo <prudo@redhat.com>
There is the case where there are multiple entries of the same parameter on
the command line, e.g.
GRUB_CMDLINE_LINUX="crashkernel=110M crashkernel=220M fadump=on crashkernel=330M".
In such an situation _update_kernel_cmdline_in_grub_etc_default only
updates/removes the last entry which is usually not what you want as the
kernel (for crashkernel) takes the last entry it can find.
Thus make sure the case with multiple entries of the same parameter is
handled properly by removing all occurrences of given parameter first.
Note
1. sed command group and conditional control has been used to get rid of
grep.
2. Fully supporting kernel cmdline as documented in
Documentation/admin-guide/kernel-parameters.rst is complex and in
foreseeable future a full implementation is not needed. So simply
document the unsupported cases instead.
Fixes: 140da74 ("rewrite reset_crashkernel to support fadump and to used by RPM scriptlet")
Reported-by: Philipp Rudo <prudo@redhat.com>
Suggested-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
When doing in-place upgrading using leapp on x86_64, kdumpcl can't
acquire instance lock when running in %post RPM scriplet on x86_64,
localhost upgrade[1306]: /bin/kdumpctl: line 49: /var/lock/kdump: No such file or directory
localhost upgrade[1306]: kdump: Create file lock failed
and running "touch /var/lock/dkump" also fails with
"No such file or directory". Thus kdumpctl can't be run in %post
scriptlet. But kdumpctl can be run in %posttrans RPM scriplet.
Besides, it's better to update crashkernel after the kernel has been
updated. So let's update kernel crashkernel in the %posttrans
scriptlet which will be run in the end of a transaction i.e. after
the kernel has been updated.
Note for %posttrans scriptlet, "$1 == 1" means both installing a new
package and upgrading a package.
[1] https://github.com/apptainer/singularity/issues/2386#issuecomment-474747054
Reported-by: Jie Li <jieli@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Previously the output of blkid is not checked. If the output
is empty, the eval will report the following error message:
/lib/kdump/kdump-lib.sh: eval: line 925: syntax error near unexpected token `;'
/lib/kdump/kdump-lib.sh: eval: line 925: `; echo $TYPE'
For example, we can observe such a failing when blkid is invoked
against a lvm thinpool block device:
$ blkid -u filesystem,crypto -o export -- "/dev/block/253\:2"
$ echo $?
2
$ udevadm info /dev/block/253\:2|grep S\:
S: mapper/vg00-thinpoll_tmeta
In this patch, we will use sed instead of eval, to output the
fstype of block device if any.
Signed-off-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
It's found that the kernel cmdline crashkernel=auto doesn't get updated
when upgrading kexec-tools. This happens because _get_all_kernels_from_grubby
is called with no argument by reset_crashkernel_after_update. When retrieving
all kernel paths on the system, "grubby --info ALL" should be used. Fix this
error by passing "ALL" argument.
Fixes: 0adb0f4 ("try to reset kernel crashkernel when kexec-tools updates the default crashkernel value")
Reported-by: Jie Li <jieli@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Tao Liu <ltao@redhat.com>
_is_osbuild fails because it expects the 1st and 2nd function parameter
to be the environment variable and environ file path respectively. Fix
it by swapping the parameters in read_proc_environ_var.
Note the osbuild environ file path is defined in _OSBUILD_ENVIRON_PATH
so _is_osbuild can be unit-tested by overwriting _OSBUILD_ENVIRON_PATH.
Fixes: 6a3ce83 ("fix the error of parsing the container environ variable for osbuild")
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Pingfan Liu <piliu@redhat.com>
shellcheck finds the following problem,
$ shellcheck kdump-lib.sh
In kdump-lib.sh line 876:
get_recommend_size "$sys_mem" "$ck_cmdline"
^---------^ SC2154: ck_cmdline is referenced but not assigned (did you mean '_ck_cmdline'?).
s/ck_cmdline/_ck_cmdline to fix kdump_get_arch_recommend_size.
Note s/sys_mem/_sys_mem as well to make the changes consistent.
Fixes: 105c016 ("factor out kdump_get_arch_recommend_crashkernel")
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Tao Liu <ltao@redhat.com>
This patch makes the default crashkernel value consistent with previous
one.
Fixes: 105c016 ("factor out kdump_get_arch_recommend_crashkernel")
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
The environment variable entries in /proc/[pid]/environ are separated by
null bytes instead of by spaces. Update the sed regex to fix this issue.
Note that,
1. this patch also fixes a issue which is kdumpctl would try to reset
crashkernel even osbuild has provided custom crashkernel value.
2. kernel hook 92-crashkernel.install installed by kexec-tools is
guaranteed to be ran by kernel-install. kexec-tools doesn't recommend
kernel so there is no guarantee kernel is installed after kexec-tools.
But dnf invokes kernel-install in the posttrans scriptlet (of kernel-core)
which is always ran after all packages including kexec-tools and kernel
in a dnf transaction.
3. To be able to do unit tests, the logic of reading environment variable
has been extracted as a separate function.
Fixes: ddd428a ("set up kernel crashkernel for osbuild in kernel hook")
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Resolves: bz2025860
Upstream: git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
commit 186e7b0752d8fce1618fa37519671c834c46340e
Author: Alexander Egorenkov <egorenar@linux.ibm.com>
Date: Wed Dec 15 18:48:53 2021 +0100
s390: handle R_390_PLT32DBL reloc entries in machine_apply_elf_rel()
Starting with gcc 11.3, the C compiler will generate PLT-relative function
calls even if they are local and do not require it. Later on during linking,
the linker will replace all PLT-relative calls to local functions with
PC-relative ones. Unfortunately, the purgatory code of kexec/kdump is
not being linked as a regular executable or shared library would have been,
and therefore, all PLT-relative addresses remain in the generated purgatory
object code unresolved. This in turn lets kexec-tools fail with
"Unknown rela relocation: 0x14 0x73c0901c" for such relocation types.
Furthermore, the clang C compiler has always behaved like described above
and this commit should fix the purgatory code built with the latter.
Because the purgatory code is no regular executable or shared library,
contains only calls to local functions and has no PLT, all R_390_PLT32DBL
relocation entries can be resolved just like a R_390_PC32DBL one.
* https://refspecs.linuxfoundation.org/ELF/zSeries/lzsabi0_zSeries/x1633.html#AEN1699
Relocation entries of purgatory code generated with gcc 11.3
------------------------------------------------------------
$ readelf -r purgatory/purgatory.o
Relocation section '.rela.text' at offset 0x6e8 contains 27 entries:
Offset Info Type Sym. Value Sym. Name + Addend
00000000000c 000300000013 R_390_PC32DBL 0000000000000000 .data + 2
00000000001a 001000000014 R_390_PLT32DBL 0000000000000000 sha256_starts + 2
000000000030 001100000014 R_390_PLT32DBL 0000000000000000 sha256_update + 2
000000000046 001200000014 R_390_PLT32DBL 0000000000000000 sha256_finish + 2
000000000050 000300000013 R_390_PC32DBL 0000000000000000 .data + 102
00000000005a 001300000014 R_390_PLT32DBL 0000000000000000 memcmp + 2
...
000000000118 001600000014 R_390_PLT32DBL 0000000000000000 setup_arch + 2
00000000011e 000300000013 R_390_PC32DBL 0000000000000000 .data + 2
00000000012c 000f00000014 R_390_PLT32DBL 0000000000000000 verify_sha256_digest + 2
000000000142 001700000014 R_390_PLT32DBL 0000000000000000
post_verification[...] + 2
Relocation entries of purgatory code generated with gcc 11.2
------------------------------------------------------------
$ readelf -r purgatory/purgatory.o
Relocation section '.rela.text' at offset 0x6e8 contains 27 entries:
Offset Info Type Sym. Value Sym. Name + Addend
00000000000e 000300000013 R_390_PC32DBL 0000000000000000 .data + 2
00000000001c 001000000013 R_390_PC32DBL 0000000000000000 sha256_starts + 2
000000000036 001100000013 R_390_PC32DBL 0000000000000000 sha256_update + 2
000000000048 001200000013 R_390_PC32DBL 0000000000000000 sha256_finish + 2
000000000052 000300000013 R_390_PC32DBL 0000000000000000 .data + 102
00000000005c 001300000013 R_390_PC32DBL 0000000000000000 memcmp + 2
...
00000000011a 001600000013 R_390_PC32DBL 0000000000000000 setup_arch + 2
000000000120 000300000013 R_390_PC32DBL 0000000000000000 .data + 122
000000000130 000f00000013 R_390_PC32DBL 0000000000000000 verify_sha256_digest + 2
000000000146 001700000013 R_390_PC32DBL 0000000000000000 post_verification[...] + 2
Corresponding s390 kernel discussion:
* https://lore.kernel.org/linux-s390/20211208105801.188140-1-egorenar@linux.ibm.com/T/#u
Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reported-by: Tao Liu <ltao@redhat.com>
Suggested-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
[hca@linux.ibm.com: changed commit message as requested by Philipp Rudo]
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
v2:
- Moved patch 601 -> 401
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>