Commit Graph

264 Commits

Author SHA1 Message Date
Coiby Xu
0ffce0ef4e Only try to reset crashkernel when kdump.service is enabled
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2243068

Currently, when kexec-tools is installed, the kernel will automatically
have the crashkernel parameter set up. In the case where users only want
the kexec reboot feature, this is not what users want as a 1G-RAM system
will lose 192M memory. Considering Fedora's systemd preset policy has
kdump.service disabled and RHEL' has kdump.service enabled, this patch
makes kexec-tools only reset crashkernel when kdump.service is enabled.

Reported-by: Chris Murphy <bugzilla@colorremedies.com>
Cc: Philipp Rudo <prudo@redhat.com>
Cc: Adam Williamson <awilliam@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-10-17 13:45:30 +08:00
Nayna Jain
4fa17b2ee4 powerpc: update kdumpctl to load kernel signing key for fadump
On secure boot enabled systems with static keys, kexec with kexec_file_load(-s)
fails as "Permission Denied" when fadump is enabled.

Similar to kdump, load kernel signing key for fadump as well.

Reported-by: Sachin P Bappalige <sachinpb@linux.vnet.ibm.com>
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
2023-10-10 08:42:01 +08:00
Nayna Jain
fe6eb30e67 powerpc: update kdumpctl to remove deletion of kernel signing key once loaded
Kernel signing key is deleted once kdump is loaded. This causes confusion in
debugging since key is no longer visible. Unless someone knows how
kdumpctl script works, it is difficult to find out how kdump could be
loaded when there is no key on .ima keyring.

Remove deletion of kernel signing key once loaded. And then to prevent
multiple loading of same key when kdump service is disabled/enabled, update
key description field as well.

Suggested-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-10-10 08:42:01 +08:00
Philipp Rudo
8175924e89 kdumpctl: Stop updating grub config in reset_crashkernel
With multiple kernel variants on the same architecture, e.g. the 4k and
64k kernel on aarch64, we can no longer assume that the crashkernel
value for the currently running kernel will work for all installed
kernels. This also means that we can no longer update the grub config as
we don't know which value to set it to. Thus get the crashkernel value
for each kernel and stop updating the grub config.

While at it merge the _new_fadump and _fadump_val variables and remove
_read_kernel_arg_in_grub_etc_default which has no user.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Pingfan Liu <piliu@redhat.com>
2023-09-14 15:01:52 +08:00
Philipp Rudo
099434b993 kdumpctl: Prevent option --fadump on non-PPC in reset_crashkernel
Prevent the --fadump option to be used on non-PPC systems. This not only
prevents user errors but also guarantees that _dump_mode and _fadump_val are
empty on these systems.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Pingfan Liu <piliu@redhat.com>
2023-09-14 15:01:52 +08:00
Philipp Rudo
f5785c60aa kdumpctl: simplify _update_kernel_cmdline
_update_kernel_cmdline handles two cmdline parameters at once. This does not
only make the function itself but also its callers more complicated than
necessary. For example in _update_crashkernel the fadump gets "updated" to
the value that has been read from grubby. Thus simplify
_update_kernel_cmdline to only update one parameter at once.

While at it shorten some variable named in the callers.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Pingfan Liu <piliu@redhat.com>
2023-09-14 15:01:52 +08:00
Philipp Rudo
1049e1c79c kdumpctl: drop condrestart subcommand
condrestart is a left over from the time of SysVinit that is no longer
needed since the kexec-tools switched to systemd (10c91a1 ("Removing
sysvinit files") plus the one before). What's especially intriguing is
that from the beginning (0112f36 ("- Add a kdump sysconfig file and init
script - Spec file additions for pre/post install/uninstall")) the
sub-command never did any actual work (other than not returning an
error). Thus simply remove the condrestart sub-command.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Pingfan Liu <piliu@redhat.com>
2023-09-14 15:01:52 +08:00
Philipp Rudo
b9738affc9 kdumpctl: drop _get_current_running_kernel_path
_get_current_running_kernel_path is identical to
_find_kernel_path_by_release $(uname -r) so simply use this instead of
defining a new function.

While at it simplify reset_crashkernel slightly. This changes the
behavior of the function for the case when KDUMP_KERNELVER is defined
but no kernel with this version is installed. Before, the missing
kernel is silently ignored and the currently running kernel is used
instead. Now, kdumpctl will exit with an error.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Pingfan Liu <piliu@redhat.com>
2023-09-14 15:01:52 +08:00
Philipp Rudo
026edc2b59 Fix various shellcheck findings
This includes fixes for

SC2295 (info): Expansions inside ${..} need to be quoted separately, otherwise they match as patterns.
SC2005 (style): Useless echo? Instead of 'echo $(cmd)', just use 'cmd'.
SC2162 (info): read without -r will mangle backslashes.
SC2086 (info): Double quote to prevent globbing and word splitting.
SC2317 (info): Command appears to be unreachable. Check usage (or ignore if invoked indirectly).

In addition add some source hints to prevent false positive findings.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Pingfan Liu <piliu@redhat.com>
2023-09-14 15:01:52 +08:00
Sourabh Jain
4b7b7736ee Introduce a function to get reserved memory size
The size of the reserved memory in the functions show_reserved_mem,
check_crash_mem_reserved, and do_estimate are fetched from the sysfs
node `/sys/kernel/kexec_crash_size`. However, in the case of fadump,
the reserved area size is instead present in
/sys/kernel/fadump/mem_reserved.

For example:

$ kdumpctl showmem
kdump: Dump mode is fadump
kdump: Reserved 0MB memory for crash kernel

The above command showed 0MB for Reserved memory which is incorrect, the
actual reservation was 2048MB.

To resolve this issue a new helper function is introduced to fetch
reserved memory size based on the dump mode. For "fadump" mode,
it looks in `/sys/kernel/fadump/mem_reserved`, otherwise, it uses
`/sys/kernel/kexec_crash_size`. And all functions that previously
fetching reserved memory directly from `/sys/kernel/kexec_crash_size`
sysfs node are now updated to use this new function to get the reserved
memory size.

With the fix in place, the `kdumpctl showmem` command will now display
correct reserved memory size.

$ kdumpctl showmem
kdump: Dump mode is fadump
kdump: Reserved 2048MB memory for crash kernel

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Reported-by: Sachin P Bappalige <sachinpb@linux.vnet.ibm.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-08-15 13:51:14 +08:00
Philipp Rudo
dda81d72c2 kdumpctl: Fix temporary directory location
The temporary directory is currently created under the current working
directory. That alone isn't ideal but works most of the time. However,
it will fail when the current working directory is not writable. So make
sure the directory is created within TMPDIR.

Fixes: ea00b7d ("kdumpctl: Move temp file in get_kernel_size to global temp dir")
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-06-20 11:17:43 +08:00
Pingfan Liu
64d93c886f kdumpctl: Fix the matching of plus symbol by grep's EREs
After introducing 64k variant kernel on aarch64, an example kernel name
looks like "vmlinuz-5.14.0-316.el9.aarch64+64k". To match the plus
symbol, it demands an escape charater.

Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-06-14 17:33:16 +08:00
Pingfan Liu
05c4861443 kdump-lib: add support for 64K aarch64
On aarch64, both 4K and 64K kernel can be installed, while they demand
different size reserved memory for kdump kernel.

'get_conf PAGE_SIZE' can not work if installing a 64K kernel when
running a 4K kernel. Hence resorting to the kernel release naming rules.
At present, the 64K kernel has the keyword '64k' in its suffix.

The base line for 64K is decided based on 4K. The diff 100M is picked up
since on a high end machine without smmu enabled, the diff of MemFree is
82M.

As for the smmu case, a huge difference in the memory consumption lies
between 64k and 4k driver. And it should be calculated separatedly.

Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-06-14 17:33:16 +08:00
Coiby Xu
07b99ecab7 Add ShellSpec tests for managing the crashkernel kernel parameter
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-05-29 14:40:57 +08:00
Coiby Xu
5b31b099ae Simplify the management of the kernel parameter crashkernel
Currently, kexec-tools only updates the crashkernel to a new default
value only when both two conditions are met,
 - auto_reset_crashkernel=yes in kdump.conf
 - existing kernels or current running kernel should use the old default
   value.

To address seen corner cases, the logic to tell if the second condition
is met becomes quite complex. Instead of making the logic more complex
to support aarch64-64k, this patch drops the second condition to
simplify the management of the crashkernel kernel parameter.

Another change brought by this simplification is kexec-tools will also
set up the kernel crashkernel parameter for a fresh install (previously
it's limited to osbuild).

Note
1. This patch also stop trying to update /etc/default/grub because
   a) it only affects the static file /boot/grub2/grub.cfg
   b) grubby is recommended to change the kernel command-line parameters
      for both Fedora [1] and RHEL9 [2][3]
   c) For the cases of aarch64 and POWER, different kernels could have
      different default crashkernel value.

2. Starting with Fedora 37,  posttrans rpm scriplet distinguish between
   package install and upgrade.

[1] https://fedoraproject.org/wiki/GRUB_2
[2] https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/managing_monitoring_and_updating_the_kernel/configuring-kernel-command-line-parameters_managing-monitoring-and-updating-the-kernel#changing-kernel-command-line-parameters-for-all-boot-entries_configuring-kernel-command-line-parameters
[3] https://access.redhat.com/solutions/1136173

Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-05-29 14:40:57 +08:00
Coiby Xu
cdc0253a3c Let _update_kernel_cmdline return the correct return code
Currently, for non-s390x systems, the return code is 1 even when
_update_kernel_cmdline is correctly executed. This makes callers like
reset_crashkernel_after_update fail to print a message if a kernel has
its crashkernel updated. Fix it by put the code inside if block for
s390x.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-05-29 14:40:57 +08:00
Jeremy Linton
8af05dc45a kdumpctl: Add support for systemd-boot paths
The default systemd-boot installed kernels on fedora end up in the form:

/boot/efi/36b54597c46383/6.4.0-0.rc0.20230427git6e98b09da931.5.fc39.aarch64/linux

Where the kernel version is a directory containing the kernel (linux)
and the initrd. Thus _find_kernel_path_by release needs to be a bit less
strict and allow some futher characters on the grubby (really bootctl)
output.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-05-16 09:21:13 +08:00
Philipp Rudo
ea7be0608e kdumpctl: Add basic UKI support
A Unified Kernel Image (UKI) is a single EFI PE executable combining an
EFI stub, a kernel image, an initrd image, and the kernel command line.
They are defined in the Boot Loader Specification [1] as type #2
entries. UKIs have the advantage that all code as well as meta data that
is required to boot the system, not only the kernel image, is combined
in a single PE file and can be signed for EFI SecureBoot. This extends
the coverage of SecureBoot extensively.

For RHEL support for UKI were included into kernel-ark with 16c7e3ee836e
("redhat: Add sub-RPM with a EFI unified kernel image for virtual
machines").

There are two problems with UKIs from the kdump point of view at the
moment. First, they cannot be directly loaded via kexec_file_load and
second, the initrd included isn't suitable for kdump. In order to enable
kdump on systems with UKIs build the kdump initrd as usual and extract
the kernel image before loading the crash kernel.

[1] https://uapi-group.org/specifications/specs/boot_loader_specification/

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-05-16 09:21:13 +08:00
Philipp Rudo
ea00b7db43 kdumpctl: Move temp file in get_kernel_size to global temp dir
Others will need to use a temporary files, too. In order to avoid
potential clashes of multiple trap handlers move the local temp file
into a global temp dir.

While at it make sure that the trap handler returns the correct exit
code.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-05-16 09:21:13 +08:00
Philipp Rudo
81d89c885f kdumpctl: Move get_kernel_size to kdumpctl
The function is only used in do_estimate. Move it to kdumpctl to
prevent confusion.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-05-16 09:20:59 +08:00
Philipp Rudo
0ff44ca6e8 kdumpctl: fix is_dracut_mod_omitted
The function is pretty broken right now. To start with the -o/--omit
option allows a quoted, space separated list of modules. But using 'set'
breaks quotation and thus only considers the first element in the list.
Furthermore dracut uses getopt internally. This means that it is also
possible to pass the list via --omit=.

Fix the function by making use of getopt for parsing the dracut_args.
While at it also add a test cases to cover the functions.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-04-17 14:49:51 +08:00
Philipp Rudo
f81e6ca8da kdump-lib: move is_dracut_mod_omitted to kdumpctl
The function is only used in kdumpctl. Thus move it there to keep
kdump-lib small and simple.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-04-17 14:49:51 +08:00
Lichen Liu
d619b6dabe kdumpctl: lower the log level in reset_crashkernel_for_installed_kernel
Although upgrading the kernel with `rpm -Uvh` is not recommended, the
kexec-tools plugin prints confusing error logs when a customer upgrades the
kernel through it.

```
kdump: kernel 5.14.0-80.el9.x86_64 doesn't exist
kdump: Couldn't find current running kernel
```

Not finding the currently running kernel will only make kdump unable to copy the
grub entry parameters to the newly installed kernel, so lower the log level.

Signed-off-by: Lichen Liu <lichliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-04-10 12:20:15 +08:00
Coiby Xu
12e6cd2b76 Use the correct command to get architecture
`uname -m` was used by mistake. As a result, kexec-tools failed to
update crashkernel=auto during in-place upgrade from RHEL8 to RHEL9.

`uname -m` should be used to get architecture instead.

Fixes: 5951b5e2 ("Don't try to update crashkernel when bootloader is not installed")

Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Lichen Liu <lichliu@redhat.com>
2023-02-21 11:33:06 +08:00
Philipp Rudo
d4e877214c kdumpctl: make do_estimate more robust
At the beginning of do_estimate it currently checks whether the
TARGET_INITRD exists and if not fails with an error message. This not
only requires the user to manually trigger the build of the initrd but
also ignores all cases where the TARGET_INITRD exists but need to be
rebuild. For example when there were changes to kdump.conf or when the
system switches from kdump to fadump. All these changes will impact the
outcome of do_estimate. Thus properly check whether the initrd needs to
be rebuild and if it does trigger the rebuild automatically.

To do so move the check whether the TARGET_INITRD has fadump enabled to
is_system_modified and call this function. With this force_(no_)rebuild
options in kdump.conf are ignored to avoid unnecessary rebuilds.

While at it cleanup check_system_modified and rename it to
is_system_modified. Furthermore move printing the info that the initrd
gets rebuild to rebuild_initrd to avoid every caller has the same line.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-01-30 17:37:23 +08:00
Philipp Rudo
37577b93ed kdumpctl: refractor check_rebuild
check_rebuild uses a bunch of local variables to store the result of the
different checks performed. At the end of the function it then evaluates
which check failed to print an appropriate info and trigger a rebuild if
needed. This not only makes the function hard to read but also requires
all checks to be executed even if an earlier one already determined that
the initrd needs to be rebuild. Thus refractor check_rebuild such that
it only checks whether the initrd needs to rebuild and trigger the
rebuild by the caller (if needed). While at it rename the function to
need_initrd_rebuild.

Furthermore also move setup_initrd to the caller so it is more consisted
with the other users of the function.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-01-30 17:37:23 +08:00
Philipp Rudo
5eefcf2e94 kdumpctl: cleanup 'stop'
Like for 'start' move the printing of the error message to the calling
function. This not only makes the code more consistent to 'start' but
also prevents 'kdumpctl restart' to call 'start' in case 'stop' has
failed. This doesn't impact the case when 'kdumpctl restart' is run
without any crash kernel being loaded as kexec will still return success
in that case.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-01-30 17:37:23 +08:00
Philipp Rudo
33b307af20 kdumpctl: cleanup 'start'
The function has many block of the kind

if ! cmd; then
  derror "Starting kdump: [FAILED]"
  return 1
fi

This duplicates code and makes the function hard to read. Thus move the
block to the calling function.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-01-30 17:37:23 +08:00
Philipp Rudo
d55a056558 kdumpctl: move aws workaround to kdump-lib
Move the workaround for aws graviton cpus from load_kdump to
prepare_cmdline. This (1) makes the workaround available also for other
callers of prepare_cmdline (although not needed at the moment) and (2)
makes it easier to fix the problems found by the unit test included
earlier as all changes to the cmdline are done at one place now.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-01-30 17:37:23 +08:00
Philipp Rudo
b9fd7a4076 kdumpctl: merge check_current_{kdump,fadump}_status
Both functions are almost identical. The only differences are (1) the
sysfs node the status is read from and (2) the fact the fadump version
doesn't verify if the file it's trying to read actually exists. Thus
merge the two functions and get rid of the check_current_status wrapper.

While at it rename the function to is_kernel_loaded which explains
better what the function does.

Finally, after moving FADUMP_REGISTER_SYS_NODE shellcheck can no longer
access the definition and starts complaining about it not being quoted.
Thus quote all uses of FADUMP_REGISTER_SYS_NODE.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-01-30 17:37:23 +08:00
Philipp Rudo
d5faaee62b kdumpctl: simplify check_failure_action_config
With the deprecation of the 'default' option in kdump.conf
check_failure_action_config needed to track which option was used
(default or failure_action). This made the function quite complex.Thus
make option 'default' a true alias of 'failure_action' when parsing
kdump.conf and simplify check_failure_action_config.

Do the same simplifications for check_final_action_config as both
functions are basically identical.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-01-30 17:37:23 +08:00
Coiby Xu
5951b5e268 Don't try to update crashkernel when bootloader is not installed
Currently when using anaconda to install the OS, the following errors
occur,

    INF packaging: Configuring (running scriptlet for): kernel-core-5.14.0-70.el9.x86_64 ...
    INF dnf.rpm: grep: /boot/grub2/grubenv: No such file or directory
    grep: /boot/grub2/grubenv: No such file or directory
    grep: /boot/grub2/grubenv: No such file or directory
    grep: /boot/grub2/grubenv: No such file or directory
    ...
    INF packaging: Configuring (running scriptlet for): kexec-tools-2.0.23-9.el9.x86_64 ...
    INF dnf.rpm: grep: /boot/grub2/grubenv: No such file or directory
    grep: /boot/grub2/grubenv: No such file or directory
    grep: /boot/grub2/grubenv: No such file or directory

Or for s390, the following errors occur,

    INF packaging: Configuring (running scriptlet for): kernel-core-5.14.0-71.el9.s390x ...
    03:37:51,232 INF dnf.rpm: grep: /etc/zipl.conf: No such file or directory
    grep: /etc/zipl.conf: No such file or directory
    grep: /etc/zipl.conf: No such file or directory

    INF packaging: Configuring (running scriptlet for): kexec-tools-2.0.23-9_1.el9_0.s390x ...
    INF dnf.rpm: grep: /etc/zipl.conf: No such file or directory

This is because when anaconda installs the packages, bootloader hasn't
been installed and /boot/grub2/grubenv or /etc/zipl.conf doesn't exist.
So don't try to update crashkernel when bootloader isn't ready to avoid
the above errors.

Note this is the second attempt to fix this issue. Previously a file
/tmp/kexec_tools_package_install was created to avoid running the
related code thus to avoid the above errors but unfortunately that
approach has two issues a) somehow osbuild doesn't delete it for RHEL b)
this file could still exist if users manually remove kexec-tools.

Fixes: e218128 ("Only try to reset crashkernel for osbuild during package install")
Reported-by: Jan Stodola <jstodola@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-12-22 10:33:00 +08:00
Hari Bathini
a833624fe5 fadump: avoid status check while starting in fadump mode
With kernel commit 607451ce0aa9b ("powerpc/fadump: register for fadump
as early as possible"), 'kdumpctl start' prematurely returns with the
below message:

    "Kdump already running: [WARNING]"

instead of setting default initrd with dump capture capability as
required for fadump. Skip status check in fadump mode to avoid this
problem.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-12-07 09:43:41 +08:00
Hari Bathini
25411da966 fadump: fix default initrd backup and restore logic
In case of fadump, default initrd is rebuilt with dump capturing
capability, as the same initrd is used for booting production kernel
as well as capture kernel.

The original initrd file is backed up with a checksum, to restore
it as the default initrd when fadump is disabled. As the checksum
file is not kernel version specific, switching between different
kernel versions and kdump/fadump dump mode breaks the default initrd
backup/restore logic. Fix this by having a kernel version specific
checksum file.

Also, if backing up initrd fails, retaining the checksum file isn't
useful. Remove it.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-12-07 09:42:29 +08:00
Lichen Liu
5eb77ee3fa kdumpctl: Optimize _find_kernel_path_by_release regex string
Currently _find_kernel_path_by_release uses grubby and grep to
find the kernel path, if both the normal kernel and it's debug
varient exist, the grep will give more than one kernel strings.

```
kernel="/boot/vmlinuz-5.14.0-139.kpq0.el9.s390x+debug"
kernel="/boot/vmlinuz-5.14.0-139.kpq0.el9.s390x"
```

This will cause an error when installing debug kernel.

```
The param "/boot/vmlinuz-5.14.0-139.kpq0.el9.s390x+debug
/boot/vmlinuz-5.14.0-139.kpq0.el9.s390x" is incorrect
```

Fixes: 945cbbd ("add helper functions to get kernel path by kernel release and the path of current running kernel")

Signed-off-by: Lichen Liu <lichliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-11-25 17:27:15 +08:00
Coiby Xu
a3da46d6c4 Skip reset_crashkernel_after_update during package install
Currently, kexec-tools tries to reset crashkernel when using anaconda to
install the system. But grubby isn't ready and complains that,
  10:34:17,014 INF packaging: Configuring (running scriptlet for): kexec-tools-2.0.23-9.el9.x86_64 1646034766 53ff7158f8808774f4e3c3c87e504aa7a6d677b537754dac86c87925c8f0a397
  10:34:17,205 INF dnf.rpm: grep: /boot/grub2/grubenv: No such file or directory
  grep: /boot/grub2/grubenv: No such file or directory
  grep: /boot/grub2/grubenv: No such file or directory

kexec-tools is supposed to update the kernel crashkernel parameter after
package upgrade. Unfortunately, the posttrans RPM scriptlet doesn't
distinguish between package install and upgrade. This patch skips
reset_crashkernel_after_update as similar to e218128e ("Only try to
reset crashkernel for osbuild during package install").

Reported-by: Jan Stodola <jstodola@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-11-18 17:22:39 +08:00
Tao Liu
3ae8cf8876 Don't check fs modified when dump target is lvm2 thinp
When the dump target is lvm2 thinp, if we didn't mount
the dump target first, get_fs_type_from_target will get
empty output:

Before mount:
$ get_fs_type_from_target /dev/vg00/thinlv

After mount:
$ mount /dev/vg00/thinlv /mnt
$ get_fs_type_from_target /dev/vg00/thinlv
ext4

As a result, kdumpctl start will fail with:
$ kdumpctl start
kdump: Dump target is invalid
kdump: Starting kdump: [FAILED]

This patch fix the issue by bypassing check_fs_modified
when the dump target is lvm2 thinp.

Signed-off-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Coiby Xu <prudo@redhat.com>
2022-11-11 10:29:02 +08:00
Tao Liu
10ca970940 lvm.conf should be check modified if lvm2 thinp enabled
lvm2 relies on /etc/lvm/lvm.conf to determine its behaviour. The
important configs such as thin_pool_autoextend_threshold and
thin_pool_autoextend_percent will be used during kdump in 2nd
kernel. So if the file is modified, the initramfs should be
rebuild to include the latest.

Signed-off-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-11-01 12:20:34 +08:00
Coiby Xu
fdad7d9869 Skip reading /etc/defaut/grub for s390x
Currently, updating kexec-tools on s390x gives the warning
sed: can't read /etc/default/grub: No such file or directory

This happens because s390x doesn't use GRUB and /etc/default/grub
doesn't exist. We need to skip both reading and writing to
/etc/default/grub.

Reported-by: Jie Li <jieli@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-10-27 14:42:27 +08:00
Coiby Xu
6ce4b85bb3 Include the memory overhead cost of cryptsetup when estimating the memory requirement for LUKS-encrypted target
Currently, "kdumpctl estimate" neglects the memory overhead cost of
cryptsetup itself. Unfortunately, there is no golden formula to
calculate the overhead cost [1]. So estimate the overhead cost as 50M
for aarch64 and 20M for other architectures based on the following
empirical data,

| Overhead (M) | OS                                        | arch    |
| ------------ | ----------------------------------------- | ------- |
| 14.1         | RHEL-9.2.0-20220829.d.1                   | ppc64le |
| 14           | Fedora-37-20220830.n.0 Everything ppc64le | ppc64le |
| 17           | Fedora 36                                 | ppc64le |
| 8.8          | Fedora 35                                 | s390x   |
| 10.1         | Fedora-Rawhide-20220829.n.0, fc38         | s390x   |
| 42           | Fedora-Rawhide-20220829.n.0, fc38         | arch64  |
| 40           | F35                                       | arch64  |
| 42           | F36                                       | arch64  |
| 42           | Fedora-Rawhide-20220901.n.0               | arch64  |
| 10           | F35                                       | x86_64  |
| 10           | Fedora-Rawhide-20220901.n.0               | x86_64  |
| 11           | Fedora-Rawhide-20220901.n.0               | x86_64  |

[1] https://lore.kernel.org/cryptsetup/20220616044339.376qlipk5h2omhx2@Rk/T/#u

Fixes: e9e6a2c ("kdumpctl: Add kdumpctl estimate")
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-10-26 15:38:21 +08:00
Coiby Xu
50a8461fc7 Choosing the most memory-consuming key slot when estimating the
memory requirement for LUKS-encrypted target

When there are multiple key slots, "kdumpctl estimate" uses the least
memory-consuming key slot. For example, when there are two memory slots
created with --pbkdf-memory=1048576 (1G) and --pbkdf-memory=524288 (512M),
"kdumpctl estimate" thinks the extra memory requirement is only 512M.
This will of course lead to OOM if the user uses the more
memory-consuming key slot. Fix it by sorting in reverse order.

Fixes: e9e6a2c ("kdumpctl: Add kdumpctl estimate")
Signed-off-by: Coiby Xu <coxu@redhat.com>

Reviewed-by: Lichen Liu <lichliu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-10-26 15:34:08 +08:00
Coiby Xu
15122b3f98 Fix grep warnings "grep: warning: stray \ before -"
Latest grep (3.8) warnings about unneeded backslashes when building
kdump initrd [1],
    kdump: Rebuilding /boot/initramfs-6.0.0-0.rc5.a335366bad13.40.test.fc38.aarch64kdump.img
    grep: warning: stray \ before -
    grep: warning: stray \ before -
    grep: warning: stray \ before -
    grep: warning: stray \ before -
    grep: warning: stray \ before -

Some warnings can be avoided by using "sed -n" to remove grep and the
others can use the -- argument.

[1] https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/2022/09/17/redhat:643020269/build_aarch64_redhat:643020269_aarch64/tests/4/results_0001/job.01/recipes/12617739/tasks/5/logs/taskout.log

Reported-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Suggested-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-10-26 14:16:04 +08:00
Coiby Xu
e218128e28 Only try to reset crashkernel for osbuild during package install
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2060319

Currently, kexec-tools tries to reset crashkernel when using anaconda to
install the system. But grubby isn't ready and complains that,
  10:33:17,631 INF packaging: Configuring (running scriptlet for): kernel-core-5.14.0-70.el9.x86_64 1645746534 03dcd32db234b72440ee6764d59b32347c5f0cd98ac3fb55beb47214a76f33b4
  10:34:16,696 INF dnf.rpm: grep: /boot/grub2/grubenv: No such file or directory
  grep: /boot/grub2/grubenv: No such file or directory

We only need to try resetting crashkernel for osbuild. Skip it for other
cases. To tell if it's package install instead of package upgrade, make
use of %pre to write a file /tmp/kexec-tools-install when "$1 == 1" [1].

[1] https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/#_syntax

Reported-by: Jan Stodola <jstodola@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Lichen Liu <lichenliu@redhat.com>
2022-10-20 13:54:10 +08:00
Coiby Xu
a7ead187a4 Prefix reset-crashkernel-{for-installed_kernel,after-update} with underscore
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2048690

To indicate they are for internal use only, underscore them.

Reported-by: rcheerla@redhat.com
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Lichen Liu <lichenliu@redhat.com>
2022-10-20 13:54:10 +08:00
Tao Liu
c743881ae6 virtiofs support for kexec-tools
This patch add virtiofs support for kexec-tools by introducing a new option
for /etc/kdump.conf:

virtiofs myfs

Where myfs is a variable tag name specified in qemu cmdline
"-device vhost-user-fs-pci,tag=myfs".

The patch covers the following cases:
1) Dumping VM's vmcore to a virtiofs shared directory;
2) When the VM's rootfs is a virtiofs shared directory and dumping the
   VM's vmcore to its subdirectory, such as /var/crash;
3) The combination of case 1 & 2: The VM's rootfs is a virtiofs shared
   directory and dumping the VM's vmcore to another virtiofs shared
   directory.

Case 2 & 3 need dracut >= 057, otherwise VM cannot boot from virtiofs
shared rootfs. But it is not the issue of kexec-tools.

Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
2022-09-29 12:22:49 +08:00
Lichen Liu
4edcd9a400 kdumpctl: make the kdump.log root-readable-only
Decrease the risk that of leaking information that could potentially
be used to exploit the crash further (think location of keys).

Signed-off-by: Lichen Liu <lichliu@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
2022-09-06 20:21:31 +08:00
Coiby Xu
58eef4582a remove useless --zipl when calling grubby to update kernel command line
"grubby --zipl" only takes effect when setting default kernel. It's
useless to add "--zipl" when updating kernel command line. Also rename
_update_grub to _update_kernel_cmdline since s390x doesn't use GRUB.

Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-08-03 11:09:45 +08:00
Coiby Xu
e8ae897595 skip updating /etc/default/grub for s390x
Resolves: bz2104534

When running "kdumpctl reset-crashkernel --kernel=ALL" on s390x,
sed: can't read /etc/default/grub: No such file or directory
sed: can't read /etc/default/grub: No such file or directory

This happens because s390x doesn't use the grub bootloader and
/etc/default/grub doesn't exist.

Reported-by: smitterl@redhat.com
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-08-03 11:09:37 +08:00
Coiby Xu
da0ca0d205 Allow to update kexec-tools using virt-customize for cloud base image
Resolves: bz2089871

Currently, kexec-tools can't be updated using virt-customize because
older version of kdumpctl can't acquire instance lock for the
get-default-crashkernel subcommand. The reason is /var/lock is linked to
/run/lock which however doesn't exist in the case of virt-customize.

This patch fixes this problem by using /tmp/kdump.lock as the lock
file if /run/lock doesn't exist.

Note
1. The lock file is now created in /run/lock instead of /var/run/lock since
   Fedora has adopted adopted /run [2] since F15.
2. %pre scriptlet now always return success since package update won't
   be blocked

[1] https://fedoraproject.org/wiki/Features/var-run-tmpfs

Fixes: 0adb0f4 ("try to reset kernel crashkernel when kexec-tools updates the default crashkernel value")

Reported-by: Nicolas Hicher <nhicher@redhat.com>
Suggested-by: Laszlo Ersek <lersek@redhat.com>
Suggested-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-08-02 18:36:34 +08:00
Pingfan Liu
d593bfa6fc KDUMP_COMMANDLINE: remove irqpoll parameter on aws aarch64 platform
Currently, kdump may experience failure on some aws aarch64 platform.
The final scenario is:

    [   79.145089] printk: console [ttyS0] disabled
Then the system has no response any more. And after reboot, there is no
vmcore generated under /var/crash/. More detail [1].

In a short word, it is caused by the irqpoll policy and some unknown
acpi issue. The serial device is hot-removed as a pci device.

More detailed, the irqpoll policy demands to iterate over all interrupt
handler, if the interrupt line is shared, then the handler is
dispatched. And acpi handler acpi_irq() is on a shared interrupt line,
so it is called.  But for some unknown reason, the acpi hardware regs
hold wrong state, and the acpi driver decides that a hot-removed event
happens on a pci slot, which finally removes the pci serial device.

To tackle this issue by removing the irqpoll parameter on aws aarch64
platform, until the real root cause in acpi is found and resolved.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=2080468#c0

Signed-off-by: Pingfan Liu <piliu@redhat.com>

Acked-by: Coiby Xu <coxu@redhat.com>
2022-07-21 19:03:37 +08:00