Commit Graph

1563 Commits

Author SHA1 Message Date
Tao Liu
0a5b71d123 Add lvm2 thin provision dump target checker
We need to check if a directory or a device is lvm2 thinp target.

First, we use get_block_dump_target() to convert dump path into
block device, then we check if the device is lvm2 thinp target by
cmd lvs.

is_lvm2_thinp_device is now located in kdump-lib-initramfs.sh, for it
will be used in 2nd kernel. is_lvm2_thinp_dump_target is located in
kdump-lib.sh, for it is only used in 1st kernel, and it has dependencies
which exist in kdump-lib.sh.

Signed-off-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-11-01 12:20:34 +08:00
Coiby Xu
995ee24903 Release 2.0.25-2
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-10-27 16:00:11 +08:00
Coiby Xu
fdad7d9869 Skip reading /etc/defaut/grub for s390x
Currently, updating kexec-tools on s390x gives the warning
sed: can't read /etc/default/grub: No such file or directory

This happens because s390x doesn't use GRUB and /etc/default/grub
doesn't exist. We need to skip both reading and writing to
/etc/default/grub.

Reported-by: Jie Li <jieli@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-10-27 14:42:27 +08:00
Coiby Xu
6ce4b85bb3 Include the memory overhead cost of cryptsetup when estimating the memory requirement for LUKS-encrypted target
Currently, "kdumpctl estimate" neglects the memory overhead cost of
cryptsetup itself. Unfortunately, there is no golden formula to
calculate the overhead cost [1]. So estimate the overhead cost as 50M
for aarch64 and 20M for other architectures based on the following
empirical data,

| Overhead (M) | OS                                        | arch    |
| ------------ | ----------------------------------------- | ------- |
| 14.1         | RHEL-9.2.0-20220829.d.1                   | ppc64le |
| 14           | Fedora-37-20220830.n.0 Everything ppc64le | ppc64le |
| 17           | Fedora 36                                 | ppc64le |
| 8.8          | Fedora 35                                 | s390x   |
| 10.1         | Fedora-Rawhide-20220829.n.0, fc38         | s390x   |
| 42           | Fedora-Rawhide-20220829.n.0, fc38         | arch64  |
| 40           | F35                                       | arch64  |
| 42           | F36                                       | arch64  |
| 42           | Fedora-Rawhide-20220901.n.0               | arch64  |
| 10           | F35                                       | x86_64  |
| 10           | Fedora-Rawhide-20220901.n.0               | x86_64  |
| 11           | Fedora-Rawhide-20220901.n.0               | x86_64  |

[1] https://lore.kernel.org/cryptsetup/20220616044339.376qlipk5h2omhx2@Rk/T/#u

Fixes: e9e6a2c ("kdumpctl: Add kdumpctl estimate")
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-10-26 15:38:21 +08:00
Coiby Xu
50a8461fc7 Choosing the most memory-consuming key slot when estimating the
memory requirement for LUKS-encrypted target

When there are multiple key slots, "kdumpctl estimate" uses the least
memory-consuming key slot. For example, when there are two memory slots
created with --pbkdf-memory=1048576 (1G) and --pbkdf-memory=524288 (512M),
"kdumpctl estimate" thinks the extra memory requirement is only 512M.
This will of course lead to OOM if the user uses the more
memory-consuming key slot. Fix it by sorting in reverse order.

Fixes: e9e6a2c ("kdumpctl: Add kdumpctl estimate")
Signed-off-by: Coiby Xu <coxu@redhat.com>

Reviewed-by: Lichen Liu <lichliu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-10-26 15:34:08 +08:00
Coiby Xu
15122b3f98 Fix grep warnings "grep: warning: stray \ before -"
Latest grep (3.8) warnings about unneeded backslashes when building
kdump initrd [1],
    kdump: Rebuilding /boot/initramfs-6.0.0-0.rc5.a335366bad13.40.test.fc38.aarch64kdump.img
    grep: warning: stray \ before -
    grep: warning: stray \ before -
    grep: warning: stray \ before -
    grep: warning: stray \ before -
    grep: warning: stray \ before -

Some warnings can be avoided by using "sed -n" to remove grep and the
others can use the -- argument.

[1] https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/2022/09/17/redhat:643020269/build_aarch64_redhat:643020269_aarch64/tests/4/results_0001/job.01/recipes/12617739/tasks/5/logs/taskout.log

Reported-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Suggested-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-10-26 14:16:04 +08:00
Coiby Xu
e218128e28 Only try to reset crashkernel for osbuild during package install
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2060319

Currently, kexec-tools tries to reset crashkernel when using anaconda to
install the system. But grubby isn't ready and complains that,
  10:33:17,631 INF packaging: Configuring (running scriptlet for): kernel-core-5.14.0-70.el9.x86_64 1645746534 03dcd32db234b72440ee6764d59b32347c5f0cd98ac3fb55beb47214a76f33b4
  10:34:16,696 INF dnf.rpm: grep: /boot/grub2/grubenv: No such file or directory
  grep: /boot/grub2/grubenv: No such file or directory

We only need to try resetting crashkernel for osbuild. Skip it for other
cases. To tell if it's package install instead of package upgrade, make
use of %pre to write a file /tmp/kexec-tools-install when "$1 == 1" [1].

[1] https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/#_syntax

Reported-by: Jan Stodola <jstodola@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Lichen Liu <lichenliu@redhat.com>
2022-10-20 13:54:10 +08:00
Coiby Xu
a7ead187a4 Prefix reset-crashkernel-{for-installed_kernel,after-update} with underscore
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2048690

To indicate they are for internal use only, underscore them.

Reported-by: rcheerla@redhat.com
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Lichen Liu <lichenliu@redhat.com>
2022-10-20 13:54:10 +08:00
Tao Liu
fc1c79ffd2 Seperate dracut and dracut-squash compressor for zstd
Previously kexec-tools will pass "--compress zstd" to dracut. It
will make dracut to decide whether: a) call mksquashfs to make a
zstd format squash-root.img, b) call cmd zstd to make a initramfs.

Since dracut(>= 057) has decoupled the compressor for dracut and
dracut-squash, So in this patch, we will pass the compressor seperately.

Note:

The is_squash_available && !dracut_has_option --squash-compressor
&& !is_zsdt_command_available case is left unprocessed on purpose.

Actually, the situation when we want to call zstd compression is:
1) If squash function OK, we want dracut to invoke mksquashfs to make
a zstd format squash-root.img within initramfs.
2) If squash function is not OK, and cmd zstd presents, we want dracut
to invoke cmd zstd to make a zstd format initramfs.

is_zstd_command_available check can handle case 2 completely.

However, for the is_squash_available check, it cannot handle case 1
completely. It only checks if the kernel supports squashfs, it doesn't
check whether the squash module has been added by dracut when making
initramfs. In fact, in kexec-tools we are unable to do the check,
there are multiple ways to forbit dracut to load a module, such as
"dracut -o module" and "omit_dracutmodules in dracut.conf".

When squash dracut module is omitted, is_squash_available check will
still pass, so "--compress zstd" will be appended to dracut cmdline,
and it will call cmd zstd to do the compression. However cmd zstd may
not exist, so it fails.

The previous "--compress zstd" is ambiguous, after the intro of
"--squash-compressor", "--squash-compressor" only effect for
mksquashfs and "--compress" only effect for specific cmd.

So for the is_squash_available && !dracut_has_option
--squash-compressor && !is_zsdt_command_available case, we just leave
it to be handled the default way.

Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
2022-10-20 12:26:37 +08:00
Tao Liu
bea6143178 Fix the sync issue for dump_fs
Previously the sync for dump_fs is problematic, it always
return success according to man 2 sync. So it cannot detect
the error of the dump target is full and not all of vmcore
data been written back the disk, which will leave the vmcore
imcomplete and report misleading log as "saving vmcore
complete".

In this patch, we will use "sync -f vmcore" instead, which
will return error if syncfs on the dump target fails. In
this way, vmcore sync related failures, such as autoextend
of lvm2 thinpool fails, can be detected and handled properly.

Signed-off-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2022-10-08 15:46:11 +08:00
Tao Liu
c743881ae6 virtiofs support for kexec-tools
This patch add virtiofs support for kexec-tools by introducing a new option
for /etc/kdump.conf:

virtiofs myfs

Where myfs is a variable tag name specified in qemu cmdline
"-device vhost-user-fs-pci,tag=myfs".

The patch covers the following cases:
1) Dumping VM's vmcore to a virtiofs shared directory;
2) When the VM's rootfs is a virtiofs shared directory and dumping the
   VM's vmcore to its subdirectory, such as /var/crash;
3) The combination of case 1 & 2: The VM's rootfs is a virtiofs shared
   directory and dumping the VM's vmcore to another virtiofs shared
   directory.

Case 2 & 3 need dracut >= 057, otherwise VM cannot boot from virtiofs
shared rootfs. But it is not the issue of kexec-tools.

Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
2022-09-29 12:22:49 +08:00
Hari Bathini
d905d49c08 fadump: avoid non-debug kernel use for fadump case
Since commit c5bdd2d8f1 ("kdump-lib: use non-debug kernels first"),
non-debug kernel is preferred, over the debug variant, as dump capture
kernel to reduce memory consumption. This works alright for kdump as
the capture kernel is loaded using kexec.

In case of fadump, regular boot loader is used to load the capture
kernel. So, the default kernel needs to be used as capture kernel as
well. But with commit c5bdd2d8f1, initrd of a different kernel is
made dump capture capable, breaking fadump's ability to capture dump
properly. Fix this by sticking with the debug variant in case of
fadump.

Fixes: c5bdd2d8f1 ("kdump-lib: use non-debug kernels first")

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Lichen Liu <lichliu@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
2022-09-23 14:23:38 +08:00
Lichen Liu
4d52b7d548 mkdumprd: Improve error messages on non-existing NFS target directories
When kdump is configured with a NFS location, and the remote directory does
not exist, kdump.service fails with a confusing error message.

    kdumpctl[2172]: kdump: Dump path "/tmp/mkdumprd.ftWhOF/target/dumps"
    does not exist in dump target "10.111.113.2:/srv/kdump"

We just need to print the remote directory "dumps" in such case, because
"/tmp/mkdumprd.ftWhOF/target" is the local temporary mount point.

Signed-off-by: Lichen Liu <lichliu@redhat.com>
Reviewed-by: Coiby Xu<coxu@redhat.com>
2022-09-21 13:30:14 +08:00
Lichen Liu
4edcd9a400 kdumpctl: make the kdump.log root-readable-only
Decrease the risk that of leaking information that could potentially
be used to exploit the crash further (think location of keys).

Signed-off-by: Lichen Liu <lichliu@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
2022-09-06 20:21:31 +08:00
Kairui Song
677da8a59b sysconfig: use a simple generator script to maintain
These kdump.sysconfig.* files are almost identical with a bit difference
in several parameters, just use a simple script to generate them upon
packaging. This should make it easier to maintain, updating a comment or
param for a certain arch can be done in one place.

There are only some comment or empty option differences with the generated
version because some arch's sysconfig is not up-to-dated, this actually
fixes the issue, I used the following script to check these differences:

 # for arch in aarch64 i386 ppc64 ppc64le s390x x86_64; do
   ./gen-kdump-sysconfig.sh $arch > kdump.sysconfig.$arch.new
   git checkout HEAD^ kdump.sysconfig.$arch &>/dev/null
   echo "$arch:"
   diff kdump.sysconfig.$arch kdump.sysconfig.$arch.new; echo ""
   done; git reset;

Signed-off-by: Kairui Song <kasong@tencent.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-08-16 14:35:35 +08:00
Coiby Xu
aa84244346 Release 2.0.25-1
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-08-03 20:14:23 +08:00
Coiby Xu
2d5df7a512 tests: specify the Fedora version when running fedpkg sources
So fedpkg will fetch the sources that matches given Fedora version.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-08-03 20:14:12 +08:00
Coiby Xu
f91711ba8e tests: specify the backing format for the backing file when using qemu-img create
New version of qemu-img requires specifying the backing format for the
backing file otherwise it will abort.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-08-03 20:14:12 +08:00
Coiby Xu
d347ad591f tests: correctly mount the root and also the boot partitions for Fedora 35, 36 and rawhide Cloud Base Image
Fedora 33 and 34 Cloud Base Images have only one partition with the
following directory structure,
.
├── bin -> usr/bin
├── boot
├── dev
├── etc
├── home
├── root

By comparison, Fedora 35, 36 and 37 Cloud Base Images have multiple
partitions. The root partition which is the last partition has the
following directory,
.
├── home
└── root
    ├── bin -> usr/bin
    ├── boot
    ├── dev
    ├── etc
    ├── home
    ├── root

and the 2nd partition is the boot partition.

This patch address the above changes by mounting {LAST_PARTITION}/root as
to TEMP_ROOT and mount SECOND_PARTITION to TEMP_ROOT/boot. So the test
image can be built successfully.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-08-03 20:14:12 +08:00
Coiby Xu
4d1e02d340 remind the users to run zipl after calling grubby on s390x
s390x doesn't use GRUB. To make sure the boot entries are updated, call
zipl after running grubby.

Suggested-by: smitterl@redhat.com
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-08-03 11:09:55 +08:00
Coiby Xu
58eef4582a remove useless --zipl when calling grubby to update kernel command line
"grubby --zipl" only takes effect when setting default kernel. It's
useless to add "--zipl" when updating kernel command line. Also rename
_update_grub to _update_kernel_cmdline since s390x doesn't use GRUB.

Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-08-03 11:09:45 +08:00
Coiby Xu
e8ae897595 skip updating /etc/default/grub for s390x
Resolves: bz2104534

When running "kdumpctl reset-crashkernel --kernel=ALL" on s390x,
sed: can't read /etc/default/grub: No such file or directory
sed: can't read /etc/default/grub: No such file or directory

This happens because s390x doesn't use the grub bootloader and
/etc/default/grub doesn't exist.

Reported-by: smitterl@redhat.com
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-08-03 11:09:37 +08:00
Coiby Xu
f6bcd819fc use /run/ostree-booted to tell if scriptlet is running on OSTree system
Resolves: bz2092012

According to the ostree team [1], the existence of /run/ostree-booted
> is the most stable way to signal/check that a system has been
> booted in ostree-style.  It is also used by rpm-ostree at
> compose/install time in the sandboxed environment where scriptlets run,
> in order to signal that the package is being installed/composed into
> an ostree commit (i.e. not directly on a live system).  See
> 8ddf5f40d9/src/libpriv/rpmostree-scripts.cxx (L350-L353)
> for reference.

By checking the existence of /run/ostree-booted, we could skip trying to
update kernel cmdline during OSTree compose time.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=2092012#c3

Reported-by: Luca BRUNO <lucab@redhat.com>
Suggested-by: Luca BRUNO <lucab@redhat.com>
Fixes: 0adb0f4 ("try to reset kernel crashkernel when kexec-tools updates the default crashkernel value")
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Acked-by: Timothée Ravier <siosm@fedoraproject.org>
2022-08-03 11:07:47 +08:00
Coiby Xu
da0ca0d205 Allow to update kexec-tools using virt-customize for cloud base image
Resolves: bz2089871

Currently, kexec-tools can't be updated using virt-customize because
older version of kdumpctl can't acquire instance lock for the
get-default-crashkernel subcommand. The reason is /var/lock is linked to
/run/lock which however doesn't exist in the case of virt-customize.

This patch fixes this problem by using /tmp/kdump.lock as the lock
file if /run/lock doesn't exist.

Note
1. The lock file is now created in /run/lock instead of /var/run/lock since
   Fedora has adopted adopted /run [2] since F15.
2. %pre scriptlet now always return success since package update won't
   be blocked

[1] https://fedoraproject.org/wiki/Features/var-run-tmpfs

Fixes: 0adb0f4 ("try to reset kernel crashkernel when kexec-tools updates the default crashkernel value")

Reported-by: Nicolas Hicher <nhicher@redhat.com>
Suggested-by: Laszlo Ersek <lersek@redhat.com>
Suggested-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-08-02 18:36:34 +08:00
Pingfan Liu
d593bfa6fc KDUMP_COMMANDLINE: remove irqpoll parameter on aws aarch64 platform
Currently, kdump may experience failure on some aws aarch64 platform.
The final scenario is:

    [   79.145089] printk: console [ttyS0] disabled
Then the system has no response any more. And after reboot, there is no
vmcore generated under /var/crash/. More detail [1].

In a short word, it is caused by the irqpoll policy and some unknown
acpi issue. The serial device is hot-removed as a pci device.

More detailed, the irqpoll policy demands to iterate over all interrupt
handler, if the interrupt line is shared, then the handler is
dispatched. And acpi handler acpi_irq() is on a shared interrupt line,
so it is called.  But for some unknown reason, the acpi hardware regs
hold wrong state, and the acpi driver decides that a hot-removed event
happens on a pci slot, which finally removes the pci serial device.

To tackle this issue by removing the irqpoll parameter on aws aarch64
platform, until the real root cause in acpi is found and resolved.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=2080468#c0

Signed-off-by: Pingfan Liu <piliu@redhat.com>

Acked-by: Coiby Xu <coxu@redhat.com>
2022-07-21 19:03:37 +08:00
Coiby Xu
c735539b35 Release 2.0.24-4
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-07-21 16:42:43 +08:00
Baoquan He
1913ea9118 Checking the existence of 40-redhat.rules before modifying
Resolves: bz2106645

The code of commit 163c02970e takes effect in rhel firstly, later
pulled to Fedora. However, Fedora OS doesn't have 40-redhat.rules
in systemd-udev package. With this commit applied, a false positive
warning message can always been seen as below.

So fixing it by checking if 40-redhat.rules exists before handling.
With this change, the false warning is gone.

[root@ ~]# kdumpctl restart
kdump: kexec: unloaded kdump kernel
kdump: Stopping kdump: [OK]
kdump: No kdump initial ramdisk found.
kdump: Rebuilding /boot/initramfs-5.19.0-rc6+kdump.img
sed: can't read /var/tmp/dracut.NnAV2g/initramfs/usr/lib/udev/rules.d/40-redhat.rules: No such file or directory
kdump: kexec: loaded kdump kernel
kdump: Starting kdump: [OK]

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
2022-07-21 15:02:32 +08:00
Lichen Liu
ed9cbec2ee kdump-lib: Add the CoreOS kernel dir to the boot_dirlist
The kernel of CoreOS is not in the standard locations, add
/boot/ostree/* to the boot_dirlist to find the vmlinuz.

Signed-off-by: Lichen Liu <lichliu@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
2022-06-30 16:00:06 +08:00
Dusty Mabe
f9c32372d2 kdump-lib: attempt to fix BOOT_IMAGE detection
Currently $boot_img can get bad data if running on a platform
that doesn't set BOOT_IMAGE in the kernel command line. For
example, currently:

- s390x Fedora CoreOS machine:

```
[root@cosa-devsh ~]# sed "s/^BOOT_IMAGE=\((\S*)\)\?\(\S*\) .*/\2/" /proc/cmdline
mitigations=auto,nosmt ignition.platform.id=qemu ostree=/ostree/boot.0/fedora-coreos/2a72567ac8f7ed678c3ac89408f795e6ccd4e97b41e14af5f471b6a807e858b9/0 root=UUID=2a88436a-3b6b-4706-b33a-b8270bd87cde rw rootflags=prjquota boot=UUID=f4b2eaa5-9317-4798-85cf-308c477fee4c crashkernel=600M
```

where on a platform that uses GRUB we get:

- x86_64 Fedora CoreOS machine:

```
[root@cosa-devsh ~]# sed "s/^BOOT_IMAGE=\((\S*)\)\?\(\S*\) .*/\2/" /proc/cmdline
/ostree/fedora-coreos-af4f6cc7b9ff486cfa647680b180e989c72c8eed03a34a42e7328e49332bd20e/vmlinuz-5.18.5-200.fc36.x86_64
```

We should change the setting of the boot_img variable such that it will
be empty if BOOT_IMAGE doesn't exist.

With this change on the s390x machine:

```
[root@cosa-devsh ~]# grep -P -o '^BOOT_IMAGE=(\S+)' /proc/cmdline | sed "s/^BOOT_IMAGE=\((\S*)\)\?\(\S*\)/\2/"
[root@cosa-devsh ~]#
```

This change mattered much more before the change in c5bdd2d which changed
the following line from [[ -n $boot_img ]] to [[ "$boot_img" == *"$kdump_kernelver" ]].
Still I think this change has merit.

Signed-off-by: Dusty Mabe <dusty@dustymabe.com>
Acked-by: Coiby Xu <coxu@redhat.com>
2022-06-30 16:00:06 +08:00
Dusty Mabe
a1ebf0b565 kdump-lib: change how ostree based systems are detected
The current recommendation is to check for /run/ostree-booted.

See https://bugzilla.redhat.com/show_bug.cgi?id=2092012#c0

Signed-off-by: Dusty Mabe <dusty@dustymabe.com>
Acked-by: Coiby Xu <coxu@redhat.com>
2022-06-30 16:00:06 +08:00
Dusty Mabe
980f10aa40 kdump-lib: clear up references to Atomic/CoreOS
There are many variants on OSTree based systems these days so
we should probably refer to the class of systems as "OSTree
based systems". Also, Atomic Host is dead.

Signed-off-by: Dusty Mabe <dusty@dustymabe.com>
Acked-by: Coiby Xu <coxu@redhat.com>
2022-06-30 16:00:06 +08:00
Pingfan Liu
b92bc6e0a7 crashkernel: optimize arm64 reserved size if PAGE_SIZE=4k
On RHEL9 and Fedora, the arm64 platform only supports 4KB page size.
the reserved memory size can be aligned to that on x86_64.

Introducing a new formula for 4KB on arm64, which bases on x86_64 plus
extra 64MB.

Signed-off-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
2022-06-15 09:08:03 +08:00
Lichen Liu
c5bdd2d8f1 kdump-lib: use non-debug kernels first
Kdump uses currently running kernel as default, but when currently
running kernel is a debug kernel, it will consume more memory,
which may cause out-of-memory and fail to collect vmcore.

Now we will try to use non-debug kernels first if possible.

Also extract the logic of determine KDUMP_KERNEL from
prepare_kdump_bootinfo into a function. This function will return
KDUMP_KERNEL given a kernel version.

Signed-off-by: Lichen Liu <lichliu@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
2022-06-14 09:36:03 +08:00
Coiby Xu
6f9653b918 Release 2.0.24-3
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-05-23 18:26:42 +08:00
Coiby Xu
8f7ffb1a00 Update makedumpfile to 1.7.1
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-05-23 18:26:42 +08:00
Dusty Mabe
218d9917c0
kdump.sysconfig*: add ignition.firstboot to KDUMP_COMMANDLINE_REMOVE
For CoreOS based systems we use Ignition for provisioning machines
in the initramfs on first boot. We trigger Ignition right now by
the presence of `ignition.firstboot` in the kernel command line. The
kernel argument is only present on first boot so after a reboot it
no longer is in the kernel command line.

If a kernel crash happens before the first reboot of a machine we
want the `ignition.firstboot` kernel argument to be removed and not
passed on to the crash kernel.
2022-05-16 14:04:12 -04:00
Coiby Xu
1101190e1c unit tests: add tests for get_system_size and get_recommend_size
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-05-13 11:03:22 +08:00
Coiby Xu
4f702c81e9 improve get_recommend_size
This patch rewrites get_recommend_size to get rid of the following
limitations,
1. only supports ranges in crashkernel sorted in increasing order
2. the first entry of crashkernel should have only a single digit and
   it's in gigabytes

Suggested-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-05-13 11:03:22 +08:00
Coiby Xu
5c23b6ebb7 fix a calculation error in get_system_size
Recently, it's found 'kdumpctl estimate' returns 512M while the system
reserves 1024M kdump memory in a case. This happens because the ranges
in /proc/iomem are inclusively. For example, "0-1: System RAM" means 2
bytes of system memory other than 1 byte. Fix this error by adding one
more byte.

Note
1. the function has been simplified as well.
2. define PROC_IOMEM as /proc/iomem for the sake of unit tests

Reported-by: Ruowen Qin <ruqin@redhat.com>
Fixes: 1813189 ("kdump-lib.sh: introduce functions to return recommened mem size")
Suggested-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-05-13 11:03:22 +08:00
Kairui Song
3d70f8b049 logger: save log after all kdump progress finished
Make log saving the last step of kdump.sh, so it can catch more info,
for example, the output of post.d hooks will be covered by the log now.

Signed-off-by: Kairui Song <kasong@tencent.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-04-29 16:22:41 +08:00
Coiby Xu
1facd0c118 fix incorrect date format in changelog
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-04-24 11:38:48 +08:00
Coiby Xu
d5b01d7ef0 Release 2.0.24-2
A issue of bogus date in %changelog is fixed as well.

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-04-24 10:38:04 +08:00
Coiby Xu
be20580b06 remove the upper bound of default crashkernel value example
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-04-24 09:35:05 +08:00
Coiby Xu
695e5b8676 update fadump-howto
1. yum is deprecated so use dnf instead
2. use the "kdumpctl reset-crashkernel --fadump=on" API

Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-04-24 09:35:05 +08:00
Coiby Xu
1e7df3e1f3 update kexec-kdump-howto
1. yum is deprecated so use dnf instead
2. use the "kdumpctl reset-crashkernel" API
3. ask the users to refer to crashkernel-howto.txt for setting custom
   crashkernel value
4. fix a typo

Philipp Rudo <prudo@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-04-24 09:35:05 +08:00
Coiby Xu
683ff87821 update crashkernel-howto
1. clean up left crashkernel.default
2. fix a few typos and grammar mistakes
3. ask the users to refer to `man kdumpctl` for reset-crashkernel

Philipp Rudo <prudo@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-04-24 09:35:05 +08:00
Coiby Xu
a1c63fa644 add man documentation for kdumpctl get-default-crashkernel
A few typos and grammar issues are fixed as well.

Philipp Rudo <prudo@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-04-24 09:35:05 +08:00
Coiby Xu
3d4cb38d96 unit tests: add check_config with with the default kdump.conf
This test prevents the mistake of adding an option to kdump.conf
without changing check_config as is the case with commit 73ced7f
("introduce the auto_reset_crashkernel option to kdump.conf").

Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-04-14 11:45:33 +08:00
Coiby Xu
e28a1399a3 unit tests: add tests for kdump_get_conf_val in kdump-lib-initramfs.sh
kdump_get_conf_val allows to retrieves config value defined in
kdump.conf and it also supports sed regex like
"ext[234]\|xfs\|btrfs\|minix\|raw\|nfs\|ssh".

Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-04-14 11:45:18 +08:00
Coiby Xu
e00b45d75f unit tests: add tests for "kdumpctl reset-crashkernel"
This commit adds a relatively thorough test suite for
kdumpctl reset-crashkernel [--fadump=[on|off|nocma]]  [--kernel=path_to_kernel] [--reboot]
as implemented in commit 140da74 ("rewrite reset_crashkernel to support
fadump and to used by RPM scriptlet").

grubby have a few options to support its own testing,
 - --no-etc-grub-update, not update /etc/default/grub
 - --bad-image-okay, don't check the validity of the image
 - --env, specify custom grub2 environment block file to avoid modifying
   the default /boot/grub2/grubenv
 - --bls-directory, specify custom BootLoaderSpec config files to avoid
   modifying the default /boot/loader/entries

So the grubby called by kdumpctl is mocked as
@grubby --grub2 --no-etc-grub-update --bad-image-okay --env=$SPEC_TEST_DIR/env_temp -b $SPEC_TEST_DIR/boot_load_entries "$@"
in the tests. To be able to call the actual grubby in the mock function [1],
ShellSpec provides the following command
$ shellspec --gen-bin @grubby
to generate spec/support/bins/@grubby which is used to call the actual grubby.

kdumpctl has implemented its own version of updating /etc/default/grub
in _update_kernel_cmdline_in_grub_etc_default. To avoiding writing to
/etc/default/grub, this function is mocked as outputting its name and
received arguments similar to python unitest's assert_called_with.

[1]  https://github.com/shellspec/shellspec#execute-the-actual-command-within-a-mock-function

Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-04-14 11:45:08 +08:00