Commit Graph

372 Commits

Author SHA1 Message Date
Coiby Xu
81b414d100 Reduce kdump memory consumption by only installing needed NIC drivers
Resolves: bz2076416
Upstream: Fedora
Conflict: None

commit a65dde2d10
Author: Coiby Xu <coxu@redhat.com>
Date:   Thu May 19 11:39:25 2022 +0800

    Reduce kdump memory consumption by only installing needed NIC drivers

    Even after having asked NM to stop managing a unneeded NIC, a NIC driver
    may still waste memory. For example, mlx5_core uses a substantial amount
    of memory during driver initialization,

    ======== Report format module_summary: ========
    Module mlx5_core using 350.2MB (89650 pages), peak allocation 367.4MB (94056 pages)
    Module squashfs using 13.1MB (3360 pages), peak allocation 13.1MB (3360 pages)
    Module overlay using 2.1MB (550 pages), peak allocation 2.2MB (555 pages)
    Module dns_resolver using 0.9MB (219 pages), peak allocation 5.2MB (1338 pages)
    Module mlxfw using 0.7MB (172 pages), peak allocation 5.3MB (1349 pages)
    ======== Report format module_summary END ========

    ======== Report format module_top: ========
    Top stack usage of module mlx5_core:
      (null) Pages: 89650 (peak: 94056)
        ret_from_fork (0xffffda088b4165f8) Pages: 60007 (peak: 60007)
          kthread (0xffffda088b4bd7e4) Pages: 60007 (peak: 60007)
            worker_thread (0xffffda088b4b48d0) Pages: 60007 (peak: 60007)
              process_one_work (0xffffda088b4b3f40) Pages: 60007 (peak: 60007)
                work_for_cpu_fn (0xffffda088b4aef00) Pages: 53906 (peak: 53906)
                  local_pci_probe (0xffffda088b9e1e44) Pages: 53906 (peak: 53906)
                    probe_one mlx5_core (0xffffda084f899cc8) Pages: 53518 (peak: 53518)
                      mlx5_init_one mlx5_core (0xffffda084f8994ac) Pages: 49756 (peak: 49756)
                        mlx5_function_setup.constprop.0 mlx5_core (0xffffda084f899100) Pages: 44434 (eak: 44434)
                          mlx5_satisfy_startup_pages mlx5_core (0xffffda084f8a4f24) Pages: 44434 (peak: 44434)
                        mlx5_function_setup.constprop.0 mlx5_core (0xffffda084f899078) Pages: 5285 (peak: 5285)
                          mlx5_cmd_init mlx5_core (0xffffda084f89e414) Pages: 4818 (peak: 4818)
                            mlx5_alloc_cmd_msg mlx5_core (0xffffda084f89aaa0) Pages: 4403 (peak: 4403)

    This memory consumption is completely unnecessary when kdump doesn't need
    this NIC. Only install needed NIC drivers to prevent this kind of waste.

    Note
    1. this patch depends on [1] to ask dracut to not install NIC drivers.
    2. "ethtool -i" somehow fails to get the vlan driver
    3. team.ko doesn't depend on the team mode drivers so we need to install
       the team mode drivers manually.

    [1] https://github.com/dracutdevs/dracut/pull/1789

    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Thomas Haller <thaller@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-11-23 09:42:18 +08:00
Coiby Xu
95a39f602b Reduce kdump memory consumption by not letting NetworkManager manage unneeded network interfaces
Resolves: bz2076416
Upstream: Fedora
Conflict: None

commit 586fe410aa
Author: Coiby Xu <coxu@redhat.com>
Date:   Thu Sep 9 11:50:00 2021 +0800

    Reduce kdump memory consumption by not letting NetworkManager manage unneeded network interfaces

    By default, NetworkManger will manage all the network interfaces and
    try to set interface IFF_UP to get carrier state. Regardless of whether
    the network interface is connected to a cable or not, the NIC driver
    will allocate memory resources for e.g. ring buffers when setting IFF_UP.
    This could be a waste of memory. For example it's found i40e consumes ~15GB
    on a power machine. On this machine, i40e manages four interfaces but only
    one interface is valid. This patch use "managed=false" to tell
    NetworkManager to not manage network interfaces that are not needed by
    kdump by putting 10-kdump-netif_allowlist.conf in the initramfs.

    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Thomas Haller <thaller@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-11-23 09:42:18 +08:00
Coiby Xu
420f55c096 Set up kdump network by directly copying NM connection profile to initrd
Resolves: bz2076416
Upstream: Fedora
Conflict: None

commit 63c3805c48
Author: Coiby Xu <coxu@redhat.com>
Date:   Fri Sep 17 13:02:07 2021 +0800

    Set up kdump network by directly copying NM connection profile to initrd

    This patch setup kdump network by directly copying NM connection profile(s)
    for different network setup including bond, bridge, vlan, and team. For
    vlan network, rename phydev to parent_netif to improve code readability.

    With the new approach, the related code to build up dracut cmdline
    parameter such rd.route, ip and etc can be cleaned up. And there is no
    need to setup dns when copying .nmconnection directly to initrd
    either. Note the bootdev dracut command line parameter is only used by
    dracut's 35network-legacy and network-manager doesn't use it, remove
    related code as well.

    Note
    1. kdump_setup_vlan/bond/... are no longer called in subshells in order
       to modify global variables like unique_netifs
    2. The original kdump_install_net is renamed to better reflect its
       current function

    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Thomas Haller <thaller@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-11-23 09:42:18 +08:00
Coiby Xu
1141e03fa1 Stop dracut 35network-manager from running nm-initrd-generator
Resolves: bz2076416
Upstream: Fedora
Conflict: None

commit 62355ebe5a
Author: Coiby Xu <coxu@redhat.com>
Date:   Fri Sep 23 22:16:49 2022 +0800

    Stop dracut 35network-manager from running nm-initrd-generator

    kexec-tools depends on dracut's 35network-manager module which will
    call nm-initrd-generator. We don't want nm-initrd-generator to generate
    connection profiles since we  will copy them from 1st kernel to
    kdump kernel initramfs. NetworkManager >= 1.35.2 won't generate connection
    profiles if there's a connection dir with rd.neednet. For Fedora/RHEL,
    this connection dir is /etc/NetworkManager/system-connections. For the
    details, please refer to the NetworkManager commit 79885656d3
    ("initrd: don't add a connection if there's a connection dir with
    rd.neednet") [1]. Before the release of NetworkManager >= 1.35.2, we
    need to mask /usr/libexec/nm-initrd-generator.

    [1] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1010

    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Thomas Haller <thaller@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-11-23 09:42:18 +08:00
Coiby Xu
214e9d0bef Apply the timeout configuration of nm-initrd-generator
Resolves: bz2076416
Upstream: Fedora
Conflict: None

commit 6b586a9036
Author: Coiby Xu <coxu@redhat.com>
Date:   Thu Sep 22 22:08:43 2022 +0800

    Apply the timeout configuration of nm-initrd-generator

    nm-wait-online-initrd.service installed by dracut's 35-networkmanager
    module calls nm-online with "-s" which means it returns immediately when
    NetworkManager logs "startup complete" after certain timeouts are
    reached. "startup complete" doesn't necessarily network connectivity has
    been established. nm-initrd-generator has a set of timeouts that in most
    of cases when applied, "startup-complete" means network connectivity has
    been established. So apply it when setting up kdump network.

    Suggested-by: Thomas Haller <thaller@redhat.com>
    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Thomas Haller <thaller@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-11-23 09:42:18 +08:00
Coiby Xu
668875e186 Determine whether IPv4 or IPv6 is needed
Resolves: bz2076416
Upstream: Fedora
Conflict: None

commit 9dfcacf72d
Author: Coiby Xu <coxu@redhat.com>
Date:   Thu Sep 8 17:06:19 2022 +0800

    Determine whether IPv4 or IPv6 is needed

    According to `man nm-online`,
      "By default, connections have the ipv4.may-fail and
      ipv6.may-fail properties set to yes; this means that
      NetworkManager waits for one of the two address families to
      complete configuration before considering the connection
      activated. If you need a specific address family configured
      before network-online.target is reached, set the corresponding
      may-fail property to no."

    If a NIC has an IPv4 or IPv6 address, set the corresponding may-fail
    property to no. Otherwise, dumping vmcore over IPv6 could fail because
    only IPv4 network is ready or vice versa.

    Also disable IPv6 if only IPv4 is used and vice versa.

    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Thomas Haller <thaller@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-11-23 09:42:18 +08:00
Coiby Xu
94f8eed573 Add functions to copy NetworkManage connection profiles to the initramfs
Resolves: bz2076416
Upstream: Fedora
Conflict: None

commit d25b1ee31c
Author: Coiby Xu <coxu@redhat.com>
Date:   Thu Sep 9 11:35:52 2021 +0800

    Add functions to copy NetworkManage connection profiles to the initramfs

    Each network interface is manged by a NM connection. Given a list of
    network interface names, copy the NetworkManager (NM) connection
    profiles i.e. .nmconnection files to the kdump initramfs.

    Before copying a connection file, clone it to automatically convert a
    legacy ifcfg-*[1] file to a .nmconnection file and for the convenience of
    editing the connection profile.

    [1] https://fedoraproject.org/wiki/Changes/NetworkManager_keyfile_instead_of_ifcfg_rh

    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Thomas Haller <thaller@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-11-23 09:42:18 +08:00
Coiby Xu
ec669a8e8b Fix error for vlan over team network interface
Related: bz2076416
Upstream: Fedora
Conflict: None

commit b7e58619d1
Author: Coiby Xu <coxu@redhat.com>
Date:   Mon Sep 13 22:13:44 2021 +0800

    Fix error for vlan over team network interface

    6f9235887f ("module-setup.sh: enable
    vlan on team interface") skips establishing teaming network by mistake.
    Although it could use one of slave netifs to establish connection
    to transfer vmcore to remote fs, it breaks the implicit assumption of
    creating an identical network topology to the 1st kernel.

    Fixes: 6f92358 ("module-setup.sh: enable vlan on team interface")
    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Thomas Haller <thaller@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-11-23 09:42:18 +08:00
Tao Liu
812f2c967f Release 2.0.25-5
Resolves: bz2083475

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-11-11 13:25:21 +08:00
Tao Liu
fb9545bb2a Don't check fs modified when dump target is lvm2 thinp
upstream: fedora
resolves: bz2083475
conflict: none

commit 3ae8cf8876
Author: Tao Liu <ltao@redhat.com>
Date:   Thu Nov 10 10:25:58 2022 +0800

    Don't check fs modified when dump target is lvm2 thinp

    When the dump target is lvm2 thinp, if we didn't mount
    the dump target first, get_fs_type_from_target will get
    empty output:

    Before mount:
    $ get_fs_type_from_target /dev/vg00/thinlv

    After mount:
    $ mount /dev/vg00/thinlv /mnt
    $ get_fs_type_from_target /dev/vg00/thinlv
    ext4

    As a result, kdumpctl start will fail with:
    $ kdumpctl start
    kdump: Dump target is invalid
    kdump: Starting kdump: [FAILED]

    This patch fix the issue by bypassing check_fs_modified
    when the dump target is lvm2 thinp.

    Signed-off-by: Tao Liu <ltao@redhat.com>
    Reviewed-by: Coiby Xu <prudo@redhat.com>

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-11-11 11:53:13 +08:00
Tao Liu
10e0a513a4 Add dependency of dracut lvmthinpool-monitor module
upstream: fedora
resolves: bz2083475
conflict: Yes, use "grep -q <<< $(cmd)" instead of
          "cmd | grep -q", because the latter will
          fail with strange reason.

commit f11721077a
Author: Tao Liu <ltao@redhat.com>
Date:   Sat Oct 8 15:41:41 2022 +0800

    Add dependency of dracut lvmthinpool-monitor module

    The 80lvmthinpool-monitor module is needed for monitor and
    autoextend the size of thin pool in 2nd kernel. The module was
    integrated in dracut version 057.

    If lvmthinpool-monitor module is not found, we will print a warning.
    Because we don't want to block the kdump process when the thin pool
    capacity is enough and no monitor-and-autoextend actually needed.

    Signed-off-by: Tao Liu <ltao@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-11-11 11:49:32 +08:00
Tao Liu
b57b206d62 lvm.conf should be check modified if lvm2 thinp enabled
resolves: bz2083475
upstream: fedora
conflict: none

commit 10ca970940
Author: Tao Liu <ltao@redhat.com>
Date:   Sat Oct 8 15:41:40 2022 +0800

    lvm.conf should be check modified if lvm2 thinp enabled

    lvm2 relies on /etc/lvm/lvm.conf to determine its behaviour. The
    important configs such as thin_pool_autoextend_threshold and
    thin_pool_autoextend_percent will be used during kdump in 2nd
    kernel. So if the file is modified, the initramfs should be
    rebuild to include the latest.

    Signed-off-by: Tao Liu <ltao@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-11-09 15:57:28 +08:00
Tao Liu
94988e9e3d Add lvm2 thin provision dump target checker
resolves: bz2083475
upstream: fedora
conflict: none

commit 0a5b71d123
Author: Tao Liu <ltao@redhat.com>
Date:   Sat Oct 8 15:41:39 2022 +0800

    Add lvm2 thin provision dump target checker

    We need to check if a directory or a device is lvm2 thinp target.

    First, we use get_block_dump_target() to convert dump path into
    block device, then we check if the device is lvm2 thinp target by
    cmd lvs.

    is_lvm2_thinp_device is now located in kdump-lib-initramfs.sh, for it
    will be used in 2nd kernel. is_lvm2_thinp_dump_target is located in
    kdump-lib.sh, for it is only used in 1st kernel, and it has dependencies
    which exist in kdump-lib.sh.

    Signed-off-by: Tao Liu <ltao@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-11-09 15:56:18 +08:00
Tao Liu
4776d9f8fa Fix the sync issue for dump_fs
related: bz2083475
upstream: fedora
conflict: none

commit bea6143178
Author: Tao Liu <ltao@redhat.com>
Date:   Sat Oct 8 14:53:21 2022 +0800

    Fix the sync issue for dump_fs

    Previously the sync for dump_fs is problematic, it always
    return success according to man 2 sync. So it cannot detect
    the error of the dump target is full and not all of vmcore
    data been written back the disk, which will leave the vmcore
    imcomplete and report misleading log as "saving vmcore
    complete".

    In this patch, we will use "sync -f vmcore" instead, which
    will return error if syncfs on the dump target fails. In
    this way, vmcore sync related failures, such as autoextend
    of lvm2 thinpool fails, can be detected and handled properly.

    Signed-off-by: Tao Liu <ltao@redhat.com>
    Reviewed-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-11-09 15:54:14 +08:00
Tao Liu
cd64a1a851 Release 2.0.25-4
Resolves: bz2120914
Resolves: bz2076206
Resolves: bz2133129
Resolves: bz2060319
Resolves: bz2048690

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-11-01 11:10:13 +08:00
Tao Liu
557230d1e8 Rebase makedumpfile to 1.7.2
Resolves: bz2120914

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-11-01 11:04:23 +08:00
Coiby Xu
5309c08efa Include the memory overhead cost of cryptsetup when estimating the memory requirement for LUKS-encrypted target
Resolves: bz2076206
Upstream: Fedora
Conflict: None

commit 6ce4b85bb3
Author: Coiby Xu <coxu@redhat.com>
Date:   Mon Sep 5 18:08:44 2022 +0800

    Include the memory overhead cost of cryptsetup when estimating the memory requirement for LUKS-encrypted target

    Currently, "kdumpctl estimate" neglects the memory overhead cost of
    cryptsetup itself. Unfortunately, there is no golden formula to
    calculate the overhead cost [1]. So estimate the overhead cost as 50M
    for aarch64 and 20M for other architectures based on the following
    empirical data,

    | Overhead (M) | OS                                        | arch    |
    | ------------ | ----------------------------------------- | ------- |
    | 14.1         | RHEL-9.2.0-20220829.d.1                   | ppc64le |
    | 14           | Fedora-37-20220830.n.0 Everything ppc64le | ppc64le |
    | 17           | Fedora 36                                 | ppc64le |
    | 8.8          | Fedora 35                                 | s390x   |
    | 10.1         | Fedora-Rawhide-20220829.n.0, fc38         | s390x   |
    | 42           | Fedora-Rawhide-20220829.n.0, fc38         | arch64  |
    | 40           | F35                                       | arch64  |
    | 42           | F36                                       | arch64  |
    | 42           | Fedora-Rawhide-20220901.n.0               | arch64  |
    | 10           | F35                                       | x86_64  |
    | 10           | Fedora-Rawhide-20220901.n.0               | x86_64  |
    | 11           | Fedora-Rawhide-20220901.n.0               | x86_64  |

    [1] https://lore.kernel.org/cryptsetup/20220616044339.376qlipk5h2omhx2@Rk/T/#u

    Fixes: e9e6a2c ("kdumpctl: Add kdumpctl estimate")
    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-11-01 02:38:40 +00:00
Coiby Xu
4757b08830 Choosing the most memory-consuming key slot when estimating the memory requirement for LUKS-encrypted target
Related: bz2076206
Upstream: Fedora
Conflict: None

commit 50a8461fc7
Author: Coiby Xu <coxu@redhat.com>
Date:   Mon Sep 5 17:49:18 2022 +0800

    Choosing the most memory-consuming key slot when estimating the
    memory requirement for LUKS-encrypted target

    When there are multiple key slots, "kdumpctl estimate" uses the least
    memory-consuming key slot. For example, when there are two memory slots
    created with --pbkdf-memory=1048576 (1G) and --pbkdf-memory=524288 (512M),
    "kdumpctl estimate" thinks the extra memory requirement is only 512M.
    This will of course lead to OOM if the user uses the more
    memory-consuming key slot. Fix it by sorting in reverse order.

    Fixes: e9e6a2c ("kdumpctl: Add kdumpctl estimate")
    Signed-off-by: Coiby Xu <coxu@redhat.com>

    Reviewed-by: Lichen Liu <lichliu@redhat.com>
    Signed-off-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-11-01 02:38:40 +00:00
Coiby Xu
7266bb9a7e Skip reading /etc/defaut/grub for s390x
Resolves: bz2133129
Upstream: Fedora
Conflict: None

commit fdad7d9869
Author: Coiby Xu <coxu@redhat.com>
Date:   Thu Sep 29 12:35:00 2022 +0800

    Skip reading /etc/defaut/grub for s390x

    Currently, updating kexec-tools on s390x gives the warning
    sed: can't read /etc/default/grub: No such file or directory

    This happens because s390x doesn't use GRUB and /etc/default/grub
    doesn't exist. We need to skip both reading and writing to
    /etc/default/grub.

    Reported-by: Jie Li <jieli@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>
    Signed-off-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-10-27 14:47:57 +08:00
Coiby Xu
a9968490a2 Only try to reset crashkernel for osbuild during package install
Resolves: bz2060319
Upstream: Fedora
Conflict: None

commit e218128e28
Author: Coiby Xu <coxu@redhat.com>
Date:   Thu Sep 8 14:30:02 2022 +0800

    Only try to reset crashkernel for osbuild during package install

    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2060319

    Currently, kexec-tools tries to reset crashkernel when using anaconda to
    install the system. But grubby isn't ready and complains that,
      10:33:17,631 INF packaging: Configuring (running scriptlet for): kernel-core-5.14.0-70.el9.x86_64 1645746534 03dcd32db234b72440ee6764d59b32347c5f0cd98ac3fb55beb47214a76f33b4
      10:34:16,696 INF dnf.rpm: grep: /boot/grub2/grubenv: No such file or directory
      grep: /boot/grub2/grubenv: No such file or directory

    We only need to try resetting crashkernel for osbuild. Skip it for other
    cases. To tell if it's package install instead of package upgrade, make
    use of %pre to write a file /tmp/kexec-tools-install when "$1 == 1" [1].

    [1] https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/#_syntax

    Reported-by: Jan Stodola <jstodola@redhat.com>
    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Lichen Liu <lichenliu@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-10-27 14:47:57 +08:00
Coiby Xu
b1b95d234b Prefix reset-crashkernel-{for-installed_kernel,after-update} with underscore
Resolves: bz2048690
Upstream: Fedora
Conflict: None

commit a7ead187a4
Author: Coiby Xu <coxu@redhat.com>
Date:   Thu Sep 8 14:08:42 2022 +0800

    Prefix reset-crashkernel-{for-installed_kernel,after-update} with underscore

    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2048690

    To indicate they are for internal use only, underscore them.

    Reported-by: rcheerla@redhat.com
    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Lichen Liu <lichenliu@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-10-27 14:47:57 +08:00
Coiby Xu
a091409f10 use /run/ostree-booted to tell if scriptlet is running on OSTree system
Related: bz2048690
Upstream: Fedora
Conflict: None

commit f6bcd819fc
Author: Coiby Xu <coxu@redhat.com>
Date:   Fri Jul 15 15:11:44 2022 +0800

    use /run/ostree-booted to tell if scriptlet is running on OSTree system

    Resolves: bz2092012

    According to the ostree team [1], the existence of /run/ostree-booted
    > is the most stable way to signal/check that a system has been
    > booted in ostree-style.  It is also used by rpm-ostree at
    > compose/install time in the sandboxed environment where scriptlets run,
    > in order to signal that the package is being installed/composed into
    > an ostree commit (i.e. not directly on a live system).  See
    > 8ddf5f40d9/src/libpriv/rpmostree-scripts.cxx (L350-L353)
    > for reference.

    By checking the existence of /run/ostree-booted, we could skip trying to
    update kernel cmdline during OSTree compose time.

    [1] https://bugzilla.redhat.com/show_bug.cgi?id=2092012#c3

    Reported-by: Luca BRUNO <lucab@redhat.com>
    Suggested-by: Luca BRUNO <lucab@redhat.com>
    Fixes: 0adb0f4 ("try to reset kernel crashkernel when kexec-tools updates the default crashkernel value")
    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>
    Acked-by: Timothée Ravier <siosm@fedoraproject.org>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-10-27 14:47:57 +08:00
Tao Liu
1d4de1f185 Release 2.0.25-3
Resolves: bz2085347
Resolves: bz2045949
Resolves: bz2044804

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-10-26 15:47:38 +08:00
Tao Liu
dcaec956e8 virtiofs support for kexec-tools
upstream: fedora
resolves: bz2085347
conflict: yes, small conflict due to patch
          "kdumpctl: drop DUMP_TARGET variable" not
          backported to rhel9.

commit c743881ae6
Author: Tao Liu <ltao@redhat.com>
Date:   Fri Sep 23 18:13:11 2022 +0800

    virtiofs support for kexec-tools

    This patch add virtiofs support for kexec-tools by introducing a new option
    for /etc/kdump.conf:

    virtiofs myfs

    Where myfs is a variable tag name specified in qemu cmdline
    "-device vhost-user-fs-pci,tag=myfs".

    The patch covers the following cases:
    1) Dumping VM's vmcore to a virtiofs shared directory;
    2) When the VM's rootfs is a virtiofs shared directory and dumping the
       VM's vmcore to its subdirectory, such as /var/crash;
    3) The combination of case 1 & 2: The VM's rootfs is a virtiofs shared
       directory and dumping the VM's vmcore to another virtiofs shared
       directory.

    Case 2 & 3 need dracut >= 057, otherwise VM cannot boot from virtiofs
    shared rootfs. But it is not the issue of kexec-tools.

    Reviewed-by: Philipp Rudo <prudo@redhat.com>
    Signed-off-by: Tao Liu <ltao@redhat.com>

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-10-26 10:24:57 +08:00
Tao Liu
b5a9e54629 Seperate dracut and dracut-squash compressor for zstd
Upstream: fedora
Resolves: bz2045949
Resolves: bz2044804
Conflict: none

commit fc1c79ffd2
Author: Tao Liu <ltao@redhat.com>
Date:   Sat Oct 8 12:09:08 2022 +0800

    Seperate dracut and dracut-squash compressor for zstd

    Previously kexec-tools will pass "--compress zstd" to dracut. It
    will make dracut to decide whether: a) call mksquashfs to make a
    zstd format squash-root.img, b) call cmd zstd to make a initramfs.

    Since dracut(>= 057) has decoupled the compressor for dracut and
    dracut-squash, So in this patch, we will pass the compressor seperately.

    Note:

    The is_squash_available && !dracut_has_option --squash-compressor
    && !is_zsdt_command_available case is left unprocessed on purpose.

    Actually, the situation when we want to call zstd compression is:
    1) If squash function OK, we want dracut to invoke mksquashfs to make
    a zstd format squash-root.img within initramfs.
    2) If squash function is not OK, and cmd zstd presents, we want dracut
    to invoke cmd zstd to make a zstd format initramfs.

    is_zstd_command_available check can handle case 2 completely.

    However, for the is_squash_available check, it cannot handle case 1
    completely. It only checks if the kernel supports squashfs, it doesn't
    check whether the squash module has been added by dracut when making
    initramfs. In fact, in kexec-tools we are unable to do the check,
    there are multiple ways to forbit dracut to load a module, such as
    "dracut -o module" and "omit_dracutmodules in dracut.conf".

    When squash dracut module is omitted, is_squash_available check will
    still pass, so "--compress zstd" will be appended to dracut cmdline,
    and it will call cmd zstd to do the compression. However cmd zstd may
    not exist, so it fails.

    The previous "--compress zstd" is ambiguous, after the intro of
    "--squash-compressor", "--squash-compressor" only effect for
    mksquashfs and "--compress" only effect for specific cmd.

    So for the is_squash_available && !dracut_has_option
    --squash-compressor && !is_zsdt_command_available case, we just leave
    it to be handled the default way.

    Reviewed-by: Philipp Rudo <prudo@redhat.com>
    Signed-off-by: Tao Liu <ltao@redhat.com>

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-10-26 10:16:16 +08:00
Tao Liu
8b72bcfbab Release 2.0.25-2
Resolves: bz2090534
Resolves: bz2129842

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-10-17 13:34:00 +08:00
Lichen Liu
bdb60065b2 mkdumprd: Improve error messages on non-existing NFS target directories
Resolves: bz2090534
Upstream: Fedora
Conflict: None

commit 4d52b7d548
Author: Lichen Liu <lichliu@redhat.com>
Date:   Tue Sep 6 11:15:15 2022 +0800

    mkdumprd: Improve error messages on non-existing NFS target directories

    When kdump is configured with a NFS location, and the remote directory does
    not exist, kdump.service fails with a confusing error message.

        kdumpctl[2172]: kdump: Dump path "/tmp/mkdumprd.ftWhOF/target/dumps"
        does not exist in dump target "10.111.113.2:/srv/kdump"

    We just need to print the remote directory "dumps" in such case, because
    "/tmp/mkdumprd.ftWhOF/target" is the local temporary mount point.

    Signed-off-by: Lichen Liu <lichliu@redhat.com>
    Reviewed-by: Coiby Xu<coxu@redhat.com>

Signed-off-by: Lichen Liu <lichliu@redhat.com>
2022-10-13 02:34:51 +00:00
Lichen Liu
6c26a30505 fadump: avoid non-debug kernel use for fadump case
Resolves: bz2129842
Upstream: Fedora
Conflict: None

commit d905d49c08
Author: Hari Bathini <hbathini@linux.ibm.com>
Date:   Fri Sep 16 19:07:24 2022 +0530

    fadump: avoid non-debug kernel use for fadump case

    Since commit c5bdd2d8f1 ("kdump-lib: use non-debug kernels first"),
    non-debug kernel is preferred, over the debug variant, as dump capture
    kernel to reduce memory consumption. This works alright for kdump as
    the capture kernel is loaded using kexec.

    In case of fadump, regular boot loader is used to load the capture
    kernel. So, the default kernel needs to be used as capture kernel as
    well. But with commit c5bdd2d8f1, initrd of a different kernel is
    made dump capture capable, breaking fadump's ability to capture dump
    properly. Fix this by sticking with the debug variant in case of
    fadump.

    Fixes: c5bdd2d8f1 ("kdump-lib: use non-debug kernels first")

    Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
    Acked-by: Lichen Liu <lichliu@redhat.com>
    Acked-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Lichen Liu <lichliu@redhat.com>
2022-09-28 11:56:52 +08:00
Tao Liu
5f3d92c802 Release 2.0.25-1
Resolves: bz2120916
Resolves: bz2104534
Resolves: bz2089871
Resolves: bz2111857

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-09-21 10:17:22 +08:00
Tao Liu
c2ea3e9427 Rebase kexec-tools to v2.0.25
Resolves: bz2120916

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-09-21 10:12:16 +08:00
Coiby Xu
3c7270927b remind the users to run zipl after calling grubby on s390x
Related: bz2104534
Upstream: Fedora
Conflict: None

commit 4d1e02d340
Author: Coiby Xu <coxu@redhat.com>
Date:   Tue Jul 12 10:52:10 2022 +0800

    remind the users to run zipl after calling grubby on s390x

    s390x doesn't use GRUB. To make sure the boot entries are updated, call
    zipl after running grubby.

    Suggested-by: smitterl@redhat.com
    Reviewed-by: Philipp Rudo <prudo@redhat.com>
    Signed-off-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-09-19 09:10:54 +08:00
Coiby Xu
e9088ae71a remove useless --zipl when calling grubby to update kernel command line
Related: bz2104534
Upstream: Fedora
Conflict: None

commit 58eef4582a
Author: Coiby Xu <coxu@redhat.com>
Date:   Tue Jul 12 16:07:37 2022 +0800

    remove useless --zipl when calling grubby to update kernel command line

    "grubby --zipl" only takes effect when setting default kernel. It's
    useless to add "--zipl" when updating kernel command line. Also rename
    _update_grub to _update_kernel_cmdline since s390x doesn't use GRUB.

    Reviewed-by: Philipp Rudo <prudo@redhat.com>
    Signed-off-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-09-19 09:10:54 +08:00
Coiby Xu
3aeb03c97e skip updating /etc/default/grub for s390x
Resolves: bz2104534
Upstream: Fedora
Conflict: None

commit e8ae897595
Author: Coiby Xu <coxu@redhat.com>
Date:   Tue Jul 12 14:06:25 2022 +0800

    skip updating /etc/default/grub for s390x

    Resolves: bz2104534

    When running "kdumpctl reset-crashkernel --kernel=ALL" on s390x,
    sed: can't read /etc/default/grub: No such file or directory
    sed: can't read /etc/default/grub: No such file or directory

    This happens because s390x doesn't use the grub bootloader and
    /etc/default/grub doesn't exist.

    Reported-by: smitterl@redhat.com
    Reviewed-by: Philipp Rudo <prudo@redhat.com>
    Signed-off-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-09-19 09:10:54 +08:00
Coiby Xu
928c386f97 Allow to update kexec-tools using virt-customize for cloud base image
Resolves: bz2089871
Upstream: Fedora
Conflict: None

commit da0ca0d205
Author: Coiby Xu <coxu@redhat.com>
Date:   Tue Jun 28 14:38:28 2022 +0800

    Allow to update kexec-tools using virt-customize for cloud base image

    Resolves: bz2089871

    Currently, kexec-tools can't be updated using virt-customize because
    older version of kdumpctl can't acquire instance lock for the
    get-default-crashkernel subcommand. The reason is /var/lock is linked to
    /run/lock which however doesn't exist in the case of virt-customize.

    This patch fixes this problem by using /tmp/kdump.lock as the lock
    file if /run/lock doesn't exist.

    Note
    1. The lock file is now created in /run/lock instead of /var/run/lock since
       Fedora has adopted adopted /run [2] since F15.
    2. %pre scriptlet now always return success since package update won't
       be blocked

    [1] https://fedoraproject.org/wiki/Features/var-run-tmpfs

    Fixes: 0adb0f4 ("try to reset kernel crashkernel when kexec-tools updates the default crashkernel value")

    Reported-by: Nicolas Hicher <nhicher@redhat.com>
    Suggested-by: Laszlo Ersek <lersek@redhat.com>
    Suggested-by: Philipp Rudo <prudo@redhat.com>
    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-09-19 09:10:54 +08:00
Lichen Liu
bd92125753 kdumpctl: make the kdump.log root-readable-only
Resolves: bz2111857
Upstream: Fedora
Conflict: None

commit 4edcd9a400
Author: Lichen Liu <lichliu@redhat.com>
Date:   Wed Aug 24 16:16:14 2022 +0800

    kdumpctl: make the kdump.log root-readable-only

    Decrease the risk that of leaking information that could potentially
    be used to exploit the crash further (think location of keys).

    Signed-off-by: Lichen Liu <lichliu@redhat.com>
    Acked-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Lichen Liu <lichliu@redhat.com>
2022-09-07 11:35:38 +08:00
Tao Liu
c9bdde4cfb Release 2.0.24-5
Resolves: bz2076425

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-07-12 09:57:55 +08:00
Lichen Liu
987edab69a kdump-lib: use non-debug kernels first
Resolves: bz2076425
Upstream: Fedora
Conflict: None

commit c5bdd2d8f1
Author: Lichen Liu <lichliu@redhat.com>
Date:   Mon Jun 13 12:08:08 2022 +0800

    kdump-lib: use non-debug kernels first

    Kdump uses currently running kernel as default, but when currently
    running kernel is a debug kernel, it will consume more memory,
    which may cause out-of-memory and fail to collect vmcore.

    Now we will try to use non-debug kernels first if possible.

    Also extract the logic of determine KDUMP_KERNEL from
    prepare_kdump_bootinfo into a function. This function will return
    KDUMP_KERNEL given a kernel version.

    Signed-off-by: Lichen Liu <lichliu@redhat.com>
    Acked-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Lichen Liu <lichliu@redhat.com>
2022-06-17 11:20:15 +08:00
Lichen Liu
112d3e1891 kdump-lib: fix typo in variable name
Resolves: bz2076425
Upstream: Fedora
Conflict: None

commit aa9bb8f8ce
Author: Philipp Rudo <prudo@redhat.com>
Date:   Fri Mar 25 15:46:59 2022 +0100

    kdump-lib: fix typo in variable name

    in prepare_kdump_bootinfo s/defaut/default/.

    While at it declare it with the other local variables as local.

    Signed-off-by: Philipp Rudo <prudo@redhat.com>
    Reviewed-by: Tao Liu <ltao@redhat.com>
    Reviewed-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Lichen Liu <lichliu@redhat.com>
2022-06-17 11:05:48 +08:00
Tao Liu
a743eaeded Release 2.0.24-4
Resolves: bz2041729
Resolves: bz2096132

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-06-15 13:05:39 +08:00
Pingfan Liu
a3181169cd crashkernel: optimize arm64 reserved size if PAGE_SIZE=4k
Resolves: bz2041729
Upstream: Fedora
Conflict: None

commit b92bc6e0a7
Author: Pingfan Liu <piliu@redhat.com>
Date:   Mon Jun 13 10:25:26 2022 +0800

    crashkernel: optimize arm64 reserved size if PAGE_SIZE=4k

    On RHEL9 and Fedora, the arm64 platform only supports 4KB page size.
    the reserved memory size can be aligned to that on x86_64.

    Introducing a new formula for 4KB on arm64, which bases on x86_64 plus
    extra 64MB.

    Signed-off-by: Pingfan Liu <piliu@redhat.com>
    Acked-by: Baoquan He <bhe@redhat.com>

Signed-off-by: Pingfan Liu <piliu@redhat.com>
2022-06-15 03:35:06 +00:00
Tao Liu
c931d1ba8e kdump-lib.sh: Check the output of blkid with sed instead of eval
upstream: fedora
resolves: bz2096132
conflict: none

commit 2bbc7512a2
Author: Tao Liu <ltao@redhat.com>
Date:   Wed Feb 16 14:26:38 2022 +0800

    kdump-lib.sh: Check the output of blkid with sed instead of eval

    Previously the output of blkid is not checked. If the output
    is empty, the eval will report the following error message:

        /lib/kdump/kdump-lib.sh: eval: line 925: syntax error near unexpected token `;'
        /lib/kdump/kdump-lib.sh: eval: line 925: `; echo $TYPE'

    For example, we can observe such a failing when blkid is invoked
    against a lvm thinpool block device:

        $ blkid -u filesystem,crypto -o export -- "/dev/block/253\:2"
        $ echo $?
        2
        $ udevadm info /dev/block/253\:2|grep S\:
        S: mapper/vg00-thinpoll_tmeta

    In this patch, we will use sed instead of eval, to output the
    fstype of block device if any.

    Signed-off-by: Tao Liu <ltao@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-06-13 10:17:06 +08:00
Tao Liu
9daddc7878 Release 2.0.24-3
Resolves: bz2090533

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-06-07 09:45:44 +08:00
Lichen Liu
fcca486525 kdump.sysconfig*: add ignition.firstboot to KDUMP_COMMANDLINE_REMOVE
Resolves: bz2090533
Upstream: Fedora
Conflict: None

commit 218d9917c0
Author: Dusty Mabe <dusty@dustymabe.com>
Date:   Mon May 16 14:04:12 2022 -0400

    kdump.sysconfig*: add ignition.firstboot to KDUMP_COMMANDLINE_REMOVE

    For CoreOS based systems we use Ignition for provisioning machines
    in the initramfs on first boot. We trigger Ignition right now by
    the presence of `ignition.firstboot` in the kernel command line. The
    kernel argument is only present on first boot so after a reboot it
    no longer is in the kernel command line.

    If a kernel crash happens before the first reboot of a machine we
    want the `ignition.firstboot` kernel argument to be removed and not
    passed on to the crash kernel.

Signed-off-by: Lichen Liu <lichliu@redhat.com>
2022-05-27 10:08:59 +08:00
Tao Liu
ba836a59b2 Release 2.0.24-2
Resolves: bz2059492
Resolves: bz2073676
Resolves: bz2074473

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-05-23 14:34:43 +08:00
Coiby Xu
1778bccc6e remove the upper bound of default crashkernel value example
Resolves: bz2059492
Upstream: Fedora
Conflict: None

commit be20580b06
Author: Coiby Xu <coxu@redhat.com>
Date:   Tue Mar 1 17:33:24 2022 +0800

    remove the upper bound of default crashkernel value example

    Reviewed-by: Philipp Rudo <prudo@redhat.com>
    Signed-off-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-05-17 09:23:10 +00:00
Coiby Xu
c345f54e4f update fadump-howto
Resolves: bz2059492
Upstream: Fedora
Conflict: None

commit 695e5b8676
Author: Coiby Xu <coxu@redhat.com>
Date:   Tue Mar 1 17:30:30 2022 +0800

    update fadump-howto

    1. yum is deprecated so use dnf instead
    2. use the "kdumpctl reset-crashkernel --fadump=on" API

    Reviewed-by: Philipp Rudo <prudo@redhat.com>
    Signed-off-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-05-17 09:23:10 +00:00
Coiby Xu
b749f72da4 update kexec-kdump-howto
Resolves: bz2059492
Upstream: Fedora
Conflict: None

commit 1e7df3e1f3
Author: Coiby Xu <coxu@redhat.com>
Date:   Tue Mar 1 17:30:50 2022 +0800

    update kexec-kdump-howto

    1. yum is deprecated so use dnf instead
    2. use the "kdumpctl reset-crashkernel" API
    3. ask the users to refer to crashkernel-howto.txt for setting custom
       crashkernel value
    4. fix a typo

    Philipp Rudo <prudo@redhat.com>

    Signed-off-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-05-17 09:23:10 +00:00
Coiby Xu
d3dba4cbc3 update crashkernel-howto
Resolves: bz2073676
Upstream: Fedora
Conflict: None

commit 683ff87821
Author: Coiby Xu <coxu@redhat.com>
Date:   Mon Apr 18 15:54:53 2022 +0800

    update crashkernel-howto

    1. clean up left crashkernel.default
    2. fix a few typos and grammar mistakes
    3. ask the users to refer to `man kdumpctl` for reset-crashkernel

    Philipp Rudo <prudo@redhat.com>

    Signed-off-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-05-17 09:23:10 +00:00
Coiby Xu
0e3a77330e add man documentation for kdumpctl get-default-crashkernel
Resolves: bz2073676
Upstream: Fedora
Conflict: None

commit a1c63fa644
Author: Coiby Xu <coxu@redhat.com>
Date:   Tue Mar 1 17:35:39 2022 +0800

    add man documentation for kdumpctl get-default-crashkernel

    A few typos and grammar issues are fixed as well.

    Philipp Rudo <prudo@redhat.com>

    Signed-off-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-05-17 09:23:10 +00:00
Coiby Xu
63deb78f73 improve get_recommend_size
Resolves: bz2074473
Upstream: Fedora
Conflict: None

commit 4f702c81e9
Author: Coiby Xu <coxu@redhat.com>
Date:   Thu May 12 10:48:31 2022 +0800

    improve get_recommend_size

    This patch rewrites get_recommend_size to get rid of the following
    limitations,
    1. only supports ranges in crashkernel sorted in increasing order
    2. the first entry of crashkernel should have only a single digit and
       it's in gigabytes

    Suggested-by: Philipp Rudo <prudo@redhat.com>
    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-05-13 11:22:07 +08:00