Commit Graph

290 Commits

Author SHA1 Message Date
Coiby Xu
3aeb03c97e skip updating /etc/default/grub for s390x
Resolves: bz2104534
Upstream: Fedora
Conflict: None

commit e8ae897595
Author: Coiby Xu <coxu@redhat.com>
Date:   Tue Jul 12 14:06:25 2022 +0800

    skip updating /etc/default/grub for s390x

    Resolves: bz2104534

    When running "kdumpctl reset-crashkernel --kernel=ALL" on s390x,
    sed: can't read /etc/default/grub: No such file or directory
    sed: can't read /etc/default/grub: No such file or directory

    This happens because s390x doesn't use the grub bootloader and
    /etc/default/grub doesn't exist.

    Reported-by: smitterl@redhat.com
    Reviewed-by: Philipp Rudo <prudo@redhat.com>
    Signed-off-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-09-19 09:10:54 +08:00
Coiby Xu
928c386f97 Allow to update kexec-tools using virt-customize for cloud base image
Resolves: bz2089871
Upstream: Fedora
Conflict: None

commit da0ca0d205
Author: Coiby Xu <coxu@redhat.com>
Date:   Tue Jun 28 14:38:28 2022 +0800

    Allow to update kexec-tools using virt-customize for cloud base image

    Resolves: bz2089871

    Currently, kexec-tools can't be updated using virt-customize because
    older version of kdumpctl can't acquire instance lock for the
    get-default-crashkernel subcommand. The reason is /var/lock is linked to
    /run/lock which however doesn't exist in the case of virt-customize.

    This patch fixes this problem by using /tmp/kdump.lock as the lock
    file if /run/lock doesn't exist.

    Note
    1. The lock file is now created in /run/lock instead of /var/run/lock since
       Fedora has adopted adopted /run [2] since F15.
    2. %pre scriptlet now always return success since package update won't
       be blocked

    [1] https://fedoraproject.org/wiki/Features/var-run-tmpfs

    Fixes: 0adb0f4 ("try to reset kernel crashkernel when kexec-tools updates the default crashkernel value")

    Reported-by: Nicolas Hicher <nhicher@redhat.com>
    Suggested-by: Laszlo Ersek <lersek@redhat.com>
    Suggested-by: Philipp Rudo <prudo@redhat.com>
    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-09-19 09:10:54 +08:00
Lichen Liu
bd92125753 kdumpctl: make the kdump.log root-readable-only
Resolves: bz2111857
Upstream: Fedora
Conflict: None

commit 4edcd9a400
Author: Lichen Liu <lichliu@redhat.com>
Date:   Wed Aug 24 16:16:14 2022 +0800

    kdumpctl: make the kdump.log root-readable-only

    Decrease the risk that of leaking information that could potentially
    be used to exploit the crash further (think location of keys).

    Signed-off-by: Lichen Liu <lichliu@redhat.com>
    Acked-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Lichen Liu <lichliu@redhat.com>
2022-09-07 11:35:38 +08:00
Tao Liu
c9bdde4cfb Release 2.0.24-5
Resolves: bz2076425

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-07-12 09:57:55 +08:00
Lichen Liu
987edab69a kdump-lib: use non-debug kernels first
Resolves: bz2076425
Upstream: Fedora
Conflict: None

commit c5bdd2d8f1
Author: Lichen Liu <lichliu@redhat.com>
Date:   Mon Jun 13 12:08:08 2022 +0800

    kdump-lib: use non-debug kernels first

    Kdump uses currently running kernel as default, but when currently
    running kernel is a debug kernel, it will consume more memory,
    which may cause out-of-memory and fail to collect vmcore.

    Now we will try to use non-debug kernels first if possible.

    Also extract the logic of determine KDUMP_KERNEL from
    prepare_kdump_bootinfo into a function. This function will return
    KDUMP_KERNEL given a kernel version.

    Signed-off-by: Lichen Liu <lichliu@redhat.com>
    Acked-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Lichen Liu <lichliu@redhat.com>
2022-06-17 11:20:15 +08:00
Lichen Liu
112d3e1891 kdump-lib: fix typo in variable name
Resolves: bz2076425
Upstream: Fedora
Conflict: None

commit aa9bb8f8ce
Author: Philipp Rudo <prudo@redhat.com>
Date:   Fri Mar 25 15:46:59 2022 +0100

    kdump-lib: fix typo in variable name

    in prepare_kdump_bootinfo s/defaut/default/.

    While at it declare it with the other local variables as local.

    Signed-off-by: Philipp Rudo <prudo@redhat.com>
    Reviewed-by: Tao Liu <ltao@redhat.com>
    Reviewed-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Lichen Liu <lichliu@redhat.com>
2022-06-17 11:05:48 +08:00
Tao Liu
a743eaeded Release 2.0.24-4
Resolves: bz2041729
Resolves: bz2096132

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-06-15 13:05:39 +08:00
Pingfan Liu
a3181169cd crashkernel: optimize arm64 reserved size if PAGE_SIZE=4k
Resolves: bz2041729
Upstream: Fedora
Conflict: None

commit b92bc6e0a7
Author: Pingfan Liu <piliu@redhat.com>
Date:   Mon Jun 13 10:25:26 2022 +0800

    crashkernel: optimize arm64 reserved size if PAGE_SIZE=4k

    On RHEL9 and Fedora, the arm64 platform only supports 4KB page size.
    the reserved memory size can be aligned to that on x86_64.

    Introducing a new formula for 4KB on arm64, which bases on x86_64 plus
    extra 64MB.

    Signed-off-by: Pingfan Liu <piliu@redhat.com>
    Acked-by: Baoquan He <bhe@redhat.com>

Signed-off-by: Pingfan Liu <piliu@redhat.com>
2022-06-15 03:35:06 +00:00
Tao Liu
c931d1ba8e kdump-lib.sh: Check the output of blkid with sed instead of eval
upstream: fedora
resolves: bz2096132
conflict: none

commit 2bbc7512a2
Author: Tao Liu <ltao@redhat.com>
Date:   Wed Feb 16 14:26:38 2022 +0800

    kdump-lib.sh: Check the output of blkid with sed instead of eval

    Previously the output of blkid is not checked. If the output
    is empty, the eval will report the following error message:

        /lib/kdump/kdump-lib.sh: eval: line 925: syntax error near unexpected token `;'
        /lib/kdump/kdump-lib.sh: eval: line 925: `; echo $TYPE'

    For example, we can observe such a failing when blkid is invoked
    against a lvm thinpool block device:

        $ blkid -u filesystem,crypto -o export -- "/dev/block/253\:2"
        $ echo $?
        2
        $ udevadm info /dev/block/253\:2|grep S\:
        S: mapper/vg00-thinpoll_tmeta

    In this patch, we will use sed instead of eval, to output the
    fstype of block device if any.

    Signed-off-by: Tao Liu <ltao@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-06-13 10:17:06 +08:00
Tao Liu
9daddc7878 Release 2.0.24-3
Resolves: bz2090533

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-06-07 09:45:44 +08:00
Lichen Liu
fcca486525 kdump.sysconfig*: add ignition.firstboot to KDUMP_COMMANDLINE_REMOVE
Resolves: bz2090533
Upstream: Fedora
Conflict: None

commit 218d9917c0
Author: Dusty Mabe <dusty@dustymabe.com>
Date:   Mon May 16 14:04:12 2022 -0400

    kdump.sysconfig*: add ignition.firstboot to KDUMP_COMMANDLINE_REMOVE

    For CoreOS based systems we use Ignition for provisioning machines
    in the initramfs on first boot. We trigger Ignition right now by
    the presence of `ignition.firstboot` in the kernel command line. The
    kernel argument is only present on first boot so after a reboot it
    no longer is in the kernel command line.

    If a kernel crash happens before the first reboot of a machine we
    want the `ignition.firstboot` kernel argument to be removed and not
    passed on to the crash kernel.

Signed-off-by: Lichen Liu <lichliu@redhat.com>
2022-05-27 10:08:59 +08:00
Tao Liu
ba836a59b2 Release 2.0.24-2
Resolves: bz2059492
Resolves: bz2073676
Resolves: bz2074473

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-05-23 14:34:43 +08:00
Coiby Xu
1778bccc6e remove the upper bound of default crashkernel value example
Resolves: bz2059492
Upstream: Fedora
Conflict: None

commit be20580b06
Author: Coiby Xu <coxu@redhat.com>
Date:   Tue Mar 1 17:33:24 2022 +0800

    remove the upper bound of default crashkernel value example

    Reviewed-by: Philipp Rudo <prudo@redhat.com>
    Signed-off-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-05-17 09:23:10 +00:00
Coiby Xu
c345f54e4f update fadump-howto
Resolves: bz2059492
Upstream: Fedora
Conflict: None

commit 695e5b8676
Author: Coiby Xu <coxu@redhat.com>
Date:   Tue Mar 1 17:30:30 2022 +0800

    update fadump-howto

    1. yum is deprecated so use dnf instead
    2. use the "kdumpctl reset-crashkernel --fadump=on" API

    Reviewed-by: Philipp Rudo <prudo@redhat.com>
    Signed-off-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-05-17 09:23:10 +00:00
Coiby Xu
b749f72da4 update kexec-kdump-howto
Resolves: bz2059492
Upstream: Fedora
Conflict: None

commit 1e7df3e1f3
Author: Coiby Xu <coxu@redhat.com>
Date:   Tue Mar 1 17:30:50 2022 +0800

    update kexec-kdump-howto

    1. yum is deprecated so use dnf instead
    2. use the "kdumpctl reset-crashkernel" API
    3. ask the users to refer to crashkernel-howto.txt for setting custom
       crashkernel value
    4. fix a typo

    Philipp Rudo <prudo@redhat.com>

    Signed-off-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-05-17 09:23:10 +00:00
Coiby Xu
d3dba4cbc3 update crashkernel-howto
Resolves: bz2073676
Upstream: Fedora
Conflict: None

commit 683ff87821
Author: Coiby Xu <coxu@redhat.com>
Date:   Mon Apr 18 15:54:53 2022 +0800

    update crashkernel-howto

    1. clean up left crashkernel.default
    2. fix a few typos and grammar mistakes
    3. ask the users to refer to `man kdumpctl` for reset-crashkernel

    Philipp Rudo <prudo@redhat.com>

    Signed-off-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-05-17 09:23:10 +00:00
Coiby Xu
0e3a77330e add man documentation for kdumpctl get-default-crashkernel
Resolves: bz2073676
Upstream: Fedora
Conflict: None

commit a1c63fa644
Author: Coiby Xu <coxu@redhat.com>
Date:   Tue Mar 1 17:35:39 2022 +0800

    add man documentation for kdumpctl get-default-crashkernel

    A few typos and grammar issues are fixed as well.

    Philipp Rudo <prudo@redhat.com>

    Signed-off-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-05-17 09:23:10 +00:00
Coiby Xu
63deb78f73 improve get_recommend_size
Resolves: bz2074473
Upstream: Fedora
Conflict: None

commit 4f702c81e9
Author: Coiby Xu <coxu@redhat.com>
Date:   Thu May 12 10:48:31 2022 +0800

    improve get_recommend_size

    This patch rewrites get_recommend_size to get rid of the following
    limitations,
    1. only supports ranges in crashkernel sorted in increasing order
    2. the first entry of crashkernel should have only a single digit and
       it's in gigabytes

    Suggested-by: Philipp Rudo <prudo@redhat.com>
    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-05-13 11:22:07 +08:00
Coiby Xu
6439997b49 fix a calculation error in get_system_size
Resolves: bz2074473
Upstream: Fedora
Conflict: None

commit 5c23b6ebb7
Author: Coiby Xu <coxu@redhat.com>
Date:   Sat May 7 16:30:39 2022 +0800

    fix a calculation error in get_system_size

    Recently, it's found 'kdumpctl estimate' returns 512M while the system
    reserves 1024M kdump memory in a case. This happens because the ranges
    in /proc/iomem are inclusively. For example, "0-1: System RAM" means 2
    bytes of system memory other than 1 byte. Fix this error by adding one
    more byte.

    Note
    1. the function has been simplified as well.
    2. define PROC_IOMEM as /proc/iomem for the sake of unit tests

    Reported-by: Ruowen Qin <ruqin@redhat.com>
    Fixes: 1813189 ("kdump-lib.sh: introduce functions to return recommened mem size")
    Suggested-by: Philipp Rudo <prudo@redhat.com>
    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-05-13 11:22:05 +08:00
Tao Liu
23f151d7b5 Release 2.0.24-1
Resolves: bz2081323
Resolves: bz2076159
Resolves: bz2076157

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-05-05 16:07:27 +08:00
Tao Liu
d180b099fd Rebase kexec-tools to v2.0.24
Resolves: bz2076157

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-05-05 16:04:09 +08:00
Tao Liu
b5d4920e52 Rebase makedumpfile to 1.7.1
Resolves: bz2076159

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-05-05 15:46:44 +08:00
Philipp Rudo
6348884a9d Avoid false-positive mem_section validation with vmlinux
Resolves: bz2081323
Upstream: github.com/makedumpfile/makedumpfile.git
Conflicts: None

commit 6d0d95ecc04a70f8448d562ff0fbbae237f5c929
Author: Kazuhito Hagio <k-hagio-ab@nec.com>
Date:   Thu Apr 21 08:58:29 2022 +0900

    [PATCH] Avoid false-positive mem_section validation with vmlinux

    Currently get_mem_section() validates if SYMBOL(mem_section) is the address
    of the mem_section array first.  But there was a report that the first
    validation wrongly returned TRUE with -x vmlinux and SPARSEMEM_EXTREME
    (4.15+) on s390x.  This leads to crash failing statup with the following
    seek error:

      crash: seek error: kernel virtual address: 67fffc2800  type: "memory section root table"

    Skip the first validation when satisfying the conditions.

    Reported-by: Dave Wysochanski <dwysocha@redhat.com>
    Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>
    Reviewed-and-Tested-by: Philipp Rudo <prudo@redhat.com>
    Reviewed-by: Pingfan Liu <piliu@redhat.com>

Signed-off-by: Philipp Rudo <prudo@redhat.com>
2022-05-03 14:47:29 +02:00
Tao Liu
aab4b79f41 Release 2.0.23-10
Resolves: bz2060774
Resolves: bz2060824
Resolves: bz2069200

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-04-08 15:26:19 +08:00
Tao Liu
dd05affb9f try to update the crashkernel in GRUB_ETC_DEFAULT after kexec-tools updates the default crashkernel value
Upstream: fedora
Conflict: none
Resolves: bz2060774

commit 6d4062a936
Author: Coiby Xu <coxu@redhat.com>
Date:   Wed Feb 16 09:42:54 2022 +0800

    try to update the crashkernel in GRUB_ETC_DEFAULT after kexec-tools updates the default crashkernel value

    If GRUB_ETC_DEFAULT use crashkernel=auto or
    crashkernel=OLD_DEFAULT_CRASHKERNEL, it should be updated as well.

    Add a helper function to read kernel cmdline parameter from
    GRUB_ETC_DEFAULT. This function is used to read kernel cmdline
    parameter like fadump or crashkernel.

    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-04-08 15:20:04 +08:00
Tao Liu
738bf03f04 address the case where there are multiple values for the same kernel arg
Resolves: bz2060774
Upstream: fedora
Conflict: none

commit 37f4f2c1f6
Author: Coiby Xu <coxu@redhat.com>
Date:   Tue Feb 15 13:24:19 2022 +0800

    address the case where there are multiple values for the same kernel arg

    There is the case where there are multiple entries of the same parameter on
    the command line, e.g.
    GRUB_CMDLINE_LINUX="crashkernel=110M crashkernel=220M fadump=on crashkernel=330M".

    In such an situation _update_kernel_cmdline_in_grub_etc_default only
    updates/removes the last entry which is usually not what you want as the
    kernel (for crashkernel) takes the last entry it can find.

    Thus make sure the case with multiple entries of the same parameter is
    handled properly by removing all occurrences of given parameter first.

    Note
    1. sed command group and conditional control has been used to get rid of
       grep.
    2. Fully supporting kernel cmdline as documented in
       Documentation/admin-guide/kernel-parameters.rst is complex and in
       foreseeable future a full implementation is not needed. So simply
       document the unsupported cases instead.

    Fixes: 140da74 ("rewrite reset_crashkernel to support fadump and to used by RPM scriptlet")

    Reported-by: Philipp Rudo <prudo@redhat.com>
    Suggested-by: Philipp Rudo <prudo@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-04-08 15:19:47 +08:00
Philipp Rudo
673f93346e s390: add support for --reuse-cmdline
Resolves: bz2060824
Upstream: git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
Conflicts: None

commit 2e1ec106dc5aac951ba884ebe4cca036e9a2d45f
Author: Sven Schnelle <svens@linux.ibm.com>
Date:   Thu Dec 16 12:43:56 2021 +0100

    s390: add support for --reuse-cmdline

    --reuse-cmdline reads the command line of the currently
    running kernel from /proc/cmdline and uses that for the
    kernel that should be kexec'd.

    Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
    Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
    Signed-off-by: Simon Horman <horms@verge.net.au>

Signed-off-by: Philipp Rudo <prudo@redhat.com>
2022-04-06 18:05:47 +02:00
Philipp Rudo
1ac74b6d66 use slurp_proc_file() in get_command_line()
Resolves: bz2060824
Upstream: git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
Conflicts: None

commit d6516ba4c88f217fe14455db92c60cd0e9af18f8
Author: Sven Schnelle <svens@linux.ibm.com>
Date:   Thu Dec 16 12:43:55 2021 +0100

    use slurp_proc_file() in get_command_line()

    This way the size of the command line that get_command_line() can handle
    is no longer fixed.

    Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
    Signed-off-by: Simon Horman <horms@verge.net.au>

Signed-off-by: Philipp Rudo <prudo@redhat.com>
2022-04-06 18:05:47 +02:00
Philipp Rudo
9e3d3bc043 add slurp_proc_file()
Resolves: bz2060824
Upstream: git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
Conflicts: None

commit 193e51deccc62544f6423eb5e5eefc8a23aad679
Author: Sven Schnelle <svens@linux.ibm.com>
Date:   Thu Dec 16 12:43:54 2021 +0100

    add slurp_proc_file()

    slurp_file() cannot be used to read proc files, as they are returning
    a size of zero in stat(). Add a function slurp_proc_file() which is
    similar to slurp_file(), but doesn't require the size of the file to
    be known.

    Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
    Signed-off-by: Simon Horman <horms@verge.net.au>

Signed-off-by: Philipp Rudo <prudo@redhat.com>
2022-04-06 18:05:47 +02:00
Philipp Rudo
f8e3f42ec1 s390: use KEXEC_ALL_OPTIONS
Resolves: bz2060824
Upstream: git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
Conflicts: None

commit 91a3d0e00a5c18ee9bdd2c6c03ac64a6471e2559
Author: Sven Schnelle <svens@linux.ibm.com>
Date:   Thu Dec 16 12:43:53 2021 +0100

    s390: use KEXEC_ALL_OPTIONS

    KEXEC_ALL_OPTIONS could be used instead defining the same
    array several times. This makes code easier to maintain when
    new options are added.

    Suggested-by: Alexander Egorenkov <egorenar@linux.ibm.com>
    Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
    Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
    Signed-off-by: Simon Horman <horms@verge.net.au>

Signed-off-by: Philipp Rudo <prudo@redhat.com>
2022-04-06 18:05:47 +02:00
Philipp Rudo
5c6595e4c0 s390: add variable command line size
Resolves: bz2060824
Upstream: git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
Conflicts: None

commit defb80a20bf1e4d778596ce2447e19d44f31ae5a
Author: Sven Schnelle <svens@linux.ibm.com>
Date:   Thu Dec 16 12:43:52 2021 +0100

    s390: add variable command line size

    Newer s390 kernels support a command line size longer than 896
    bytes. Such kernels contain a new member in the parameter area,
    which might be utilized by tools like kexec. Older kernels have
    the location initialized to zero, so we check whether there's a
    non-zero number present and use that. If there isn't, we fallback
    to the legacy command line size of 896 bytes.

    Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
    Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
    Signed-off-by: Simon Horman <horms@verge.net.au>

Signed-off-by: Philipp Rudo <prudo@redhat.com>
2022-04-06 18:05:47 +02:00
Philipp Rudo
574a202f62 util_lib/elf_info: harden parsing of printk buffer
Resolves: bz2069200
Upstream: git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
Conflicts: None

commit f4c59879b830c7d574a953e6ce970ddaf20910d7
Author: Philipp Rudo <prudo@redhat.com>
Date:   Wed Mar 23 16:35:36 2022 +0100

    util_lib/elf_info: harden parsing of printk buffer

    The old printk mechanism (> v3.5.0 and < v5.10.0) had a fixed size
    buffer (log_buf) that contains all messages. The location for the next
    message is stored in log_next_idx. In case the log_buf runs full
    log_next_idx wraps around and starts overwriting old messages at the
    beginning of the buffer. The wraparound is denoted by a message with
    msg->len == 0.

    Following the behavior described above blindly is dangerous as e.g. a
    memory corruption could overwrite (parts of) the log_buf. If the
    corruption adds a message with msg->len == 0 this leads to an endless
    loop when dumping the dmesg. Fix this by verifying that not wrapped
    around before when it encounters a message with msg->len == 0.

    While at it also verify that the index is within the log_buf and thus
    guard against corruptions with msg->len != 0.

    The same bug has been reported and fixed in makedumpfile [1].

    [1] http://lists.infradead.org/pipermail/kexec/2022-March/024272.html

    Signed-off-by: Philipp Rudo <prudo@redhat.com>
    Signed-off-by: Simon Horman <horms@verge.net.au>

Signed-off-by: Philipp Rudo <prudo@redhat.com>
2022-03-28 19:20:45 +02:00
Philipp Rudo
7620d676de print error when reading with unsupported compression
Resolves: bz2069200
Upstream: github.com/makedumpfile/makedumpfile.git
Conflicts: goto out_error --> return FALSE
	   due to missing 64b5b29 ("[PATCH 03/15] remove variable length
	   array in readpage_kdump_compressed()")

commit 5035c0821f07da3badda645cd0064d4b80e1667d
Author: Philipp Rudo <prudo@redhat.com>
Date:   Mon Mar 14 17:04:32 2022 +0100

    [PATCH] print error when reading with unsupported compression

    Currently makedumpfile only checks if the required compression algorithm
    was enabled during build when compressing a dump but not when reading
    from one. This can lead to situations where, one version of makedumpfile
    creates the dump using a compression algorithm an other version of
    makedumpfile doesn't support. When the second version now tries to, e.g.
    extract the dmesg from the dump it will fail with an error similar to

      # makedumpfile --dump-dmesg vmcore dmesg.txt
      __vtop4_x86_64: Can't get a valid pgd.
      readmem: Can't convert a virtual address(ffffffff92e18284) to physical address.
      readmem: type_addr: 0, addr:ffffffff92e18284, size:390
      check_release: Can't get the address of system_utsname.

      makedumpfile Failed.

    That's because readpage_kdump_compressed{_parallel} does not return
    with an error if the page it is trying to read is compressed with an
    unsupported compression algorithm. Thus readmem copies random data from
    the (uninitialized) cachebuf to its caller and thus causing the error
    above.

    Fix this by checking if the required compression algorithm is supported
    in readpage_kdump_compressed{_parallel} and print a proper error message
    if it isn't.

    Reported-by: Dave Wysochanski <dwysocha@redhat.com>
    Signed-off-by: Philipp Rudo <prudo@redhat.com>
    Reviewed-and-tested-by: Dave Wysochanski <dwysocha@redhat.com>
    Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>

Signed-off-by: Philipp Rudo <prudo@redhat.com>
2022-03-28 19:19:43 +02:00
Philipp Rudo
fb25c11f79 use cycle detection when parsing the prink log_buf
Resolves: bz2069200
Upstream: github.com/makedumpfile/makedumpfile.git
Conflicts: None

commit 68d120b30af5e930afafed81e79712af3c1a278c
Author: Philipp Rudo <prudo@redhat.com>
Date:   Mon Mar 14 17:04:31 2022 +0100

    [PATCH v2 3/3] use cycle detection when parsing the prink log_buf

    The old printk mechanism (> v3.5.0 and < v5.10.0) had a fixed size
    buffer (log_buf) that contains all messages. The location for the next
    message is stored in log_next_idx. In case the log_buf runs full
    log_next_idx wraps around and starts overwriting old messages at the
    beginning of the buffer. The wraparound is denoted by a message with
    msg->len == 0.

    Following the behavior described above blindly in makedumpfile is
    dangerous as e.g. a memory corruption could overwrite (parts of) the
    log_buf. If the corruption adds a message with msg->len == 0 this leads
    to an endless loop when dumping the dmesg with makedumpfile appending
    the messages up to the corruption over and over again to the output file
    until file system is full. Fix this by using cycle detection and aboard
    once one is detected.

    While at it also verify that the index is within the log_buf and thus
    guard against corruptions with msg->len != 0.

    Reported-by: Audra Mitchell <aubaker@redhat.com>
    Suggested-by: Dave Wysochanski <dwysocha@redhat.com>
    Signed-off-by: Philipp Rudo <prudo@redhat.com>
    Reviewed-and-tested-by: Dave Wysochanski <dwysocha@redhat.com>

Signed-off-by: Philipp Rudo <prudo@redhat.com>
2022-03-28 18:39:31 +02:00
Philipp Rudo
92f7d2ba5e use pointer arithmetics for dump_dmesg
Resolves: bz2069200
Upstream: github.com/makedumpfile/makedumpfile.git
Conflicts: None

commit e1d2e5302b016c6f7942f46ffa27aa31326686c5
Author: Philipp Rudo <prudo@redhat.com>
Date:   Mon Mar 14 17:04:30 2022 +0100

    [PATCH v2 2/3] use pointer arithmetics for dump_dmesg

    When parsing the printk buffer for the old printk mechanism (> v3.5.0+ and
    < 5.10.0) a log entry is currently specified by the offset into the
    buffer where the entry starts. Change this to use a pointers instead.
    This is done in preparation for using the new cycle detection mechanism.

    Signed-off-by: Philipp Rudo <prudo@redhat.com>
    Reviewed-and-tested-by: Dave Wysochanski <dwysocha@redhat.com>
    Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>

Signed-off-by: Philipp Rudo <prudo@redhat.com>
2022-03-28 18:39:04 +02:00
Philipp Rudo
90da030e61 add generic cycle detection
Resolves: bz2069200
Upstream: github.com/makedumpfile/makedumpfile.git
Conflicts: None

commit feae3d1754d2b0788ce1f18b0cd4b40098ff52ff
Author: Philipp Rudo <prudo@redhat.com>
Date:   Mon Mar 14 17:04:29 2022 +0100

    [PATCH v2 1/3] add generic cycle detection

    In order to work makedumpfile needs to interpret data read from the
    dump. This can cause problems as the data from the dump cannot be
    trusted (otherwise the kernel wouldn't have panicked in the first
    place). This also means that every loop which stop condition depend on
    data read from the dump has a chance to loop forever. Thus add a generic
    cycle detection mechanism that allows to detect and handle such
    situations appropriately.

    For cycle detection use Brent's algorithm [1] as it has constant memory
    usage. With this it can also be used in the kdump kernel without the
    danger that it runs oom when iterating large data structures.
    Furthermore it only depends on some pointer arithmetic. Thus the
    performance impact (as long as no cycle was detected) should be
    comparatively small.

    [1] https://en.wikipedia.org/wiki/Cycle_detection#Brent's_algorithm

    Suggested-by: Dave Wysochanski <dwysocha@redhat.com>
    Signed-off-by: Philipp Rudo <prudo@redhat.com>
    Reviewed-and-tested-by: Dave Wysochanski <dwysocha@redhat.com>

Signed-off-by: Philipp Rudo <prudo@redhat.com>
2022-03-28 18:38:30 +02:00
Tao Liu
1eadf2b8a8 Release 2.0.23-9
Resolves: bz2055498

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-02-25 09:58:26 +08:00
Tao Liu
6e2f52ede1 makedumpfile: sadump, kaslr: fix failure of calculating kaslr_offset
upstream: fedora
resolves: bz2055498
conflict: none

commit 59b1726fbcc251155140c8a1972384498fee4daf
Author: HATAYAMA Daisuke <d.hatayama@fujitsu.com>
Date:   Tue Jan 25 12:55:15 2022 +0000

    [PATCH] sadump, kaslr: fix failure of calculating kaslr_offset

    On kernels v5.8 or later, makedumpfile fails for memory dumps in the
    sadump-related formats as follows:

        # makedumpfile -f -l -d 31 -x ./vmlinux /dev/sdd4 /root/vmcore-ld31
        __vtop4_x86_64: Can't get a valid pud_pte.
        ...110 lines of the same message...
        __vtop4_x86_64: Can't get a valid pud_pte.
        calc_kaslr_offset: failed to calculate kaslr_offset and phys_base; default to 0
        readmem: type_addr: 1, addr:ffffffff85411858, size:8
        __vtop4_x86_64: Can't get pgd (page_dir:ffffffff85411858).
        readmem: Can't convert a virtual address(ffffffff059be980) to physical address.
        readmem: type_addr: 0, addr:ffffffff059be980, size:1024
        cpu_online_mask_init: Can't read cpu_online_mask memory.

        makedumpfile Failed.

    This is caused by the kernel commit 9d06c4027f21 ("x86/entry: Convert
    Divide Error to IDTENTRY") that renamed divide_error to
    asm_exc_divide_error, breaking logic for calculating kaslr offset.

    Fix this by adding initialization of asm_exc_divide_error.

    Signed-off-by: HATAYAMA Daisuke <d.hatayama@fujitsu.com>

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-02-25 09:53:51 +08:00
Tao Liu
c19658357d Release 2.0.23-8
Resolves: bz2051822

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-02-21 10:37:44 +08:00
Coiby Xu
06b51aac41 update kernel crashkernel in posttrans RPM scriptlet when updating kexec-tools
Resolves: bz2051822
Upstream: Fedora
Conflict: None

commit 311b5b100b
Author: Coiby Xu <coxu@redhat.com>
Date:   Fri Feb 11 13:11:17 2022 +0800

    update kernel crashkernel in posttrans RPM scriptlet when updating kexec-tools

    When doing in-place upgrading using leapp on x86_64, kdumpcl can't
    acquire instance lock when running in %post RPM scriplet on x86_64,
      localhost upgrade[1306]: /bin/kdumpctl: line 49: /var/lock/kdump: No such file or directory
      localhost upgrade[1306]: kdump: Create file lock failed

    and running "touch /var/lock/dkump" also fails with
    "No such file or directory". Thus kdumpctl can't be run in %post
    scriptlet. But kdumpctl can be run in %posttrans RPM scriplet.

    Besides, it's better to update crashkernel after the kernel has been
    updated. So let's update kernel crashkernel in the %posttrans
    scriptlet which will be run in the end of a transaction i.e. after
    the kernel has been updated.

    Note for %posttrans scriptlet, "$1 == 1" means both installing a new
    package and upgrading a package.

    [1] https://github.com/apptainer/singularity/issues/2386#issuecomment-474747054

    Reported-by: Jie Li <jieli@redhat.com>
    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-02-18 09:05:25 +08:00
Tao Liu
e2d1f68d0f Release 2.0.23-7
Related: bz1895258
Resolves: bz2024976

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-02-14 12:37:55 +08:00
Coiby Xu
9b4f6070ec fix incorrect usage of _get_all_kernels_from_grubby
Related: bz1895258
Upstream: Fedora
Conflict: None

commit 41b8f9528c
Author: Coiby Xu <coxu@redhat.com>
Date:   Wed Feb 9 08:04:39 2022 +0800

    fix incorrect usage of _get_all_kernels_from_grubby

    It's found that the kernel cmdline crashkernel=auto doesn't get updated
    when upgrading kexec-tools. This happens because _get_all_kernels_from_grubby
    is called with no argument by reset_crashkernel_after_update. When retrieving
    all kernel paths on the system, "grubby --info ALL" should be used. Fix this
    error by passing "ALL" argument.

    Fixes: 0adb0f4 ("try to reset kernel crashkernel when kexec-tools updates the default crashkernel value")

    Reported-by: Jie Li <jieli@redhat.com>
    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Tao Liu <ltao@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-02-14 11:04:47 +08:00
Coiby Xu
7a61e066b7 fix the mistake of swapping function parameters of read_proc_environ_var
Resolves: bz2024976
Upstream: Fedora
Conflict: None

commit 5111c01334
Author: Coiby Xu <coxu@redhat.com>
Date:   Mon Feb 7 08:08:01 2022 +0800

    fix the mistake of swapping function parameters of read_proc_environ_var

    _is_osbuild fails because it expects the 1st and 2nd function parameter
    to be the environment variable and environ file path respectively. Fix
    it by swapping the parameters in read_proc_environ_var.

    Note the osbuild environ file path is defined in _OSBUILD_ENVIRON_PATH
    so _is_osbuild can be unit-tested by overwriting _OSBUILD_ENVIRON_PATH.

    Fixes: 6a3ce83 ("fix the error of parsing the container environ variable for osbuild")
    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Pingfan Liu <piliu@redhat.com>
2022-02-08 11:14:53 +08:00
Tao Liu
32276a35f2 Release 2.0.23-6
Resolves: bz2031736
Resolves: bz2017196
Resolves: bz2041911
Resolves: bz2017121
Resolves: bz2042726
Resolves: bz2045971
Resolves: bz2045969
Resolves: bz2024976

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-01-26 15:05:37 +08:00
Tao Liu
2bc8a0d274 Revert "Remove trace_buf_size and trace_event from the kernel bootparameters of the kdump kernel"
upstream: fedora
conflict: none
resolves: bz2042726

commit 99de77bba7
Author: Tao Liu <ltao@redhat.com>
Date:   Fri Jan 21 16:08:47 2022 +0800

    Revert "Remove trace_buf_size and trace_event from the kernel bootparameters of the kdump kernel"

    There is a mechanism to keep memory consumption minimum, i.e. equal
    to trace_buf_size=1, until tracing by ftrace is actually started:

        tracing: keep ring buffer to minimum size till used
        73c5162aa3

    Since ftrace is usually never used in the kdump 2nd kernel, the kdump
    2nd kernel behaves in the same way with or without trace_buf_size=1.
    So the issue which the patch want to solve never exists. Let's revert
    the patch for better maintainance and avoid confusion.

    ref link: https://bugzilla.redhat.com/show_bug.cgi?id=2034501#c20

    This reverts commit f39000f.

    Signed-off-by: Tao Liu <ltao@redhat.com>
    Acked-by: Coiby Xu <coxu@redhat.com>

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-01-26 14:58:39 +08:00
Coiby Xu
80a37c95ee fix broken kdump_get_arch_recommend_size
Resolves: bz2045971
Upstream: Fedora
Conflict: None

commit 2df55984f6
Author: Coiby Xu <coxu@redhat.com>
Date:   Wed Jan 26 08:48:18 2022 +0800

    fix broken kdump_get_arch_recommend_size

    shellcheck finds the following problem,
    $ shellcheck kdump-lib.sh
    In kdump-lib.sh line 876:
            get_recommend_size "$sys_mem" "$ck_cmdline"
                                           ^---------^ SC2154: ck_cmdline is referenced but not assigned (did you mean '_ck_cmdline'?).

    s/ck_cmdline/_ck_cmdline to fix kdump_get_arch_recommend_size.

    Note s/sys_mem/_sys_mem as well to make the changes consistent.

    Fixes: 105c016 ("factor out kdump_get_arch_recommend_crashkernel")
    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Acked-by: Tao Liu <ltao@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-01-26 14:41:01 +08:00
Coiby Xu
4e511ca370 remove the upper bound of 102400T for the range in default crashkernel
Resolves: bz2045969
Upstream: Fedora
Conflict: None

commit c67a836cde
Author: Coiby Xu <coxu@redhat.com>
Date:   Tue Jan 25 16:16:58 2022 +0800

    remove the upper bound of 102400T for the range in default crashkernel

    This patch makes the default crashkernel value consistent with previous
    one.

    Fixes: 105c016 ("factor out kdump_get_arch_recommend_crashkernel")
    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-01-26 14:40:17 +08:00
Coiby Xu
e943713401 fix the error of parsing the container environ variable for osbuild
Resolves: bz2024976
Upstream: Fedora
Conflict: None

commit 6a3ce83a60
Author: Coiby Xu <coxu@redhat.com>
Date:   Wed Jan 19 11:16:29 2022 +0800

    fix the error of parsing the container environ variable for osbuild

    The environment variable entries in /proc/[pid]/environ are separated by
    null bytes instead of by spaces. Update the sed regex to fix this issue.

    Note that,
      1. this patch also fixes a issue which is kdumpctl would try to reset
         crashkernel even osbuild has provided custom crashkernel value.
      2. kernel hook 92-crashkernel.install installed by kexec-tools is
         guaranteed to be ran by kernel-install. kexec-tools doesn't recommend
         kernel so there is no guarantee kernel is installed after kexec-tools.
         But dnf invokes kernel-install in the posttrans scriptlet (of kernel-core)
         which is always ran after all packages including kexec-tools and kernel
         in a dnf transaction.
      3. To be able to do unit tests, the logic of reading environment variable
         has been extracted as a separate function.

    Fixes: ddd428a ("set up kernel crashkernel for osbuild in kernel hook")
    Signed-off-by: Coiby Xu <coxu@redhat.com>
    Reviewed-by: Pingfan Liu <piliu@redhat.com>
    Reviewed-by: Philipp Rudo <prudo@redhat.com>

Signed-off-by: Coiby Xu <coxu@redhat.com>
2022-01-26 14:39:13 +08:00
Pingfan Liu
149a6dc9b2 mkdumprd (RHEL-only): add nvme module by force
Resolves: bz2017121
Upstream: RHEL-only

The following dracut commit can not detect the nvme module in some special
case, which causes vmcore dump failure.

commit c86f4d286000d1e76fd405560b4114537e2cbbff
Author: Pingfan Liu <piliu@redhat.com>
Date:   Wed Jul 28 18:13:43 2021 +0800

    fix(kernel-modules): detect block device's hardware driver

As a workaround, adding nvme module by force at present, and after a
real fix in dracut, we can revert this patch.

Signed-off-by: Pingfan Liu <piliu@redhat.com>
2022-01-26 11:19:56 +08:00
Philipp Rudo
4559f7bbd5 Remove dropped patches
Resolves: bz2041911
Upstream: RHEL-only

When updating makedumpfile to 1.7.0 the spec file was updated to no
longer apply these patches without removing the actual patch files.
Remove them now.

Fixes: d77fd26 ("Update makedumpfile to 1.7.0")
Signed-off-by: Philipp Rudo <prudo@redhat.com>
2022-01-19 10:35:02 +01:00