fc66e25f7b
Upstream: fedora Resolves: RHEL-70214 Conflict: Yes, the conflict is the same as the original c9s commitc5aa4609
("Introduce vmcore creation notification to kdump")9ec61f6c
("Return the correct exit code of rebuild initrd") Also this patch cherry-picked the ipv6 fixed in [1]. [1]: https://github.com/rhkdump/kdump-utils/pull/60/files commit 24e76222c740def1d03a506652400fe55959e024 Author: Tao Liu <ltao@redhat.com> Date: Fri Nov 29 16:15:18 2024 +1300 Re-introduce vmcore creation notification to kdump Motivation ========== People may forget to recheck to ensure kdump works, which as a result, a possibility of no vmcores generated after a real system crash. It is unexpected for kdump. It is highly recommended people to test kdump after any system modification, such as: a. after kernel patching or whole yum update, as it might break something on which kdump is dependent, maybe due to introduction of any new bug etc. b. after any change at hardware level, maybe storage, networking, firmware upgrading etc. c. after implementing any new application, like which involves 3rd party modules etc. Though these exceed the range of kdump, however a simple vmcore creation status notification is good to have for now. Design ====== Kdump currently will check any relating files/fs/drivers modified before determine if initrd should rebuild when (re)start. A rebuild is an indicator of such modification, and kdump need to be tested. This will clear the vmcore creation status specified in $VMCORE_CREATION_STATUS, and as a result, a notification of vmcore creation test will be outputted. To test kdump, there is an entry for doing that by "kdumpctl test". It will generate a timestamp string as the ID of the current test, along with a "pending" status in $VMCORE_CREATION_STATUS, then a real crash & dump process will be triggered. After system reboot back to normal, a vmcore creation check will start at "kdumpctl (re)start/status", and will report the results as success/fail/manual status to users. To achieve that, program will first check the status in $VMCORE_CREATION_STATUS. If "pending" status if found, which means the test result is undetermined and need a retrive from remote/local dump folder. Then if test id is found in the dump folder and vmcore is complete, then "pending" would be overwritten by "success", which indicates a successful kdump test. If test id is found in the dump folder but vmcore is incomplete, then it is a "fail" kdump test. If no test id is found, then it is a "manual" status, which indicates users should check the test results manually. If $VMCORE_CREATION_STATUS is already success/fail/manual status, it indicates the test result has already been determined, so the program will not access the remote/local dump folder again. This can limite any unnecessary access to dump target, shorten the time consumption. User should check for the root cause of fail/manual status when get reports. $VMCORE_CREATION_STATUS is used for recording the vmcore creation status of the current env. The format is like: <status> kdump_test_id=<timestamp sec>-<timestamp nanosec> e.g: success kdump_test_id=1729823462-938751820 Which means, there has been a successful kdump test at $(date -d "@1729823462") timestamp for the current env. Timestamp nanosec is only meaningful for uniquify id string. Difference ========== Previously there is one commit 88525ebf ("Introduce vmcore creation notification to kdump") merged and addressing the same issue, but implemented differently: The prev one: Save the $VMCORE_CREATION_STATUS to local drive during the 2nd kernel dumping. If vmcore dumping target is different from $VMCORE_CREATION_STATUS's drive, then the latter one need to be mounted in 2nd kernel. This one: Save the $VMCORE_CREATION_STATUS to local drive only in 1nd kernel, that is, the test result is retrived after 2nd kernel dumping. So it doesn't load or mount other drive in 2nd kernel. The advantage: Extra mounting in 2nd kernel will introduce higher risk of failure, as a result, lower the success of vmcore dumping, which is unaccepted. So keep the code for 2nd kernel as simple is preferred. Usage ===== [root@localhost ~]# kdumpctl restart kdump: kexec: unloaded kdump kernel kdump: Stopping kdump: [OK] kdump: kexec: loaded kdump kernel kdump: Starting kdump: [OK] kdump: Notice: No vmcore creation test performed! [root@localhost ~]# kdumpctl status kdump: Kdump is operational kdump: Notice: No vmcore creation test performed! [root@localhost ~]# kdumpctl test [root@localhost ~]# cat /var/lib/kdump/vmcore-creation.status pending kdump_test_id=1729823462-938751820 [root@localhost ~]# kdumpctl status kdump: Kdump is operational kdump: Notice: Last successful vmcore creation on Fri Oct 25 02:31:02 AM UTC 2024 [root@localhost ~]# cat /var/lib/kdump/vmcore-creation.status success kdump_test_id=1729823462-938751820 [root@localhost ~]# kdumpctl restart kdump: kexec: unloaded kdump kernel kdump: Stopping kdump: [OK] kdump: kexec: loaded kdump kernel kdump: Starting kdump: [OK] kdump: Notice: Last successful vmcore creation on Fri Oct 25 02:31:02 AM UTC 2024 Note: the notification for kdumpctl (re)start/status can be disabled by setting VMCORE_CREATION_NOTIFICATION in /etc/sysconfig/kdump. And fadump is NOT supported for this feature. Signed-off-by: Tao Liu <ltao@redhat.com> Signed-off-by: Tao Liu <ltao@redhat.com>
61 lines
2.6 KiB
Plaintext
61 lines
2.6 KiB
Plaintext
# Kernel Version string for the -kdump kernel, such as 2.6.13-1544.FC5kdump
|
|
# If no version is specified, then the init script will try to find a
|
|
# kdump kernel with the same version number as the running kernel.
|
|
KDUMP_KERNELVER=""
|
|
|
|
# The kdump commandline is the command line that needs to be passed off to
|
|
# the kdump kernel. This will likely match the contents of the grub kernel
|
|
# line. For example:
|
|
# KDUMP_COMMANDLINE="ro root=LABEL=/"
|
|
# Dracut depends on proper root= options, so please make sure that appropriate
|
|
# root= options are copied from /proc/cmdline. In general it is best to append
|
|
# command line options using "KDUMP_COMMANDLINE_APPEND=".
|
|
# If a command line is not specified, the default will be taken from
|
|
# /proc/cmdline
|
|
KDUMP_COMMANDLINE=""
|
|
|
|
# This variable lets us remove arguments from the current kdump commandline
|
|
# as taken from either KDUMP_COMMANDLINE above, or from /proc/cmdline
|
|
# NOTE: some arguments such as crashkernel will always be removed
|
|
KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet log_buf_len swiotlb cma hugetlb_cma ignition.firstboot"
|
|
|
|
# This variable lets us append arguments to the current kdump commandline
|
|
# after processed by KDUMP_COMMANDLINE_REMOVE
|
|
KDUMP_COMMANDLINE_APPEND="irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 acpi_no_memhotplug transparent_hugepage=never nokaslr hest_disable novmcoredd cma=0 hugetlb_cma=0 pcie_ports=compat"
|
|
|
|
# Any additional kexec arguments required. In most situations, this should
|
|
# be left empty
|
|
#
|
|
# Example:
|
|
# KEXEC_ARGS="--elf32-core-headers"
|
|
KEXEC_ARGS="-s"
|
|
|
|
#Where to find the boot image
|
|
#KDUMP_BOOTDIR="/boot"
|
|
|
|
#What is the image type used for kdump
|
|
KDUMP_IMG="vmlinuz"
|
|
|
|
#What is the images extension. Relocatable kernels don't have one
|
|
KDUMP_IMG_EXT=""
|
|
|
|
# Enable vmcore creation notification by default, disable by setting
|
|
# VMCORE_CREATION_NOTIFICATION=""
|
|
VMCORE_CREATION_NOTIFICATION="yes"
|
|
|
|
# Logging is controlled by following variables in the first kernel:
|
|
# - @var KDUMP_STDLOGLVL - logging level to standard error (console output)
|
|
# - @var KDUMP_SYSLOGLVL - logging level to syslog (by logger command)
|
|
# - @var KDUMP_KMSGLOGLVL - logging level to /dev/kmsg (only for boot-time)
|
|
#
|
|
# In the second kernel, kdump will use the rd.kdumploglvl option to set the
|
|
# log level in the above KDUMP_COMMANDLINE_APPEND.
|
|
# - @var rd.kdumploglvl - logging level to syslog (by logger command)
|
|
# - for example: add the rd.kdumploglvl=3 option to KDUMP_COMMANDLINE_APPEND
|
|
#
|
|
# Logging levels: no logging(0), error(1),warn(2),info(3),debug(4)
|
|
#
|
|
# KDUMP_STDLOGLVL=3
|
|
# KDUMP_SYSLOGLVL=0
|
|
# KDUMP_KMSGLOGLVL=0
|