kexec-tools

Author	SHA1	Message	Date
Coiby Xu	6d4062a936	try to update the crashkernel in GRUB_ETC_DEFAULT after kexec-tools updates the default crashkernel value If GRUB_ETC_DEFAULT use crashkernel=auto or crashkernel=OLD_DEFAULT_CRASHKERNEL, it should be updated as well. Add a helper function to read kernel cmdline parameter from GRUB_ETC_DEFAULT. This function is used to read kernel cmdline parameter like fadump or crashkernel. Reviewed-by: Philipp Rudo <prudo@redhat.com>	2022-03-01 10:29:20 +08:00
Coiby Xu	37f4f2c1f6	address the case where there are multiple values for the same kernel arg There is the case where there are multiple entries of the same parameter on the command line, e.g. GRUB_CMDLINE_LINUX="crashkernel=110M crashkernel=220M fadump=on crashkernel=330M". In such an situation _update_kernel_cmdline_in_grub_etc_default only updates/removes the last entry which is usually not what you want as the kernel (for crashkernel) takes the last entry it can find. Thus make sure the case with multiple entries of the same parameter is handled properly by removing all occurrences of given parameter first. Note 1. sed command group and conditional control has been used to get rid of grep. 2. Fully supporting kernel cmdline as documented in Documentation/admin-guide/kernel-parameters.rst is complex and in foreseeable future a full implementation is not needed. So simply document the unsupported cases instead. Fixes: `140da74` ("rewrite reset_crashkernel to support fadump and to used by RPM scriptlet") Reported-by: Philipp Rudo <prudo@redhat.com> Suggested-by: Philipp Rudo <prudo@redhat.com> Reviewed-by: Philipp Rudo <prudo@redhat.com>	2022-03-01 10:28:53 +08:00
Coiby Xu	41b8f9528c	fix incorrect usage of _get_all_kernels_from_grubby It's found that the kernel cmdline crashkernel=auto doesn't get updated when upgrading kexec-tools. This happens because _get_all_kernels_from_grubby is called with no argument by reset_crashkernel_after_update. When retrieving all kernel paths on the system, "grubby --info ALL" should be used. Fix this error by passing "ALL" argument. Fixes: `0adb0f4` ("try to reset kernel crashkernel when kexec-tools updates the default crashkernel value") Reported-by: Jie Li <jieli@redhat.com> Signed-off-by: Coiby Xu <coxu@redhat.com> Reviewed-by: Tao Liu <ltao@redhat.com>	2022-02-14 10:34:55 +08:00
Coiby Xu	5111c01334	fix the mistake of swapping function parameters of read_proc_environ_var _is_osbuild fails because it expects the 1st and 2nd function parameter to be the environment variable and environ file path respectively. Fix it by swapping the parameters in read_proc_environ_var. Note the osbuild environ file path is defined in _OSBUILD_ENVIRON_PATH so _is_osbuild can be unit-tested by overwriting _OSBUILD_ENVIRON_PATH. Fixes: `6a3ce83` ("fix the error of parsing the container environ variable for osbuild") Signed-off-by: Coiby Xu <coxu@redhat.com> Reviewed-by: Pingfan Liu <piliu@redhat.com>	2022-02-08 10:42:40 +08:00
Coiby Xu	6a3ce83a60	fix the error of parsing the container environ variable for osbuild The environment variable entries in /proc/[pid]/environ are separated by null bytes instead of by spaces. Update the sed regex to fix this issue. Note that, 1. this patch also fixes a issue which is kdumpctl would try to reset crashkernel even osbuild has provided custom crashkernel value. 2. kernel hook 92-crashkernel.install installed by kexec-tools is guaranteed to be ran by kernel-install. kexec-tools doesn't recommend kernel so there is no guarantee kernel is installed after kexec-tools. But dnf invokes kernel-install in the posttrans scriptlet (of kernel-core) which is always ran after all packages including kexec-tools and kernel in a dnf transaction. 3. To be able to do unit tests, the logic of reading environment variable has been extracted as a separate function. Fixes: `ddd428a` ("set up kernel crashkernel for osbuild in kernel hook") Signed-off-by: Coiby Xu <coxu@redhat.com> Reviewed-by: Pingfan Liu <piliu@redhat.com> Reviewed-by: Philipp Rudo <prudo@redhat.com>	2022-01-26 08:32:06 +08:00
Coiby Xu	ae0cbdf34a	fix "kdump: Invalid kdump config option auto_reset_crashkernel" error kdumpctl only accepts a specified set of options. Add auto_reset_crashkernel to this set. Fixes: `73ced7f` ("introduce the auto_reset_crashkernel option to kdump.conf") Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Tao Liu <ltao@redhat.com>	2022-01-07 12:20:37 +08:00
Coiby Xu	d5c31605f3	use grep -s to suppress error messages about nonexistent or unreadable files When a file doesn't exist or isn't readable, grep complains as follows, grep: /proc/cmdline: No such file or directory grep: /etc/kernel/cmdline: No such file or directory /proc/cmdline doesn't exist when installing package for an OS image and /etc/kernel/cmdline may not exist if osbuild doesn't want set custom kernel cmdline. Use "-s" to suppress the error messages. Fixes: `0adb0f4` ("try to reset kernel crashkernel when kexec-tools updates the default crashkernel value") Fixes: `ddd428a` ("set up kernel crashkernel for osbuild in kernel hook") Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Tao Liu <ltao@redhat.com>	2022-01-07 12:20:21 +08:00
Coiby Xu	ddd428a1d0	set up kernel crashkernel for osbuild in kernel hook osbuild is a tool to build OS images. It uses bwrap to install packages inside a sandbox/container. Since the kernel package recommends kexec-tools which in turn recommends grubby, the installation order would be grubby -> kexec-tools -> kernel. So we can use the kernel hook 92-crashkernel.install provided by kexec-tools to set up kernel crashkernel for the target OS image. But in osbuild's case, there is no current running kernel and running `uname -r` in the container/sandbox actually returns the host kernel release. To set up kernel crashkernel for the OS image built by osbuild, a different logic is needed. We will check if kernel hook is running inside the osbuild container then set up kernel crashkernel only if osbuild hasn't specified a custome value. osbuild exposes [1] the container=bwrap-osbuild environment variable. According to [2], the environment variable is not inherited down the process tree, so we need to check /proc/1/environ to detect this environment variable to tell if the kernel hook is running inside a bwrap-osbuild container. After that we need to know if osbuild wants to use custom crashkernel value. This is done by checking if /etc/kernel/cmdline has crashkernel set [3]. /etc/kernel/cmdline is written before packages are installed. [1] https://github.com/osbuild/osbuild/pull/926 [2] https://systemd.io/CONTAINER_INTERFACE/ [3] https://bugzilla.redhat.com/show_bug.cgi?id=2024976#c5 Reviewed-by: Pingfan Liu <piliu@redhat.com> Reviewed-by: Philipp Rudo <prudo@redhat.com> Signed-off-by: Coiby Xu <coxu@redhat.com>	2022-01-05 09:40:24 +08:00
Coiby Xu	5e8c751c39	reset kernel crashkernel for the special case where the kernel is updated right after kexec-tools When kexec-tools updates the default crashkernel value, it will try to reset the existing installed kernels including the currently running kernel. So the running kernel could have different kernel cmdline parameters from /proc/cmdline. When installing a kernel after updating kexec-tools, /usr/lib/kernel/install.d/20-grub.install would be called by kernel-install [1] which would use /proc/cmdline to set up new kernel's cmdline. To address this special case, reset the new kernel's crashkernel and fadump value to the value that would be used by running kernel after rebooting by the installation hook. One side effect of this commit is it would reset the installed kernel's crashkernel even currently running kernel don't use the default crashkernel value after rebooting. But I think this side effect is a benefit for the user. The implementation depends on kernel-install which run the scripts in /usr/lib/kernel/install.d passing the following arguments, add KERNEL-VERSION $BOOT/MACHINE-ID/KERNEL-VERSION/ KERNEL-IMAGE [INITRD-FILE ...] An concrete example is given as follows, add 5.11.12-300.fc34.x86_64 /boot/e986846f63134c7295458cf36300ba5b/5.11.12-300.fc34.x86_64 /lib/modules/5.11.12-300.fc34.x86_64/vmlinuz kernel-install could be started by the kernel package's RPM scriplet [2]. As mentioned in previous commit "try to reset kernel crashkernel when kexec-tools updates the default crashkernel value", kdumpctl has difficulty running in RPM scriptlet fore CoreOS. But rpm-ostree ignores all kernel hooks, there is no need to disable the kernel hook for CoreOS/Atomic/Silverblue. But a collaboration between rpm-ostree and kexec-tools is needed [3] to take care of this special case. Note the crashkernel.default support is dropped. [1] https://www.freedesktop.org/software/systemd/man/kernel-install.html [2] https://src.fedoraproject.org/rpms/kernel/blob/rawhide/f/kernel.spec#_2680 [3] https://github.com/coreos/rpm-ostree/issues/2894 Reviewed-by: Pingfan Liu <piliu@redhat.com> Reviewed-by: Philipp Rudo <prudo@redhat.com> Signed-off-by: Coiby Xu <coxu@redhat.com>	2022-01-05 09:40:24 +08:00
Coiby Xu	0adb0f4a8c	try to reset kernel crashkernel when kexec-tools updates the default crashkernel value kexec-tools could update the default crashkernel value. When auto_reset_crashkernel=yes, reset kernel to new crashkernel value in the following two cases, - crashkernel=auto is found in the kernel cmdline - the kernel crashkernel was previously set by kexec-tools i.e. the kernel is using old default crashkernel value To tell if the user is using a custom value for the kernel crashkernel or not, we assume the user would never use the default crashkernel value as custom value. When kexec-tools gets updated, 1. save the default crashkernel value of the older package to /tmp/crashkernel (for POWER system, /tmp/crashkernel_fadump is saved as well). 2. If auto_reset_crashkernel=yes, iterate all installed kernels. For each kernel, compare its crashkernel value with the old default crashkernel and reset it if yes The implementation makes use of two RPM scriptlets [2], - %pre is run before a package is installed so we can use it to save old default crashkernel value - %post is run after a package installed so we can use it to try to reset kernel crashkernel There are several problems when running kdumpctl in the RPM scripts for CoreOS/Atomic/Silverblue, for example, the lock can't be acquired by kdumpctl, "rpm-ostree kargs" can't be run and etc.. So don't enable this feature for CoreOS/Atomic/Silverblue. Note latest shellcheck (0.8.0) gives false positives about the associative array as of this commit. And Fedora's shellcheck is 0.7.2 and can't even correctly parse the shell code because of the associative array. [1] https://github.com/koalaman/shellcheck/issues/2399 [2] https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/ Reviewed-by: Pingfan Liu <piliu@redhat.com> Reviewed-by: Philipp Rudo <prudo@redhat.com> Signed-off-by: Coiby Xu <coxu@redhat.com>	2022-01-05 09:40:24 +08:00
Coiby Xu	140da74a34	rewrite reset_crashkernel to support fadump and to used by RPM scriptlet Rewrite kdumpctl reset-crashkernel KERNEL_PATH as kdumpctl reset-crashkernel [--fadump=[on\|off\|nocma]] [--kernel=path_to_kernel] [--reboot] This interface would reset a specific kernel to the default crashkernel value given the kernel path. And it also supports grubby's syntax so there are the following special cases, - if --kernel not specified, - use KDUMP_KERNELVER if it's defined in /etc/sysconfig/kdump - otherwise use current running kernel, i.e. `uname -r` - if --kernel=DEFAULT, the default boot kernel is chosen - if --kernel=ALL, all kernels would have its crashkernel reset to the default value and the /etc/default/grub is updated as well --fadump=[on\|off\|nocma] toggles fadump on/off for the kernel provided in KERNEL_PATH. If --fadump is omitted, the dump mode is determined by parsing the kernel command line for the kernel(s) to update. CoreOS/Atomic/Silverblue needs to be treated as a special case because, - "rpm-ostree kargs" is used to manage kernel command line parameters so --kernel doesn't make sense and there is no need to find current running kernel - "rpm-ostree kargs" itself would prompt the user to reboot the system after modify the kernel command line parameter - POWER is not supported so we can assume the dump mode is always kdump This interface will also be called by kexec-tools RPM scriptlets [1] to reset crashkernel. Note the support of crashkenrel.default is dropped. [1] https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/ Reviewed-by: Pingfan Liu <piliu@redhat.com> Reviewed-by: Philipp Rudo <prudo@redhat.com> Signed-off-by: Coiby Xu <coxu@redhat.com>	2022-01-05 09:40:24 +08:00
Coiby Xu	12ecbce359	fix incorrect usage of rpm-ostree to update kernel command line parameters CoreOS/Atomic/Silverblue use "rpm-ostree kargs" to manage kernel command line parameters. Fixes: `86130ec` ("kdumpctl: Add kdumpctl reset-crashkernel") Reviewed-by: Pingfan Liu <piliu@redhat.com> Reviewed-by: Philipp Rudo <prudo@redhat.com> Signed-off-by: Coiby Xu <coxu@redhat.com>	2022-01-05 09:40:24 +08:00
Coiby Xu	945cbbd59b	add helper functions to get kernel path by kernel release and the path of current running kernel grubby --info=kernel-path or --add-kernel=kernel-path accepts a kernel path (e.g. /boot/vmlinuz-5.14.14-200.fc34.x86_64) instead of kernel release (e.g 5.14.14-200.fc34.x86_64). So we need to know the kernel path given a kernel release. Although for Fedora/RHEL, the kernel path is "/boot/vmlinuz-<KERNEL_RELEASE>", a path kernel could also be /boot/<machine-id>/<KERNEL_RELEASE>/vmlinuz. So the most reliable way to find the kernel path given a kernel release is to use "grubby --info". For osbuild, a kernel path may not yet exist but it's valid for "grubby --update-kernel=KERNEL_PATH". For example, "grubby -info" may output something as follows, index=0 kernel="/var/cache/osbuild-worker/osbuild-store/tmp/tmp2prywdy5object/tree/boot/vmlinuz-5.15.10-100.fc34.x86_64" args="ro no_timer_check net.ifnames=0 console=tty1 console=ttyS0,115200n8" root="UUID=76a22bf4-f153-4541-b6c7-0332c0dfaeac" initrd="/var/cache/osbuild-worker/osbuild-store/tmp/tmp2prywdy5object/tree/boot/initramfs-5.15.10-100.fc34.x86_64.img" There is no need to check if path like /var/cache/osbuild-worker/osbuild-store/tmp/tmp2prywdy5object/tree/boot/vmlinuz-5.15.10-100.fc34.x86_64 physically exists. Note these helper functions doesn't support CoreOS/Atomic/Silverblue since grubby isn't used by them. Reviewed-by: Pingfan Liu <piliu@redhat.com> Reviewed-by: Philipp Rudo <prudo@redhat.com> Signed-off-by: Coiby Xu <coxu@redhat.com>	2022-01-05 09:40:24 +08:00
Coiby Xu	3d2079c31c	add helper functions to get dump mode Add a helper function to get dump mode. The dump mode would be - fadump if fadump=on or fadump=nocma - kdump if fadump=off or empty fadump Otherwise return 1. Also add another helper function to return a kernel's dump mode. Reviewed-by: Pingfan Liu <piliu@redhat.com> Reviewed-by: Philipp Rudo <prudo@redhat.com> Signed-off-by: Coiby Xu <coxu@redhat.com>	2022-01-05 09:40:24 +08:00
Coiby Xu	fb9e6838ab	add a helper function to read kernel cmdline parameter from grubby --info This helper function will be used to retrieve the value of kernel cmdline parameters including crashkernel, fadump, swiotlb and etc. Suggested-by: Philipp Rudo <prudo@redhat.com> Reviewed-by: Pingfan Liu <piliu@redhat.com> Signed-off-by: Coiby Xu <coxu@redhat.com>	2022-01-05 09:40:24 +08:00
Coiby Xu	796d0f6fd2	provide kdumpctl get-default-crashkernel for kdump_anaconda_addon and RPM scriptlet Provide "kdumpctl get-default-crashkernel" for kdump_anaconda_addon so crashkernel.default isn't needed. When fadump is on, kdump_anaconda_addon would need to specify the dump mode, i.e. "kdumpctl get-default-crashkernel fadump". This interface would also be used by RPM scriptlet [1] to fetch default crashkernel value. [1] https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/ Reviewed-by: Pingfan Liu <piliu@redhat.com> Reviewed-by: Philipp Rudo <prudo@redhat.com> Signed-off-by: Coiby Xu <coxu@redhat.com>	2022-01-05 09:40:24 +08:00
Kairui Song	546c81a205	kdumpctl: remove some legacy code It seems the save_core function and vmcore detection was used a long time ago when kdump shares same userspace in first and second kernel. It's now heavily deprecated (only support cp, hardcoded path, dumpoops no longer exists) and not used. Now vmcore will never show up in first kernel for both kdump and fadump case, and kdumpctl is only used in first kernel, so just remove them. Signed-off-by: Kairui Song <kasong@tencent.com> Acked-by: Coiby Xu <coxu@redhat.com>	2021-12-31 11:37:19 +08:00
Kairui Song	0e4b66b1ab	bash scripts: reformat with shfmt This is a batch update done with: shfmt -s -w mkfadumprd mkdumprd kdumpctl *-module-setup.sh Clean up code style and reduce code base size, no behaviour change. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Philipp Rudo <prudo@redhat.com>	2021-09-14 03:25:29 +08:00
Kairui Song	4f75e16700	bash scripts: declare and assign separately Declare and assign separately to avoid masking return values: https://github.com/koalaman/shellcheck/wiki/SC2155 Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Philipp Rudo <prudo@redhat.com>	2021-09-14 03:25:29 +08:00
Kairui Song	a4648fc851	bash scripts: fix redundant exit code check As suggested by: https://github.com/koalaman/shellcheck/wiki/SC2181 Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Philipp Rudo <prudo@redhat.com>	2021-09-14 03:25:29 +08:00
Kairui Song	86538ca6e2	bash scripts: fix variable quoting issue Fixed quoting issues found by shellcheck, no feature change. This should fix many errors when there is space in any shell variables, eg. dump target's name/path/id. False positives are marked with "# shellcheck disable=SCXXXX", for example, args are expected to split so it should not be quoted. And replaced some `cut -d ' ' -fX` with `awk '{print $X}'` since cut is fragile, and doesn't work well with any quoted strings that have redundant space. Following quoting related issues are fixed (check the link for example code and what could go wrong): https://github.com/koalaman/shellcheck/wiki/SC2046 https://github.com/koalaman/shellcheck/wiki/SC2053 https://github.com/koalaman/shellcheck/wiki/SC2068 https://github.com/koalaman/shellcheck/wiki/SC2086 https://github.com/koalaman/shellcheck/wiki/SC2206 Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Philipp Rudo <prudo@redhat.com>	2021-09-14 03:25:29 +08:00
Kairui Song	70978c00e5	bash scripts: replace '[ ]' with '[[ ]]' for bash scripts kdumpctl, mkdumprd, -module-setup.sh only target bash, since they only run in first kernel and depend on dracut, and dracut depends on bash. So use '[[ ]]' to replace '[ ]'. This is a batch update done with following command: `sed -i -e 's/$\s$\[\s$[^]]$\s\]/\1\[\[\ \2 \]\]/g' kdumpctl, mkdumprd, *-module-setup.sh` and replaced [ ... -a ... ] with [[ ... ]] && [[ ... ]] manually. See https://tldp.org/LDP/abs/html/testconstructs.html for more details on '[[ ]]', it's more versatile, safer, and slightly faster than '[ ]'. This will also help shfmt to clean up the code in later commits. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Philipp Rudo <prudo@redhat.com>	2021-09-14 03:25:29 +08:00
Kairui Song	54cc5c44be	bash scripts: use $(...) notation instead of legacy `...` This is a batch update done with following command: `sed -i -e 's/`$[^`]*$`/\$(\1)/g' mkfadumprd mkdumprd \ kdumpctl dracut-module-setup.sh dracut-fadump-module-setup.sh \ dracut-early-kdump-module-setup.sh` And manually converted some corner cases. This fixes all related issues detected by shellcheck. Make it easier to do clean up in later commits. Check following link for reasons to switch to the new syntax: https://github.com/koalaman/shellcheck/wiki/SC2006 Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Philipp Rudo <prudo@redhat.com>	2021-09-14 03:25:29 +08:00
Kairui Song	a416930706	bash scripts: always use "read -r" This helps to strip spaces and avoid mangling backslashes: https://github.com/koalaman/shellcheck/wiki/SC2162 Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Philipp Rudo <prudo@redhat.com>	2021-09-14 03:25:29 +08:00
Kairui Song	c4d85142be	bash scripts: get rid of expr and let As suggested by: https://github.com/koalaman/shellcheck/wiki/SC2219 Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Philipp Rudo <prudo@redhat.com>	2021-09-14 03:25:29 +08:00
Kairui Song	6d45257cc1	bash scripts: remove useless cat Some `cat` calls are useless, remove them to make it cleaner. See: https://github.com/koalaman/shellcheck/wiki/SC2002 Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Philipp Rudo <prudo@redhat.com>	2021-09-14 03:25:29 +08:00
Kairui Song	80525aface	kdumpctl: refine grep usage Use `grep -q` instead of redirect to /dev/null. Use `grep -c` instead, as suggested in: https://github.com/koalaman/shellcheck/wiki/SC2126 Use `grep -E` instead of `egrep`. https://github.com/koalaman/shellcheck/wiki/SC2196 Signed-off-by: Kairui Song <kasong@redhat.com>	2021-09-14 03:25:29 +08:00
Kairui Song	dfb76467c9	kdumpctl: fix fragile loops over find output For loops over find output are fragile, use a while read loop: https://github.com/koalaman/shellcheck/wiki/SC2044 Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Philipp Rudo <prudo@redhat.com	2021-09-14 03:25:29 +08:00
Kairui Song	01613b7211	kdumpctl: use kdump_get_conf_val to read config values Also fixed kdumpctl, use `awk` instead of `cut` to read core_collector's executable name correctly when its arguments are not seperated by space. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Philipp Rudo <prudo@redhat.com>	2021-09-14 03:25:29 +08:00
Kairui Song	09ccf88405	kdump-lib.sh: add a config value retrive helper Add a helper kdump_get_conf_val to replace get_option_value. It can help cover more corner cases in the code, like when there are multiple spaces in config file, config value separated by a tab, heading spaces, or trailing comments. And this uses "sed group command" and "sed hold buffer", make it much faster than previous `grep <config> \| tail -1`. This helper is supposed to provide a universal way for kexec-tools scripts to read in config value. Currently, different scripts are reading the config in many different fragile ways. For example, following codes are found in kexec-tools script code base: 1. grep ^force_rebuild $KDUMP_CONFIG_FILE echo $_force_rebuild \| cut -d' ' -f2 2. grep ^kdump_post $KDUMP_CONFIG_FILE \| cut -d\ -f2 3. awk '/^sshkey/ {print $2}' $conf_file 4. grep ^path $KDUMP_CONFIG_FILE \| cut -d' ' -f2- 1, 2, and 4 will fail if the space is replaced by, e.g. a tab 1 and 2 might fail if there are multiple spaces between config name and config value: "kdump_post /var/crash/scripts/kdump-post.sh" A space will be read instead of config value. 1, 2, 3 will fail if there are space in file path, like: "kdump_post /var/crash/scripts dir/kdump-post.sh" 4 will fail if there are trailing comments: "path /var/crash # some comment here" And all will fail if there are heading space, " path /var/crash" And all will most likely cause problems if the config file contains the same option more than once. And all of them are slower than the new sed call. Old get_option_value is also very slow and doesn't handle heading space. Although we never claim to support heading space or tailing comments before, it's harmless to be more robust on config reading, and many conf files in /etc support heading spaces. And have a faster and safer config reading helper makes it easier to clean up the code. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Philipp Rudo <prudo@redhat.com>	2021-09-14 03:25:29 +08:00
Kairui Song	a0282ab22c	kdump-lib.sh: add a config format and read helper Add a helper `kdump_read_conf` to replace read_strip_comments. `kdump_read_conf` does a few more things: - remove trailing spaces. - format the content, remove duplicated spaces between name and value. - read from KDUMP_CONFIG_FILE (/etc/kdump.conf) directly, avoid pasting "/etc/kdump.conf" path everywhere in the code. - check if config file exists, just in case. Also unify the environmental variable, now KDUMP_CONFIG_FILE stands for the default config location. This helps avoid some shell pitfalls about spaces when reading config. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Philipp Rudo <prudo@redhat.com>	2021-09-14 03:25:29 +08:00
Kairui Song	097059dedc	Clear old crashkernl=auto in comment and doc Acked-by: Pingfan Liu <piliu@redhat.com> Signed-off-by: Kairui Song <kasong@redhat.com>	2021-08-05 17:54:20 +08:00
Kairui Song	bcd8d6a47b	kdumpctl: fix a typo Recommanded -> Recommended Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Coiby Xu <coxu@redhat.com>	2021-07-20 15:57:05 +08:00
Kairui Song	86130ec10f	kdumpctl: Add kdumpctl reset-crashkernel In newer kernel, crashkernel.default will contain the default crashkernel value of a kernel build. So introduce a new sub command to help user reset kernel crashkernel size to the default value. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Baoquan He <bhe@redhat.com>	2021-07-08 15:18:45 +08:00
Kairui Song	bf6671b60d	fadump: kdumpctl should check the modules used by the fadump initramfs After fadump embedded the fadump initramfs in the normal initramfs, kdumpctl will mistakenly rebuild the initramfs everytime. kdumpctl checks the hostonly-kernel-modules.txt file in initramfs to check if required drivers are included, but the normal initramfs is built in non-hostonly mode, so it doesn't have a hostonly-kernel-modules.txt file. The check will always fail. So let mkfadumprd make a copy of the hostonly-kernel-modules.txt in the fadump initramfs and let kdumpctl check that file instead. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Hari Bathini <hbathini@linux.ibm.com>	2021-06-30 17:27:02 +08:00
Hari Bathini	fa9201b240	fadump: isolate fadump initramfs image within the default one In case of fadump, the initramfs image has to be built to boot into the production environment as well as to offload the active crash dump to the specified dump target (for boot after crash). As the same image would be used for both boot scenarios, it could not be built optimally while accommodating both cases. Use --include to include the initramfs image built for offloading active crash dump to the specified dump target. Also, introduce a new out-of-tree dracut module (99zz-fadumpinit) that installs a customized init program while moving the default /init to /init.dracut. This customized init program is leveraged to isolate fadump image within the default initramfs image by kicking off default boot process (exec /init.dracut) for regular boot scenario and activating fadump initramfs image, if the system is booting after a crash. If squash is available, ensure default initramfs image is also built with squash module to reduce memory consumption in capture kernel. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-06-29 21:35:58 +08:00
Kairui Song	e9e6a2c745	kdumpctl: Add kdumpctl estimate Add a rough esitimation support, currently, following memory usage are checked by this sub command: - System RAM - Kdump Initramfs size - Kdump Kernel image size - Kdump Kernel module size - Kdump userspace user and other runtime allocated memory (currently simply using a fixed value: 64M) - LUKS encryption memory usage The output of kdumpctl estimate looks like this: # kdumpctl estimate Reserved crashkernel: 256M Recommanded crashkernel: 160M Kernel image size: 47M Kernel modules size: 12M Initramfs size: 19M Runtime reservation: 64M Large modules: xfs: 1892352 nouveau: 2318336 And if the kdump target is encrypted: # kdumpctl estimate Encrypted kdump target requires extra memory, assuming using the keyslot with minimun memory requirement Reserved crashkernel: 256M Recommanded crashkernel: 655M Kernel image size: 47M Kernel modules size: 12M Initramfs size: 19M Runtime reservation: 64M LUKS required size: 512M Large modules: xfs: 1892352 nouveau: 2318336 WARNING: Current crashkernel size is lower than recommanded size 655M. The "Recommanded" value is calculated based on memory usages mentioned above, and will be adjusted accodingly to be no less than the value provided by kdump_get_arch_recommend_size. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Pingfan Liu <piliu@redhat.com>	2021-05-19 15:27:43 +08:00
Kairui Song	6137956f79	kdumpctl: fix check_config error when kdump.conf is empty Kdump scirpt already have default values for core_collector, path in many other place. Empty kdump.conf still works. Fix this corner case and fix the error message. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Pingfan Liu <piliu@redhat.com>	2021-04-28 18:05:12 +08:00
Kelvin Fan	75bdcb7399	Write to `/var/lib/kdump` if $KDUMP_BOOTDIR not writable The `/boot` directory on some operating systems might be read-only. If we cannot write to `$KDUMP_BOOTDIR` when generating the kdump initrd, attempt to place the generated initrd at `/var/lib/kdump` instead. Signed-off by: Kelvin Fan <kelvinfan001@gmail.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-04-19 16:11:17 +08:00
Pingfan Liu	596fa0a07f	kdumpctl: enable secure boot on ppc64le LPARs On ppc64le LPAR, secure-boot is a little different from bare metal, Where host secure boot: /ibm,secure-boot/os-secureboot-enforcing DT property exists while guest secure boot: /ibm,secure-boot >= 2 Make kexec-tools adapt to LPAR Signed-off-by: Pingfan Liu <piliu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-02-23 09:45:54 +08:00
Kairui Song	02202aa70f	logger: source the logger file individually Sourcing logger file in kdump-lib.sh will leak kdump helper to dracut, because module-setup.sh will source kdump-lib.sh. This will make kdump's function override dracut's ones, and lead to unexpected behaviours. So include kdump-logger.sh individually and only source it where it really needed. for module-setup.sh, simply use dracut's logger helper is good enough so just source kdump-logger.sh in kdump only scripts. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Lianbo Jiang <lijiang@redhat.com>	2021-01-20 14:13:44 +08:00
Pingfan Liu	0bd0c5b9f1	kdumpctl: fix a variable expansion in check_fence_kdump_config() Both $ipaddrs and $node can hold multiple strings, so use "" to brace them. Signed-off-by: Pingfan Liu <piliu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-01-06 13:28:46 +08:00
Kairui Song	4464bcf8f3	kdump-lib.sh: Use a more generic helper to detect omitted dracut module Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Lianbo Jiang <lijiang@redhat.com>	2020-11-30 15:25:26 +08:00
Kairui Song	647aa56b53	Fix the watchdog drivers detection code Currently the watchdog detection code is broken already, it get the list of active watchdog drivers, then check if they are set in the /etc/cmdline.d/* as preload module. But after we switched to use squash module, /etc/cmdline.d/* is not directly visible. So just detect whether current needed driver is installed. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Lianbo Jiang <lijiang@redhat.com>	2020-11-30 15:25:19 +08:00
Kairui Song	276de0f810	Remove a redundant nfs check In check_fs_modified, is_nfs_dump_target is already called, the dump target can't be nfs. No need to check here. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Lianbo Jiang <lijiang@redhat.com>	2020-11-30 15:25:06 +08:00
Kairui Song	d54e5bab0f	kdumpctl: split the driver detection from fs dection function The driver detection have nothing to do with fs detection, and currently if the dump target is raw, the block driver detection is skipped which is wrong. Just split it out and run the block driver detection when dump target is fs or raw. Also simplfied the code a bit. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Lianbo Jiang <lijiang@redhat.com>	2020-11-30 15:24:45 +08:00
Lianbo Jiang	cd85fe9165	Add code comments to help better understanding Let's add some code comments to help better understanding, and no code changes. Signed-off-by: Lianbo Jiang <lijiang@redhat.com> Acked-by: Baoquan He <bhe@redhat.com>	2020-11-12 13:59:21 +08:00
Kairui Song	b9a1f461a8	Fix error when using raw target with opalcore Commit `08276e9` wrongly raise this warning message to error level, fix this. Fixes: `08276e9` ('Rework check_config and warn on any duplicated option') Signed-off-by: Kairui Song <kasong@redhat.com>	2020-10-27 17:37:14 +08:00
Lianbo Jiang	88a8b94de9	kdumpctl: add the '-d' option to enable the kexec loading debugging messages Currently, the kexec option '--debug/-d' is not enabled by default, which means that users need to set it manually and wait for the next failure to capture the additional information. Therefore, let's enable the option '-d' for kexec loading by default. Signed-off-by: Lianbo Jiang <lijiang@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2020-10-27 17:34:03 +08:00
Lianbo Jiang	3b743ae6ae	enable the logger for kdump Since the logger was introduced into kdump, let's enable it for kdump so that we can output kdump messages according the log level and save these messages for debugging. Signed-off-by: Lianbo Jiang <lijiang@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2020-10-27 17:33:54 +08:00
Kairui Song	08276e9f7a	Rework check_config and warn on any duplicated option Instead of read and parse the kdump.conf multiple times, only read once and use a single loop to handle the error check, which is faster. Also check for any duplicated config otion, and error out if there are duplicated ones. Now it checks for following errors, most are unchanged from before: - Any duplicated config options. (New added) - Deprecated/Invalid kdump config option. - Duplicated kdump target, will have a different error message of other duplicated config options. - Duplicated --mount options in dracut_args. - Empty config values. All kdump configs should be in "<config_opt> <config_value>" format. - Check If raw target is used in fadump mode. And removed detect of lines start with space, it will not break kdump anyway. The performance is measurable better than before for the check_config function. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Pingfan Liu <piliu@redhat.com>	2020-10-27 17:07:54 +08:00
Kairui Song	a37f36ad4d	Refactor kernel image and initrd detection code kernel installation is not always in a fixed location /boot, there are multiple different style of kernel installation, and initramfs location changes with kernel. The two files should be detected together and adapt to different style. To do so we use a list of known installation destinations, and a list of possible kernel image and initrd names. Iterate the two list to detect the installation location of the two files. If GRUB is in use, the BOOT_IMAGE= cmdline from GRUB will also be considered. And also prefers user specified config if given. Previous atomic workaround is no longer needed as the new detection method can cover that case. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2020-08-27 11:29:17 +08:00
Pingfan Liu	f96172d353	kdumpctl: exit if either pre.d or post.d is missing It is hard to detect the time that /etc/kdump is removed. And this failure may cause out-of-date kdump.initrd. To keep things simple, just exit if /etc/kdump/pre.d and post.d does not exist. Signed-off-by: Pingfan Liu <piliu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2020-07-30 16:47:10 +08:00
Pingfan Liu	25824d64cd	kdumpctl: detect modification of scripts by its directory's timestamp Checking modification against a file can not detect a removing file in "/etc/kdump/post.d/ /etc/kdump/pre.d/". Hence it also needs the modified time of directory to detect such changes. Signed-off-by: Pingfan Liu <piliu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2020-07-20 16:18:42 +08:00
Lianbo Jiang	073646998f	Revert "kdump-lib: switch to the kexec_file_load() syscall on x86_64 by default" This reverts commit `6a20bd5447`. Let's restore the logic of secureboot status check, and remove the option 'KDUMP_FILE_LOAD=on\|off'. We will use the option KEXEC_ARGS="-s" to enable the kexec file load later, which can avoid failures when the secureboot is enabled. Signed-off-by: Lianbo Jiang <lijiang@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2020-07-01 17:07:46 +08:00
Kairui Song	a29de38da5	Always wrap up call to dracut get_persistent_dev function Dracut get_persistent_dev function don't recognize UUID= or LABEL= format, so caller should conver it to the path to the block device before calling it. There is already such a helper "kdump_get_persistent_dev", just move it to kdump-lib.sh and rename it to reuse it, Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2020-06-22 19:58:08 +08:00
onitsuka.shinic@fujitsu.com	bdd57a5864	kdumpctl: Check the update of the binary and script files in /etc/kdump/{pre.d,post.d} This patch adds the binary and script files in /etc/kdump/{pre.d,post.d} to modified checklist in order to update kdump initramfs when one adds new scripts or binaries or removes the existing ones under /etc/kdump/{pre.d, post.d}. Signed-off-by: Shinichi Onitsuka <onitsuka.shinic@fujitsu.com> Acked-by: Pingfan Liu <piliu@redhat.com>	2020-06-11 12:59:15 +08:00
Kairui Song	61e016939c	User get_mount_info to replace findmnt calls Use get_mount_info so that fstab is used as a failback when look for mount info. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Pingfan Liu <piliu@redhat.com>	2020-05-22 16:14:02 +08:00
Kairui Song	70deeb474b	Allow calling mkdumprd from kdumpctl even if targat not mounted Ignore mount check in kdumpctl, mkdumprd will still fail building and exit if target is not mounted. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Pingfan Liu <piliu@redhat.com>	2020-05-22 16:13:49 +08:00
Kairui Song	0624148414	Add a is_mounted helper Use is_mounted helper instaed of calling findmnt directly or checking if "mount" value is empty. If findmnt looks for fstab as well, some non mounted entry will also return value. Required to support non-mounted target. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Pingfan Liu <piliu@redhat.com>	2020-05-22 16:13:24 +08:00
Kairui Song	43ea36b3e8	Introduce get_kdump_mntpoint_from_target and fix duplicated / User a helper to get the path to mount dump target in kdump kernel, and fix duplicated '/' in the mount path problem. Fixes: bz1785371 Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2020-05-22 16:13:02 +08:00
Kairui Song	c1c7f004c8	Remove is_dump_target_configured It's basically same with is_user_configured_dump_target and only have one caller. And the name is confusing, the dump target is always configured, it's either user configured or path based. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Lianbo Jiang <lijiang@redhat.com>	2020-03-30 22:05:31 +08:00
Kairui Song	632c369ec2	kdumpctl: fix driver change detection on latest Fedora Now modinfo will return "(builtin)" instead of empty string for builtin module. Sync the code logic. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Pingfan Liu <piliu@redhat.com>	2020-03-23 10:24:57 +08:00
Kairui Song	2fbcdf41e3	kdumpctl: check hostonly-kernel-modules.txt for kernel module Since Dracut commit a0d9ad6 loaded-kernel-modules is renamed to hostonly-kernel-modules and contains all hostonly modules. So check hostonly-kernel-modules instead for module change. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Pingfan Liu <piliu@redhat.com>	2020-03-18 15:11:43 +08:00
Hari Bathini	e3f2f926dd	powerpc: enable the scripts to capture dump on POWERNV platform With FADump support added on POWERNV paltform, enable the scripts to capture /proc/vmcore. Also, if CONFIG_OPAL_CORE is enabled, OPAL core is preserved and exported on POWERNV platform. So, offload OPAL core, if it is available. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Kairui Song <kasong@redhat.com>	2020-02-06 22:13:06 +08:00
Lianbo Jiang	6a20bd5447	kdump-lib: switch to the kexec_file_load() syscall on x86_64 by default UEFI Secure boot is a signature verification mechanism, designed to prevent malicious code being loaded and executed at the early boot stage. This makes sure that code executed is trusted by firmware. Previously, with kexec_file_load() interface, kernel prevents unsigned kernel image from being loaded if secure boot is enabled. So kdump will detect whether secure boot is enabled firstly, then decide which interface is chosen to execute, kexec_load() or kexec_file_load(). Otherwise unsigned kernel loading will fail if secure boot enabled, and kexec_file_load() is entered. Now, the implementation of kexec_file_load() is adjusted in below commit. With this change, if CONFIG_KEXEC_SIG_FORCE is not set, unsigned kernel still has a chance to be allowed to load under some conditions. commit 99d5cadfde2b ("kexec_file: split KEXEC_VERIFY_SIG into KEXEC_SIG and KEXEC_SIG_FORCE") And in the current Fedora, the CONFIG_KEXEC_SIG_FORCE is not set, only the CONFIG_KEXEC_SIG and CONFIG_BZIMAGE_VERIFY_SIG are set on x86_64 by default. It's time to spread kexec_file_load() onto all systems of x86_64, including Secure-boot platforms and legacy platforms. Please refer to the following form. .----------------------------------------------------------------------. \| . \| signed kernel \| unsigned kernel \| \| . types \|-----------------------\|-----------------------\| \| . \|Secure boot\| Legacy \|Secure boot\| Legacy \| \| . \|-----------\|-----------\|-----------\|-----------\| \| options . \| prev\| now \| prev\| now \| \| \| prev\| now \| \| . \|(file\|(file\|(only\|(file\| prev\| now \|(only\|(file\| \| . \|load)\|load)\|load)\|load)\| \| \|load)\|load)\| \|----------------------\|-----\|-----\|-----\|-----\|-----\|-----\|-----\|-----\| \|KEXEC_SIG=y \| \| \| \| \| \| \| \| \| \|SIG_FORCE is not set \|succ \|succ \|succ \|succ \| X \| X \|succ \|succ \| \|BZIMAGE_VERIFY_SIG=y \| \| \| \| \| \| \| \| \| \|----------------------\|-----\|-----\|-----\|-----\|-----\|-----\|-----\|-----\| \|KEXEC_SIG=y \| \| \| \| \| \| \| \| \| \|SIG_FORCE is not set \| \| \| \| \| \| \| \| \| \|BZIMAGE_VERIFY_SIG is \|fail \|fail \|succ \|fail \| X \| X \|succ \|fail \| \|not set \| \| \| \| \| \| \| \| \| \|----------------------\|-----\|-----\|-----\|-----\|-----\|-----\|-----\|-----\| \|KEXEC_SIG=y \| \| \| \| \| \| \| \| \| \|SIG_FORCE=y \|succ \|succ \|succ \|fail \| X \| X \|succ \|fail \| \|BZIMAGE_VERIFY_SIG=y \| \| \| \| \| \| \| \| \| \|----------------------\|-----\|-----\|-----\|-----\|-----\|-----\|-----\|-----\| \|KEXEC_SIG=y \| \| \| \| \| \| \| \| \| \|SIG_FORCE=y \| \| \| \| \| \| \| \| \| \|BZIMAGE_VERIFY_SIG is \|fail \|fail \|succ \|fail \| X \| X \|succ \|fail \| \|not set \| \| \| \| \| \| \| \| \| \|----------------------\|-----\|-----\|-----\|-----\|-----\|-----\|-----\|-----\| \|KEXEC_SIG is not set \| \| \| \| \| \| \| \| \| \|SIG_FORCE is not set \| \| \| \| \| \| \| \| \| \|BZIMAGE_VERIFY_SIG is \|fail \|fail \|succ \|succ \| X \| X \|succ \|succ \| \|not set \| \| \| \| \| \| \| \| \| ---------------------------------------------------------------------- Note: [1] The 'X' indicates that the 1st kernel(unsigned) can not boot when the Secure boot is enabled. Hence, in this patch, if on x86_64, let's use the kexec_file_load() only. See if anything wrong happened in this case, in Fedora firstly for the time being. Signed-off-by: Lianbo Jiang <lijiang@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2020-02-06 21:57:14 +08:00
Hari Bathini	0a9aabaadd	kdumpctl: make reload fail proof When large amount of memory, about 1TB, is removed with DLPAR memory remove operation, kdump reload could fail due to race condition with device tree property update. In such scenario, the subsequent kdump reload requests would also fail as reload() only proceeds if current load status is active. Since the possibility of this race condition couldn't be wished away due to the nature of the scenario, workaround it by proceeding to load even if current load status is not active as long as kdump service is active, which kdump udev rules already check for. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Kairui Song <kasong@redhat.com>	2019-11-12 13:22:52 +08:00
Pingfan Liu	72ed97683f	kdumpctl: bail out immediately if host key verification failed In kdump.conf, if sshkey points to an invalid ssh key, 'kdumpctl restart' can bail out immediately instead of retry. Signed-off-by: Pingfan Liu <piliu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2019-10-22 15:14:37 +08:00
Pingfan Liu	e07fc3e071	kdumpctl: echo msg when waiting for connection Print some message during the long wait period to reflect the process. The message will look like: Network dump target is not usable, waiting for it to be ready ... Signed-off-by: Pingfan Liu <piliu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2019-09-24 13:17:16 +08:00
Pingfan Liu	680c0d3414	kdumpctl: distinguish the failed reason of ssh On a host with ipaddr not ready before kdump service, ssh return errno 255. While if no ssh-key, ssh also return errno 255. For both of cases, the current kdump code promote user to run 'kdumpctl propagate'. This confuses user who already installs ssh-key. In order to tell these two cases from each other, the ssh warning message should be involved, and parsed. For the no ssh-key case , warning message is "Permission denied" or "No such file or directory". For the other, warning message is "Network Unreachable" This patch also does a slight change to enlarge the timeout from 60s to 180s. This value can meet test at the time being Signed-off-by: Pingfan Liu <piliu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2019-09-02 17:06:21 +08:00
Pingfan Liu	c1a06343df	kdumpctl: wait a while for network ready if dump target is ssh If dump target is ipv6 address, a host should have ipv6 address ready before starting kdump service. Otherwise, kdump service fails to start due to the failure "ssh dump_server_ip mkdir -p $SAVE_PATH". And user can see message like: "Could not create root@2620:52:0:10da:46a8:42ff:fe23:3272/var/crash" I observe a long period (about 30s) on some machine before they got ipv6 address dynamiclly, which is never seen on ipv4 host. Hence kdump service has a dependency on ipv6 address. But there is no good way to resolve it. One way is asking user to run the cmd "nmcli connection modify eth0 ipv6.may-fail false". But this will block systemd until ipv6 address is ready. Despite doing so, kdump can try its best (wait 1 minutes after it starts up) before failure. How to implement the wait is arguable. It will involve too many technique details if explicitly waiting on ipv6 address, instead, just lean on 'ssh' return value to see the availability of network. Signed-off-by: Pingfan Liu <piliu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2019-08-12 16:13:08 +08:00
Kairui Song	f0fa5c8e91	kdumpctl: check for ssh path availability when rebuild Currently kdumpctl rebuild will simply rebuild the initramfs, and only perform basic config syntax check. But it should also check if the target path is available when using SSH target, else kdump may fail. is second kernel. kdumpctl rebuild should cover this case, and create the path if it doesn't exist. This patch make rebuild and restart behaves the same, rebuild is now equal to restart, except it won't check config change or reload kdump resource. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2019-05-27 16:13:29 +08:00
Kairui Song	43c26b7312	kdumpctl: Check kdump.conf for error when rebuild is called Although "kdumpctl rebuild" is introduced to help user rebuild the initramfs without modifying the kdump.conf, if the kdump.conf is modified and "kdumpctl rebuild" is called, a initramfs with a faulty kdump.conf will be built. Kdump will refuse to load the initramfs when restarted, but kdumpctl reload may load the faulty initramfs. So need to make sure the faulty build won't be generate in the first place. Check for kdump.conf error before building the initramfs to ensure such failure won't happen. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2019-05-27 13:57:55 +08:00
Kairui Song	2efc0f1854	kdumpctl: don't always rebuild when extra_modules is set We don't necessarily have to always rebuild the initramfs when extra_modules is set. Instead, just detect if any module is updated, and only rebuild initramfs if found any updated kernel module. Tested with in-tree kernel modules, out-of-tree kernel modules, weak modules, all worked as expected. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2019-05-20 17:01:25 +08:00
Kairui Song	30913fd667	kdumpctl: follow symlink when checking for modified files Previously only the symlink's timestamp is used for checking if file are modified, this will not trigger a rebuild if the symlink target it modified. So check both symlink timestamp and symlink target timestamp, rebuild the initramfs on both symlink changed and target changed. Also give a proper error message if the file doesn't exist. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2019-05-20 16:56:31 +08:00
Kairui Song	75d9132417	Get rid of duplicated strip_comments when reading config When reading kdump configs, a single parsing should be enough and this saves a lot of duplicated striping call which speed up the total load speed. Speed up about 2 second when building and 0.1 second for reload in my tests. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2019-05-20 16:56:28 +08:00
Lianbo Jiang	9529191d95	earlykdump: provide a prompt message after the rebuilding of kdump initramfs. Early kdump inherits the settings of normal kdump, so any changes that caused normal kdump rebuilding also require rebuilding the system initramfs to make sure that the changes take effect for early kdump. Therefore, when the early kdump is enabled, provide a prompt message after the rebuilding of kdump initramfs is completed. Signed-off-by: Lianbo Jiang <lijiang@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2019-05-20 16:56:19 +08:00
Kairui Song	1c1159a586	kdumpctl: Detect block device driver change for initramfs rebuild Previous we rebuild the initramfs when kenrel load module list changed, but this is not very stable as some async services may load/unload kernel modules, and cause unnecessary initramfs rebuild. Instead, it's better to just check if the module required to dump to the dump target is loaded or not, and rebuild if not loaded. This avoids most false-positives, and ensure local target change is always covered. Currently only local fs dump target is covered, because this check requires the dump target to be mounted when building the initramfs, this guarantee that the module is in the loaded kernel module list, else we may still get some false positive. dracut-install could be leveraged to combine the modalias list with kernel loaded module list as a more stable module list in the initramfs, but upstream dracut change need to be done first. Passed test on a KVM VM, changing the storage between SATA/USB/VirtIO will trigger initramfs rebuild and didn't notice any false-positive. Also passed test on my laptop with no false-positive. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2019-05-08 17:51:18 +08:00
Kairui Song	09f50350d9	Revert "kdumpctl: Rebuild initramfs if loaded kernel modules changed" This reverts commit `6b479b6572`. Check initramfs rebuild by looking at if there is any change of load kernel modules list is not very stable after all. Previously we are counting on udev to settle before kdump is started to ensure all modules is ready, but actually any service may cause a kernel module load, even after udev is settled. The previous commit is trying to workaround an issue that VM created with disk snapshot may fail in the kdump initramfs. The better fix is to not include the kdump initramfs in the disk snapshot at all, as the kdump initramfs is not generated for a generic use. And With new added "kdumpctl reload" command, admins could rebuild the image easily, and should rebuild the initramfs on hardware change manually. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2019-05-06 17:54:26 +08:00
Kairui Song	594ac119c5	kdumpctl: add rebuild support Use "kdumpctl rebuild" to rebuild the image directly. This could help admins to rebuild kdump image directly. Also merge fadump related initramfs backup/restore into setup_initrd, and do permission only when actually trying to rebuild the image. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2019-04-05 02:02:43 +08:00
Hari Bathini	689fca5af3	fadump: leverage kernel support to re-regisgter FADump With kernel commit 0823c68b054b ("powerpc/fadump: re-register firmware- assisted dump if already registered") support is enabled to re-register when FADump is alredy registered. Leverage that option in kdump scripts. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Kairui Song <kasong@redhat.com>	2019-03-07 13:33:17 +08:00
Hari Bathini	da6b75f59b	fadump: use the original initrd to rebuild fadump initrdfrom The idea behind adding support for dracut '--rebuild' option was to ensure the initrd built for fadump takes into consideration all the build parameters passed to original initrd. Pass original initrd instead of current default initrd for rebuild as current initrd might already have build parameters from original initrd along with parameters from previous fadump intird build making the build parameters look like this after a few iterations: -H --persistent-policy 'by-uuid' -f --quiet --hostonly --hostonly- cmdline --hostonly-i18n --hostonly-mode 'strict' -o 'plymouth dash resume ifcfg' --mount '/dev/mapper/rhel_zzfp219--lp3-home /kdumproot //home xfs defaults' -f --kver '4.18.0-60.el8.ppc64le' --quiet --hostonly --hostonly-cmdline --hostonly-i18n --hostonly-mode 'strict' -o 'plymouth dash resume ifcfg' --mount '/dev/mapper/rhel_zzfp219--lp3-home /kdumproot//home xfs defaults' -f --kver '4.18.0-60.el8.ppc64le' --quiet --hostonly --hostonly-cmdline --hostonly-i18n --hostonly-mode 'strict' -o 'plymouth dash resume ifcfg' --mount '/dev/mapper/rhel_zzfp219--lp3-home /kdumproot//home xfs defaults' -f --kver '4.18.0-60.el8.ppc64le' --include '/tmp/fadump.initramfs' '/etc/fadump.initramfs' --include '/tmp/fadump.initramfs' '/etc/fadump.initramfs' --include '/tmp/fadump.initramfs' '/etc/fadump.initramfs' -- Since it is not desirable to build initrd with stale and/or duplicate build parameters, use original initrd (backed up) to rebuild fadump initrd, instead of current default initrd. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Dave Young <dyoung@redhat.com>	2019-03-07 13:32:47 +08:00
Kazuhito Hagio	242da37c58	Add final_action option to kdump.conf If a crash occurs repeatedly after enabling kdump, the system goes into a crash loop and the dump target may get filled up by vmcores. This is likely especially with early kdump. This patch introduces 'final_action' option to kdump.conf, in order for users to be able to power off the system even after capturing a vmcore successfully. Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com> Cc: Dave Young <dyoung@redhat.com> Cc: Lianbo Jiang <lijiang@redhat.com> Cc: Bhupesh Sharma <bhsharma@redhat.com> Acked-by: Bhupesh Sharma <bhsharma@redhat.com> Acked-by: Dave Young <dyoung@redhat.com> Signed-off-by: Kairui Song <kasong@redhat.com>	2019-01-22 17:58:24 +08:00
Kazuhito Hagio	cc95f0a744	Add failure_action as alias of default and make default obsolete In preparation for adding 'final_action' option, since it's confusing to have the 'final_action' and 'default' options at the same time, this patch introduces 'failure_action' as an alias of the 'default' option to /etc/kdump.conf, and makes 'default' obsolete to be removed in the future. Also, the "default action" term is renamed to "failure action". Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com> Cc: Dave Young <dyoung@redhat.com> Cc: Lianbo Jiang <lijiang@redhat.com> Cc: Bhupesh Sharma <bhsharma@redhat.com> Acked-by: Bhupesh Sharma <bhsharma@redhat.com> Acked-by: Dave Young <dyoung@redhat.com> Signed-off-by: Kairui Song <kasong@redhat.com>	2019-01-22 17:57:53 +08:00
Kairui Song	32fc6070a6	Add missing usage info In commit `b34ce3a` reload support was added to kdumpctl but the usage info is not updated. Now add reload to usage output to let user aware of the new command. Signed-off-by: Kairui Song <kasong@redhat.com>	2018-11-09 11:17:00 +08:00
Kairui Song	b34ce3a7b4	kdumpctl: Add reload support Add reload support to kdumpctl, reload will simply unload current loaded kexec crash kernel and initramfs, and load it again. Changes in /etc/sysconfig/kdump will take effect with kdumpctl reload, but reloading will not check the content of /etc/kdump.conf and won't rebuild anything. reload is fast, the only time-consuming part of kdumpctl reload is loading kernel and initramfs with kexec which is always necessary. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2018-11-01 22:31:20 +08:00
Kenneth Dsouza	5b385cbd0c	kdumpctl: Print warning in case the raw device is formatted and contains filesystem. Currently the kdumpctl script doesn't check if the raw device is formatted which might destroy existing data at the time of dump capture. This patch addresses this issue, by ensuring kdumpctl prints a warning in case it finds the raw device to be formatted. Signed-off-by: Kenneth D'souza <kdsouza@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2018-10-15 10:47:08 +08:00
Kenneth Dsouza	d92b9364ae	kdumpctl: Error out if path is set more than once. Currently the kdumpctl script doesn't check if the path option is set more than once due to which a vmcore is not captured. This patch addresses this issue by ensuring that only one path is specified in /etc/kdump.conf file. Signed-off-by: Kenneth D'souza <kdsouza@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2018-08-22 15:23:32 +08:00
Kairui Song	6b479b6572	kdumpctl: Rebuild initramfs if loaded kernel modules changed Currently, we only rebuilt kdump initramfs on config file change, fs change, or watchdog related change. This will not cover the case that hardware changed but fs layout and other configurations still stays the same, and kdump may fail. To cover such case, we can detect and compare loaded kernel modules, if a hardware change requires the image to be rebuilt, loaded kernel modules must have changed. Starting from commit 7047294 dracut will record loaded kernel modules when the image is built if hostonly mode is enabled. With this patch, kdumpctl will compare the recorded value with currently loaded kernel modules, and rebuild the image on change. "kdumpctl start" will be a bit slower, as we have to call lsinitrd one more time to get the loaded kernel modules list. I measure the time consumption and we have an overall 0.2s increased loading time. Time consumption of command "kdumpctl restart": Before: real 0m0.587s user 0m0.481s sys 0m0.102s After: real 0m0.731s user 0m0.591s sys 0m0.133s Time comsumption of command "kdumpctl restart" with image rebuild: Before (force rebuild): real 0m10.972s user 0m8.966s sys 0m1.318s After (inserted ~100 new modules): real 0m11.220s user 0m9.387s sys 0m1.337s Signed-off-by: Kairui Song <kasong@redhat.com>	2018-07-26 19:25:09 +08:00
Lianbo Jiang	b1fbeebd08	move some common functions from kdumpctl to kdump-lib.sh we move some common functions from kdumpctl to kdump-lib.sh, the functions could be used in other modules, such as early kdump. It has no bad effect. Signed-off-by: Lianbo Jiang <lijiang@redhat.com> Reviewed-by: Kazuhito Hagio <khagio@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2018-05-29 10:18:40 +08:00
Dave Young	3578c54ff2	Fix kdumpctl showmem showmem function mistakenly added some noise character before the real code, it could be some copy-paste error. Fixes: `1a6cb43a19`	2018-05-24 13:27:02 +08:00
Bhupesh Sharma	5221d4b90c	kdumpctl: Remove 'netroot' and 'iscsi initiator' entries from kdump cmdline In a iSCSI multipath environment (which uses iSCSI software initiator and target environment) when the vmcore file is saved on the target, kdump always fails to establish a iSCSI session and also fails to collect dump due to duplicate entries for 'netroot' and 'iscsi initiator' in the kdump bootargs: # echo c > /proc/sysrq-trigger [83471.842707] SysRq : Trigger a crash [83471.843233] BUG: unable to handle kernel NULL pointer dereference at (null) [83471.844155] IP: [<ffffffffac82ed16>] sysrq_handle_crash+0x16/0x20 [83471.844931] PGD 800000023f710067 PUD 229fd6067 PMD 0 [83471.845655] Oops: 0002 [#1] SMP <snip..> [83471.861889] Call Trace: [83471.862162] [<ffffffffac82f53d>] __handle_sysrq+0x10d/0x170 [83471.862771] [<ffffffffac82f9af>] write_sysrq_trigger+0x2f/0x40 [83471.863405] [<ffffffffac690630>] proc_reg_write+0x40/0x80 [83471.863984] [<ffffffffac61acd0>] vfs_write+0xc0/0x1f0 [83471.864536] [<ffffffffac61baff>] SyS_write+0x7f/0xf0 [83471.865075] [<ffffffffacb1f7d5>] system_call_fastpath+0x1c/0x21 [83471.865714] Code: eb 9b 45 01 f4 45 39 65 34 75 e5 4c 89 ef e8 e2 f7 ff ff eb db 0f 1f 44 00 00 55 48 89 e5 c7 05 41 47 81 00 01 00 00 00 0f ae f8 <c6> 04 25 00 00 00 00 01 5d c3 0f 1f 44 00 00 55 31 c0 c7 05 be [83471.868888] RIP [<ffffffffac82ed16>] sysrq_handle_crash+0x16/0x20 [83471.869700] RSP <ffff9e7fe77b7e58> [83471.870074] CR2: 0000000000000000 <snip..> Starting Login iSCSI Target iqn.2014-08.com.example:t1... [ OK ] Stopped Login iSCSI Target iqn.2014-08.com.example:t1. Starting Login iSCSI Target iqn.2014-08.com.example:t1... [ 6.607051] scsi host2: iSCSI Initiator over TCP/IP [FAILED] Failed to start Login iSCSI Target iqn.2014-08.com.example:t1. See 'systemctl status "iscsistart_\\x40...com.example:t1.service"' for details. [ 126.572911] dracut-initqueue[243]: Warning: dracut-initqueue timeout - starting timeout scripts Stopping Open-iSCSI... [ OK ] Stopped Open-iSCSI. Starting Open-iSCSI... [ OK ] Started Open-iSCSI. Starting Login iSCSI Target iqn.2014-08.com.example:t1... [ OK ] Stopped Login iSCSI Target iqn.2014-08.com.example:t1. Starting Login iSCSI Target iqn.2014-08.com.example:t1... [ 131.095897] scsi host3: iSCSI Initiator over TCP/IP [FAILED] Failed to start Login iSCSI Target iqn.2014-08.com.example:t1. See 'systemctl status "iscsistart_\\x40...com.example:t1.service"' for details. [ 251.085029] dracut-initqueue[243]: Warning: dracut-initqueue timeout - starting timeout scripts [ 251.594554] dracut-initqueue[243]: Warning: dracut-initqueue timeout - starting timeout scripts <snip..> This patch fixes the same by removing the 'netroot', 'rd.iscsi.initiator' and 'iscsi_initiator' entries from the kdump boot cmdline. One reason why this is safe is our kdump target setup does not depend on 1st kernel inherited cmdline params now since the work we dropped root dependency. Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2018-05-21 14:09:17 +08:00
Pingfan Liu	1a6cb43a19	kdumpctl: add showmem cmd port from rhel, original patch is contributed by Minfei Huang: Using /sys to determines crashkernel actual size is confusing since there is no unit of measure. Add a new command "kdumpctl showmem" to show the reserved memory kindly. Signed-off-by: Pingfan Liu <piliu@redhat.com> Signed-off-by: Minfei Huang <mhuang@redhat.com> Acked-by: Bhupesh Sharma <bhsharma@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2018-05-21 14:06:30 +08:00
Lianbo Jiang	dbe8214586	kdumpctl: Check the modification time of core_collector When core_collector is changed, the kdump initramfs needs to be rebuilt before it is loaded. Signed-off-by: Lianbo Jiang <lijiang@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2018-03-22 16:10:58 +08:00
Pingfan Liu	cde5944f93	kdumpctl: skip selinux-relabel for dracut_args --mount dump target When using "dracut_args --mount" to specify dump target, e.g. nfs like: path / core_collector makedumpfile -d 31 dracut_args --mount "host:/path /var/crash nfs defaults" kdump service should neither guarantees the correctness, nor relabels it. For current code, since dracut_args dump targets are likely not mounted so kdump service mistakenly relabel the rootfs, which is meanless and takes very long time. Signed-off-by: Pingfan Liu <piliu@redhat.com>	2017-12-04 12:51:15 +08:00
Baoquan He	85156bfc66	Revert "kdumpctl: sanity check of nr_cpus for x86_64 in case running out of vectors" This reverts commit `2040103bd7`. Reason is it's based on the environment of 1st kernel where all present devices could be active and initialized during bootup. Then all pci devices will request irqs. While kdump only brings up those devices which are necessary for vmcore dumping. So this commit is not meaningful and helpless to very large extent. And it will print out 'Warning' when calculated result is larger than 1 cpu, actually it's a false positive report most of the time. So revert the commit, and can check the git history for later reference. [dyoung]: on some machine this warning message shows up but later we found the irq numbers with and without nr_cpus=1 is quite different so this need more investigation since the formula is not accurate. Signed-off-by: Baoquan He <bhe@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2017-12-04 12:50:50 +08:00
Bhupesh Sharma	cb3d1c1c3f	kdumpctl: Error out in case there are white spaces before an option name Resolves: BZ1484945 https://bugzilla.redhat.com/show_bug.cgi?id=1484945 Currently the kdumpctl script doesn't handle whitespaces (including TABs) which might be there before an option name in the kdump.conf This patch addresses this issue, by ensuring that the kdumpctl errors out in case it finds any stray space(s) or tab(s) before a option name. Reported-by: Kenneth D'souza <kdsouza@redhat.com> Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Acked-by: Pratyush Anand <panand@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2017-10-11 09:57:31 +08:00
Hari Bathini	601766a3d9	fadump: rebuild default initrd with dump capture capability As default initrd is used for booting fadump capture kernel, it must be rebuilt with dump capture capability when dump mode is fadump. Check if default initrd is already fadump capable and rebuild, if necessary. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Reviewed-by: Xunlei Pang <xlpang@redhat.com>	2017-09-06 15:42:13 +08:00
Xunlei Pang	2c9a863fd3	kdumpctl: remove some cmdline inheritage from 1st kernel Now with the help of "--hostonly-cmdline", dracut will generate the needed cmdlines for the dump target, so we can avoid the corresponding duplicate or unnecessary inheritage. Signed-off-by: Xunlei Pang <xlpang@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2017-09-06 15:40:15 +08:00
Xunlei Pang	31dc60ad20	Change dump_to_rootfs to use "--mount" instead of "root=X" Currently, we kept "root=X" for the dump_to_rootfs case, this patch consolidates to use "--mount" for all the kdump mounts. One advantage of this way is that dracut can correctly mark root (in case of dump_to_rootfs is specified) as the host device when "--no-hostonly-default-device" is added in the following patch. Changed the code style in passing, as shellcheck tool reported: Use $(..) instead of deprecated `..` Signed-off-by: Xunlei Pang <xlpang@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2017-09-06 15:39:41 +08:00

1 2 3 4 5

248 Commits