kexec-tools

Author	SHA1	Message	Date
WANG Chao	eedfc174a6	update to kdump-anaconda-addon-005-2-g86366ae.tar.gz It contains translations update. Signed-off-by: WANG Chao <chaowang@redhat.com>	2014-12-01 15:55:52 +08:00
WANG Chao	b3d4edfe33	Release 2.0.8-4 Signed-off-by: WANG Chao <chaowang@redhat.com>	2014-11-04 11:44:47 +08:00
WANG Chao	bd29906daa	Fix an installation issue on ppc64le I forgot to add kdump.sysconfig.ppc64le to "Source" directive to kexec-tools.spec. And on ppc64le, the default kdump.sysconfig will be installed to /etc/sysconfig/kdump. Now fix it. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Baoquan He <bhe@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2014-11-04 11:43:30 +08:00
WANG Chao	b95a63839a	kdump-lib: fix get_option_value() get_option_value() is used to get the value of $1 configured in /etc/kdump.conf. But when we use "get_option_value ssh", it can get the value of "sshkey" instead of "ssh". Fix the regexp pattern to get an exact match. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Baoquan He <bhe@redhat.com>	2014-11-04 11:42:04 +08:00
WANG Chao	5ee69fc5f2	Release 2.0.8-3 Signed-off-by: WANG Chao <chaowang@redhat.com>	2014-10-28 11:17:16 +08:00
Baoquan He	a68bb200f8	save exact route to remote target Previously for solving static route issues, all routes which go through a specific dev will be saved in 1st kernel, and then added in 2nd kernel. Because we use below search pattern, an exception will happen: /sbin/ip route show \| grep -v default \| grep "^[[:digit:]].via. $_netdev" That exception is a corner case which happened when 2 machines connected directly by cable and the 2 network interfaces are configured in different network subnets. E.g there are 2 machines A and B: A:ens10 < ------ > B:ens9 A:ens10 inet 192.168.100.111/24 scope global ens10 route need be added in A: 192.168.110.0/24 dev ens10 B:ens9 inet 192.168.110.222/24 scope global ens9 route need be added in B 192.168.100.0/24 dev ens9 Now if A want to dump to B, the route "192.168.110.0/24 dev ens10" has to be saved and added in 2nd kernel. So in this patch "ip route get to $target" command is executed, then an exact route can be got for going to that target. By this, static route works and the corner case can be fixed too. Signed-off-by: Baoquan He <bhe@redhat.com> Acked-by: Marc Milgram <mmilgram@redhat.com> Acked-by: WANG Chao <chaowang@redhat.com>	2014-10-28 10:56:57 +08:00
Hari Bathini	d1483f9b28	kdump: fix save vmcore path for fadump With fadump support, dracut-kdump.sh script is installed into default initrd to capture vmcore generated by firmware assisted dump. Thus in fadump case, the same initrd is being used for normal boot as well as boot after system crash. Hence a device node, added by firmware while system crashes, is checked to identify if it is a normal boot or boot after crash to determine whether or not capture vmcore. While testing fadump in fedora21 alpha, observed that vmcore capture is initiated even during normal boot, inspite of this check, with the below error: "kdump.sh[451]: /bin/kdump.sh: line 5: return: can only `return' from a function or sourced script" The below patch tries to fix this issue. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Acked-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-10-27 21:12:02 +08:00
WANG Chao	09951d997c	Fix bogus date The date is a typo and fix it. Signed-off-by: WANG Chao <chaowang@redhat.com>	2014-10-21 14:59:13 +08:00
WANG Chao	5bc459ce64	Release 2.0.8-2 Signed-off-by: WANG Chao <chaowang@redhat.com>	2014-10-21 13:42:34 +08:00
WANG Chao	d03fe08d92	spec: Fix rpmbuild issue on ARM Fix rpmbuild issue on ARM platform. Signed-off-by: WANG Chao <chaowang@redhat.com>	2014-10-21 13:42:29 +08:00
WANG Chao	12b38ac1ad	Release 2.0.8-1 Rebase kexec-tools-2.0.8 Signed-off-by: WANG Chao <chaowang@redhat.com>	2014-10-20 11:28:22 +08:00
WANG Chao	5379422cd4	Remove kexec-tools-eppic subpackage Remove this package and put eppic_makedumpfile.so and its sample scripts in kexec-tools package. makedumpfile does dlopen() on eppic_makedumpfile.so and that does not enforce any choice. One could either ship it in kexec-tools package or in a subpackage. Both will work. The real reason was that code for eppic_makedumpfile.so (extension_eppic.c) and some eppic scripts are in upstream makedumpfile project. And that project is distributed as part of kexec-tools package. Now breaking down that makedumpfile in two parts and shipping all eppic specific bits in a separate subpackage was creating confusion everytime we did some changes. So to avoid that confusion and to keep all of the makedumpfile related bits in a single package, this change is being done. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-10-20 11:26:42 +08:00
WANG Chao	b29d8e0d54	Rebase kdump-anaconda-addon-005 The new kdump-anaconda-addon add support of FADUMP. Signed-off-by: WANG Chao <chaowang@redhat.com>	2014-10-14 11:01:42 +08:00
WANG Chao	fd3900dcaf	Release 2.0.7-11 Signed-off-by: WANG Chao <chaowang@redhat.com>	2014-09-26 13:00:56 +08:00
WANG Chao	7d8e65d615	ppc64, ppc64le: disable kvm CMA reservation in kdump kernel By default on powerpc platform, kvm will reserve a relatively large CMA (128M aligned) at early boot. In kdump kernel, even KVM sounds useless but still it reserves 128M and makes kdump kernel fail to boot. Now fix this by adding the following to kernel command line: "kvm_cma_resv_ratio=0" which disable the CMA reservation. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-09-26 12:59:04 +08:00
WANG Chao	013bb485b8	module-setup: do not add duplicate ip=xxx In case of iscsi boot, kernel cmdline will contain ip=xxx kernel parameter for dracut setting up iscsi root in initramfs. For example: "root=xxx ip=192.168.3.26:::255.255.255.0:localhost.localdomain:eno19:none ..." dracut doesn't allow duplicate ip conf for the same network card. dracut will not ignore the either of the duplicate. Instead, it refuses to continue: [ 15.876306] dracut: FATAL: For argument 'ip=192.168.3.26:::255.255.255.0:localhost.localdomain:eno19:none'\n Duplication configurations for 'eno19' [ 16.055513] dracut: Refusing to continue ev argument for multiple ip= lines That's why in our code we don't add a duplicate ip conf when handling the same network card the second time. But we never consider the case that ip conf is already added in kernel cmdline for some special purpose, for example, iscsi boot. Now we also look up /proc/cmdline for ip conf. If it exists, we use the existing one. The existing one should work out of box because dracut will handle it in second kernel like it does for first kernel. That said, the network card will be brought up and root disk will be mounted under /sysroot. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-09-25 10:06:02 +08:00
WANG Chao	2e19ead4fd	spec: fix ppc64le build failure kexec-tools expects "powerpc64le" to pass to configure.ac, while we passed ppc64le. Otherwise the build fails. Now fix it like we did for ppc64. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-09-25 10:05:49 +08:00
WANG Chao	affcb6eeda	Release 2.0.7-10 Signed-off-by: WANG Chao <chaowang@redhat.com>	2014-09-23 15:56:13 +08:00
WANG Chao	aa289b9140	ppc64le: Add arch specific sysconfig kdump.sysconfig.ppc64le is copied from kdump.sysconfig.ppc64. The default sysconfig won't work for ppc64le. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-09-23 11:22:45 +08:00
WANG Chao	e77fed1a83	spec: build makedumpfile on ppc64le Enable makedumpfile build on ppc64le. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-09-23 11:22:45 +08:00
Baoquan He	0aa4890cc5	mkdumprd: try to get mount options from fstab first Previously if a target need mount info, the relevant mount options are got from /proc/mounts by below command: findmnt -k -f -n -r -o OPTIONS $_dev This will bring problems. Since /proc/mounts will give out a set which contains each option. Some options have value specified by user, some options just have default value if user doesn't specify. If some mount options are not supported very well, bugs occured. The more options, the worse. So in this patch, we try to check fstab to get mount options firstly, this give user a chance to decide which options they really want. If they don't give a fstab entry, then we trust all options in /proc/mounts. Signed-off-by: Baoquan He <bhe@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-09-23 11:22:45 +08:00
WANG Chao	c88a6bb5b6	Rebase makedumpfile-1.5.7 Rebase makedumpfile-1.5.7 and remove the useless patches. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-09-19 11:29:35 +08:00
WANG Chao	5b1065de3c	Add sample eppic scripts to kexec-tools-eppic package Upstream makedumpfile contains some sample eppic scripts for reference. Now pull the whole scripts directory into kexec-tools-eppic package. Signed-off-by: WANG Chao <chaowang@redhat.com>	2014-09-18 11:19:50 +08:00
Vivek Goyal	b8b1667e2c	udev-rules: Restart kdump service on cpu ADD/REMOVE events This patch changes restart of kdump service from cpu online/offline events to cpu add/remove events. Some people have complained that they are running cpu online/offline tests at high frequency and kdump restarts at high frequency and systemd disables the service. As a temporary fix, we committed a patch to never disable kdump service. In general it probably is a good idea to restart kdump service on cpu add/remove events. Toshi Kani confirmed following. - File for /sys/devices/system/cpu/cpuX/crash_notes will be created first before ADD event goes out. That means we can not miss creating EFL notes for newly created cpu. - For REMOVE event files under /sys/devices/system/cpu/cpuX/ are removed first and then REMOVE event goes out. That means we will remove the elf note header for removed cpu. - There are some race conditions like a cpu is removed but system crashes before kdump service restarts. In that case vmcore.c has to be more robust to be able to inspect elf notes and discard empty ones. Also it is possible that after cpu remove, crash notes memory got reused for something else and after crash vmcore.c might see some random data. It does basic size checks and discards elf notes if checks don't pass. Above rance conditions can happen even with OFFLINE event and there is no good way to remove these altogether. So making vmcore.c more robust is the right solution here. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: WANG Chao <chaowang@redhat.com>	2014-09-15 21:55:07 +08:00
WANG Chao	511ed60630	ppc64/kdump: Fix ELF header endianess Backport the following commit from kexec-tools upstream: commit 45b33eb Author: Laurent Dufour <ldufour@linux.vnet.ibm.com> Date: Fri Jul 25 17:07:49 2014 +0200 ppc64/kdump: Fix ELF header endianess The ELF header created among the loading of the kdump kernel should be flagged using the current endianess and not always as big endian. Without this patch the data exposed in /proc/vmcore are not readable when running in LE mode. Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au> This is part of the work to enable ppc64le. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-09-15 21:55:04 +08:00
WANG Chao	79449e6612	kexec/ppc64: disabling exception handling when building the purgatory Backport the following commit from upstream kexec-tools: commit 335bad7 Author: Laurent Dufour <ldufour@linux.vnet.ibm.com> Date: Tue Jul 22 18:22:28 2014 +0200 kexec/ppc64: disabling exception handling when building the purgatory Some Linux distributions would like to turn on the GCC exception handling by default. As this option introduces symbols in the built code that are defined in a separate shared library, this is not a good idea to have such an option activated when building the purgatory. This patch forces the exception handling to be turned off when building the purgatory on ppc64 BE and LE. Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au> This is part of the work to enable ppc64le. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-09-15 21:55:02 +08:00
WANG Chao	768d9ce47f	kexec/ppc64: move to device tree version 17 Backport the following commit from upstream kexec-tools: commit 2ca2203 Author: Laurent Dufour <ldufour@linux.vnet.ibm.com> Date: Mon Jun 16 14:42:43 2014 +0200 kexec/ppc64: move to device tree version 17 Kernel commit e6a6928c3ea1d0195ed75a091e345696b916c09b changed the way the device tree is processed in the kernel. Now version 2 is no more supported. This patch move the version of the device tree generated in ppc64 environment from 2 to 17, allowing to kexec kernel 3.16. In addition, automates the define of NEED_STRUCTURE_BLOCK_EXTRA_PAD which should not be set for DT version 16 and above. Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au> This is part of the work to enable ppc64le. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-09-15 21:54:47 +08:00
Baoquan He	a0023e92fc	Release 2.0.7-9	2014-09-10 10:51:04 +08:00
Vivek Goyal	38329992fe	kdumpctl: Use kexec file based syscall for secureboot enabled machines Now kexec file based syscall can be used with secureboot enabled machines. Automatically switch to using new syscall if secureboot is enabled on the machine. Also remove the old message where kdump service failed if secureboot is enabled. That's not the case anymore. v2: Renamed "secureboot" to "Secure Boot" in user visible message. Signed-off-by: Vivek Goyal <vgoyal@redhat.com>	2014-09-10 10:45:46 +08:00
Vivek Goyal	d301d5e542	kdumpctl: Use kexec file based mode to unload kdump kernel Currently old kexec syscall denies unloading a kernel if secureboot is enabled. I think this is not right behavior and should be changed. But for now, use new syscall if secureboot is enabled and that allows unloading kernel. Signed-off-by: Vivek Goyal <vgoyal@redhat.com>	2014-09-10 10:45:41 +08:00
Vivek Goyal	7041917dbd	kdumpctl: Do not redirect error messages to /dev/null Does anybody know why are we redirecting stderr to /dev/null when using kexec load/unload commands? This sounds wrong to me. In case of error I have no idea what went wrong. Systemctl already puts all the information in journal. So if we are worried that user will be bombarded with error messages, that should not be a concern. So do not redirect stderr to /dev/null. Signed-off-by: Vivek Goyal <vgoyal@redhat.com>	2014-09-10 10:45:37 +08:00
Baoquan He	3a9200bc9f	kexec: Provide an option to use new kexec system call This is a back port from upstream. commit 046d1755d2bd723a11a180c265e61a884990712e Author: Vivek Goyal <vgoyal@redhat.com> Date: Mon Aug 18 11:22:32 2014 -0400 kexec: Provide an option to use new kexec system call Hi, This is v2 of the patch. Since v1, I moved syscall implemented check littler earlier in the function as per the feedback. Now a new kexec syscall (kexec_file_load()) has been merged in upstream kernel. This system call takes file descriptors of kernel and initramfs as input (as opposed to list of segments to be loaded). This new system call allows for signature verification of the kernel being loaded. One use of signature verification of kernel is secureboot systems where we want to allow kexec into a kernel only if it is validly signed by a key system trusts. This patch provides and option --kexec-file-syscall (-s), to force use of new system call for kexec. Default is to continue to use old syscall. Currently only bzImage64 on x86_64 can be loaded using this system call. As kernel adds support for more arches and for more image types, kexec-tools can be modified accordingly. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Baoquan He <bhe@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Baoquan He <bhe@redhat.com>	2014-09-10 10:44:24 +08:00
WANG Chao	838d8046b7	Release 2.0.7-8 Signed-off-by: WANG Chao <chaowang@redhat.com>	2014-08-29 13:20:00 +08:00
Dave Young	fae72772d7	Removing firstboot module Since we have added kdump anaconda addon, thus removing firstboot module User can setup kdump in anaconda install phase, and change the kdump.conf details in s-c-kdump Delete the firstboot po files as well. Signed-off-by: Dave Young <dyoung@redhat.com>	2014-08-29 13:17:14 +08:00
WANG Chao	8b0cc435e0	update to kdump-anaconda-addon-003.tar.gz Signed-off-by: WANG Chao <chaowang@redhat.com>	2014-08-29 13:02:18 +08:00
WANG Chao	1ff6192737	kdump-emergency.service: executable uses absolute path Bao noticed the following systemd warning: systemd[1]: [/usr/lib/systemd/system/emergency.service:17] Executable path is not absolute, ignoring: systemctl --no-block isolate kdump-error-handler.service It turns out that now systemd doesn't allow relative path for an executable, we must adapt that, make the change. Signed-off-by: WANG Chao <chaowang@redhat.com> Tested-by: Baoquan He <bhe@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-08-28 13:05:56 +08:00
WANG Chao	e242ae873b	Release 2.0.7-7 Signed-off-by: WANG Chao <chaowang@redhat.com>	2014-08-21 10:32:08 +08:00
WANG Chao	d2d16f0521	update to kdump-anaconda-addon-002.tar.gz Signed-off-by: WANG Chao <chaowang@redhat.com>	2014-08-21 10:32:08 +08:00
WANG Chao	890d2fabf4	spec: install udev rules 98-kexec.rules to /usr/lib not /etc Resolves: rhbz#1131169 Zbigniew (systemd developer) pointed out that our udev rules should install to /usr/lib/ not /etc. Because /etc is supposed to be used by sysadmins only and package should install by default into /usr/lib. As advised here: http://www.freedesktop.org/software/systemd/man/udev.html#Rules%20Files Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-08-21 10:31:52 +08:00
Peter Robinson	e46e590cf9	- Rebuilt for https://fedoraproject.org/wiki/Fedora_21_22_Mass_Rebuild	2014-08-16 23:35:59 +00:00
WANG Chao	082043e117	dracut-module-setup: allow short hostname in cluster configuration Node could be referenced by short hostname (hostname -s) in cluster configuration: [root@virt-068 /]# pcs status nodes Pacemaker Nodes: Online: virt-066 virt-067 virt-068 Standby: Offline: We didn't know it before. Martin noticed the kdump failure, and provide this fix. Thanks to Martin. Signed-off-by: WANG Chao <chaowang@redhat.com> Tested-by: Martin Juricek <mjuricek@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-08-12 13:16:08 +08:00
WANG Chao	b17977dfb8	Release 2.0.7-5 Signed-off-by: WANG Chao <chaowang@redhat.com>	2014-08-06 12:06:04 +08:00
WANG Chao	5a77531f8a	Let systemd handle unmount Since we've use systemd to control the shutdown path, there's not need for us to unmount the filesystem, systemd will do that for us just like it does in a normal boot. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-08-06 12:02:59 +08:00
WANG Chao	1719a8aa92	do not force shutdown It's more safe to use systemd (init) to control the shutdown path for us in either reboot or power off or halt action. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-08-06 12:02:59 +08:00
WANG Chao	83d14e0f39	kdumpctl, fadump: only use lsinitrd when initramfs exists in fadump mode When there's no kdump initramfs for lsinitrd to inspect with, there will be an error: # kdumpctl start /boot/initramfs-3.16.0-rc7+kdump.img does not exist Usage: lsinitrd [options] [<initramfs file> [<filename> [<filename> [...] ]]] Usage: lsinitrd [options] -k <kernel version> -h, --help print a help message and exit. -s, --size sort the contents of the initramfs by size. -m, --mod list modules. -f, --file <filename> print the contents of <filename>. -k, --kver <kernel version> inspect the initramfs of <kernel version>. No kdump initial ramdisk found. Rebuilding /boot/initramfs-3.16.0-rc7+kdump.img [..] In addition, lsinitrd is a slow operation. We only run it when it's fadump mode, to speed up in kdump mode. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-08-06 12:02:59 +08:00
Baoquan He	f7f8361af9	Add static route into cmdline if target address is not local If one target address is not local and its route is different than default gateway, the specific route to this target address need be added. E.g, target is 192.168.200.222. sh> ip route show default via 192.168.122.1 dev eth0 proto static metric 1024 192.168.200.0/24 via 192.168.100.222 dev ens10 proto static metric 1 In this patch, get the route to the specific target address and store it as cmdline, here is /etc/cmdline.d/45-route-static.conf. And the route options are separated by semicolon like below. Then the stored route can be parsed when kdump kernel boot up. 192.168.200.0/24:192.168.100.222:ens10 Signed-off-by: Baoquan He <bhe@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-08-05 14:05:53 +08:00
Hari Bathini	adb585a336	kdumpctl: fix error handling in fadump case In fadump, in case of failure while rebuilding initrd, the error status is not handled properly. See code snippet below: $MKDUMPRD $target_initrd_tmp --rebuild $TARGET_INITRD --kver $kdump_kver \ -i /tmp/fadump.initramfs /etc/fadump.initramfs rm -f /tmp/fadump.initramfs if [ $? != 0 ]; then echo "mkdumprd: failed to rebuild initrd with fadump support" >&2 return 1 fi This patch fixes this issue Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-08-05 13:53:24 +08:00
WANG Chao	2276b8561c	Introduce kdump capture service This patch introduce a new kdump-capture.service which is used to run kdump.sh. kdump-capture.service has OnFailure=emergency.target and OnFailureIsolate=yes set. When kdump.sh fails, the kdump emergency service will be triggered and enter the error handling path. In 2nd kernel, the default target for systemd is initrd.target, so we put kdump-capture.service in initrd.target.wants/ and by that, system will start kdump-capture as part of the boot process. kdump.sh used to run in dracut-pre-pivot hook. Now kdump-capture.service is placed after dracut-pre-pivot.service and other dependencies are all copied from dracut-pre-pivot.service. So the start point of kdump.sh will be almost the same as it used to be. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2014-08-05 13:13:32 +08:00
WANG Chao	002337c671	Introduce kdump error handling service Now upon failure kdump script might not be called at all and it might not be able to execute default action. It results in a hang. Because we disable emergency shell and rely on kdump.sh being invoked through dracut-pre-pivot hook. But it might happen that we never call into dracut-pre-pivot hook because certain systemd targets could not reach due to failure in their dependencies. In those cases error handling code does not run and system hangs. For example: sysroot-var-crash.mount --> initrd-root-fs.target --> initrd.target \ --> dracut-pre-pivot.service --> kdump.sh If /sysroot/var/crash mount fails, initrd-root-fs.target will not be reached. And then initrd.target will not be reached, dracut-pre-pivot.service wouldn't run. Finally kdump.sh wouldn't run. To solve this problem, we need to separate the error handling code from dracut-pre-pivot hook, and every time when a failure shows up, the separated code can be called by the emergency service. By default systemd provides an emergency service which will drop us into shell every time upon a critical failure. It's very convenient for us to re-use the framework of systemd emergency, because we don't have to touch the other parts of systemd. We can use our own script instead of the default one. This new scheme will overwrite emergency shell and replace with kdump error handling code. And this code will do the error handling as needed. Now, we will not rely on dracut-pre-pivot hook running always. Instead whenever error happens and it is serious enough that emergency shell needed to run, now kdump error handler will run. dracut-emergency is also replaced by kdump error handler and it's enabled again all the way down. So all the failure (including systemd and dracut) in 2nd kernel could be captured, and trigger kdump error handler. dracut-initqueue is a special case, which calls "systemctl start emergency" directly, not via "OnFailure=emergency". In case of failure, emergency is started, but not in a isolation mode, which means dracut-initqueue is still running. On the other hand, emergency will call dracut-initqueue again when default action is dump_to_rootfs. systemd would block on the last dracut-initqueue, waiting for the first instance to exit, which leaves us hang. It looks like the following: dracut-initqueue (running) --> call dracut-emergency: --> dracut-emergency (running) --> kdump-error-handler.sh (running) --> call dracut-initqueue: --> blocking and waiting for the original instance to exit. To fix this, I'd like to introduce a wrapper emergency service. This emegency service will replace both the systemd and dracut emergency. And this service does nothing but to isolate to real kdump error handler service: dracut-initqueue (running) --> call dracut-emergency: --> dracut-emergency isolate to kdump-error-handler.service --> dracut-emergency and dracut-initqueue will both be stopped and kdump-error-handler.service will run kdump-error-handler.sh. In a normal failure case, this still works: foo.service fails --> trigger emergency.service --> emergency.service isolates to kdump-error-handler.service --> kdump-error-handler.service will run kdump-error-handler.sh Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Dave Young <dyoung@redhat.com>	2014-08-05 13:13:32 +08:00
WANG Chao	de95c74a76	mkdumprd: append "x-initrd.mount" to the mount options. Now when mount in /etc/fstab fails, systemd would not consider it as critical and it would continue to boot. In fact, emergency service is triggered, but not in a isolation mode, and it results in the emergency service getting shutdown at some point later of the boot process. We need isolation otherwise we won't see any emergency service. That is because in kdump initramfs, mount units specified in /etc/fstab are required by "local-fs.target". When any of these mounts fails, local-fs.target fails. For kdump initramfs, we need to isolate to emergency service on any of the mount failure, that said, every service should be stopped and onlu emergency service would run. But local-fs.target won't trigger that on its failure. That means in case of mount failure, local-fs.target also enters failure state, but all the service will continue without any interruption. After digging looking into source code of systemd-fstab-generator. I find "x-initrd.mount" using in initramfs mount, will make the mount units required by "initrd-root-fs.target" rather than it's used to be "local-fs.target". "initrd-root-fs.target" is suitable to us because if it fails, it will isolate to emergency service. That means in case of any mount failure, the emergeny service will start and everything else will stop. We want this effect because we need to take kdump fail-safe action when there's a mount failure. From systemd unit point of view, "initrd-root-fs.target" has OnFailureIsolate=yes, but "local-fs.target" doesn't. From systemd.unit(5): OnFailureIsolate= Takes a boolean argument. If true, the unit listed in OnFailure= will be enqueued in isolation mode, i.e. all units that are not its dependency will be stopped. If this is set, only a single unit may be listed in OnFailure=. Defaults to false. NOTE: Harald who contributed "x-initrd.mount" in systemd, confirmed that this feature will stay. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-08-05 13:13:32 +08:00

1 2 3 4 5 ...

718 Commits