Unnamed repository
Go to file
WANG Chao 002337c671 Introduce kdump error handling service
Now upon failure kdump script might not be called at all and it might
not be able to execute default action. It results in a hang.

Because we disable emergency shell and rely on kdump.sh being invoked
through dracut-pre-pivot hook. But it might happen that we never call
into dracut-pre-pivot hook because certain systemd targets could not
reach due to failure in their dependencies. In those cases error
handling code does not run and system hangs. For example:

sysroot-var-crash.mount --> initrd-root-fs.target --> initrd.target \
  --> dracut-pre-pivot.service --> kdump.sh

If /sysroot/var/crash mount fails, initrd-root-fs.target will not be
reached. And then initrd.target will not be reached,
dracut-pre-pivot.service wouldn't run. Finally kdump.sh wouldn't run.

To solve this problem, we need to separate the error handling code from
dracut-pre-pivot hook, and every time when a failure shows up, the
separated code can be called by the emergency service.

By default systemd provides an emergency service which will drop us into
shell every time upon a critical failure. It's very convenient for us to
re-use the framework of systemd emergency, because we don't have to
touch the other parts of systemd. We can use our own script instead of
the default one.

This new scheme will overwrite emergency shell and replace with kdump
error handling code. And this code will do the error handling as needed.
Now, we will not rely on dracut-pre-pivot hook running always. Instead
whenever error happens and it is serious enough that emergency shell
needed to run, now kdump error handler will run.

dracut-emergency is also replaced by kdump error handler and it's
enabled again all the way down. So all the failure (including systemd
and dracut) in 2nd kernel could be captured, and trigger kdump error
handler.

dracut-initqueue is a special case, which calls "systemctl start
emergency" directly, not via "OnFailure=emergency". In case of failure,
emergency is started, but not in a isolation mode, which means
dracut-initqueue is still running. On the other hand, emergency will
call dracut-initqueue again when default action is dump_to_rootfs.
systemd would block on the last dracut-initqueue, waiting for the first
instance to exit, which leaves us hang. It looks like the following:

dracut-initqueue (running)
  --> call dracut-emergency:
    --> dracut-emergency (running)
      --> kdump-error-handler.sh (running)
        --> call dracut-initqueue:
          --> blocking and waiting for the original instance to exit.

To fix this, I'd like to introduce a wrapper emergency service. This
emegency service will replace both the systemd and dracut emergency. And
this service does nothing but to isolate to real kdump error handler
service:

dracut-initqueue (running)
  --> call dracut-emergency:
    --> dracut-emergency isolate to kdump-error-handler.service
      --> dracut-emergency and dracut-initqueue will both be stopped
          and kdump-error-handler.service will run kdump-error-handler.sh.

In a normal failure case, this still works:
foo.service fails
  --> trigger emergency.service
    --> emergency.service isolates to kdump-error-handler.service
      --> kdump-error-handler.service will run kdump-error-handler.sh

Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
2014-08-05 13:13:32 +08:00
anaconda-addon Rename the subpackage kdump-anaconda-addon 2014-05-22 18:32:43 +08:00
po Translation, Makefile: add make tgz option to auto pack po files 2013-12-24 14:25:23 +08:00
.gitignore kdump-anaconda-addon: update to kdump-anaconda-addon-001-4-g03898ef.tar.gz 2014-07-24 12:56:57 +08:00
98-kexec.rules Fix kdump udev memory event restarts 2014-05-04 17:08:34 +08:00
dracut-kdump-emergency.service Introduce kdump error handling service 2014-08-05 13:13:32 +08:00
dracut-kdump-error-handler.service Introduce kdump error handling service 2014-08-05 13:13:32 +08:00
dracut-kdump-error-handler.sh Introduce kdump error handling service 2014-08-05 13:13:32 +08:00
dracut-kdump.sh Introduce kdump error handling service 2014-08-05 13:13:32 +08:00
dracut-module-setup.sh Introduce kdump error handling service 2014-08-05 13:13:32 +08:00
dracut-monitor_dd_progress monitor-dd-progress fix 2013-06-25 16:45:59 +08:00
fadump-howto.txt kdump: Add firmware-assisted dump howto document 2014-07-28 13:03:51 +08:00
firstboot_kdump.py firstboot:fix reserve mem ui spinbox step size 2012-12-12 17:15:10 +08:00
kdump-dep-generator.sh kdump-dep-generator: Add kdump service dependencies on the fly 2014-04-17 11:27:31 +08:00
kdump-in-cluster-environment.txt Add fence_kdump support for generic clusters 2014-04-03 14:43:06 +08:00
kdump-lib-initramfs.sh Introduce kdump error handling service 2014-08-05 13:13:32 +08:00
kdump-lib.sh pass mount info to dracut when default target is a separate disk 2014-04-17 11:27:31 +08:00
kdump.conf kdump.conf: renew the path section 2014-04-17 11:27:31 +08:00
kdump.conf.5 kdump.conf: renew the path section 2014-04-17 11:27:31 +08:00
kdump.service Disable the ratelimit of kdump service on hotplug events 2014-05-04 17:08:14 +08:00
kdump.sysconfig do not mount root twice 2012-07-12 11:15:35 +08:00
kdump.sysconfig.i386 kdump.sysconfig: default to "nofail" mount 2013-09-27 15:45:24 +08:00
kdump.sysconfig.ppc64 kdump.sysconfig: default to "nofail" mount 2013-09-27 15:45:24 +08:00
kdump.sysconfig.s390x s390x, sysconfig: Change maxcpus=1 to nr_cpus=1 for s390x 2014-01-22 12:52:02 +08:00
kdump.sysconfig.x86_64 Add acpi_no_memhotplug to kdump kernel 2014-01-29 16:25:57 +08:00
kdumpctl kdump: Check whether or not to invoke capturing vmcore 2014-07-28 13:03:48 +08:00
kexec-kdump-howto.txt kexec-kdump-howto.txt: renew the path section 2014-04-17 11:27:31 +08:00
kexec-tools-2.0.3-disable-kexec-test.patch Disable kexec_test 2012-01-21 16:56:07 +08:00
kexec-tools-2.0.4-makedumpfile-Fix-free-bitmap_buffer_cyclic-error.patch Fix free bitmap_buffer_cyclic error. 2014-07-16 14:57:00 +08:00
kexec-tools-2.0.4-makedumpfile-Fix-Makefile-for-eppic_makedumpfile.so-build.patch makedumpfile: Fix Makefile for eppic_makedumpfile.so build 2014-06-13 13:17:56 +08:00
kexec-tools-2.0.4-makedumpfile-Introduce-the-mdf_pfn_t-type.patch Introduce the mdf_pfn_t type. 2014-07-16 14:56:41 +08:00
kexec-tools-2.0.4-makedumpfile-Move-counting-pfn_memhole-for-cyclic-mode.patch Move counting pfn_memhole for cyclic mode. 2014-07-16 14:57:18 +08:00
kexec-tools-2.0.4-makedumpfile-Remove-the-1st-bitmap-buffer-from-the-ELF-.patch Remove the 1st bitmap buffer from the ELF path in cyclic mode. 2014-07-16 14:57:15 +08:00
kexec-tools-2.0.4-makedumpfile-Stop-maximizing-the-bitmap-buffer-to-reduc.patch Stop maximizing the bitmap buffer to reduce the risk of OOM. 2014-07-16 14:57:21 +08:00
kexec-tools.spec Introduce kdump error handling service 2014-08-05 13:13:32 +08:00
mkdumprd mkdumprd: append "x-initrd.mount" to the mount options. 2014-08-05 13:13:32 +08:00
mkdumprd.8 Remove comma which is redundant 2013-02-16 15:19:41 +08:00
README README: Add a README file 2014-04-02 10:45:36 +08:00
rhcrashkernel-param rhcrashkernel-param: echo crashkernel=auto for rhel7 2012-08-20 15:01:47 +08:00
sources kdump-anaconda-addon: update to kdump-anaconda-addon-001-4-g03898ef.tar.gz 2014-07-24 12:56:57 +08:00
zanata-notes.txt Add a notes for zanata process 2012-12-05 01:23:09 -05:00

Adding a patch to kexec-tools
=============================
There is a mailing list kexec@lists.fedoraproject.org where all the dicussion
related to fedora kexec-tools happen. All the patches are posted there for
inclusion and committed to kexec-tools after review.

So if you want your patches to be included in fedora kexec-tools package,
post these to kexec@lists.fedoraproject.org.

One can subscribe to list and browse through archives here.

https://admin.fedoraproject.org/mailman/listinfo/kexec