Commit Graph

747 Commits

Author SHA1 Message Date
WANG Chao
c88a6bb5b6 Rebase makedumpfile-1.5.7
Rebase makedumpfile-1.5.7 and remove the useless patches.

Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-09-19 11:29:35 +08:00
WANG Chao
5b1065de3c Add sample eppic scripts to kexec-tools-eppic package
Upstream makedumpfile contains some sample eppic scripts for reference.
Now pull the whole scripts directory into kexec-tools-eppic package.

Signed-off-by: WANG Chao <chaowang@redhat.com>
2014-09-18 11:19:50 +08:00
Vivek Goyal
b8b1667e2c udev-rules: Restart kdump service on cpu ADD/REMOVE events
This patch changes restart of kdump service from cpu online/offline events
to cpu add/remove events.

Some people have complained that they are running cpu online/offline tests
at high frequency and kdump restarts at high frequency and systemd disables
the service. As a temporary fix, we committed a patch to never disable
kdump service.

In general it probably is a good idea to restart kdump service on cpu
add/remove events.

Toshi Kani confirmed following.

- File for /sys/devices/system/cpu/cpuX/crash_notes will be created first
  before ADD event goes out. That means we can not miss creating EFL notes
  for newly created cpu.

- For REMOVE event files under /sys/devices/system/cpu/cpuX/ are removed
  first and then REMOVE event goes out. That means we will remove the elf
  note header for removed cpu.

- There are some race conditions like a cpu is removed but system crashes
  before kdump service restarts. In that case vmcore.c has to be more robust
  to be able to inspect elf notes and discard empty ones.

  Also it is possible that after cpu remove, crash notes memory got reused
  for something else and after crash vmcore.c might see some random data.
  It does basic size checks and discards elf notes if checks don't pass.

  Above rance conditions can happen even with OFFLINE event and there is
  no good way to remove these altogether. So making vmcore.c more robust
  is the right solution here.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: WANG Chao <chaowang@redhat.com>
2014-09-15 21:55:07 +08:00
WANG Chao
511ed60630 ppc64/kdump: Fix ELF header endianess
Backport the following commit from kexec-tools upstream:

commit 45b33eb
Author: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Date:   Fri Jul 25 17:07:49 2014 +0200

    ppc64/kdump: Fix ELF header endianess

    The ELF header created among the loading of the kdump kernel should be
    flagged using the current endianess and not always as big endian.

    Without this patch the data exposed in /proc/vmcore are not readable when
    running in LE mode.

    Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
    Signed-off-by: Simon Horman <horms@verge.net.au>

This is part of the work to enable ppc64le.

Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-09-15 21:55:04 +08:00
WANG Chao
79449e6612 kexec/ppc64: disabling exception handling when building the purgatory
Backport the following commit from upstream kexec-tools:

commit 335bad7
Author: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Date:   Tue Jul 22 18:22:28 2014 +0200

    kexec/ppc64: disabling exception handling when building the purgatory

    Some Linux distributions would like to turn on the GCC exception handling
    by default. As this option introduces symbols in the built code that are
    defined in a separate shared library, this is not a good idea to have such
    an option activated when building the purgatory.

    This patch forces the exception handling to be turned off when building the
    purgatory on ppc64 BE and LE.

    Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
    Signed-off-by: Simon Horman <horms@verge.net.au>

This is part of the work to enable ppc64le.

Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-09-15 21:55:02 +08:00
WANG Chao
768d9ce47f kexec/ppc64: move to device tree version 17
Backport the following commit from upstream kexec-tools:

commit 2ca2203
Author: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Date:   Mon Jun 16 14:42:43 2014 +0200

    kexec/ppc64: move to device tree version 17

    Kernel commit e6a6928c3ea1d0195ed75a091e345696b916c09b changed the way the
    device tree is processed in the kernel. Now version 2 is no more supported.

    This patch move the version of the device tree generated in ppc64
    environment from 2 to 17, allowing to kexec kernel 3.16.

    In addition, automates the define of NEED_STRUCTURE_BLOCK_EXTRA_PAD which
    should not be set for DT version 16 and above.

    Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
    Signed-off-by: Simon Horman <horms@verge.net.au>

This is part of the work to enable ppc64le.

Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-09-15 21:54:47 +08:00
Baoquan He
a0023e92fc Release 2.0.7-9 2014-09-10 10:51:04 +08:00
Vivek Goyal
38329992fe kdumpctl: Use kexec file based syscall for secureboot enabled machines
Now kexec file based syscall can be used with secureboot enabled machines.
Automatically switch to using new syscall if secureboot is enabled on the
machine.

Also remove the old message where kdump service failed if secureboot is
enabled. That's not the case anymore.

v2:
  Renamed "secureboot" to "Secure Boot" in user visible message.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2014-09-10 10:45:46 +08:00
Vivek Goyal
d301d5e542 kdumpctl: Use kexec file based mode to unload kdump kernel
Currently old kexec syscall denies unloading a kernel if secureboot is enabled.
I think this is not right behavior and should be changed. But for now, use
new syscall if secureboot is enabled and that allows unloading kernel.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2014-09-10 10:45:41 +08:00
Vivek Goyal
7041917dbd kdumpctl: Do not redirect error messages to /dev/null
Does anybody know why are we redirecting stderr to /dev/null when using
kexec load/unload commands? This sounds wrong to me. In case of error I
have no idea what went wrong.

Systemctl already puts all the information in journal. So if we are worried
that user will be bombarded with error messages, that should not be a concern.

So do not redirect stderr to /dev/null.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2014-09-10 10:45:37 +08:00
Baoquan He
3a9200bc9f kexec: Provide an option to use new kexec system call
This is a back port from upstream.

commit 046d1755d2bd723a11a180c265e61a884990712e
Author: Vivek Goyal <vgoyal@redhat.com>
Date:   Mon Aug 18 11:22:32 2014 -0400

    kexec: Provide an option to use new kexec system call

    Hi,

    This is v2 of the patch. Since v1, I moved syscall implemented check littler
    earlier in the function as per the feedback.

    Now a new kexec syscall (kexec_file_load()) has been merged in upstream
    kernel. This system call takes file descriptors of kernel and initramfs
    as input (as opposed to list of segments to be loaded). This new system
    call allows for signature verification of the kernel being loaded.

    One use of signature verification of kernel is secureboot systems where
    we want to allow kexec into a kernel only if it is validly signed by
    a key system trusts.

    This patch provides and option --kexec-file-syscall (-s), to force use of
    new system call for kexec. Default is to continue to use old syscall.

    Currently only bzImage64 on x86_64 can be loaded using this system call.
    As kernel adds support for more arches and for more image types, kexec-tools
    can be modified accordingly.

    Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
    Acked-by: Baoquan He <bhe@redhat.com>
    Signed-off-by: Simon Horman <horms@verge.net.au>

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
2014-09-10 10:44:24 +08:00
WANG Chao
838d8046b7 Release 2.0.7-8
Signed-off-by: WANG Chao <chaowang@redhat.com>
2014-08-29 13:20:00 +08:00
Dave Young
fae72772d7 Removing firstboot module
Since we have added kdump anaconda addon, thus removing firstboot module
User can setup kdump in anaconda install phase, and change the kdump.conf
details in s-c-kdump

Delete the firstboot po files as well.

Signed-off-by: Dave Young <dyoung@redhat.com>
2014-08-29 13:17:14 +08:00
WANG Chao
8b0cc435e0 update to kdump-anaconda-addon-003.tar.gz
Signed-off-by: WANG Chao <chaowang@redhat.com>
2014-08-29 13:02:18 +08:00
WANG Chao
1ff6192737 kdump-emergency.service: executable uses absolute path
Bao noticed the following systemd warning:

systemd[1]: [/usr/lib/systemd/system/emergency.service:17] Executable path is
not absolute, ignoring: systemctl --no-block isolate kdump-error-handler.service

It turns out that now systemd doesn't allow relative path for an executable, we
must adapt that, make the change.

Signed-off-by: WANG Chao <chaowang@redhat.com>
Tested-by: Baoquan He <bhe@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-08-28 13:05:56 +08:00
WANG Chao
e242ae873b Release 2.0.7-7
Signed-off-by: WANG Chao <chaowang@redhat.com>
2014-08-21 10:32:08 +08:00
WANG Chao
d2d16f0521 update to kdump-anaconda-addon-002.tar.gz
Signed-off-by: WANG Chao <chaowang@redhat.com>
2014-08-21 10:32:08 +08:00
WANG Chao
890d2fabf4 spec: install udev rules 98-kexec.rules to /usr/lib not /etc
Resolves: rhbz#1131169

Zbigniew (systemd developer) pointed out that our udev rules should
install to /usr/lib/ not /etc. Because /etc is supposed to be used by
sysadmins only and package should install by default into /usr/lib.

As advised here:
http://www.freedesktop.org/software/systemd/man/udev.html#Rules%20Files

Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-08-21 10:31:52 +08:00
Peter Robinson
e46e590cf9 - Rebuilt for https://fedoraproject.org/wiki/Fedora_21_22_Mass_Rebuild 2014-08-16 23:35:59 +00:00
WANG Chao
082043e117 dracut-module-setup: allow short hostname in cluster configuration
Node could be referenced by short hostname (hostname -s) in cluster
configuration:

[root@virt-068 /]# pcs status nodes
Pacemaker Nodes:
 Online: virt-066 virt-067 virt-068
 Standby:
 Offline:

We didn't know it before. Martin noticed the kdump failure, and provide
this fix. Thanks to Martin.

Signed-off-by: WANG Chao <chaowang@redhat.com>
Tested-by: Martin Juricek <mjuricek@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-08-12 13:16:08 +08:00
WANG Chao
b17977dfb8 Release 2.0.7-5
Signed-off-by: WANG Chao <chaowang@redhat.com>
2014-08-06 12:06:04 +08:00
WANG Chao
5a77531f8a Let systemd handle unmount
Since we've use systemd to control the shutdown path, there's not need
for us to unmount the filesystem, systemd will do that for us just like
it does in a normal boot.

Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-08-06 12:02:59 +08:00
WANG Chao
1719a8aa92 do not force shutdown
It's more safe to use systemd (init) to control the shutdown path for us
in either reboot or power off or halt action.

Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-08-06 12:02:59 +08:00
WANG Chao
83d14e0f39 kdumpctl, fadump: only use lsinitrd when initramfs exists in fadump mode
When there's no kdump initramfs for lsinitrd to inspect with, there will
be an error:

  # kdumpctl start
  /boot/initramfs-3.16.0-rc7+kdump.img does not exist

  Usage: lsinitrd [options] [<initramfs file> [<filename> [<filename> [...] ]]]
  Usage: lsinitrd [options] -k <kernel version>

  -h, --help                  print a help message and exit.
  -s, --size                  sort the contents of the initramfs by size.
  -m, --mod                   list modules.
  -f, --file <filename>       print the contents of <filename>.
  -k, --kver <kernel version> inspect the initramfs of <kernel version>.

No kdump initial ramdisk found.
Rebuilding /boot/initramfs-3.16.0-rc7+kdump.img
[..]

In addition, lsinitrd is a slow operation. We only run it when it's
fadump mode, to speed up in kdump mode.

Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-08-06 12:02:59 +08:00
Baoquan He
f7f8361af9 Add static route into cmdline if target address is not local
If one target address is not local and its route is different than
default gateway, the specific route to this target address need be
added. E.g, target is 192.168.200.222.

sh> ip route show
default via 192.168.122.1 dev eth0  proto static  metric 1024
192.168.200.0/24 via 192.168.100.222 dev ens10  proto static  metric 1

In this patch, get the route to the specific target address and store
it as cmdline, here is /etc/cmdline.d/45-route-static.conf. And the
route options are separated by semicolon like below. Then the stored
route can be parsed when kdump kernel boot up.

192.168.200.0/24:192.168.100.222:ens10

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-08-05 14:05:53 +08:00
Hari Bathini
adb585a336 kdumpctl: fix error handling in fadump case
In fadump, in case of failure while rebuilding initrd, the error status
is not handled properly. See code snippet below:

    $MKDUMPRD $target_initrd_tmp --rebuild $TARGET_INITRD --kver $kdump_kver \
            -i /tmp/fadump.initramfs /etc/fadump.initramfs
    rm -f /tmp/fadump.initramfs
    if [ $? != 0 ]; then
        echo "mkdumprd: failed to rebuild initrd with fadump support" >&2
        return 1
    fi

This patch fixes this issue

Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-08-05 13:53:24 +08:00
WANG Chao
2276b8561c Introduce kdump capture service
This patch introduce a new kdump-capture.service which is used to run
kdump.sh.

kdump-capture.service has OnFailure=emergency.target and
OnFailureIsolate=yes set. When kdump.sh fails, the kdump emergency
service will be triggered and enter the error handling path.

In 2nd kernel, the default target for systemd is initrd.target, so we
put kdump-capture.service in initrd.target.wants/ and by that, system
will start kdump-capture as part of the boot process.

kdump.sh used to run in dracut-pre-pivot hook. Now kdump-capture.service
is placed after dracut-pre-pivot.service and other dependencies are all
copied from dracut-pre-pivot.service. So the start point of
kdump.sh will be almost the same as it used to be.

Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
2014-08-05 13:13:32 +08:00
WANG Chao
002337c671 Introduce kdump error handling service
Now upon failure kdump script might not be called at all and it might
not be able to execute default action. It results in a hang.

Because we disable emergency shell and rely on kdump.sh being invoked
through dracut-pre-pivot hook. But it might happen that we never call
into dracut-pre-pivot hook because certain systemd targets could not
reach due to failure in their dependencies. In those cases error
handling code does not run and system hangs. For example:

sysroot-var-crash.mount --> initrd-root-fs.target --> initrd.target \
  --> dracut-pre-pivot.service --> kdump.sh

If /sysroot/var/crash mount fails, initrd-root-fs.target will not be
reached. And then initrd.target will not be reached,
dracut-pre-pivot.service wouldn't run. Finally kdump.sh wouldn't run.

To solve this problem, we need to separate the error handling code from
dracut-pre-pivot hook, and every time when a failure shows up, the
separated code can be called by the emergency service.

By default systemd provides an emergency service which will drop us into
shell every time upon a critical failure. It's very convenient for us to
re-use the framework of systemd emergency, because we don't have to
touch the other parts of systemd. We can use our own script instead of
the default one.

This new scheme will overwrite emergency shell and replace with kdump
error handling code. And this code will do the error handling as needed.
Now, we will not rely on dracut-pre-pivot hook running always. Instead
whenever error happens and it is serious enough that emergency shell
needed to run, now kdump error handler will run.

dracut-emergency is also replaced by kdump error handler and it's
enabled again all the way down. So all the failure (including systemd
and dracut) in 2nd kernel could be captured, and trigger kdump error
handler.

dracut-initqueue is a special case, which calls "systemctl start
emergency" directly, not via "OnFailure=emergency". In case of failure,
emergency is started, but not in a isolation mode, which means
dracut-initqueue is still running. On the other hand, emergency will
call dracut-initqueue again when default action is dump_to_rootfs.
systemd would block on the last dracut-initqueue, waiting for the first
instance to exit, which leaves us hang. It looks like the following:

dracut-initqueue (running)
  --> call dracut-emergency:
    --> dracut-emergency (running)
      --> kdump-error-handler.sh (running)
        --> call dracut-initqueue:
          --> blocking and waiting for the original instance to exit.

To fix this, I'd like to introduce a wrapper emergency service. This
emegency service will replace both the systemd and dracut emergency. And
this service does nothing but to isolate to real kdump error handler
service:

dracut-initqueue (running)
  --> call dracut-emergency:
    --> dracut-emergency isolate to kdump-error-handler.service
      --> dracut-emergency and dracut-initqueue will both be stopped
          and kdump-error-handler.service will run kdump-error-handler.sh.

In a normal failure case, this still works:
foo.service fails
  --> trigger emergency.service
    --> emergency.service isolates to kdump-error-handler.service
      --> kdump-error-handler.service will run kdump-error-handler.sh

Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
2014-08-05 13:13:32 +08:00
WANG Chao
de95c74a76 mkdumprd: append "x-initrd.mount" to the mount options.
Now when mount in /etc/fstab fails, systemd would not consider it as
critical and it would continue to boot. In fact, emergency service is
triggered, but not in a isolation mode, and it results in the emergency
service getting shutdown at some point later of the boot process. We
need isolation otherwise we won't see any emergency service.

That is because in kdump initramfs, mount units specified in /etc/fstab
are required by "local-fs.target". When any of these mounts fails,
local-fs.target fails.

For kdump initramfs, we need to isolate to emergency service on any of
the mount failure, that said, every service should be stopped and onlu
emergency service would run. But local-fs.target won't trigger that on
its failure. That means in case of mount failure, local-fs.target also
enters failure state, but all the service will continue without any
interruption.

After digging looking into source code of systemd-fstab-generator. I
find "x-initrd.mount" using in initramfs mount, will make the mount
units required by "initrd-root-fs.target" rather than it's used to be
"local-fs.target".

"initrd-root-fs.target" is suitable to us because if it fails, it will
isolate to emergency service. That means in case of any mount failure,
the emergeny service will start and everything else will stop. We want
this effect because we need to take kdump fail-safe action when there's
a mount failure.

From systemd unit point of view, "initrd-root-fs.target" has
OnFailureIsolate=yes, but "local-fs.target" doesn't. From
systemd.unit(5):

OnFailureIsolate=
    Takes a boolean argument. If true, the unit listed in OnFailure=
    will be enqueued in isolation mode, i.e. all units that are not its
    dependency will be stopped. If this is set, only a single unit may
    be listed in OnFailure=. Defaults to false.

NOTE: Harald who contributed "x-initrd.mount" in systemd, confirmed that
      this feature will stay.

Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-08-05 13:13:32 +08:00
WANG Chao
0787accc4e cleanup: mount dump target under /sysroot in 2nd kernel
This patch does the following change in 2nd kernel:
 - dump target is mounted under /sysroot

With this change,  we don't need to track what we've mounted in 2nd
kernel. We can just umount recursively every mount in /sysroot by
command:

  umount -R /sysroot

It's very convenient to do so, because it's hard to track what we've
mounted when we're in error handling path (later patches). So mount
everything under /sysroot is reasonable and practical for us.

Also clean up a bit along with this patch.

Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
2014-08-05 13:13:30 +08:00
WANG Chao
3b27570bea cleanup: extract functions from kdump.sh to kdump-lib-initramfs.sh
Extract functions from kdump.sh, and construct kdump-lib-initramfs.sh as
kdump common functions/varaibles library.

kdump-lib-initramfs.sh will include kdump-lib.sh, because it will use
the functions from there. IOW, kdump-lib-initramfs.sh will be a superset
of kdump-lib.sh

So after this cleanup:

- scripts running in 1st kernel only have to include kdump-lib.sh
- scripts running in 2nd kernel only have to include kdump-lib-initramfs.sh

Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
2014-08-05 13:13:11 +08:00
Vivek Goyal
7f2717aa0a dracut-kdump.sh: Issue a sync after saving vmcore-dmesg.txt
Recently somebody reported an issue where vmcore-dmesg.txt was saved
successfully but later saving vmcore failed to due to lack of space on disk.
System rebooted but after reboot there was nothing on disk. Not even
vmcore-dmesg.txt.

Issue a sync after saving vmcore-dmesg.txt to solve this issue.

I think this is happening because we are doing "reboot -f" instead of going
through systemd reboot path. Anyway, doing a sync now should take care of
this.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: WANG Chao <chaowang@redhat.com>
2014-07-28 13:22:26 +08:00
Hari Bathini
072fa3a65a kdump: Add firmware-assisted dump howto document
Resending this patch by updating 'fadump control flow' to reflect
the latest changes.

This patch adds fadump howto document to kexec-tools. The document
is prepared in reference to kexec-kdump-howto.txt document.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-07-28 13:03:51 +08:00
Hari Bathini
78589a3207 kdump: Check whether or not to invoke capturing vmcore
The script dracut-kdump.sh is  responsible for capturing vmcore during
second kernel boot.  Currently this  script  gets installed into kdump
initrd as part of kdumpbase dracut module.

With fadump support, 'dracut-kdump.sh' script also gets installed into
default initrd to capture  vmcore generated by firmware assisted dump.
Thus in fadump case, the  same initrd is  going to be used for  normal
boot as well as boot after system crash. Hence a  check is required to
see if it is a normal boot or boot after crash.

A new node "ibm,kernel-dump" is added, to the device tree, by firmware
to notify kernel if it is booting after crash.  The below patch adds a
check for this node  before executing  steps to  capture vmcore.  This
check will help bypassing  the vmcore capture steps during normal boot
process.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-07-28 13:03:48 +08:00
Hari Bathini
5bb5be045c kdump: Rebuild default initrd for firmware assisted dump
The current kdump infrastructure builds a separate initrd which then
gets loaded into memory by kexec-tools for use by  kdump kernel. But
firmware  assisted dump (FADUMP) does not  use kexec-based approach.
After crash, firmware reboots  the  partition and  loads grub loader
like the normal  booting process does. Hence in the FADUMP approach,
the second kernel  (after crash) will always  use the default initrd
(OS built). So, to support FADUMP, change is required,  as in to add
dump capturing steps, in this initrd.

The  current kdumpctl script implementation  already has the code to
build initrd  using mkdumprd. This patch  uses  the new  '--rebuild'
option introduced, in  dracut, to incrementally build  the initramfs
image. Before rebuilding, we may need to  probe the initrd image for
fadump support, to avoid  rebuilding the initrd image multiple times
unnecessarily. This can be done using "lsinitrd" tool with the newly
proposed '--mod' option  & inspecting the presence of "kdumpbase" in
the list of modules of default initrd image. We rebuild the image if
only "kdumpbase" module is missing in the initrd image. Also, before
rebuilding, a backup of default initrd image is taken.

Kexec-tools package in rhel7 is now enhanced to insert a out-of-tree
kdump  module for  dracut,  which is responsible for  adding  vmcore
capture  steps into initrd,  if dracut is  invoked  with  "IN_KDUMP"
environment variable set to 1.  mkdumprd script exports "IN_KDUMP=1"
environment variable before  invoking  dracut to build kdump initrd.
This patch relies on this current mechanism of kdump init script.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-07-28 13:03:44 +08:00
Hari Bathini
9f7b8b03b4 kdump: Modify kdump script to stop firmware assisted dump
During service kdump stop, if firmware assisted dump is enabled
and running,  then stop firmware assisted dump by echo'ing 0 to
'/sys/kernel/fadump_registered' file.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-07-28 13:03:41 +08:00
Hari Bathini
734790aa75 kdump: Modify kdump script to start the firmware assisted dump.
During service kdump start, if firmware assisted dump is not enabled then
fallback to starting of existing kexec based kdump.  If firmware assisted
is enabled but not running, then start firmware assisted dump by echo'ing
1 to '/sys/kernel/fadump_registered' file.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-07-28 13:03:38 +08:00
Hari Bathini
e0e70085e1 kdump: Modify status routine to check for firmware-assisted dump
This patch enables kdump script to  check if firmware-assisted dump is
enabled or not by reading value from '/sys/kernel/fadump_enabled'. The
determine_dump_mode() routine sets dump_mode to 'fadump', if fadump is
enabled. By default, dump_mode is set to 'kdump' mode.

Modify status routine to check if firmware assisted dump is registered
or not by  reading value from '/sys/kernel/fadump_registered' file. If
it is set to '1' then return status=0 else return status=1.

    0  <= Firmware assisted is enabled and running
    1  <= Firmware assisted is enabled but not running

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-07-28 13:02:28 +08:00
WANG Chao
0b63f4a522 Release 2.0.7-4
Signed-off-by: WANG Chao <chaowang@redhat.com>
2014-07-24 13:01:19 +08:00
WANG Chao
ba7660f37e dracut-module-setup: NIC renamed with prefix "kdump-" for native ethX
We met a problem that eth0 ends up being eth1 and eth1 being eth0
between 1st and 2nd kernel. Because we pass ifname=eth0:$mac to force
it's named eth0 and since "eth0"is already taken by the other NIC, udev
fails to bring up the NIC we want, thus kdump fails.

kernel assigned network interface names are not persistent. So if first
kernel is using kernel assigned interface names, then force it to use
"kdump-" prefixed names in second kernel.

For ethX, we put a prefix "kdump-" before it, so in 2nd kernel, ethX
will name to "kdump-ethX". So that we can avoid the naming conflict.

We only need to change the ethernet card name, that means, for bridge,
vlan, bond, team devices' names , we never prefix them. Because these
names are assigned when they're created by userspace.

Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-07-24 12:58:06 +08:00
WANG Chao
fd4bd5552b dracut-module-setup: avoid writing the vlan.conf twice
We handle different types of device for vlan. For each type, it should
write different options for vlan.conf in each control path.

Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-07-24 12:57:53 +08:00
WANG Chao
10b2ee22ef kdump-anaconda-addon: update to kdump-anaconda-addon-001-4-g03898ef.tar.gz
Signed-off-by: WANG Chao <chaowang@redhat.com>
2014-07-24 12:56:57 +08:00
Hari Bathini
69986be99d cleanup: Remove "function" keyword from all functions and correct typos
This cleanup patch removes unnecessary keyword "function" at all places in
kdumpctl script. Also, corrects couple of typos in the script.

Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-07-21 17:47:21 +08:00
WANG Chao
61fcf06f98 Release 2.0.7-3
Signed-off-by: WANG Chao <chaowang@redhat.com>
2014-07-21 14:50:43 +08:00
WANG Chao
a87b9ff7ef Update kexec-tools-anaconda-addon
update to kdump-anaconda-addon-20140721.tar.gz

Signed-off-by: WANG Chao <chaowang@redhat.com>
2014-07-21 14:49:13 +08:00
WANG Chao
463742ad95 kdumpctl: display message while waiting for the single instance lock
Vivek suggested we should display message while waiting for the lock,
because the waiting could be long and user will have no idea what's
going on.

So we will repeat the following message every 5 seconds while waiting:

"Another app is currently holding the kdump lock; waiting for it to exit..."

Thanks Vivek for providing a more comprehensive message.

Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-07-18 13:31:54 +08:00
WANG Chao
993961a411 Fix a typo in %changelog
Signed-off-by: WANG Chao <chaowang@redhat.com>
2014-07-16 15:10:00 +08:00
WANG Chao
c45783a2c3 Release 2.0.7-2
Signed-off-by: WANG Chao <chaowang@redhat.com>
2014-07-16 14:59:13 +08:00
Baoquan He
e5769870b1 Stop maximizing the bitmap buffer to reduce the risk of OOM.
This is a backport of the following upstream commit.

commit 0b732828091a545185ad13d0b2e6800600788d61
Author: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
Date:   Tue Jun 10 13:57:29 2014 +0900

    [PATCH 3/3] Stop maximizing the bitmap buffer to reduce the risk of OOM.

    We tried to maximize the bitmap buffer to get the best performance,
    but the performance degradation caused by multi-cycle processing
    looks very small according to the benchmark on 2TB memory:

      https://lkml.org/lkml/2013/3/26/914

    This result means we don't need to make an effort to maximize the
    bitmap buffer, it will just increase the risk of OOM.

    This patch sets a small fixed value (4MB) as a safety limit,
    it may be safer and enough in most cases.

    Signed-off-by: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-07-16 14:57:21 +08:00
Baoquan He
23d1c25fd5 Move counting pfn_memhole for cyclic mode.
This is a backport of the following upstream commit.

commit 2648a8f7caa63e3ec82fd4bce471cec0a895b704
Author: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
Date:   Mon Jun 9 17:48:30 2014 +0900

    [PATCH 2/3] Move counting pfn_memhole for cyclic mode.

    In cyclic mode, memory holes are checked in initialize_2nd_bitmap_cyclic()
    in both the kdump path and the ELF path, so pfn_memhole should be
    counted there.

    Signed-off-by: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-07-16 14:57:18 +08:00