(1) explain early kdump a little clearer in "Introduction"
(2) move the notes out to a new "Notes" section for readability and
add a note about need of reconfiguration after kernel update
(3) change journalctl -x option to -b option because -x is unnecessary
and -b will make it very faster if persistent journal is available
(4) shorten the example messages for readability
(5) add a note to "Limitation" about the earliness of early kdump
Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>
Acked-by: Lianbo Jiang <lijiang@redhat.com>
For step2 in early-kdump-howto.txt, --force option of dracut
is necessary to rebuild system initramfs. Without --force option,
executing step2 fails because system initramfs already exists.
Signed-off-by: Shigeki Morishima <s.morishima@jp.fujitsu.com>
Acked-by: Kairui Song <kasong@redhat.com>
Currently while trying to save vmcore via vlan eth interface, the Kdump
kernel fails with network unreachable message.
This is because mkdumprd produces a vlan config that does not get
ip address for vlan on eth device.
Fix the same via this patch.
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
With FADump support added on POWERNV paltform, enable the scripts to
capture /proc/vmcore. Also, if CONFIG_OPAL_CORE is enabled, OPAL core
is preserved and exported on POWERNV platform. So, offload OPAL core,
if it is available.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Kairui Song <kasong@redhat.com>
UEFI Secure boot is a signature verification mechanism, designed to
prevent malicious code being loaded and executed at the early boot
stage. This makes sure that code executed is trusted by firmware.
Previously, with kexec_file_load() interface, kernel prevents unsigned
kernel image from being loaded if secure boot is enabled. So kdump will
detect whether secure boot is enabled firstly, then decide which interface
is chosen to execute, kexec_load() or kexec_file_load(). Otherwise unsigned
kernel loading will fail if secure boot enabled, and kexec_file_load() is
entered.
Now, the implementation of kexec_file_load() is adjusted in below commit.
With this change, if CONFIG_KEXEC_SIG_FORCE is not set, unsigned kernel
still has a chance to be allowed to load under some conditions.
commit 99d5cadfde2b ("kexec_file: split KEXEC_VERIFY_SIG into KEXEC_SIG
and KEXEC_SIG_FORCE")
And in the current Fedora, the CONFIG_KEXEC_SIG_FORCE is not set, only the
CONFIG_KEXEC_SIG and CONFIG_BZIMAGE_VERIFY_SIG are set on x86_64 by default.
It's time to spread kexec_file_load() onto all systems of x86_64, including
Secure-boot platforms and legacy platforms. Please refer to the following
form.
.----------------------------------------------------------------------.
| . | signed kernel | unsigned kernel |
| . types |-----------------------|-----------------------|
| . |Secure boot| Legacy |Secure boot| Legacy |
| . |-----------|-----------|-----------|-----------|
| options . | prev| now | prev| now | | | prev| now |
| . |(file|(file|(only|(file| prev| now |(only|(file|
| . |load)|load)|load)|load)| | |load)|load)|
|----------------------|-----|-----|-----|-----|-----|-----|-----|-----|
|KEXEC_SIG=y | | | | | | | | |
|SIG_FORCE is not set |succ |succ |succ |succ | X | X |succ |succ |
|BZIMAGE_VERIFY_SIG=y | | | | | | | | |
|----------------------|-----|-----|-----|-----|-----|-----|-----|-----|
|KEXEC_SIG=y | | | | | | | | |
|SIG_FORCE is not set | | | | | | | | |
|BZIMAGE_VERIFY_SIG is |fail |fail |succ |fail | X | X |succ |fail |
|not set | | | | | | | | |
|----------------------|-----|-----|-----|-----|-----|-----|-----|-----|
|KEXEC_SIG=y | | | | | | | | |
|SIG_FORCE=y |succ |succ |succ |fail | X | X |succ |fail |
|BZIMAGE_VERIFY_SIG=y | | | | | | | | |
|----------------------|-----|-----|-----|-----|-----|-----|-----|-----|
|KEXEC_SIG=y | | | | | | | | |
|SIG_FORCE=y | | | | | | | | |
|BZIMAGE_VERIFY_SIG is |fail |fail |succ |fail | X | X |succ |fail |
|not set | | | | | | | | |
|----------------------|-----|-----|-----|-----|-----|-----|-----|-----|
|KEXEC_SIG is not set | | | | | | | | |
|SIG_FORCE is not set | | | | | | | | |
|BZIMAGE_VERIFY_SIG is |fail |fail |succ |succ | X | X |succ |succ |
|not set | | | | | | | | |
----------------------------------------------------------------------
Note:
[1] The 'X' indicates that the 1st kernel(unsigned) can not boot when the
Secure boot is enabled.
Hence, in this patch, if on x86_64, let's use the kexec_file_load() only.
See if anything wrong happened in this case, in Fedora firstly for the
time being.
Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
The dracut initqueue may quit immediately and won't trigger any hook if
there is no "finished" hook still pending (finished hook will be deleted
once it return 0).
This issue start to appear with latest dracut, latest dracut use
network-manager to configure the network,
network-manager module only install "settled" hook, and we didn't
install any other hook. So NFS/SSH dump will fail. iSCSI dump works
because dracut iscsi module will install a "finished" hook to detect if
the iscsi target is up.
So for NFS/SSH we keep initqueue running until the host successfully get
a valid IP address, which means the network is ready.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
sed and awk is heavily used everywhere in the code, but it's not
explicitely installed by kdump dracut module. If the module in dracut
stop installing them (which already happened with latest dracut
upstream), kdump will break.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
For ssh/nfs dump, kdump need the 'ip' tool to get the host ip address
for naming the vmcore. But kdump-module-setup.sh never installed this
tool. kdump-module-setup.sh worked so far as dracut network module will
help install it.
After dracut changed to use 35network-manager for network setup, "ip"
command won't be installed in second kernel by default. So need to
ensure "ip" is installed when installing kdump dracut module.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
Previously is_nfs_dump_target didn't cover the case that 'path <value>'
where <value> points to a nfs mount point. This function is never used
in first kernel before so so far it worked.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
This help deduplicate the code. Use a single function instead of
repeat the same logic.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
Some of echo $(...) code segment is pointless, just call the command
directly, and remove some useless if / return statement.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
Previous commit f13eab6 ('mkdumprd: simplify dracut args parsing')
break dracut arguments parsing for some use case, this should fix it
well.
Passed nfs/local/iscsi/ssh dump test, and with extra dracut_argss.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Currently "systemctl --fail --no-block default" will be executed on
kdump-error-handler exit due to the config. This may makes systemd try
to isolate to the default target again.
The execute chain will be like:
initrd.target -> kdump.sh (failed) -> kdump-error-handler.service ->
failure_action -> final_action -> ExecStopPost (go to initrd.target again)
Currently, reboot/shutdown/halt is called by either by executing
failure_action or final_action. So the loop will be stopped. However
nont of the reboot/shutdown/halt call is blocking so it might lead to
race issue.
Just drop the ExecStopPost to fix this potential issue.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
Tested-by: HATAYAMA Daisuke <d.hatayama@fujitsu.com>
Previously dracut_args is stored as an array in the shell code,
so we can just pass this array as argument to dracut.
But when trying to append new dracut argument, we have to manually
handle the quotes and spaces to ensure the appended argument is
splitted into array elements in the right way.
This is complex and hard to read or maintain. Instead, just store the
dracut_args as a plain string, and let xargs help parse it and call
dracut.
Simply passing $dracut_args or "$dracut_args" will either let dracut
consider the whole string as a single argument, or loss all the
quotes. xargs can handle it well and cover more corner cases.
Eg. one corner case before, if we have:
dracut_args --mount "/dev/sda1 /mnt/test xfs rw"
Kdump will fail, because the function add_dracut_arg() will wrongly
change the arguments into: (Notice the extra space.)
dracut_args --mount " /dev/sda1 /mnt/test xfs rw"
Instead of fixing it just use xargs instead. Tested with above config
and multiple other dracut_args values.
Resolves: bz1700136
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
By default kernel have vm.zone_reclaim_mode = 0 and large page
allocation might fail as kernel is very conservative on memory
reclaiming. If the page allocation failure is not handled carefully
it could lead to more serious problems.
This issue can be reproduced by change with following steps:
- Fill up page cache use:
# dd if=/dev/urandom of=/test bs=1M count=1300
- Now the memory is filled with write cache:
# free -m
total used free shared buff/cache available
Mem: 1790 184 132 2 1473 1348
Swap: 2119 7 2112
- Insert a module which simply calls "kmalloc(SZ_1M, GFP_KERNEL)" for
512 times: (Notice: vmalloc don't have such problem)
# insmod debug_module.ko
- Got following allocation failure:
insmod: page allocation failure: order:8, mode:0x40cc0(GFP_KERNEL|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0
- Clean up and repeat again with vm.zone_reclaim_mode = 3, OOM is not
observed.
In kdump kernel there is usually only one online CPU and limited memory,
so we set vm.zone_reclaim_mode = 3 to let kernel reclaim memory more
aggresively to avoid such issue.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
When large amount of memory, about 1TB, is removed with DLPAR memory
remove operation, kdump reload could fail due to race condition with
device tree property update. In such scenario, the subsequent kdump
reload requests would also fail as reload() only proceeds if current
load status is active. Since the possibility of this race condition
couldn't be wished away due to the nature of the scenario, workaround
it by proceeding to load even if current load status is not active as
long as kdump service is active, which kdump udev rules already check
for.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Kairui Song <kasong@redhat.com>
If failure_action is shutdown/reboot/halt, final_action is pointless as
the system will be already stopping. And if final_action is different
from failure_action, it will trigger a systemd race problem and cause
unexpected behavior to occur.
So let the error handler stop and exit after performing failure_action
successfully if failure_action is one of shutdown/reboot/halt.
This way, final_action will not be executed.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
Merge kdump_setup_netdev into kdump_install_net.
kdump_install_net is a wrapper of calling kdump_setup_netdev, and
it do following three extra things:
1. Sanitize and resolve the hostname
2. Resolve the route to the destination
3. Set the default gateway for once
There is currently only one caller of kdump_setup_netdev, the iscsi
network setup code, and it's doing 1 and 2 by itself. And there should
only be one default gateway in kdump enviroment, so applying 3 here is
fine.
And the comment of kdump_install_net is wrong and obsoleted, update the
comment too.
Just merge kdump_setup_netdev into kdump_install_net and always use
kdump_install_net instead.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
check_size checks if the specified dump path on the ssh target have
enough space. And if the path doesn't exits, it will fail and exit.
mkdir_save_path_ssh should be called first to check if the path
exists, and create the path if it doesn't exits, so the size check
can always work properly.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
In commit a431a7e354 (module-setup: fix 99kdumpbase network dependency),
the statement for OR operation is still wrong.
The OR condition statement should be: if a || b
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
In kdump.conf, if sshkey points to an invalid ssh key, 'kdumpctl restart'
can bail out immediately instead of retry.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
For fadump, this helps to reduce the risk of boot failure, and
may also help speed up the boot by a bit.
For normal kdump, this will delay the dump target mounting, and no
longer depend on systemd to do the mounting job.
And currently there is a failure that caused by some mount handling
bug with kernel and systemd that is failing the system booting:
[FAILED] Failed to mount /kdumproot/home.
See 'systemctl status kdumproot-home.mount' for details.
[DEPEND] Dependency failed for Local File Systems.
[ OK ] Reached target Remote File Systems (Pre).
[ OK ] Reached target Remote File Systems.
Starting udev Coldplug all Devices...
Starting Create Volatile Files and Directories...
Starting Kdump Emergency...
This patch can bypass it. The fix of root cause is still WIP, but this
patch itself is a nice to have optimization so it's reasonable to do so.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
When trying to setup kdump for fedora-coreos, kdumpctl start fails to
find the correct boot directory since BOOT_IMAGE start with the grub
device name
Signed-off-by: Yuval Turgeman <yturgema@redhat.com>
Print some message during the long wait period to reflect the process.
The message will look like:
Network dump target is not usable, waiting for it to be ready
...
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Backport from the makedumpfile devel branch in upstream.
commit 8425342a52b23d462f10bceeeb1c8a3a43d56bf0
Author: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Date: Fri Sep 6 09:50:34 2019 -0400
[PATCH] Fix inconsistent return value from find_vmemmap()
When -e option is given, the find_vmemmap() returns FAILED(1) if
it failed on x86_64, but on architectures other than that, it is
stub_false() and returns FALSE(0).
if (info->flag_excludevm) {
if (find_vmemmap() == FAILED) {
ERRMSG("Can't find vmemmap pages\n");
#define find_vmemmap() stub_false()
As a result, on the architectures other than x86_64, the -e option
does some unnecessary processing with no effect, and marks the dump
DUMP_DH_EXCLUDED_VMEMMAP unexpectedly.
Also, the functions for the -e option return COMPLETED or FAILED,
which are for command return value, not for function return value.
So let's fix the issue by following the common style that returns
TRUE or FALSE, and avoid confusion.
Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Acked-by: Kairui Song <kasong@redhat.com>
Backport from the makedumpfile devel branch in upstream.
commit b461971bfac0f193a0c274c3b657d158e07d4995
Author: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Date: Thu Aug 29 14:51:56 2019 -0400
[PATCH] Fix exclusion range in find_vmemmap_pages()
In the function, since pfn ranges are literally start and end, not start
and end+1, if the struct page of endpfn is at the last in a vmemmap page,
the vmemmap page is dropped by the following code, and not excluded.
npfns_offset = endpfn - vmapp->rep_pfn_start;
vmemmap_offset = npfns_offset * size_table.page;
// round down to page boundary
vmemmap_offset -= (vmemmap_offset % pagesize);
We can use (endpfn+1) here to fix.
Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Acked-by: Kairui Song <kasong@redhat.com>
Backport from the makedumpfile devel branch in upstream.
commit aa5ab4cf6c7335392094577380d2eaee8a0a8d52
Author: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Date: Thu Aug 29 12:26:34 2019 -0400
[PATCH] x86_64: Fix incorrect exclusion by -e option with KASLR
The -e option uses info->vmemmap_start for creating a table to determine
the positions of page structures that should be excluded, but it is a
hardcoded value even with KASLR-enabled vmcore. As a result, the option
excludes incorrect pages from it.
To fix this, get the vmemmap start address from info->mem_map_data.
Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Acked-by: Kairui Song <kasong@redhat.com>
On a host with ipaddr not ready before kdump service, ssh return errno 255.
While if no ssh-key, ssh also return errno 255. For both of cases, the
current kdump code promote user to run 'kdumpctl propagate'. This confuses
user who already installs ssh-key.
In order to tell these two cases from each other, the ssh warning message
should be involved, and parsed.
For the no ssh-key case , warning message is "Permission denied" or "No
such file or directory". For the other, warning message is "Network
Unreachable"
This patch also does a slight change to enlarge the timeout from 60s to
180s. This value can meet test at the time being
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Currently there are two issues with device dump:
- It may use too much memory
- kdump won't automatically include required driver in second kernel
User should manually reserve enough memory, and include the required
driver by using extra_modules. Add some notes about the issues in
kexec-kdump-howto.txt
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Device dump may use a log of memory and cause OOM issue, so append
"novmcoredd" option for second kernel and disable it by default.
To use device dump, user should remove the vmcoredd parameter
manually.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Bond options in ifcfg is space separated, dracut expected it to be comma
separated, so it have to be parsed and converted during initramfs
building.
The currently parsing and convert pattern is flawed, for example:
" downdelay=0 miimon=100 mode=802.3ad updelay=0 "
is converted to :
":,downdelay=0 miimon=100 mode=802.3ad updelay=0 "
should be:
":downdelay=0,miimon=100,mode=802.3ad,updelay=0"
So fix this issue by using more simple but robust method for processing
the options.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
The localhost is filtered out in case of is_pcs_fence_kdump, do it too in
case of is_generic_fence_kdump.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
'hostname -A' can not get the alias, meanwhile 'hostname -a' is deprecated.
So we should do it by ourselves.
The parsing is based on the format of /etc/hosts, i.e.
IP_address canonical_hostname [aliases...]
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
If dump target is ipv6 address, a host should have ipv6 address ready
before starting kdump service. Otherwise, kdump service fails to start due
to the failure "ssh dump_server_ip mkdir -p $SAVE_PATH".
And user can see message like:
"Could not create root@2620:52:0:10da:46a8:42ff:fe23:3272/var/crash"
I observe a long period (about 30s) on some machine before they got ipv6
address dynamiclly, which is never seen on ipv4 host.
Hence kdump service has a dependency on ipv6 address. But there is no good
way to resolve it. One way is asking user to run the cmd "nmcli connection
modify eth0 ipv6.may-fail false". But this will block systemd until ipv6
address is ready. Despite doing so, kdump can try its best (wait 1 minutes
after it starts up) before failure.
How to implement the wait is arguable. It will involve too many technique
details if explicitly waiting on ipv6 address, instead, just lean on 'ssh'
return value to see the availability of network.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Backport from the makedumpfile devel branch in upstream.
commit 7bdb468c2c99dd780c9a5321f93c79cbfdce2527
Author: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Date: Tue Jul 23 12:24:47 2019 -0400
[PATCH] Increase SECTION_MAP_LAST_BIT to 4
kernel commit 326e1b8f83a4 ("mm/sparsemem: introduce a SECTION_IS_EARLY
flag") added the flag to mem_section->section_mem_map value, and it caused
makedumpfile an error like the following:
readmem: Can't convert a virtual address(fffffc97d1000000) to physical address.
readmem: type_addr: 0, addr:fffffc97d1000000, size:32768
__exclude_unnecessary_pages: Can't read the buffer of struct page.
create_2nd_bitmap: Can't exclude unnecessary pages.
To fix this, SECTION_MAP_LAST_BIT needs to be updated. The bit has not
been used until the addition, so we can just increase the value.
Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Acked-by: Dave Young <dyoung@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>