Backport from the makedumpfile devel branch in upstream.
commit aa5ab4cf6c7335392094577380d2eaee8a0a8d52
Author: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Date: Thu Aug 29 12:26:34 2019 -0400
[PATCH] x86_64: Fix incorrect exclusion by -e option with KASLR
The -e option uses info->vmemmap_start for creating a table to determine
the positions of page structures that should be excluded, but it is a
hardcoded value even with KASLR-enabled vmcore. As a result, the option
excludes incorrect pages from it.
To fix this, get the vmemmap start address from info->mem_map_data.
Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Acked-by: Kairui Song <kasong@redhat.com>
Backport from the makedumpfile devel branch in upstream.
commit 7bdb468c2c99dd780c9a5321f93c79cbfdce2527
Author: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Date: Tue Jul 23 12:24:47 2019 -0400
[PATCH] Increase SECTION_MAP_LAST_BIT to 4
kernel commit 326e1b8f83a4 ("mm/sparsemem: introduce a SECTION_IS_EARLY
flag") added the flag to mem_section->section_mem_map value, and it caused
makedumpfile an error like the following:
readmem: Can't convert a virtual address(fffffc97d1000000) to physical address.
readmem: type_addr: 0, addr:fffffc97d1000000, size:32768
__exclude_unnecessary_pages: Can't read the buffer of struct page.
create_2nd_bitmap: Can't exclude unnecessary pages.
To fix this, SECTION_MAP_LAST_BIT needs to be updated. The bit has not
been used until the addition, so we can just increase the value.
Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Acked-by: Dave Young <dyoung@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Backport from the makedumpfile devel branch in upstream.
commit c1b834f80311706db2b5070cbccdcba3aacc90e5
Author: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Date: Tue Jul 23 11:50:52 2019 -0400
[PATCH] Do not proceed when get_num_dumpable_cyclic() fails
Currently, when get_num_dumpable_cyclic() fails and returns FALSE in
create_dump_bitmap(), info->num_dumpable is set to 0 and makedumpfile
proceeds to write a broken dumpfile slowly with incorrect progress
indicator due to the value.
It should not proceed when get_num_dumpable_cyclic() fails.
Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Acked-by: Dave Young <dyoung@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
When building for i386, an error occured:
kexec/arch/i386/kexec-x86.c:39:22: error: 'multiboot2_x86_probe'
undeclared here (not in a function); did you mean 'multiboot_x86_probe'?
39 | { "multiboot2-x86", multiboot2_x86_probe, multiboot2_x86_load,
| ^~~~~~~~~~~~~~~~~~~~
| multiboot_x86_probe
kexec/arch/i386/kexec-x86.c:39:44: error: 'multiboot2_x86_load'
undeclared here (not in a function); did you mean 'multiboot_x86_load'?
39 | { "multiboot2-x86", multiboot2_x86_probe, multiboot2_x86_load,
| ^~~~~~~~~~~~~~~~~~~
| multiboot_x86_load
kexec/arch/i386/kexec-x86.c:40:4: error: 'multiboot2_x86_usage'
undeclared here (not in a function); did you mean 'multiboot_x86_usage'?
40 | multiboot2_x86_usage },
| ^~~~~~~~~~~~~~~~~~~~
| multiboot_x86_usage
Fix this issue by putting the definition in the right header, also tidy
up Makefile.
Signed-off-by: Kairui Song <kasong@redhat.com>
Backport from the makedumpfile devel branch in upstream.
commit d222b01e516bba73ef9fefee4146734a5f260fa1 (HEAD -> devel)
Author: Lianbo Jiang <lijiang@redhat.com>
Date: Wed Jan 30 10:48:53 2019 +0800
[PATCH] x86_64: Add support for AMD Secure Memory Encryption
On AMD machine with Secure Memory Encryption (SME) feature, if SME is
enabled, page tables contain a specific attribute bit (C-bit) in their
entries to indicate whether a page is encrypted or unencrypted.
So get NUMBER(sme_mask) from vmcoreinfo, which stores the value of
the C-bit position, and drop it to obtain the true physical address.
Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
'maxcpus' setting normally don't work on several kdump enabled systems
due to a known udev issue.
Currently the fedora kdump configuration is set as the following on the
aarch64 systems:
# cat /etc/sysconfig/kdump
<..snip..>
# This variable lets us append arguments to the current kdump
# commandline after processed by KDUMP_COMMANDLINE_REMOVE
# KDUMP_COMMANDLINE_APPEND="irqpoll maxcpus=1 reset_devices"
<..snip..>
Since the 'maxcpus' setting doesn't limit the number of SMP CPUs,
so the kdump kernel still boots with all CPUs available on the system.
For e.g on the qualcomm amberwing its 46 CPUs:
# lscpu
Architecture: aarch64
Byte Order: Little Endian
CPU(s): 46
On-line CPU(s) list: 0-45
Thread(s) per core: 1
Core(s) per socket: 46
Socket(s): 1
NUMA node(s): 1
Vendor ID: Qualcomm
Model: 1
Model name: Falkor
Stepping: 0x0
CPU max MHz: 2600.0000
CPU min MHz: 600.0000
BogoMIPS: 40.00
L1d cache: 32K
L1i cache: 64K
L2 cache: 512K
L3 cache: 58880K
NUMA node0 CPU(s): 0-45
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid asimdrdm
This causes the memory consumption in the kdump kernel to swell up and
we can end up having OOM issues in the kdump kernel boot.
Whereas if we use 'nr_cpus=1' in the bootargs, the number of SMP CPUs in
the kdump kernel get limited to 1.
The 'swiotlb=noforce' setting in bootargs provide us extra guarding, to
ensure the crash kernel size requirements do not swell on systems
which support swiotlb.
With the above settings, crashkernel boots properly (without OOM) on all
the aarch64 boards I could test on - qualcomm amberwings, hp-moonshots
and hpe-apache (thunderx2) for crash dump saving on local disk.
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
On powerpc, after hot add cpu and trigger crash on the hot-added cpu, the
kdump kernel hangs after "I'm in purgatory".
The current udev rules expects the dtb to be rebuit on cpu add/remove event.
But since powerpc does not follow the standard cpu hot add framework, it
only ejects online/offline event to user space when cpu is hot
added/removed, instead of add/remove event. Pingfan tried fixing that but
it didn't please the maintainer as it breaks some old userspace tools.
Due to the failure of dtb's rebuilding, KDump kernel fails to get the
'boot_cpuid' and eventually fails to boot [see early_init_dt_scan_cpus() in
arch/powerpc/kernel/prom.c file] if system crashes on hot-added CPU.
Work around it by changing udev rules on powerpc to onlne/offline.
As for offline message, it is even useless on powerpc, and can be dropped.
See the explain: On powerpc, /sys/devices/system/cpu/cpuX nodes are present
for all "possible", irrespective of whether a CPU is hot-added/removed.
crash_notes are already built for all /sys/devices/system/cpu/cpuX nodes and
these nodes are present for all "possible" CPUs
(online/offline/could-be-hot-removed/could-be-hot-added)
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
eppic project have moved to github, update to latest upstream snapshot,
change source link and tar file naming style to fit github's URL format.
This fix the O0 warning reported by annocheck and passes all distro package
flag checking.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Backport the patches required to make the hardening build flags work with
kexec-tools and makedumpfile, and enabld hardening flags in spec file.
This will make the pacakge pass all warnings for kexec and makedumpfile
reported by annocheck.
Didn't find any issue with basic tests with kexec and makedumpfile.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Previously, kdump will restart / reload for many times on hotplug
event, especially memory hotplug events. Hotplugged memory may
generate many udev event as memory are managed and hotplugged in
small chunks by the kernel.
This results in unnecessary system workload and an actually longer
delay of kdump reload and the hotplug event, as udev will either
get blocked or kdumpctl will be waiting for other triggered operation.
To fix this, introduce a kdump-udev-throttler as an agent which will
be called by udev and merge concurrent kdump restart requests. Tested
with a Hyper-V VM which is failing due to udev timeout previously,
no new issues found.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
In dracut-049, a new squash module is introduced, it can reduce the
memory usage of kdump initramfs in the capture kernel, this helps a lot
on lowering the risk of OOM failure.
Tested with latest rawhide with NFS, SSH and local dump.
Signed-off-by: Kairui Song <kasong@redhat.com>
Resolves: bz1619122
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1619122
This patch fixes the "Unhandled rela relocation: R_X86_64_PLT32" error
that we are seeing with Fedora 29 (and newer kernels > 4.18) which
trying to run kexec/kdump on x86_64 machines.
The patch is being discussed upstream and has been ACK'ed by Baoquan and
myself (see <https://www.spinics.net/lists/kexec/msg21255.html>) and I
have also tested the same on Fedora 29/rawhide x86_64 machine as well:
Before the patch:
----------------
[root@hp-bl480c-01 ~]# kdumpctl restart
kexec: unloaded kdump kernel
Stopping kdump: [OK]
Unhandled rela relocation: R_X86_64_PLT32
kexec: failed to load kdump kernel
Starting kdump: [FAILED]
After the patch:
---------------
[root@hp-bl480c-01 ~]# kdumpctl restart
kexec: unloaded kdump kernel
Stopping kdump: [OK]
kexec: loaded kdump kernel
Starting kdump: [OK]
Suggested Upstream Fix:
In response to a change in binutils, commit b21ebf2fb4c
(x86: Treat R_X86_64_PLT32 as R_X86_64_PC32) was applied to
the linux kernel during the 4.16 development cycle and has
since been backported to earlier stable kernel series. The
change results in the failure message in $SUBJECT when
rebooting via kexec.
Fix this by replicating the change in kexec.
Signed-off-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Kdump anaconda has been included as a subpackage for a long time, which
is not a good practice, as the anaconda plugin should be built as
noarch and it does not belong to kexec-tools. We have created a new
package 'kdump-anaconda-addon', so remove it here.
The release version should be bumped later so that kdump-anaconda-addon
could mark previous versions as obsoleted.
Signed-off-by: Kairui Song <kasong@redhat.com>
armv7hl build failed because no makedumpfile* built but the latest commit
tries to install them.
Exclude armv7hl in the code chunk.
Signed-off-by: Dave Young <dyoung@redhat.com>
kexec_test seems to be no longer used upstream, so we had introduced
the 'kexec-tools-2.0.3-disable-kexec-test.patch' earlier to disable the
same from fedora kexec-tools as well.
However an earlier patch "Remove obsolete kdump tool" now explicitly
installs needed files via appropriate logic in .spec file, so we can
drop this patch now to reduce the maintenance burden.
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1441677
Testing: On x86_64 Fedora machine. After this patch kdump utility and related
man page cannot be found on this machine:
[root@tyan-gt24-09 ~]# which kdump
/usr/bin/which: no kdump in
(/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin)
[root@tyan-gt24-09 ~]# man kdump
No manual entry for kdump
Update the fedora 'kexec-tools.spec' to not install the obsolete
kdump tool.
I have submitted an upstream patch to obsolete the kdump tool from
upstream kexec-tools (which has been accepted), but after an internal
discussion we decided not to backport the upstream 'kexec-tools' patch
(which does the same) for fedora, as we would prefer to manage the
changes directly in the .spec file itself.
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Kdump always use _proto=dhcp for both ipv4 and ipv6. But for ipv6
the dhcp address assignment is not like ipv4, there are different ways
for it, stateless and stateful, see below document:
https://fedoraproject.org/wiki/IPv6Guide
In case stateless, kernel can do the address assignment, dracut use
_proto=auto6; for stateful case, dracut use _proto=dhcp6.
But it is hard to decide whether stateless or stateful takes effect,
hence, dracut introduces ip=either6 option, which can try both of these
method automatically for us. For detail, refer to dracut:
commit 67354ee 40network: introduce ip=either6 option
We do not see bug reports before because for the most auto6 cases
kernel assign ip address before dhclient, kdump just happened to work.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Kdump service starts too late, so early crashes will have no chance
to get kdump kernel booting, this will cause crash information to be
lost. It is necessary to add a dracut module in order to load crash
kernel and initramfs as early as possible. You can provide "rd.early
kdump" in grub commandline to enable, then the early kdump will load
those files like the normal kdump, which is disabled by default.
For the normal kdump service, it can check whether the early kdump
has loaded the crash kernel and initramfs. It has no conflict with
the early kdump.
If you rebuild the new initramfs for early kdump, the new initramfs
size will become large, because it will put the vmlinuz and kdump
initramfs into the new initramfs.
In addition, early kdump doesn't support fadump.
Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Reviewed-by: Kazuhito Hagio <khagio@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Dracut has "--hostonly-cmdline" which can generate cmdlines(if any)
regarding the dump target, it's an existing way for us to use to
simplify the code. E.g. We already removed generate_lvm_cmdlines(),
to use "--hostonly-cmdline".
But "--hostonly-cmdline" has other issues(e.g. BZ1451717), it adds
needless devices for kdump like root device.
Now dracut supports "--no-hostonly-default-device" which enables
us to only add the kdump target, which can avoid needless devices
being recognized under kdump. Thus "--hostonly-cmdline" side effects
can be avoided with the help of "--no-hostonly-default-device".
This patch applies dracut's "--hostonly-cmdline" together with
"--no-hostonly-default-device" to achieve above-mentioned purpose.
Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Added patch from panand which was accepted by upstream but not merged in upstream yet.
kexec-tools-2.0.15-makedumpfile-fix-SECTION_MAP_MASK-for-kernel-bigger-than-4.13.patch
https://bugzilla.redhat.com/show_bug.cgi?id=1474706
Makedumpfile failed with below error messages, which is caused by kernel changes 65ade2f872b474fa8a04c2d397783350326634e6:
Buffer size for the cyclic mode: 95992
vtop4_x86_64: Can't get the symbol of init_level4_pgt.
readmem: Can't convert a virtual address(ffffffff8fe18284) to physical address.
readmem: type_addr: 0, addr:ffffffff8fe18284, size:390
check_release: Can't get the address of system_utsname.
Pull in Pratyush's fix in upstream makedumpfile (not merged yet but acked by
maintainer)
Signed-off-by: Dave Young <dyoung@redhat.com>
This patch fixes the whitespace errors reported by
'rpmlint' or 'fedpkg lint' when they are run on kexec-tools srpm:
kexec-tools.spec:242: W: mixed-use-of-spaces-and-tabs (spaces: line 107,
tab: line 242)
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
We met a problem that the kdump emergency service failed to
start when the target dump timeout(we passed "rd.timeout=30"
to kdump), it reported "Transaction is destructive" messages:
[ TIME ] Timed out waiting for device dev-mapper-fedora\x2droot.device.
[DEPEND] Dependency failed for Initrd Root Device.
[ SKIP ] Ordering cycle found, skipping System Initialization
[DEPEND] Dependency failed for /sysroot.
[DEPEND] Dependency failed for Initrd Root File System.
[DEPEND] Dependency failed for Reload Configuration from the Real Root.
[ SKIP ] Ordering cycle found, skipping System Initialization
[ SKIP ] Ordering cycle found, skipping Initrd Default Target
[DEPEND] Dependency failed for File System Check on /dev/mapper/fedora-root.
[ OK ] Reached target Initrd File Systems.
[ OK ] Stopped dracut pre-udev hook.
[ OK ] Stopped dracut cmdline hook.
Starting Setup Virtual Console...
Starting Kdump Emergency...
[ OK ] Reached target Initrd Default Target.
[ OK ] Stopped dracut initqueue hook.
Failed to start kdump-error-handler.service: Transaction is destructive.
See system logs and 'systemctl status kdump-error-handler.service' for details.
[FAILED] Failed to start Kdump Emergency.
See 'systemctl status emergency.service' for details.
[DEPEND] Dependency failed for Emergency Mode.
This is because in case of root failure, initrd-root-fs.target
will trigger systemd emergency target which requires the systemd
emergency service actually is kdump-emergency.service, then our
kdump-emergency.service starts kdump-error-handler.service with
"systemctl isolate"(see 99kdumpbase/kdump-emergency.service, we
replace systemd's with this one under kdump).
This will lead to systemd two contradictable jobs queued as an
atomic transaction:
job 1) the emergency service gets started by initrd-root-fs.target
job 2) the emergency service gets stopped due to "systemctl isolate"
thereby throwing "Transaction is destructive".
In order to solve it, we can utilize "IgnoreOnIsolate=yes" for both
kdump-emergency.service and kdump-emergency.target. Unit with attribute
"IgnoreOnIsolate=yes" won't be stopped when isolating another unit,
they can keep going as expected in case be triggered by any failure.
We add kdump-emergency.target dedicated to kdump the similar way
as did for kdump-emergency.service(i.e. will replace systemd's
emergency.target with kdump-emergency.target under kdump), and
adds "IgnoreOnIsolate=yes" into both of them.
Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Acked-by: Pratyush Anand <panand@redhat.com>
[bhe: improve the patch log about IgnoreOnIsolate="]
Patches have been taken from kexec-tools and makedumpfile to fix issue
with `makedumpfile --mem-usage /proc/kcore`.
Two of the patches is from kexec-tools and rest are from makedumpfile.
All the patches have been acked upstream and applies without conflict.
Kexec-tools patches:
(kexec-tools-2.0.14-x86-x86_64-Fix-format-warning-with-die.patch), which
fixes koji build issue.
kexec-tools-2.0.14-build_mem_phdrs-check-if-p_paddr-is-invalid.patch fixes
the regresssion caused by kernel /proc/kcore fix to use -1 as default value
of p_paddr for pt_loads. Without his patch kexec -p will fail with latest
kernel.
Other makedumpfile patches are backported to support --mem-usage while
kernel kaslr being enabled. Details please see the patch log of the individual
patches.
All the patches are backport of upstream commits.
Patches has been tested with kernel 4.11.0-0.rc1.git0.1.fc26.x86_64.
# makedumpfile --mem-usage /proc/kcore -f
The kernel version is not supported.
The makedumpfile operation may be incomplete.
TYPE PAGES EXCLUDABLE DESCRIPTION
----------------------------------------------------------------------
ZERO 1960 yes Pages filled
with zero
NON_PRI_CACHE 22850 yes Cache pages
without private flag
PRI_CACHE 1517 yes Cache pages with
private flag
USER 32522 yes User process
pages
FREE 1898981 yes Free pages
KERN_DATA 78721 no Dumpable kernel
data
page size: 4096
Total pages on system: 2036551
Total size on system: 8341712896 Byte
We won't need to pass -f once fedora kernel is rebased with v4.12.
Signed-off-by: Pratyush Anand <panand@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Resolves: bz1399436
Since currently crashkernel= will be handled in kdump anaconda addon
we can safely remove rhcrashkernel-param callback.
Signed-off-by: Tong Li <tonli@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Backport upstream kexec-tools commit for correct kaslr page_offset value
commit 9f62cbddddfc93d78d9aafbddf3e1208cb242f7b
Author: Thomas Garnier <thgarnie@google.com>
Date: Tue Sep 13 15:10:05 2016 +0800
kexec/arch/i386: Add support for KASLR memory randomization
Multiple changes were made on KASLR (right now in linux-next). One of
them is randomizing the virtual address of the physical mapping, vmalloc
and vmemmap memory sections. It breaks kdump ability to read physical
memory.
This change identifies if KASLR memories randomization is used by
checking if the page_offset_base variable exists. It search for the
correct PAGE_OFFSET value by looking at the loaded memory section and
find the lowest aligned on PUD (the randomization level).
Related commits on linux-next:
- 0483e1fa6e09d4948272680f691dccb1edb9677f: Base for randomization
- 021182e52fe01c1f7b126f97fd6ba048dc4234fd: Enable for PAGE_OFFSET
Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Dave Young <dyoung@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>