The current code only exclude the hostname, while localhost can have alias in
/etc/hosts. All of the alias should be excluded from the fence dump node to
avoid deadlock issue.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Squash module is used to save memory. For fadump this is not neccessary
and may slow down the build time, and make it more fragile.
fadump initramfs is used for normal boot as well, although squash module
is capable of being used for generic normal boot, but there are cases
where is doesn't work well. So disable it and make fadump more robust.
Signed-off-by: Kairui Song <kasong@redhat.com>
Tested-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Don't use any log storage and forward to console directly, this make
console output more useful, and also save more memory. On a fresh
installed Fedora 30 it saved ~5M of memory, and the amount of log being
printed to console is still accetable.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Some firmware will provide a ACPI HEST table with massive amount of
entries, and the way how kernel handles these entries will consume a lot
of memory which will lead to OOM issue in kdump kernel. During testing
on certain machine, disable HEST saved ~60M of memory.
Kdump is only for emergency use in case of a kernel panic, so temporarily
disable hardware error report & recovery related feature is acceptable
in general. So disable HEST support in kdump kernel to save memory.
Currently such issue is only observed on x86_64, so limit this change to
x86_64 only.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
systemd has the following message "OnFailureIsolate is deprecated. Please
use OnFailureJobMode= instead"
Changing the file to meet systemd's requirement
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Backport from the makedumpfile devel branch in upstream.
commit d222b01e516bba73ef9fefee4146734a5f260fa1 (HEAD -> devel)
Author: Lianbo Jiang <lijiang@redhat.com>
Date: Wed Jan 30 10:48:53 2019 +0800
[PATCH] x86_64: Add support for AMD Secure Memory Encryption
On AMD machine with Secure Memory Encryption (SME) feature, if SME is
enabled, page tables contain a specific attribute bit (C-bit) in their
entries to indicate whether a page is encrypted or unencrypted.
So get NUMBER(sme_mask) from vmcoreinfo, which stores the value of
the C-bit position, and drop it to obtain the true physical address.
Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Looking at the difference of the x86_64 and aarch64 kdump.sysconfig
options for Fedora, one can see the following options which are
different:
Present in kdump.sysconfig.x86_64 but not in kdump.sysconfig.aarch64:
---------------------------------------------------------------------
cgroup_disable=memory
mce=off
numa=off
udev.children-max=2
panic=10
acpi_no_memhotplug
transparent_hugepage=never
nokaslr
Present in kdump.sysconfig.aarch64 but not in kdump.sysconfig.x86_64:
---------------------------------------------------------------------
swiotlb=noforce
After going through all the options, it makes sense to add the
following options added to kdump.sysconfig.aarch64:
KDUMP_COMMANDLINE_APPEND="cgroup_disable=memory udev.children-max=2
panic=10 irqpoll nr_cpus=1
swiotlb=noforce reset_devices"
This has helped reduce the memory footprint of crashkernel on several
aarch64 machines available in the beaker lab. For e.g. I was seeing
OOM issues on large aws ec2 instances with the default crashkernel size
of 512M, and I had to use an increased crashkernel size of 786M on the
same to boot the crash dump kernel.
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
'maxcpus' setting normally don't work on several kdump enabled systems
due to a known udev issue.
Currently the fedora kdump configuration is set as the following on the
aarch64 systems:
# cat /etc/sysconfig/kdump
<..snip..>
# This variable lets us append arguments to the current kdump
# commandline after processed by KDUMP_COMMANDLINE_REMOVE
# KDUMP_COMMANDLINE_APPEND="irqpoll maxcpus=1 reset_devices"
<..snip..>
Since the 'maxcpus' setting doesn't limit the number of SMP CPUs,
so the kdump kernel still boots with all CPUs available on the system.
For e.g on the qualcomm amberwing its 46 CPUs:
# lscpu
Architecture: aarch64
Byte Order: Little Endian
CPU(s): 46
On-line CPU(s) list: 0-45
Thread(s) per core: 1
Core(s) per socket: 46
Socket(s): 1
NUMA node(s): 1
Vendor ID: Qualcomm
Model: 1
Model name: Falkor
Stepping: 0x0
CPU max MHz: 2600.0000
CPU min MHz: 600.0000
BogoMIPS: 40.00
L1d cache: 32K
L1i cache: 64K
L2 cache: 512K
L3 cache: 58880K
NUMA node0 CPU(s): 0-45
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid asimdrdm
This causes the memory consumption in the kdump kernel to swell up and
we can end up having OOM issues in the kdump kernel boot.
Whereas if we use 'nr_cpus=1' in the bootargs, the number of SMP CPUs in
the kdump kernel get limited to 1.
The 'swiotlb=noforce' setting in bootargs provide us extra guarding, to
ensure the crash kernel size requirements do not swell on systems
which support swiotlb.
With the above settings, crashkernel boots properly (without OOM) on all
the aarch64 boards I could test on - qualcomm amberwings, hp-moonshots
and hpe-apache (thunderx2) for crash dump saving on local disk.
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
Currently kdumpctl rebuild will simply rebuild the initramfs, and
only perform basic config syntax check. But it should also check if the
target path is available when using SSH target, else kdump may fail.
is second kernel. kdumpctl rebuild should cover this case, and create
the path if it doesn't exist.
This patch make rebuild and restart behaves the same, rebuild is
now equal to restart, except it won't check config change or reload
kdump resource.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Although "kdumpctl rebuild" is introduced to help user rebuild the
initramfs without modifying the kdump.conf, if the kdump.conf is
modified and "kdumpctl rebuild" is called, a initramfs with a faulty
kdump.conf will be built.
Kdump will refuse to load the initramfs when restarted, but kdumpctl
reload may load the faulty initramfs. So need to make sure the faulty
build won't be generate in the first place.
Check for kdump.conf error before building the initramfs to ensure such
failure won't happen.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
We don't necessarily have to always rebuild the initramfs when
extra_modules is set. Instead, just detect if any module is updated,
and only rebuild initramfs if found any updated kernel module.
Tested with in-tree kernel modules, out-of-tree kernel modules, weak
modules, all worked as expected.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Previously only the symlink's timestamp is used for checking if file are
modified, this will not trigger a rebuild if the symlink target it
modified.
So check both symlink timestamp and symlink target timestamp, rebuild
the initramfs on both symlink changed and target changed.
Also give a proper error message if the file doesn't exist.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
When reading kdump configs, a single parsing should be enough and this
saves a lot of duplicated striping call which speed up the total load
speed.
Speed up about 2 second when building and 0.1 second for reload in my
tests.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Early kdump inherits the settings of normal kdump, so any changes that
caused normal kdump rebuilding also require rebuilding the system initramfs
to make sure that the changes take effect for early kdump.
Therefore, when the early kdump is enabled, provide a prompt message after
the rebuilding of kdump initramfs is completed.
Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Currently kdump is not working well with encrypted targets, add document
about this issue.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Add some note about the limitation of kdumpctl's auto detect and rebuild
feature, and suggest the user to rebuild the initramfs manually on
major system change, and don't include the initramfs in disk images.
Put the note about system change in front part of the document so user
will less likely to miss it.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Previous we rebuild the initramfs when kenrel load module list changed,
but this is not very stable as some async services may load/unload
kernel modules, and cause unnecessary initramfs rebuild.
Instead, it's better to just check if the module required to dump to
the dump target is loaded or not, and rebuild if not loaded. This
avoids most false-positives, and ensure local target change is always
covered.
Currently only local fs dump target is covered, because this check
requires the dump target to be mounted when building the initramfs,
this guarantee that the module is in the loaded kernel module list,
else we may still get some false positive.
dracut-install could be leveraged to combine the modalias list with
kernel loaded module list as a more stable module list in the initramfs,
but upstream dracut change need to be done first.
Passed test on a KVM VM, changing the storage between SATA/USB/VirtIO
will trigger initramfs rebuild and didn't notice any false-positive.
Also passed test on my laptop with no false-positive.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
This reverts commit 6b479b6572.
Check initramfs rebuild by looking at if there is any change of load
kernel modules list is not very stable after all. Previously we are
counting on udev to settle before kdump is started to ensure all modules
is ready, but actually any service may cause a kernel module load, even
after udev is settled.
The previous commit is trying to workaround an issue that VM created
with disk snapshot may fail in the kdump initramfs. The better fix is to
not include the kdump initramfs in the disk snapshot at all, as the
kdump initramfs is not generated for a generic use. And With new added
"kdumpctl reload" command, admins could rebuild the image easily, and
should rebuild the initramfs on hardware change manually.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
On powerpc, after hot add cpu and trigger crash on the hot-added cpu, the
kdump kernel hangs after "I'm in purgatory".
The current udev rules expects the dtb to be rebuit on cpu add/remove event.
But since powerpc does not follow the standard cpu hot add framework, it
only ejects online/offline event to user space when cpu is hot
added/removed, instead of add/remove event. Pingfan tried fixing that but
it didn't please the maintainer as it breaks some old userspace tools.
Due to the failure of dtb's rebuilding, KDump kernel fails to get the
'boot_cpuid' and eventually fails to boot [see early_init_dt_scan_cpus() in
arch/powerpc/kernel/prom.c file] if system crashes on hot-added CPU.
Work around it by changing udev rules on powerpc to onlne/offline.
As for offline message, it is even useless on powerpc, and can be dropped.
See the explain: On powerpc, /sys/devices/system/cpu/cpuX nodes are present
for all "possible", irrespective of whether a CPU is hot-added/removed.
crash_notes are already built for all /sys/devices/system/cpu/cpuX nodes and
these nodes are present for all "possible" CPUs
(online/offline/could-be-hot-removed/could-be-hot-added)
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
kdumpctl fails to load a crash kernel if KASLR is enabled.
The incorrect configuration of the kptr_restrict parameter causes that
the kdumpctl fails to load a crash kernel. The problem occurs when the
Kernel Address Space Layout Randomization (KASLR) mechanism is enabled.
The value of kptr_restrict decides how to expose the content of /proc
files. Inappropriate setting will cause that the /proc/kallsyms file is
being printed as all zeros. This configuration will break the kdump
loading process since it needs access to the conntent of /proc/kallsyms
if KASLR is enabled.
Set kptr_restrict to 1 to avoid the problem described above.
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
These two command are not documented, update the man page to let user
know how to use them.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Use "kdumpctl rebuild" to rebuild the image directly. This could help
admins to rebuild kdump image directly.
Also merge fadump related initramfs backup/restore into setup_initrd,
and do permission only when actually trying to rebuild the image.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Seems some dead codes are left here for historical reason, just remove
them, read the config and strip comments only for once.
This improve the speed by a lot (2.6s -> 0.498s for reading a
simple config in my test case, on HDD) and make the code cleaner.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
With kernel commit 0823c68b054b ("powerpc/fadump: re-register firmware-
assisted dump if already registered") support is enabled to re-register
when FADump is alredy registered. Leverage that option in kdump scripts.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Kairui Song <kasong@redhat.com>
The idea behind adding support for dracut '--rebuild' option was to
ensure the initrd built for fadump takes into consideration all the
build parameters passed to original initrd. Pass original initrd
instead of current default initrd for rebuild as current initrd
might already have build parameters from original initrd along
with parameters from previous fadump intird build making the
build parameters look like this after a few iterations:
-H --persistent-policy 'by-uuid' -f --quiet --hostonly --hostonly-
cmdline --hostonly-i18n --hostonly-mode 'strict' -o 'plymouth dash
resume ifcfg' --mount '/dev/mapper/rhel_zzfp219--lp3-home /kdumproot
//home xfs defaults' -f --kver '4.18.0-60.el8.ppc64le' --quiet
--hostonly --hostonly-cmdline --hostonly-i18n --hostonly-mode 'strict'
-o 'plymouth dash resume ifcfg' --mount '/dev/mapper/rhel_zzfp219--lp3-home
/kdumproot//home xfs defaults' -f --kver '4.18.0-60.el8.ppc64le' --quiet
--hostonly --hostonly-cmdline --hostonly-i18n --hostonly-mode 'strict'
-o 'plymouth dash resume ifcfg' --mount '/dev/mapper/rhel_zzfp219--lp3-home
/kdumproot//home xfs defaults' -f --kver '4.18.0-60.el8.ppc64le' --include
'/tmp/fadump.initramfs' '/etc/fadump.initramfs' --include
'/tmp/fadump.initramfs' '/etc/fadump.initramfs' --include
'/tmp/fadump.initramfs' '/etc/fadump.initramfs'
--
Since it is not desirable to build initrd with stale and/or duplicate
build parameters, use original initrd (backed up) to rebuild fadump
initrd, instead of current default initrd.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
eppic project have moved to github, update to latest upstream snapshot,
change source link and tar file naming style to fit github's URL format.
This fix the O0 warning reported by annocheck and passes all distro package
flag checking.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Backport the patches required to make the hardening build flags work with
kexec-tools and makedumpfile, and enabld hardening flags in spec file.
This will make the pacakge pass all warnings for kexec and makedumpfile
reported by annocheck.
Didn't find any issue with basic tests with kexec and makedumpfile.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Currently we use "\b" (word boundary) as the delimiter for ro option,
which is not correct. For mount options like
"defaults,errors=remount-ro" the ro on the tail will also be replaced
and result in an invalid mount option.
So we use a more strict logic on detecting ro mount option. It should
either starts with "," or "^" (begin of line) and ends with "," or "$"
(end of line), and keep the delimiter untouched. This should ensure
only valid mount option got detected and replaced.
This passed following tests:
defaults,ro,noauto,errors=remount-ro,nobootwait,nofail => defaults,rw,errors=remount-ro,
defaults,errors=remount-ro => defaults,errors=remount-ro
defaults,ro,relatime => defaults,rw,relatime
defaults,ro => defaults,rw
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Since early kdump is generally used for capturing vmcore when
boot-time panic occurs, if a system always reboots after capturing
vmcore, it can go into a crash loop.
To avoid this issue, this patch add a note of 'final_action' option
to the early kdump document.
Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Lianbo Jiang <lijiang@redhat.com>
Acked-by: Bhupesh Sharma <bhsharma@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Kairui Song <kasong@redhat.com>
If a crash occurs repeatedly after enabling kdump, the system goes
into a crash loop and the dump target may get filled up by vmcores.
This is likely especially with early kdump.
This patch introduces 'final_action' option to kdump.conf, in order
for users to be able to power off the system even after capturing
a vmcore successfully.
Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Lianbo Jiang <lijiang@redhat.com>
Cc: Bhupesh Sharma <bhsharma@redhat.com>
Acked-by: Bhupesh Sharma <bhsharma@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Kairui Song <kasong@redhat.com>
In preparation for adding 'final_action' option, since it's confusing
to have the 'final_action' and 'default' options at the same time,
this patch introduces 'failure_action' as an alias of the 'default'
option to /etc/kdump.conf, and makes 'default' obsolete to be removed
in the future.
Also, the "default action" term is renamed to "failure action".
Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Lianbo Jiang <lijiang@redhat.com>
Cc: Bhupesh Sharma <bhsharma@redhat.com>
Acked-by: Bhupesh Sharma <bhsharma@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Kairui Song <kasong@redhat.com>
earlykdump is not suppose to be loaded for a kdump initramfs, and user
may add it into dracut's config file so it will be included by default.
It will also make the image building always fail because earlykdump
actually detect if it's being used for kdump image and raise an error if
so.
In that case, we always force drop this module to avoid such problem.
Signed-off-by: Kairui Song <ryncsn@gmail.com>
Acked-by: Dave Young <dyoung@redhat.com>
Previously we handled the case when the installed kernel version for
early kdump is different from dracut target, it will be better to
print a warning even if installation successed, to let the user know
that an different kernel is used.
No warn message will be given if the user specified a KDUMP_KERNELVER
value, as in such case a different kernel version is used on purpose.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Currently when earlykdump failed to install required kernel image or
initramfs, it will still install the earlykdump hook and other utils.
But it won't work due to the absent of kernel image or kdump initramfs,
so the hook and installed utils is meanless.
We can't simply fail dracut building, as if earlykdump is included by
dracut config file, this may fail kernel update, where kernel image is
installed but initramfs failed to generate, and then it will fail
booting.
So this patch let it skip earlydkump install if anything is missing and
give a clean error message to let the user better ware of the situation.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
There is currently a problem with earlykdump image building, when a user
is upgrading kernel, dracut will generate new initramfs for the new
kernel, and earlykdump will install currently running version of kernel
into the initramfs, and remain the version based kernel image naming
untouched. But after a reboot the new kernel is running, and it
will try to load the image corresponding to the new kernel version by
file naming.
This patch fixes the problem by creating a symlink with unified stable
naming to the installed kernel image and initramfs, and use the symlink
instand so it will always work despite the kernel version number change.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Early kdump always fails to load the vmlinuz-xxx after the 'binutils'
package has been installed, and outputs the following messages:
...
dracut-cmdline[309]: Cannot determine the file type of /boot/vmlinuz-4.18.0-51.el8.x86_64
dracut-cmdline[309]: kexec: failed to load early-kdump kernel
...
The reason is that the vmlinuz-xxx image is mistakenly stripped when
using dracut to generate the kdump initrd. Because dracut always find
all executable binary files to strip only if the 'binutils' package
is installed, otherwise it will skip the stripping.
Therefore, remove the executable permissions of the vmlinuz-xxx in
'${initrd}' in order to let dracut skip the mistakenly stripping.
Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Without this patch, when there are two or more spaces after 'path'
configuration phrase with ssh or nfs setting, SAVE_PATH is set to
'/var/crash' in mkdumprd, and in most cases kdump service fails to
start by checking the /var/crash directory regardless of the path
value.
ssh kdump(a)192.168.122.1
path /kdump
^^
This behavior would be too sensitive and different from the other
configurations. With this patch, mkdumprd allows such spaces.
Signed-off-by: Kazuhito Hagio <k-hagio(a)ab.jp.nec.com>
Acked-by: Kairui Song <kasong@redhat.com>