When 1st kernel is using KMS and crash, 2nd kernel can't reset to
nomodeset and the screen is black. In this case, user can't observe the
boot/dump progress and run commands in shell.
So let's pull in drm dracut module to fix this.
Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
When ssh dump, if user doesn't have write permission on save path
of server, the crash kernel can be loaded successfully, but finally
kdump will fail because write is not allowed.
Let's check it in the service start phase, if no write permission
print error message and exit.
For differentiation, change the name of old function mkdir_save_path
to mkdir_save_path_fs.
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
We shouldn't output what dracut module are used when rebuilding kdump
initrd. It's confusing to user.
And since we've introduced dracut_args in kdump.conf, we can safely
remove this mandatory -M and let user add as his/her need.
Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Dracut root fs is always mounted, but it's not guaranteed to success
because we are in crash/kdump context. So selinux policy can not only
depends on chroot load_policy.
Per discussion with Vivek and Selinux people, relabel kdump files
when the service restart.
Currently only below cases are considerd:
1. target mounted in 1st kernel
2. target mounted as rw, if user mount it as 'ro' they will have to
relabel the files by themselves.
3. save path is not masked, this means if /var/crash is mount to another
disk which is different from dump target it will not visible to user
so user need manually relabel them.
4. only local filesystem based targets.
Tested on F19 machine.
Tested local fs dump and network dump along with different save path
to address above mentioned cases.
Vivek: use function name is_dump_target_configured
use getfattr -m "security.selinux" instead of ".*"
Daniel: use restorecon instead of chcon.
dyoung: keep minix in local fs list since it has not been deperacated yet.
Vivek: wrap is_dump_target_configured checking in function path_to_be_relabeled
dyoung: use awk instead of cut to print config value for different
space delimeters
dyoung: mute df error message: `df $_mnt/$_path 2>/dev/null`
For nfs restorecon, since it will be in 3.11 kernel, we can add it when it's
ok in Fedora.
Signed-off-by: Dave Young <dyoung@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
depends on 98selinux if 1st kernel selinux is enabled so we can load_policy
and correctly label the vmcore/vmcore-dmesg files.
Since dracut always mount rootfs, the 98selinux will chroot and load_policy,
so this will be ok for Fedora. In case rootfs mount failure we have to check
and relable vmcore files, will add the kdumpctl relabeling code in another
patch.
add 'dracut_args --printsize' to /etc/kdump.conf, it shows below added size:
selinux install size: 16k
Tested on F19:
With this patch applied, vmcore selinux attr is ok.
v1->v2: use sestatus 2>/dev/null to mute error messages
Signed-off-by: Dave Young <dyoung@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
In dracut-kdump.sh, kdump did not umount rootfs after dump_to_rootfs, just
like dump_fs does. And in kdump, the FINAL_ACTION is "reboot -f", no umount
action is taken.
Even though "sync" has been executed, it's safer to take a "umount rootfs"
action. Anyway no harm to umount.
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
When doing kdump If the capture kernel crashes for some reason, the default
behavior appears to be hanging the system without rebooting. We at least
need an option to reset if the capture kernel crashes. Business critical
customers tend to want the system to reboot without manual intervention.
Kernel provides a parameter “panic=n” to solve such problem. If this parameter
is given, the capture kernel will reboot after n seconds in case it panics.
Now add this parameter into “KDUMP_COMMANDLINE_APPEND”, and set the default
waiting time value as 10 seconds.
It's tested on KVM f19, and passed.
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
For reserving a chunk of memory for kdump kernel, args need be
appended to kernel cmdline. For different Arch, different
bootloaders and related config files are used, it's a little
annoying. Using grubby, it can be very easy to append a single
arg to kernel cmdline, and can save words in howto document.
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: WANG Chao <chaowang@redhat.com>
Currenty the kexec-tools package contains udev rules for kdump
that reload kdump in case of memory or CPU hotplug:
$ cat /etc/udev/rules.d/98-kexec.rules
SUBSYSTESUBSYSTEM=="cpu", ACTION=="online", PROGRAM="/bin/systemctl try-restart kdump.service"
SUBSYSTEM=="cpu", ACTION=="offline", PROGRAM="/bin/systemctl try-restart kdump.service"
SUBSYSTEM=="memory", ACTION=="add", PROGRAM="/bin/systemctl try-restart kdump.service"
SUBSYSTEM=="memory", ACTION=="remove", PROGRAM="/bin/systemctl try-restart kdump.service"
On other architectures the rules are necessary because the memory
and CPU layout stored in the kdump in-memory ELF header at kdump
load time. Therefore the kdump kernel has to be reloaded each
time when the CPU or memory configuration changes.
This has drawbacks:
1. During kdump reload the system can't be dumped.
2. On systems with many hotplug events (e.g. on s390 with cpuplugd)
this creates significant overhead
The reload is not necessary on s390 because there the ELF header is
created in the 2nd (kdump) kernel. Therefore, to improve things,
remove the rules for s390.
Log is from IBM, and patch has been tested by IBM and work well.
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Upon encountering a failure, dracut can drop user to emergency shell. But
in kdump environment kdump module wants to do the error handling and
wants to handle error as sepecified by user in kdump.conf file (halt,
reboot etc). Now dracut has provided an option action_on_fail=continue
which means dracut just ignores the failure and continues and expects
module to handle the error.
Modify kdump.sysconfig to pass action_on_fail=continue to dracut.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
In kdump.conf, space key is used as delimiter by default.
In kdump_install_conf of dracut-module-setup.sh, if specify
core_collector with a tab delimiter, the tool may not be
copied into kdump-initrd.
E.g, core_collector scp -v
And in dump_ssh of dracut-kdump.sh, dumping will fail caused
by tab key in core_collector.
Here change code to allow tab key as delimiter when specifying
core_collector.
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
dracut pre-pivot systemd service has below settings:
StandardOutput=syslog
StandardError=syslog+console
Thus kdump_pre/kdump_post output will disapear. Because the output is useful
for users, in case any failure user can watch the console log to see what's
wrong.
Dracut/Systemd people do not want to change the service settings.
So let's redirect the stdout to stderr fo fix it.
Per vivek: redirect whole kdump.sh stdout to stderr instead of only fix
for kdump_pre and kdump_post.
Tested on F19.
Signed-off-by: Dave Young <dyoung@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Save vmcore-dmesg.txt before saving vmcore. For ssh targets, it assumes
that ssh is enabled. No scp logic as I don't have a local copy of
kernel log buffers and saving one will consume extra memory. We
can possibly enhance this logic to save kernel log buffers first locally
and then scp it (For setups which allow scp but disable ssh access).
(log is from Vivek Goyal <vgoyal@redhat.com>)
And add 1 section to describe it in kexec-kdump-howto.txt
v3->v4:
Remove old description of dmesg in kexec-kdump-howto.txt, now
add a new section to describe it, and note user kernel log
buffers won't be available if dump target is raw device.
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Currently we overwrite 40ip.conf to make ip and ifname both at the first
line. But getarg() of dracut doesn't have the limitation that all
cmdline args should be at the first line. Therefore, we can remove the
overwrite safely.
After applying this patch, in 2nd kernel,
kdump:/# cat /etc/cmdline.d/40ip.conf
ip=eth0:dhcp
ifname=eth0:52:54:00:b2:98:05
kdump:/# source /usr/lib/dracut/dracut-lib.sh
kdump:/# getarg ip
ip=eth0:dhcp
kdump:/# getarg ifname
ifname=eth0:52:54:00:b2:98:05
Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Kdump module doesn't check if bridge is stack on other complex interface
and setup proper dracut cmdline. That makes dracut fail to setup a working
network environment in 2nd kernel.
This patch adds the ability to setup proper dracut cmdline for bridge over
bond/team/vlan. Although in this timeframe, drauct only supports bridge over
bond among these three complex network, it's worth fixing the other two types
(bridge over team/vlan) along with. It would be much easier for us once
the dracut part is done.
Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
drauct takes bond=<bondname>[:<bondslaves>:[:<options>]] syntax to parse
bond. For example:
bond=bond0:eth0,eth1:mode=balance-rr
Update v2:
- Get bonding options from corresponding ifcfg. Because it's hard to keep
track of all the runtime configurable options under /sys/class/net/$netif/
- Remove kdump_get_bond_mode, since it's useless now.
Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
mkdumprd call dracut to rebuilding kdump initrd, sometimes passing extra
dracut args is helpful. For example user can enable debug output with
--debug, --printsize to print roughly increased initramfs size by each module,
--omit-drivers to omit kernel modules, etc.
This patch enables dracut_args option for passing extra args to dracut.
Also it modifies add_dracut_arg() to treat a string with-in quote as single
string because for dracut options which has its own args, the args need to be
quoted and space seperated.
If add_dracut_arg() gets an string read from kdump.conf and if that string
contains double quotes, then while converting to positional parameters
those double quotes are not interpreted. Hence if /etc/kdump.conf contains
following.
dracut_args --add-drivers "driver1 driver2"
then add_dracut_args() sees following positional parameters
$1= --add-drivers
$2= "driver1
$3= driver2"
Notice, double quotes have been ignored and parameters have been broken
based on white space.
Modify add_dracut_arg() to look for parameters starting with " and
if one is found, it tries to merge all the next parameters till one
is found with ending double quote. Hence effectively simulating
following behavior.
$1= --add-drivers
$2= "driver1 driver2"
[v1->v2]: address quoted substring in dracut_args, also handle the leading
and ending spaces in substring.
[v2->v3]: fix dracut arguments seperator in kdump.conf.
[v3->v4]: improve changelog, thanks vivek.
[v4->v5]: make the manpage more verbose [vivek].
Tested with below dracut_args test cases:
1. dracut_args --add-drivers "pcspkr virtio_net" --omit-drivers "sdhci-pci hid-logitech-dj e1000"
2. dracut_args --add-drivers " pcspkr virtio_net " --omit-drivers "sdhci-pci hid-logitech-dj e1000"
Signed-off-by: Dave Young <dyoung@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
commit 97e107b "Add support for team devices" introduced ethtool to
get permanent address.
Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
cked-by: Dave Young <dyoung@redhat.com>
We do not support dump to an encrypted disk now, so adding the functions to
error out if any of the dump target is encrypted.
This patch is based on the check resettable patches from BaoQuan which added
some dracut functions for iterating block devices.
Currently dracut support an encrypted rootfs, but it need interacive entering
passcode. It might be possible to use some keyfile to pass the key checking.
But let's fisrtly check and error out. In the future if there's such
requirement we can look into it that time.
Tested in F18 with encrypted root, encrypted disk other than root and
dump_to_rootfs with encrypted root.
Signed-off-by: Dave Young <dyoung@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Below patches were applied to kexec-tools-2.0.3, the latest
kexec-tools-2.0.4 has included them. Delete them here.
kexec-tools-2.0.3-Load-bzImages-smaller-than-32-KiB.patch
kexec-tools-2.0.3-kdump-pass-acpi_rsdp-to-2nd-kernel-for-efi-booting.patch
kexec-tools-2.0.3-ppc-exec-stack-fix.patch
kexec-tools-2.0.3-ppc-ppc64-compile-purgatory-code-with-gcc-option-msoft-float.patch
kexec-tools-2.0.3-vmcore-dmesg-Do-not-write-beyond-end-of-buffer.patch
kexec-tools-2.0.3-vmcore-dmesg-vmcore-dmesg-Make-it-work-with-new-stru.patch
Some Smart Array (hpsa/cciss) adapters don't support reset, we need
to disable kdump on those devices, like rhel6 did.
In this patch, the dump target is checked according to below
criteria if it's a block device.
If it's cciss disk but is resettbale, can be used as dump target.
If it's cciss disk but is not resettable, can not be used as dump
target.
If it's cciss disk and not resettable, but user set OVERRIDE_RESETTABLE
to 1 in /etc/sysconfig/kdump, can be taken as dump target. Because
user know the situation and want to have a try.
In this patch, added codes include 4 parts:
1)Add an option "override_resettable <0 | 1>" into kdump.conf, and
add related section into kdump.conf man page. In mkdumprd, will check
whether user has set a value, get that value if yes. By default, the
value is 0.
2)port utility functions from dracut-functions.sh.
3)The check_resettable function checks if dump target is a resettable
block device. This includes the case where default action dump_to_rootfs
is set.
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
We use function to pass stdout to a variable, like get_persistent_dev
but it will echo some error message and exit in some cases, instead of
redirect all the echo to stderr, this patch adds a function perror_exit
to fix this and simplify/cleanup related code.
Also add another function perror() for cases where no need to exit.
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Current blacklist option is different from the option in rhel6. In current
implementation blacklist just means omit the driver, but it should really
be preventing it being loaded in initramfs.
To keep consistent, just make the option as deprecated. User is suggested
to user dracut kernel cmdline rd.driver.blacklist instead.
[v1->v2]: improve man page description, thanks Vivek.
Tested in kvm guest with rd.driver.blacklist in kdump sysconfig
Signed-off-by: Dave Young <dyoung@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Because makedumpfile is not supported on ppc and s390, so it makes
no sense to create the eppic_makedumpfile sub-package if there is
no makedumpfile binary to run it with.
Remove eppic contents related to ppc and s390 in kexec-tools.spec. This
will not build and install eppic on ppc and s390.
There's one mistake in rules related to eppic in kexec-tools.spec,
that caused kexec-tools-eppic to fail installation of i386. In this
patch remove that line of rule.
Meanwhile update eppic_030413.tar.gz.
This feature enables us to specify rules to scrub data in a
dumpfile with eppic macro instead of the current configuration
file (makedumpfile.conf). Currently, this feature works only
for symbols in vmlinux while the current feature can work also
for module symbols.
This library is backported from upstream, integrated and tested by
Dave Anderson.
If CORE_COLLECTOR is makedumpfile, "-F" is only allowed on ssh/raw,
removing it when dump_to_rootfs is necessary.
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Firstly rename dump_rootfs to dump_to_rootfs to remove the ambiguity
about dump_rootfs. Then add it as one of default options. That means
user can specify dump_to_rootfs to be default action manually, then
it will take action when specified target dump failed.
Secondly, in rhel7 and fedora, when default action is not specified,
the default 'default' is dump_to_rootfs. Namely when specified target
dump failed, the kdump initrd will mount root and save kdump from
initramfs context. However in rhel6, the default 'default' is 'reboot'.
That means when specified target dump failed, the kdump initrd will
reboot systems. For being consistent with rhel6, change the default
'default' back to 'reboot'. And this can also keep logic simple, easier
to understand. Primarily, Our default dump target is root filesystem.
So keeping "default" as "dump_to_rootfs" and trying to dump to root
filesystem again when first attempt fails does not make much sense.
Meanwhile add the relevant description into kdump.conf,kdump.conf.5
and kexec-kdump-howto.txt.
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
blkid do not support UUID/LABEL with quotes, remove the quotes before converting
to dev name, or the result devname will be nul.
Signed-off-by: Dave Young <dyoung@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Add a function check_config to check kdump config file.
1. move multi dump target checking into this function
2. check invalid config options and obsolete config options
3. check null config value.
[v2->v3]: add detail doc about deprecated options in kdump.conf manpage.
[v3->v4]: print out the bad config option in case it is not valid.
[v4->v5]: improve documentation according to comments from Vivek.
[v5->v6]: s/Deprecated/Invalid for invalid config options.
Signed-off-by: Dave Young <dyoung@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Marc Milgram <mmilgram@redhat.com>
check_config is actually checking the files timestamp and rebuilding initrd.
Rename it to check_rebuild instead thus check_config can be used for checking
config file valid or not.
Signed-off-by: Dave Young <dyoung@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Moving the checking target mount code a little earlier to ensure
dump target is mounted and fail out early before other handlings.
This change also cleanup a bit for the related code.
Tested UUID/devname local dump, also tested the non-exist kdump target.
Signed-off-by: Dave Young <dyoung@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Previously to_dev_name use blkid to get dev name from dump target,
but blkid can not handle UUID/LABEL with quotes so to_dev_name will
silently fail.
Because we enforce dump target being mounted before creating kdump
initrd, so change to use findmnt is fine. findmnt can handle input
params with quotes.
to_dev_name is not necessary anymore, just remove it.
Also there's another user of it is for checking if the dev is root
or not, here change to use findmnt for this as well.
Tested the rootfs dump, UUID with/without quotes dump.
Signed-off-by: Dave Young <dyoung@redhat.com>
Reviewed-by: Caspar Zhang <czhang@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
In old code, kdumpctl program exit directly when check_ssh_target failed
without printing "Starting kdump: FAILED". Then when manually invoke
"kdumpctl restart", only print "Stopping kdump: OK", but no "Starting
kdump: FAILED". That is unreasonable.
In this patch change check_ssh_target() to return when it failed. Then
check the returned value in start() function and print status if the
returned value is not 0.
Meanwhile change "space" to "tab" in function check_ssh_target(), make
those be consistent with the whole script file.
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
In kdumpctl, some printings are incomplete, like "Starting kdump:" or
"Stopping kdump:". Now add the service status to the end of such kind
of printing.
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
In fedora, systemd take control of services. During bootup and manually
invoke "systemctl restart kdump.service", the standard Output/Error
are all redirected to journal/syslog. Then particular LOGGER is useless
in kdumpctl.
In this patch, remove codes related to LOGGER. But for noticing user,
trying to add substituted printing to Standard Output/Err.
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
tune sysconfig to save 2nd kernel memory usage
The memory in 2nd kernel is limited, we need to use as less memory as we can
to ensure vmcore capturing ok.
I'm doing below improvements in this patch:
1)
numa support is not necessary for kdump kernel, so disable it by adding numa=off
to save some kernel mm memory usage.
2)
Also add udev.children-max=2 to cmdline to limit max udev chidren processes.
3)
For ppc64, ehea driver will by default enable multi queue feature which will
use a lot of memory. Almost each ppc machine will oom for network(ssh/nfs)
kdump. The module param use_mcs=0 is used to disable multi queue feature.
Tested these params on an IBM machine with 2 numa nodes which ooms even for
local dump to rootfs.
With this patch oom does not happen for local/ssh dump, but for nfs dump oom
still happens in the middle of makedumpfile vmcore copying. So there should be
other improvement yet.
For ehea driver there's other params we can use, but because it's hard to
measure the saved memory, I'm waiting for input from IBM people. We can add
them later.
Signed-off-by: Dave Young <dyoung@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
This reverts commit 05b67ee95c.
The old commit was merged for a emergent bug fixing on release 1.5.1
of makedumpfile. Now the upstream has been updated to v1.5.3, and this
patch has been included already.
For updating to makedumpfile v1.5.3, revert it.