Commit Graph

57 Commits

Author SHA1 Message Date
Philipp Rudo
258d285c63 kdump-lib-initramfs: remove is_fs_dump_target
The function isn't used anywhere. Thus remove it to keep
kdump-lib-initramfs small and simple.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Lichen Liu <lichliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-04-17 14:49:51 +08:00
Philipp Rudo
ca306cd403 kdump-lib-initramfs: harden is_mounted
If the device/mountpoint for findmnt is omitted findmnt will list all
mounted filesystems. In that case it will always return "true". So
explicitly check if an argument was passed to prevent false-positives.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-04-17 14:49:51 +08:00
Coiby Xu
12d9eff9dc Show how much time kdump has waited for the network to be ready
Relates: https://bugzilla.redhat.com/show_bug.cgi?id=2151504

Currently, when the network isn't ready, kdump would repeatedly print
the same info,

    [   29.537230] kdump[671]: Bad kdump network destination: 192.123.1.21
    [   30.559418] kdump[679]: Bad kdump network destination: 192.123.1.21
    [   31.580189] kdump[687]: Bad kdump network destination: 192.123.1.21

This is not user-friendly and users may think kdump has got stuck. So
also show much time has waited for the network to be ready,

    [   29.546258] kdump[673]: Waiting for network to be ready (50s / 10min)
    ...
    [   32.608967] kdump[697]: Waiting for network to be ready (56s / 10min)

Note kdump_get_ip_route no longer prints an error message and it's up to
the caller to determine the log level and print relevant messages. And
kdump_collect_netif_usage aborts when kdump_get_ip_route fails.

Reported-by: Martin Pitt <mpitt@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-04-15 06:39:17 +08:00
Coiby Xu
568623e69a Address the cases where a NIC has a different name in kdump kernel
A NIC may get a different name in the kdump kernel from 1st kernel
in cases like,
 - kernel assigned network interface names are not persistent e.g. [1]
 - there is an udev rule to rename the NIC in the 1st kernel but the
   kdump initrd may not have that rule e.g. [2]

If NM tries to match a NIC with a connection profile based on NIC name
i.e. connection.interface-name, it will fail the above bases. A simple
solution is to ask NM to match a connection profile by MAC address.
Note we don't need to do this for user-created NICs like vlan, bridge and
bond.

An remaining issue is passing the name of a NIC via the kdumpnic dracut
command line parameter which requires passing ifname=<interface>:<MAC> to
have fixed NIC name. But we can simply drop this requirement. kdumpnic
is needed because kdump needs to get the IP by NIC name and use the IP
to created a dumping folder named "{IP}-{DATE}". We can simply pass the
IP to the kdump kernel directly via a new dracut command line parameter
kdumpip instead. In addition to the benefit of simplifying the code,
there are other three benefits brought by this approach,
  - make use of whatever network to transfer the vmcore. Because  as long
    as we have the network to we don't care which NIC is active.
  - if obtained IP in the kdump kernel is different from the one in the
    1st kernel. "{IP}-{DATE}" would better tell where the dumped vmcore
    comes from.
  - without passing ifname=<interface>:<MAC> to kdump initrd, the
    issue of there are two interfaces with the same MAC address for
    Azure Hyper-V NIC SR-IOV [3] is resolved automatically.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1121778
[2] https://bugzilla.redhat.com/show_bug.cgi?id=810107
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1962421

Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Thomas Haller <thaller@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-11-23 06:39:27 +08:00
Tao Liu
10ca970940 lvm.conf should be check modified if lvm2 thinp enabled
lvm2 relies on /etc/lvm/lvm.conf to determine its behaviour. The
important configs such as thin_pool_autoextend_threshold and
thin_pool_autoextend_percent will be used during kdump in 2nd
kernel. So if the file is modified, the initramfs should be
rebuild to include the latest.

Signed-off-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-11-01 12:20:34 +08:00
Tao Liu
0a5b71d123 Add lvm2 thin provision dump target checker
We need to check if a directory or a device is lvm2 thinp target.

First, we use get_block_dump_target() to convert dump path into
block device, then we check if the device is lvm2 thinp target by
cmd lvs.

is_lvm2_thinp_device is now located in kdump-lib-initramfs.sh, for it
will be used in 2nd kernel. is_lvm2_thinp_dump_target is located in
kdump-lib.sh, for it is only used in 1st kernel, and it has dependencies
which exist in kdump-lib.sh.

Signed-off-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-11-01 12:20:34 +08:00
Coiby Xu
15122b3f98 Fix grep warnings "grep: warning: stray \ before -"
Latest grep (3.8) warnings about unneeded backslashes when building
kdump initrd [1],
    kdump: Rebuilding /boot/initramfs-6.0.0-0.rc5.a335366bad13.40.test.fc38.aarch64kdump.img
    grep: warning: stray \ before -
    grep: warning: stray \ before -
    grep: warning: stray \ before -
    grep: warning: stray \ before -
    grep: warning: stray \ before -

Some warnings can be avoided by using "sed -n" to remove grep and the
others can use the -- argument.

[1] https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/2022/09/17/redhat:643020269/build_aarch64_redhat:643020269_aarch64/tests/4/results_0001/job.01/recipes/12617739/tasks/5/logs/taskout.log

Reported-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Suggested-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-10-26 14:16:04 +08:00
Tao Liu
c743881ae6 virtiofs support for kexec-tools
This patch add virtiofs support for kexec-tools by introducing a new option
for /etc/kdump.conf:

virtiofs myfs

Where myfs is a variable tag name specified in qemu cmdline
"-device vhost-user-fs-pci,tag=myfs".

The patch covers the following cases:
1) Dumping VM's vmcore to a virtiofs shared directory;
2) When the VM's rootfs is a virtiofs shared directory and dumping the
   VM's vmcore to its subdirectory, such as /var/crash;
3) The combination of case 1 & 2: The VM's rootfs is a virtiofs shared
   directory and dumping the VM's vmcore to another virtiofs shared
   directory.

Case 2 & 3 need dracut >= 057, otherwise VM cannot boot from virtiofs
shared rootfs. But it is not the issue of kexec-tools.

Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
2022-09-29 12:22:49 +08:00
Philipp Rudo
7cd3f232d5 kdump-lib-initramfs: merge definitions for default ssh key
There are currently three identical definitions for the default ssh key.
Combine them into one in kdump-lib-initramfs.sh.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2022-04-02 16:24:32 +08:00
Pingfan Liu
72ef6929a6 move variable FENCE_KDUMP_SEND from kdump-lib.sh to kdump-lib-initramfs.sh
Since kdump-lib-initramfs.sh is included by kdump-lib.sh, and
FENCE_KDUMP_SEND is used by both 1st and 2nd kernel, moving
FENCE_KDUMP_SEND from kdump-lib.sh to kdump-lib-initramfs.sh.

Signed-off-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Tao Liu <ltao@redhat.com>
2022-01-18 14:22:24 +08:00
Kairui Song
ee337c6f49 Add header comment for POSIX compliant scripts
To make things cleaner and more human readable, add a short comment for
the POSIX scripts.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-15 23:11:37 +08:00
Kairui Song
5debf397fe kdump-lib-initramfs.sh: make it POSIX compatible
POSIX doesn't support keyword local, so add double underscore and prefix
to variable names, and reduce variable usage, to avoid any variable name
conflict.

Also reformat the code with `shfmt -s -w kdump-lib-initramfs.sh`.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-15 23:11:37 +08:00
Kairui Song
a1205effaa kdump-lib-initramfs.sh: move dump related functions to kdump.sh
These dump related functions are only used by dracut-kdump.sh.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:54 +08:00
Kairui Song
a5faa052d4 kdump-lib-initramfs.sh: prepare to be a POSIX compatible lib
Move all functions needed in the second kernel from kdump-lib.sh
to kdump-lib-initramfs.sh, and update shebang headers.

Now, kdump-lib-initramfs.sh is an independent lib script, no longer
depend on kdump-lib.sh, and kdump-lib.sh is no longer needed for
the second kernel.

In later commits, functions in kdump-lib-initramfs.sh will be reworked
to be POSIX compatible, kdump-lib.sh will contain bash only functions.

POSIX shell have very limited features, eg. `local` keyword doesn't
exist in POSIX but we rely on that heavily. So kdump-lib.sh will
use bash syntax and contain the most complex helper and codes.

kdump-lib-initramfs.sh will contain the minimum set of helpers,
and be shared by both the first and second kernel.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:46 +08:00
Kairui Song
a0282ab22c kdump-lib.sh: add a config format and read helper
Add a helper `kdump_read_conf` to replace read_strip_comments.
`kdump_read_conf` does a few more things:

  - remove trailing spaces.
  - format the content, remove duplicated spaces between name and value.
  - read from KDUMP_CONFIG_FILE (/etc/kdump.conf) directly, avoid pasting
    "/etc/kdump.conf" path everywhere in the code.
  - check if config file exists, just in case.

Also unify the environmental variable, now KDUMP_CONFIG_FILE stands for
the default config location.

This helps avoid some shell pitfalls about spaces when reading config.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:29 +08:00
Kairui Song
660cf4ac03 Make dump_to_rootfs wait for 90s for real
When `failure_action` is set to `dump_to_rootfs`, the message:
"Waiting for rootfs mount, will timeout after 90 seconds"
is actually wrong. Kdump will simply call `systemctl start sysroot.mount`,
but the timeout value of sysroot.mount depends on the unit service and
dracut parameters. And by default, dracut will set
JobRunningTimeoutSec=0 and JobTimeoutSec=0 for the device units,
which means it will wait forever. (see wait_for_dev function in dracut)

For some devices, this can be fixed by setting rd.timeout=90. But when
initqueue is set enabled during initramfs build, dracut will force set
timeout for host devices to `0`. (see 99base/module-setup.sh).

Depending on dracut / systemd can make things unpredictable and break as
parameters or code change. To make things easy to understand and
maintain, just call `systemctl` with `--no-block` params, and implement
a standalone wait loop.  Now `dump_to_rootfs` will actually wait for
90s then timeout.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
2021-07-21 15:40:38 +08:00
Kairui Song
2603ba7187 Cleanup dead systemd services before start sysroot.mount
When kdump failed due to initqueue timeout, the sysroot.mount and other
serivces could be stuck in `start` but `dead` status:

Example output of systemctl:

dev-disk-by\x2duuid-530830d1\x2df2c7\x2d4c9a\x2d9a82\x2d148609097521.device loaded inactive   dead    start
<... snip ...>
squash-root.mount		loaded active     mounted       /squash/root
squash.mount			loaded active     mounted       /squash
sysroot.mount			loaded inactive   dead    start /sysroot
<... snip ...>
dracut-cmdline.service		loaded active     exited        dracut cmdline hook
dracut-initqueue.service	loaded activating start   start dracut initqueue hook
dracut-mount.service		loaded inactive   dead    start dracut mount hook

At this point calling `systemctl start sysroot.mount` will just hang as
systemd will just wait for the services that are stuck in `start`
status. So call `systemctl cancel` here to cancel all pending jobs and
have a clean start for mounting sysroot.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
2021-07-12 16:53:34 +08:00
Kairui Song
41980f30d9 Use a customized emergency shell
Use a modified and minimized version of emergency shell.
The differences of this kdump shell and dracut emergency shell are:

  - Kdump shell won't generate a rdsosreport automatically
  - Customized prompts
  - Never ask root password
  - Won't tangle with dracut's emergency_action. If emergency_action is
    set, dracut emergency shell will perform dracut's emergency_action
    instead of kdump final_action on exit.
  - If rd.shell=no is set, kdump shell will still work, dracut emergency
    shell won't, even if kdump failure_action is set to shell.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
2021-06-04 14:26:51 +08:00
Kairui Song
108258139a Don's try to restart dracut-initqueue if it's already there
kdump's dump_to_rootfs will try to start initqueue unconditionally.
dump_to_rootfs will run after systemd isolate to emergency
target, so this is currently accetable.

But there is a problem when initqueue starts the emergency action
because of initqueue timeout. dump_to_rootfs will start initqueue and
lead to timeout again.

So following patch will remove the previous isolation wrapper, and
detect the service status here. Previous isolation makes the detection
impossible. Now this detection will be valid and helpful to prevent
double timeout or hang.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
2021-06-04 14:25:37 +08:00
Tao Liu
ca05b754af Fix incorrect file permissions of vmcore-dmesg-incomplete.txt
vmcore-dmesg-incomplete.txt is generated by shell redirection,
which taking the default umask value. When dmesg collector exits
with non-zero, the file will exist and anyone can have access to
it.

This patch fixed the issue by chmod the file, making it accessible
only to its owner.

Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-05-11 02:11:50 +08:00
Hari Bathini
d0e9c51e0d fadump: fix dump capture failure to root disk
If the dump target is the root disk, kdump scripts add an entry in
/etc/fstab for root disk with /sysroot as the mount point. The root
disk, passed through root=<> kernel commandline parameter, is mounted
at /sysroot in read-only mode before switching from initial ramdisk.
So, in fadump mode, a remount of /sysroot to read-write mode is needed
to capture dump successfully, because /sysroot is already mounted as
read-only based on root=<> boot parameter.

Commit e8ef4db8ff ("Fix dump_fs mount point detection and fallback
mount") removed initialization of $_op variable, the variable holding
the options the dump target was mounted with, leading to the below
error as remount was skipped:

  kdump[586]: saving to /sysroot/var/crash/127.0.0.1-2021-04-22-07:22:08/
  kdump.sh[587]: mkdir: cannot create directory '/sysroot/var/crash/127.0.0.1-2021-04-22-07:22:08/': Read-only file system
  kdump[589]: saving vmcore failed

Restore $_op variable initialization in dump_fs() function to fix this.

Fixes: e8ef4db8ff ("Fix dump_fs mount point detection and fallback mount")
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-04-28 15:41:33 +08:00
Tao Liu
91c802ff52 Fix incorrect permissions on kdump dmesg file
Also known as CVE-2021-20269. The kdump dmesg log files(kexec-dmesg.log,
vmcore-dmesg.txt) are generated by shell redirection, which take the
default umask value, making the files readable for group and others.

This patch chmod these files, making them only accessible to owner.

Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-03-23 16:39:18 +08:00
Lianbo Jiang
a571b0da9f fix kdump failure of saving vmcore with the scp + ipv6 method
Currently, kdump will fail to save vmcore when using the scp and ipv6.
The reason is that the scp requires IPv6 addresses to be enclosed in
square brackets, but ssh doesn’t require this.

Let's enclose the ipv6 address in square brackets for scp dump.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
2021-01-21 15:03:20 +08:00
Kairui Song
02202aa70f logger: source the logger file individually
Sourcing logger file in kdump-lib.sh will leak kdump helper to dracut,
because module-setup.sh will source kdump-lib.sh. This will make kdump's
function override dracut's ones, and lead to unexpected behaviours.

So include kdump-logger.sh individually and only source it where it really
needed. for module-setup.sh, simply use dracut's logger helper is good
enough so just source kdump-logger.sh in kdump only scripts.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Lianbo Jiang <lijiang@redhat.com>
2021-01-20 14:13:44 +08:00
Kairui Song
e8ef4db8ff Fix dump_fs mount point detection and fallback mount
Simplify the code and fix mount point detection. The code logic is now
much simpler: if $1 is not a mount point, call "mount --target $1" again
to try mount it. "mount --target" cmd itself can handle all the /etc/fstab
parsing job, so drop the buggy and complex bash code.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
2021-01-14 01:39:02 +08:00
Kairui Song
7f1f8f229f Revert "Don's try to restart dracut-initqueue if it's already failed"
systemctl is-failed will not work after dracut isolated to the emergency
target, so this judgement is invalid. And the restart is basically
harmless, so just revert this commit.

This reverts commit ad6a93b00d.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
2021-01-14 01:38:58 +08:00
Lianbo Jiang
cd86148804 Save the final failure information to log file if saving vmcore failed
Currently, if saving vmcore failed, the final failure information won't
be saved to the kexec-dmesg.log, because the action of saving the log
occurs before the final log is printed, it has no chance to save the
log(marked it with the '^^^' below) to the log file(kexec-dmesg.log).
For example:

[1] console log:
[    3.589967] kdump[453]: saving vmcore-dmesg.txt to /sysroot//var/crash/127.0.0.1-2020-11-26-14:19:17/
[    3.627261] kdump[458]: saving vmcore-dmesg.txt complete
[    3.633923] kdump[460]: saving vmcore
[    3.661020] kdump[465]: saving vmcore failed
                           ^^^^^^^^^^^^^^^^^^^^
[2] kexec-dmesg.log:
Nov 26 14:19:17 kvm-06-guest25.hv2.lab.eng.bos.redhat.com kdump[453]: saving vmcore-dmesg.txt to /sysroot//var/crash/127.0.0.1-2020-11-26-14:19:17/
Nov 26 14:19:17 kvm-06-guest25.hv2.lab.eng.bos.redhat.com kdump[458]: saving vmcore-dmesg.txt complete
Nov 26 14:19:17 kvm-06-guest25.hv2.lab.eng.bos.redhat.com kdump[460]: saving vmcore

Let's improve it in order to avoid the loss of important information.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2020-12-29 17:40:26 +08:00
Kairui Song
ad6a93b00d Don's try to restart dracut-initqueue if it's already failed
If dracut-initqueue failed in kdump kernel and failure action
is set to dump_to_rootfs, there is no point try again to start the
initqueue. It will also slow down the dump process, and the initqueue
will most like still not work if first attemp failed.

So just try to start sysroot.mount, if it failed, there is no luck.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
2020-12-09 16:06:48 +08:00
Kairui Song
08d9846eba Make get_mount_info work with bind mount
Remove the --real when calling findmnt.

The option is only useful in capture kernel, to avoid
`findmnt` returning the pseudo 'rootfs' for non mounted path.

example, when /kdumproot/mnt/ is not mounted:
kdump:/# findmnt --target /kdumproot/mnt
TARGET SOURCE FSTYPE OPTIONS
/      rootfs rootfs rw,size=61368k,nr_inodes=15342

kdump:/# findmnt --target /kdumproot/mnt
<return 1 and empty output>

But this function will make findmnt also return empty value for bind
mount. So remove it and add an extra if statement for second kernel.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
2020-11-30 15:26:47 +08:00
Lianbo Jiang
e345ed18e2 Add the rd.kdumploglvl option to control log level in the second kernel
Let's add the rd.kdumploglvl option to control log level in the second
kernel, which can make us avoid rebuilding the kdump initramfs after we
change the log level in /etc/sysconfig/kdump.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2020-11-13 02:43:49 +08:00
Lianbo Jiang
3221f4e91f increase makdumpfile default message level to 7
Currently, the makedumpfile option '--message-level' is set to 1 when
dumping the vmcore, it only displays the progress indicator message,
but there are no common message and error message, it is important to
report some additional messages, especially for the error message,
which is very useful for the debugging.

In view of this, let's change the message level to 7 by default.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2020-10-27 17:45:32 +08:00
Lianbo Jiang
d7054f4cd8 Improve debugging in the kdump kernel
Let's use the logger in the second kernel and collect the kernel ring
buffer(dmesg) of the second kernel.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2020-10-27 17:34:07 +08:00
Kairui Song
cfd93e2b7e Revert "Add a hook to wait for kdump target in initqueue"
This reverts commit cee618593c.

Upstream dracut have provided a parameter for adding mandantory network
requirement by appending "rd.neednet" parameter, so we should use that
instead.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
2020-05-28 16:26:00 +08:00
Kairui Song
61e016939c User get_mount_info to replace findmnt calls
Use get_mount_info so that fstab is used as a failback when look for
mount info.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
2020-05-22 16:14:02 +08:00
Kairui Song
0624148414 Add a is_mounted helper
Use is_mounted helper instaed of calling findmnt directly or checking if
"mount" value is empty.

If findmnt looks for fstab as well, some non mounted entry will also
return value. Required to support non-mounted target.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
2020-05-22 16:13:24 +08:00
Kairui Song
33e79681d9 Fix the problem that kdump prints redundant /
In second kernel, kdump always prints redundant '/':

kdump: saving to /sysroot//var/crash/127.0.0.1-2020-03-12-21:32:54/

Just trim it.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
2020-05-11 14:22:22 +08:00
Kazuhito Hagio
74081a2b64 Don't unmount the dump target just after saving vmcore
Since commit 6dee286467 ("Don't mount the dump target unless needed"),
dump_fs function unmounts the dump target just after saving vmcore.

This broke the condition that it's mounted when executing "kdump_post",
which had been stable since RHEL5, and a certain tool which uses the
kdump_post hook to save information of 2nd kernel to the dump target
started to fail.

As unmounting it is done by systemd-shutdown before reboot without
the umount command as below, so let's don't unmount it in dump_fs.

  systemd-shutdown[1]: Unmounting file systems.
  [547]: Remounting '/sysroot' read-only in with options '(null)'.
  EXT4-fs (dm-0): re-mounted. Opts: (null)
  [548]: Unmounting '/sysroot'.

Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>
Acked-by: Kairui Song <kasong@redhat.com>
2020-04-27 16:30:59 +08:00
Hari Bathini
e3f2f926dd powerpc: enable the scripts to capture dump on POWERNV platform
With FADump support added on POWERNV paltform, enable the scripts to
capture /proc/vmcore. Also, if CONFIG_OPAL_CORE is enabled, OPAL core
is preserved and exported on POWERNV platform. So, offload OPAL core,
if it is available.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Kairui Song <kasong@redhat.com>
2020-02-06 22:13:06 +08:00
Kairui Song
cee618593c Add a hook to wait for kdump target in initqueue
The dracut initqueue may quit immediately and won't trigger any hook if
there is no "finished" hook still pending (finished hook will be deleted
once it return 0).

This issue start to appear with latest dracut, latest dracut use
network-manager to configure the network,
network-manager module only install "settled" hook, and we didn't
install any other hook. So NFS/SSH dump will fail. iSCSI dump works
because dracut iscsi module will install a "finished" hook to detect if
the iscsi target is up.

So for NFS/SSH we keep initqueue running until the host successfully get
a valid IP address, which means the network is ready.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
2020-01-29 08:12:45 +08:00
Kairui Song
39352d0cfc Don't execute final_action if failure_action terminates the system
If failure_action is shutdown/reboot/halt, final_action is pointless as
the system will be already stopping. And if final_action is different
from failure_action, it will trigger a systemd race problem and cause
unexpected behavior to occur.

So let the error handler stop and exit after performing failure_action
successfully if failure_action is one of shutdown/reboot/halt.
This way, final_action will not be executed.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
2019-11-01 11:21:58 +08:00
Kairui Song
6dee286467 Don't mount the dump target unless needed
For fadump, this helps to reduce the risk of boot failure, and
may also help speed up the boot by a bit.

For normal kdump, this will delay the dump target mounting, and no
longer depend on systemd to do the mounting job.

And currently there is a failure that caused by some mount handling
bug with kernel and systemd that is failing the system booting:

[FAILED] Failed to mount /kdumproot/home.
See 'systemctl status kdumproot-home.mount' for details.
[DEPEND] Dependency failed for Local File Systems.
[  OK  ] Reached target Remote File Systems (Pre).
[  OK  ] Reached target Remote File Systems.
         Starting udev Coldplug all Devices...
         Starting Create Volatile Files and Directories...
         Starting Kdump Emergency...

This patch can bypass it. The fix of root cause is still WIP, but this
patch itself is a nice to have optimization so it's reasonable to do so.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
2019-09-29 17:12:54 +08:00
Kairui Song
75d9132417 Get rid of duplicated strip_comments when reading config
When reading kdump configs, a single parsing should be enough and this
saves a lot of duplicated striping call which speed up the total load
speed.

Speed up about 2 second when building and 0.1 second for reload in my
tests.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
2019-05-20 16:56:28 +08:00
Kazuhito Hagio
242da37c58 Add final_action option to kdump.conf
If a crash occurs repeatedly after enabling kdump, the system goes
into a crash loop and the dump target may get filled up by vmcores.
This is likely especially with early kdump.

This patch introduces 'final_action' option to kdump.conf, in order
for users to be able to power off the system even after capturing
a vmcore successfully.

Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Lianbo Jiang <lijiang@redhat.com>
Cc: Bhupesh Sharma <bhsharma@redhat.com>
Acked-by: Bhupesh Sharma <bhsharma@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Kairui Song <kasong@redhat.com>
2019-01-22 17:58:24 +08:00
Kazuhito Hagio
cc95f0a744 Add failure_action as alias of default and make default obsolete
In preparation for adding 'final_action' option, since it's confusing
to have the 'final_action' and 'default' options at the same time,
this patch introduces 'failure_action' as an alias of the 'default'
option to /etc/kdump.conf, and makes 'default' obsolete to be removed
in the future.

Also, the "default action" term is renamed to "failure action".

Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Lianbo Jiang <lijiang@redhat.com>
Cc: Bhupesh Sharma <bhsharma@redhat.com>
Acked-by: Bhupesh Sharma <bhsharma@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Kairui Song <kasong@redhat.com>
2019-01-22 17:57:53 +08:00
Pingfan Liu
89565289c6 kdump-lib-initramfs.sh: using -force option when poweroff
If default action is poweroff, we can observe that the machine is
rebooted, instead of poweroff. That is due to the following two race
processes:
    systemctl poweroff
    systemctl reboot -f
which is launched by kdump-error-handle.sh.

Unfortunately, although both of them are executed in systemd block
mode, but due to poweroff will tear down some internal things in
systemd, there is no guarantee for the block mode. As we can see
the msg "Failed to execute operation: Connection reset by peer",
which is thrown by "systemctl reboot -f".

poweroff and reboot share most of code, if one fails, then the other
should also fails, so it is meaningless to use reboot as the backup of
poweroff. Using "systemctl poweroff -f", the sdbus will teared down
immediately, which prevent the following "systemctl reboot -f" from
executing. Meanwhile, as man systemctl says:
    -f, --force
        When used with enable, overwrite any existing conflicting symlinks.

        When used with halt, poweroff, reboot or kexec, execute the selected
        operation without shutting down all units. However, all processes will
        be killed forcibly and all file systems are unmounted or remounted read-only.

Hence, replacing the 'poweroff' with 'systemctl poweroff -f'

Signed-off-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2018-12-10 14:37:57 +08:00
Kenneth Dsouza
ac55095191 kdump-lib-initramfs.sh: Add check to remount to rw mode only if dump target is ro.
Currently the script does not check if the dump target is read-only and would
always mount to read-write mode. This caused an issue with nfs mount as the
fstab options would be reconsidered while remounting to read-write mode.
The remount would fail with the below error as all options cannot be changed
runtime.

mount.nfs: mount(2): Invalid argument
mount.nfs: an incorrect mount option was specified

Which in result would not save the vmcore on the dump target.

This patch addresses this issue by checking the dump target status for read-only.
If yes, remount to read-write mode without reconsidering the fstab options.

Signed-off-by: Kenneth D'souza <kdsouza@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2018-10-15 10:46:44 +08:00
Pingfan Liu
0165cfa332 kdump-lib-initramfs.sh: ignore the failure of echo
The kdump-capture.service will fail, if the following conds are meet up.
-1. boot up a VM with the following cmd:
qemu-kvm   -name 'avocado-vt-vm1'  -sandbox off    -machine pc   -nodefaults  -vga cirrus \
    -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=$guest_img \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=04 \
    -device virtio-net-pci,mac=9a:4d:4e:4f:50:51,id=id3DveCw,vectors=4,netdev=idgW5YRp,bus=pci.0,addr=05 \
    -netdev tap,id=idgW5YRp \
    -m 2048  \
    -smp 4,maxcpus=4,cores=2,threads=1,sockets=2  \
    -cpu 'SandyBridge',+kvm_pv_unhalt \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot order=cdn,once=c,menu=off,strict=off \
    -enable-kvm \
    -monitor stdio \
    -qmp tcp:localhost:4444,server,nowait
-2. in kernel cmdline with the following options: console=tty0 console=ttyS0,

Because the  "-nodefaults" option in qemu cmd excludes the emulation of serial port, the ttyS0 will
have no real backend device. We can observe such issue in 1st kernel by:
	echo teststring > /dev/console or
	echo teststring > /dev/ttyS0,
It gets the error "-bash: echo: write error: Input/output error".
Such conds cause small issue in 1st kernel, but it is a big problem for kdump-capture and emergency
service.

This patch aims to work aroundthe issue in kdump-capture service:
dump_fs() return value will affect the following code in dracut-kdump.sh
	DUMP_RETVAL=$?    <---
	do_kdump_post $DUMP_RETVAL
	if [ $? -ne 0 ]; then
	    echo "kdump: kdump_post script exited with non-zero status!"
	fi

Although kdump-capture saves the vmcore successfully, but it exit 1 and
fall on emergency service.

Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Xunlei Pang <xlpang@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
2017-04-27 13:59:49 +08:00
Dave Young
b253434819 use "systemctl reboot -f" for reboot action
In latest rawhide kdump kernel reboot hangs because systemd reports a
conflict when kdump calls reboot during booting. Need further investigation
about the new systemd behavior.

Here is the error message copied from kdump session:
[snip]
kdump: saving vmcore complete
Failed to start reboot.target: Transaction contains conflicting jobs 'stop' and 'start' for shutdown.target. Probably contradicting requirement dependencies configured.
Failed to talk to init daemon.
[FAILED] Failed to start Kdump Vmcore Save Service.
[snip]

We previouly use "reboot -f" but later we changed to reboot because we want
systemd to take care of the shutdown path, mainly for umount filesystems.

Change back to "reboot -f" works but we still need umount by ourselves.
During my tests with "reboot -f" I get below dirty ext2 filesystem:

[root@localhost ~]# fsck /dev/vdb
fsck from util-linux 2.27
e2fsck 1.42.13 (17-May-2015)
/dev/vdb was not cleanly unmounted, check forced.

Actually "reboot -f" equals to "systemctl reboot -f -f"

systemctl manpage says "-f" and "-f -f" means different behavior:
When use -f with reboot, will execute reboot without shutting down all units.
However all processes will be killed forcibly and all file systems are
unmounted or remounted read-only. If -f is specified twice, will reboot
immediately without terminating any processes or unmounting any file systems.

Thus change to use "systemctl reboot -f" for our reboot actions. It can fix
the problem and at the same time it can ensure filesystems are umounted before
rebooting.

OTOH, a systemd changes cause the breakage, it may be a system service new
design, Later I can dig into systemd changes see which commit cause the
breakage.

Signed-off-by: Dave Young <dyoung@redhat.com
Signed-off-by: Dangyi Liu <dliu@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
2015-12-11 15:20:54 +08:00
Baoquan He
6f4940f198 Revert "execute kdump_post after do_default_action"
This reverts commit f4c45236bf.
Since that commit will change the behaviour of kdump_post. That is not
good.

Signed-off-by: Baoquan He <bhe@redhat.com>
2015-04-08 15:50:16 +08:00
WANG Chao
84f94be90b make kdump saving directory name consistent with RHEL6
Now we use the pattern:
  <machine/ipaddr>-YYYY.MM.DD-HH:MM:SS

while rhel6 uses the following:
  <machine/ipaddr>-YYYY-MM-DD-HH:MM:SS

This change may break someone's script and we should change it back to
keep consistent between releases.

Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Acked-by: Minfei Huang <mhuang@redhat.com>
2015-02-25 16:52:05 +08:00