Resolves: bz1968374
Upstream: Fedora
Conflict: None
commit 39a642b66b
Author: Hari Bathini <hbathini@linux.ibm.com>
Date: Tue Jun 1 16:46:58 2021 +0530
kdump-lib.sh: fix a warning in prepare_kdump_bootinfo()
Fix the warning observed when KDUMP_KERNELVER is specified:
kdumpctl[10926]: /lib/kdump/kdump-lib.sh: line 697: [: missing `]'
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Resolves: bz1965952
Upstream: Fedora
Conflict: None
commit 75bdcb7399
Author: Kelvin Fan <kfan@redhat.com>
Date: Fri Apr 16 22:31:13 2021 +0000
Write to `/var/lib/kdump` if $KDUMP_BOOTDIR not writable
The `/boot` directory on some operating systems might be read-only.
If we cannot write to `$KDUMP_BOOTDIR` when generating the kdump
initrd, attempt to place the generated initrd at `/var/lib/kdump`
instead.
Signed-off by: Kelvin Fan <kelvinfan001@gmail.com>
Acked-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Resolves: bz1941905
Upstream: Fedora
Conflict: None
commit 7d47251568
Author: Coiby Xu <coxu@redhat.com>
Date: Mon Jun 7 07:26:03 2021 +0800
Iterate /sys/bus/ccwgroup/devices to tell if we should set up rd.znet
This patch fixes bz1941106 and bz1941905 which passed empty rd.znet to the
kernel command line in the following cases,
- The IBM (Z15) KVM guest uses virtio for all devices including network
device, so there is no znet device for IBM KVM guest. So we can't
assume a s390x machine always has a znet device.
- When a bridged network is used, kexec-tools tries to obtain the znet
configuration from the ifcfg script of the bridged network rather than
from the ifcfg script of znet device.
We can iterate /sys/bus/ccwgroup/devices to tell if there if there is
a znet network device. By getting an ifname from znet, we can also avoid
mistaking the slave netdev as a znet network device in a bridged network
or bonded network.
Note: This patch also assumes there is only one znet device as commit
7148c0a30d ("add s390x netdev setup")
which greatly simplifies the code. According to IBM [1], there could be
more than znet devices for a z/VM system and a z/VM system may have a
non-znet network device like ConnectX. Since kdump_setup_znet was
introduced in 2012 and so far there is no known customer complaint that
invalidates this assumption I think it's safe to assume an IBM z/VM
system only has one znet device. Besides, there is no z/VM system found
on beaker to test the alternative scenarios.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1941905#c13
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: bz1901024
Upstream: Fedora
Conflict: None
commit 41980f30d9
Author: Kairui Song <kasong@redhat.com>
Date: Mon Apr 26 17:09:57 2021 +0800
Use a customized emergency shell
Use a modified and minimized version of emergency shell.
The differences of this kdump shell and dracut emergency shell are:
- Kdump shell won't generate a rdsosreport automatically
- Customized prompts
- Never ask root password
- Won't tangle with dracut's emergency_action. If emergency_action is
set, dracut emergency shell will perform dracut's emergency_action
instead of kdump final_action on exit.
- If rd.shell=no is set, kdump shell will still work, dracut emergency
shell won't, even if kdump failure_action is set to shell.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Kairui Song <kasong@redhat.com>
Resolves: bz1901024
Upstream: Fedora
Conflict: None
commit a2306346bc
Author: Kairui Song <kasong@redhat.com>
Date: Mon Apr 26 17:09:56 2021 +0800
Remove the kdump error handler isolation wrapper
The wrapper is introduced in commit 002337c, according to the commit
message, the only usage of the wrapper is when dracut-initqueue calls
"systemctl start emergency" directly. In that case, emergency
is started, but not in a isolation mode, which means dracut-initqueue
is still running. On the other hand, emergency will call
"systemctl start dracut-initqueue" again when default action is dump_to_rootfs.
systemd would block on the last dracut-initqueue, waiting for the first
instance to exit, which leaves us hang.
In previous commit we added initqueue status detect in dump_to_rootfs,
so now even without the wrapper, it will not hang.
And actually, previously, with the wrapper, emergency might still hang
for like 30s. When dracut called emergency service because initqueue
timed out, dump_to_rootfs will try start initqueue again and timeout
again. Now with the wrapper removed, we can avoid these two kinds of
hangs, bacause without the isolation we can detect initqueue service
status correctly in such case.
Also remove the invalid header comments in service file, the service
is not part of systemd code. And sync the service spec with dracut.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Kairui Song <kasong@redhat.com>
Resolves: bz1901024
Upstream: Fedora
Conflict: None
commit 108258139a
Author: Kairui Song <kasong@redhat.com>
Date: Mon Apr 26 17:09:55 2021 +0800
Don's try to restart dracut-initqueue if it's already there
kdump's dump_to_rootfs will try to start initqueue unconditionally.
dump_to_rootfs will run after systemd isolate to emergency
target, so this is currently accetable.
But there is a problem when initqueue starts the emergency action
because of initqueue timeout. dump_to_rootfs will start initqueue and
lead to timeout again.
So following patch will remove the previous isolation wrapper, and
detect the service status here. Previous isolation makes the detection
impossible. Now this detection will be valid and helpful to prevent
double timeout or hang.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Kairui Song <kasong@redhat.com>
Resolves: bz1952342
Upstream: Fedora
Conflict: None
commit 45377836b0
Author: Pingfan Liu <piliu@redhat.com>
Date: Tue May 25 09:26:09 2021 +0800
kdump-lib.sh: fix the case if no enough total RAM for kdump in get_recommend_size()
For crashkernel=auto policy, if total RAM size is under a throttle,
there is no memory reserved for kdump.
Also correct a trivial bug by correcting the arch name.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Resolves: bz1951415
Upstream: fedora
Conflict: none
commit e9e6a2c745
Author: Kairui Song <kasong@redhat.com>
Date: Thu Apr 22 03:27:10 2021 +0800
kdumpctl: Add kdumpctl estimate
Add a rough esitimation support, currently, following memory usage are
checked by this sub command:
- System RAM
- Kdump Initramfs size
- Kdump Kernel image size
- Kdump Kernel module size
- Kdump userspace user and other runtime allocated memory (currently
simply using a fixed value: 64M)
- LUKS encryption memory usage
The output of kdumpctl estimate looks like this:
# kdumpctl estimate
Reserved crashkernel: 256M
Recommanded crashkernel: 160M
Kernel image size: 47M
Kernel modules size: 12M
Initramfs size: 19M
Runtime reservation: 64M
Large modules:
xfs: 1892352
nouveau: 2318336
And if the kdump target is encrypted:
# kdumpctl estimate
Encrypted kdump target requires extra memory, assuming using the keyslot with minimun memory requirement
Reserved crashkernel: 256M
Recommanded crashkernel: 655M
Kernel image size: 47M
Kernel modules size: 12M
Initramfs size: 19M
Runtime reservation: 64M
LUKS required size: 512M
Large modules:
xfs: 1892352
nouveau: 2318336
WARNING: Current crashkernel size is lower than recommanded size 655M.
The "Recommanded" value is calculated based on memory usages mentioned
above, and will be adjusted accodingly to be no less than the value provided
by kdump_get_arch_recommend_size.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
Signed-off-by: Kairui Song <kasong@redhat.com>
Resolves: bz1951415
Upstream: fedora
Conflict: None
commit 85c725813b
Author: Kairui Song <kasong@redhat.com>
Date: Thu Apr 8 01:41:21 2021 +0800
mkdumprd: make use of the new get_luks_crypt_dev helper
Simplfy the code and also improve the performance. udevadm call is
heavy.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
Signed-off-by: Kairui Song <kasong@redhat.com>
Resolves: bz1951415
Upstream: fedora
Conflict: None
commit 1c70cf51c7
Author: Kairui Song <kasong@redhat.com>
Date: Tue May 18 16:13:16 2021 +0800
kdump-lib.sh: introduce a helper to get all crypt dev used by kdump
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
Signed-off-by: Kairui Song <kasong@redhat.com>
Resolves: bz1896247
Upstream: fedora
Conflict: none
commit ee160bf04d
Author: Kairui Song <kasong@redhat.com>
Date: Mon Apr 19 23:00:10 2021 +0800
Revert "Always set vm.zone_reclaim_mode = 3 in kdump kernel"
This reverts commit 5633e83318.
vm.zone_reclaim_mode may cause trashing on some machines. And after
second thought, vm.zone_reclaim_mode is barely helpful for machines
with high mem stress, so just revert it.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
Signed-off-by: Kairui Song <kasong@redhat.com>
Resolves: bz1952652
Upstream: fedora
Conflict: none
commit d0e9c51e0d
Author: Hari Bathini <hbathini@linux.ibm.com>
Date: Thu Apr 22 18:21:59 2021 +0530
fadump: fix dump capture failure to root disk
If the dump target is the root disk, kdump scripts add an entry in
/etc/fstab for root disk with /sysroot as the mount point. The root
disk, passed through root=<> kernel commandline parameter, is mounted
at /sysroot in read-only mode before switching from initial ramdisk.
So, in fadump mode, a remount of /sysroot to read-write mode is needed
to capture dump successfully, because /sysroot is already mounted as
read-only based on root=<> boot parameter.
Commit e8ef4db8ff ("Fix dump_fs mount point detection and fallback
mount") removed initialization of $_op variable, the variable holding
the options the dump target was mounted with, leading to the below
error as remount was skipped:
kdump[586]: saving to /sysroot/var/crash/127.0.0.1-2021-04-22-07:22:08/
kdump.sh[587]: mkdir: cannot create directory '/sysroot/var/crash/127.0.0.1-2021-04-22-07:22:08/': Read-only file system
kdump[589]: saving vmcore failed
Restore $_op variable initialization in dump_fs() function to fix this.
Fixes: e8ef4db8ff ("Fix dump_fs mount point detection and fallback mount")
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Resolves: bz1924116
Upstream: fedora
Conflict: none
commit 6a2e820d87
Author: Sourabh Jain <sourabhjain@linux.ibm.com>
Date: Sun Feb 21 17:23:37 2021 +0530
Stop reloading kdump service on CPU hotplug event for FADump
As FADump does not require an explicit elfcorehdr update whenever there is CPU
hotplug event so let's stop kdump service reload for FADump when CPU hotplug
event is triggered.
A new label is added to handle CPU and memory hotplug events separately. The
updated CPU hotplug event handler make sure that kdump service should not be
reloaded when FADump is configured.
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Resolves: bz1947928
Upstream: fedora
Conflict: none
commit 475e33030b
Author: Tao Liu <ltao@redhat.com>
Date: Sun Apr 25 17:05:42 2021 +0800
Make dracut-squash required for kexec-tools
This patch reverts commit "Make dracut-squash a weak dep".
Although kexec-tools can work without dracut-squash, it is essential
for kdump to run properly in cases [1][2] where minimal amount of memory
consumption is expected. Thus dracut-squash is needed for it.
[1] https://lists.fedoraproject.org/archives/list/kexec@lists.fedoraproject.org/message/SJX7CW3WLOYSFI2YJKGTUGDBWSCMZXVZ/
[2] https://www.spinics.net/lists/systemd-devel/msg05864.html
Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Resolves: bz1955453
Upstream: fedora
Conflict: none
commit ca05b754af
Author: Tao Liu <ltao@redhat.com>
Date: Mon May 10 22:10:26 2021 +0800
Fix incorrect file permissions of vmcore-dmesg-incomplete.txt
vmcore-dmesg-incomplete.txt is generated by shell redirection,
which taking the default umask value. When dmesg collector exits
with non-zero, the file will exist and anyone can have access to
it.
This patch fixed the issue by chmod the file, making it accessible
only to its owner.
Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Resolves: bz1950885
Upstream: fedora
Conflict: none
commit d5fe96cd7a
Author: Tao Liu <ltao@redhat.com>
Date: Tue Apr 27 17:58:40 2021 +0800
Disable CMA in kdump 2nd kernel
kexec-tools needs to disable CMA for kdump kernel cmdline,
otherwise kdump kernel may run out of memory.
This patch strips the inherited cma=, hugetlb_cma= cmd
line from 1st kernel, and sets to be 0 for 2nd kernel.
Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Resolves: bz1919052
Upstream: Fedora
Conflict: None
commit d5f6d38173
Author: Coiby Xu <coxu@redhat.com>
Date: Thu Apr 1 15:32:13 2021 +0800
Set up bond cmdline by "nmcli --get-values"
Now kdumpctl will exit if failing to set up bond cmdline.
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: bz1919052
Upstream: Fedora
Conflict: None
commit 8b08b4f17b
Author: Coiby Xu <coxu@redhat.com>
Date: Thu Apr 1 15:32:11 2021 +0800
Set up s390 znet cmdline by "nmcli --get-values"
Now kdumpctl will abort when failing to set up znet.
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: bz1919052
Upstream: Fedora
Conflict: None
commit 10c309b5f7
Author: Coiby Xu <coxu@redhat.com>
Date: Thu Apr 1 15:32:08 2021 +0800
Add helper to get value by field using "nmcli --get-values"
nmcli --get-values <field> connection show /org/freedesktop/NetworkManager/ActiveConnection/1
returns the following value for the corresponding field respectively,
Field Value
IP4.DNS "10.19.42.41 | 10.11.5.19 | 10.5.30.160"
802-3-ethernet.s390-subchannels ""
bond.options "mode=balance-rr"
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: bz1950932
Upstream: Fedora
Conflict: None
commit 8a33ffffbc
Author: Coiby Xu <coxu@redhat.com>
Date: Thu May 6 09:20:27 2021 +0800
rd.route should use the name from kdump_setup_ifname
This fixes bz1854037 which happens because kexec-tools generates rd.route for
eth0 instead of for kdump-eth0,
1. "rd.route=168.63.129.16:10.0.0.1:eth0 rd.route=169.254.169.254:10.0.0.1:eth0" is passed to the dracut cmdline by kexec-tools
2. In the 2rd kernel, dracut/modules.d/35network-manager/nm-config.sh calls
/usr/libexec/nm-initrd-generator to generate two .nmconnection files
based on the dracut cmdline, i.e. kdump-eth0.nmconnection and eth0.nmconnection,
- /run/NetworkManager/system-connections/kdump-eth0.nmconnection
[connection]
id=kdump-eth0
uuid=3ef53b1b-3908-437e-a15f-cf1f3ea2678b
type=ethernet
autoconnect-retries=1
interface-name=kdump-eth0
multi-connect=1
permissions=
wait-device-timeout=60000
[ethernet]
mac-address-blacklist=
[ipv4]
address1=10.0.0.4/24,10.0.0.1
dhcp-timeout=90
dns=168.63.129.16;
dns-search=
may-fail=false
method=manual
[ipv6]
addr-gen-mode=eui64
dhcp-timeout=90
dns-search=
method=disabled
[proxy]
- /run/NetworkManager/system-connections/eth0.nmconnection
[connection]
id=eth0
uuid=f224dc22-2891-4d7b-8f66-745029df4b53
type=ethernet
autoconnect-retries=1
interface-name=eth0
multi-connect=1
permissions=
[ethernet]
mac-address-blacklist=
[ipv4]
dhcp-timeout=90
dns=168.63.129.16;
dns-search=
method=auto
route1=168.63.129.16/32,10.0.0.1
route2=169.254.169.254/32,10.0.0.1
[ipv6]
addr-gen-mode=eui64
dhcp-timeout=90
dns-search=
method=auto
[proxy]
3. Since there's eth0.nmconnection, NetworkManager will try to get an IP for eth0 regardless of the fact it's a slave NIC and time out
```
$ ip link show
2: kdump-eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 00:0d:3a:11:86:8b brd ff:ff:ff:ff:ff:ff
3: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master kdump-eth0 state UP mode DEFAULT group default qlen 1000
```
Reported-by: Huijing Hei <hhei@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: bz1947347
Upstream: Fedora
Conflict: None
commit 1ca1b71780
Author: Coiby Xu <coxu@redhat.com>
Date: Thu Apr 8 11:44:26 2021 +0800
Implement IP netmask calculation to replace "ipcalc -m"
Recently, dracut-network drops depedency on dhcp-client which requires
ipcalc. Thus the dependency chain
"kexec-tools -> dracut-network -> dhcp-client -> ipcalc"
is broken. When NIC is configured to a static IP, kexec-tools depended
on "ipcalc -m" to get netmask. This commit implements the shell
equivalent of "ipcalc -m".
The following test code shows cal_netmask_by_prefix is consistent with
"ipcalc -m",
#!/bin/bash
. dracut-module-setup.sh
for i in {0..128}; do
mask_expected=$(ipcalc -m fe::/$i| cut -d"=" -f2)
mask_actual=$(cal_netmask_by_prefix $i "-6")
if [[ "$mask_expected" != "$mask_actual" ]]; then
echo "prefix="$i, "expected="$mask_expected, "acutal="$mask_actual
exit
fi
done
echo "IPv6 tests passed"
for i in {0..32}; do
mask_expected=$(ipcalc -m 8.8.8.8/$i| cut -d"=" -f2)
mask_actual=$(cal_netmask_by_prefix $i "")
if [[ "$mask_expected" != "$mask_actual" ]]; then
echo "prefix="$i, "expected="$mask_expected, "acutal="$mask_actual
exit
fi
done
echo "IPv4 tests passed"
i=-2
res=$(cal_netmask_by_prefix "$i" "")
if [[ $? -ne 1 ]]; then
echo "cal_netmask_by_prefix should exit when prefix<0"
exit
fi
res=$(cal_netmask_by_prefix "$i" "")
if [[ $? -ne 1 ]]; then
echo "cal_netmask_by_prefix should exit when prefix<0"
exit
fi
i=33
$(cal_netmask_by_prefix $i "")
if [[ $? -ne 1 ]]; then
echo "cal_netmask_by_prefix should exit when prefix>32 for IPv4"
exit
fi
i=129
$(cal_netmask_by_prefix $i "-6")
if [[ $? -ne 1 ]]; then
echo "cal_netmask_by_prefix should exit when prefix>128 for IPv4"
exit
fi
echo "Bad prefixes tests passed"
echo "All tests passed"
Reported-by: Jie Li <jieli@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: bz1892558
Upstream: Fedora
Conflict: None
commit 18131894b6
Author: Pingfan Liu <piliu@redhat.com>
Date: Thu Feb 4 09:45:36 2021 +0800
kdump-lib.sh: introduce functions to return recommened mem size
There is requirement to decide the recommended memory size for the current
system. And the algorithm is based on /proc/iomem, so it can align with the
algorithm used by reserve_crashkernel() in kernel.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Related: rhbz#1938165
Upstream: fedora
Conflict: none
commit 00785873ef
Author: Tao Liu <ltao@redhat.com>
Date: Fri Mar 19 18:07:51 2021 +0800
Fix incorrect vmcore permissions when dumped through ssh
Previously when dumping vmcore to a remote machine through ssh,
the files are created remotely and file permissions are taken
from the default umask value, which making the files accessible to
anyone on the remote machine.
This patch fixed the security issue by setting a customized umask value
before the file creation on the remote machine.
Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Resolves: rhbz#1938165
Upstream: fedora
Conflict: none
commit 91c802ff52
Author: Tao Liu <ltao@redhat.com>
Date: Thu Mar 18 16:52:46 2021 +0800
Fix incorrect permissions on kdump dmesg file
Also known as CVE-2021-20269. The kdump dmesg log files(kexec-dmesg.log,
vmcore-dmesg.txt) are generated by shell redirection, which take the
default umask value, making the files readable for group and others.
This patch chmod these files, making them only accessible to owner.
Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>