When running with squash module enabled for both initramfs, /dev and
/run are also mounted by squash-init, so move them to newroot as well,
else they might leak.
Also pass `-d` to umount so loop devices (if used) will be force freed.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
After fadump embedded the fadump initramfs in the normal initramfs,
kdumpctl will mistakenly rebuild the initramfs everytime.
kdumpctl checks the hostonly-kernel-modules.txt file in initramfs
to check if required drivers are included, but the normal initramfs
is built in non-hostonly mode, so it doesn't have a
hostonly-kernel-modules.txt file. The check will always fail.
So let mkfadumprd make a copy of the hostonly-kernel-modules.txt in the
fadump initramfs and let kdumpctl check that file instead.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
In case of fadump, the initramfs image has to be built to boot into
the production environment as well as to offload the active crash dump
to the specified dump target (for boot after crash). As the same image
would be used for both boot scenarios, it could not be built optimally
while accommodating both cases.
Use --include to include the initramfs image built for offloading
active crash dump to the specified dump target. Also, introduce a new
out-of-tree dracut module (99zz-fadumpinit) that installs a customized
init program while moving the default /init to /init.dracut. This
customized init program is leveraged to isolate fadump image within
the default initramfs image by kicking off default boot process
(exec /init.dracut) for regular boot scenario and activating fadump
initramfs image, if the system is booting after a crash.
If squash is available, ensure default initramfs image is also built
with squash module to reduce memory consumption in capture kernel.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Change spaces to tab to fix alignment issue.
Fixes: commit 7d47251568
("Iterate /sys/bus/ccwgroup/devices to tell if we should set up rd.znet")
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
/sys/bus/ccwgroup/devices doesn't exist for non-s390x machines which leads to
the warning "find: '/sys/bus/ccwgroup/devices': No such file or directory".
This warning can be eliminated by checking the existence of
"/sys/bus/ccwgroup/devices" beforehand.
Fixes: commit 7d47251568
("Iterate /sys/bus/ccwgroup/devices to tell if we should set up rd.znet")
Reported-by: Ruowen Qin <ruqin@redhat.com>
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1974618
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Backport from upstream.
commit 9a6f589d99dcef114c89fde992157f5467028c8f
Author: Tao Liu <ltao@redhat.com>
Date: Fri Jun 18 18:28:04 2021 +0800
[PATCH] check for invalid physical address of /proc/kcore when making ELF dumpfile
Previously when executing makedumpfile with -E option against
/proc/kcore, makedumpfile will fail:
# makedumpfile -E -d 31 /proc/kcore kcore.dump
...
write_elf_load_segment: Can't convert physaddr(ffffffffffffffff) to an offset.
makedumpfile Failed.
It's because /proc/kcore contains PT_LOAD program headers which have
physaddr (0xffffffffffffffff). With -E option, makedumpfile will
try to convert the physaddr to an offset and fails.
Skip the PT_LOAD program headers which have such physaddr.
Signed-off-by: Tao Liu <ltao@redhat.com>
Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Backport from upstream.
commit 38d921a2ef50ebd36258097553626443ffe27496
Author: Coiby Xu <coxu@redhat.com>
Date: Tue Jun 15 18:26:31 2021 +0800
[PATCH] check for invalid physical address of /proc/kcore when finding max_paddr
Kernel commit 464920104bf7adac12722035bfefb3d772eb04d8 ("/proc/kcore:
update physical address for kcore ram and text") sets an invalid paddr
(0xffffffffffffffff = -1) for PT_LOAD segments of not direct mapped
regions:
$ readelf -l /proc/kcore
...
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
NOTE 0x0000000000000120 0x0000000000000000 0x0000000000000000
0x0000000000002320 0x0000000000000000 0x0
LOAD 0x1000000000010000 0xd000000000000000 0xffffffffffffffff
^^^^^^^^^^^^^^^^^^
0x0001f80000000000 0x0001f80000000000 RWE 0x10000
makedumpfile uses max_paddr to calculate the number of sections for
sparse memory model thus wrong number is obtained based on max_paddr
(-1). This error could lead to the failure of copying /proc/kcore
for RHEL-8.5 on ppc64le machine [1]:
$ makedumpfile /proc/kcore vmcore1
get_mem_section: Could not validate mem_section.
get_mm_sparsemem: Can't get the address of mem_section.
makedumpfile Failed.
Let's check if the phys_start of the segment is a valid physical
address to fix this problem.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1965267
Reported-by: Xiaoying Yan <yiyan@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Backport from upstream.
commit 646456862df8926ba10dd7330abf3bf0f887e1b6
Author: Kazuhito Hagio <k-hagio-ab@nec.com>
Date: Wed May 26 14:31:26 2021 +0900
[PATCH] Increase SECTION_MAP_LAST_BIT to 5
* Required for kernel 5.12
Kernel commit 1f90a3477df3 ("mm: teach pfn_to_online_page() about
ZONE_DEVICE section collisions") added a section flag
(SECTION_TAINT_ZONE_DEVICE) and causes makedumpfile an error on
some machines like this:
__vtop4_x86_64: Can't get a valid pmd_pte.
readmem: Can't convert a virtual address(ffffe2bdc2000000) to physical address.
readmem: type_addr: 0, addr:ffffe2bdc2000000, size:32768
__exclude_unnecessary_pages: Can't read the buffer of struct page.
create_2nd_bitmap: Can't exclude unnecessary pages.
Increase SECTION_MAP_LAST_BIT to 5 to fix this. The bit had not
been used until the change, so we can just increase the value.
Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
For the log entry that has multiple lines, "makedumpfile --dump-dmesg"
would indent the remaining lines while vmcore-dmesg doesn't. For
example, vmcore-dmesg.txt and vmcore-dmesg.txt.2 are the outputs of
vmcore-dmesg and "makedumpfile --dump-dmesg" respectively,
```
diff -u vmcore-dmesg.txt vmcore-dmesg.txt.2
--- vmcore-dmesg.txt 2021-03-28 22:13:09.986000000 -0400
+++ vmcore-dmesg.txt.2 2021-03-28 22:13:39.920106131 -0400
@@ -397,9 +397,9 @@
[ 1.710742] vc vcsa: hash matches
[ 1.711938] RAS: Correctable Errors collector initialized.
[ 1.713736] Unstable clock detected, switching default tracing clock to "global"
-If you want to keep using the local clock, then add:
- "trace_clock=local"
-on the kernel command line
+ If you want to keep using the local clock, then add:
+ "trace_clock=local"
+ on the kernel command line
[ 1.750539] ata1.01: NODEV after polling detection
[ 1.750973] ata1.00: ATA-7: QEMU HARDDISK, 2.5+, max UDMA/100
[ 1.752885] ata1.00: 8388608 sectors, multi 16: LBA48
```
Quite often, all three tests could fail because of the above difference. So
let's ignore all the spaces. This patch could fix bz1952299 [1].
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1952299
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
test_base_image should depend on EXTRA_RPMS so it gets rebuild when
EXTRA_RPMS changes.
Fixes: commit bbc064f958
("selftest: add EXTRA_RPMs so dracut RPMs can be installed onto
the image to run the tests")
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
When kdump service fails, the current errors do not display the
absolute path of dump location(marked it as "^"), for example:
kdump: kexec: unloaded kdump kernel
kdump: Stopping kdump: [OK]
kdump: Detected change(s) in the following file(s): /etc/kdump.conf
kdump: Rebuilding /boot/initramfs-4.18.0-304.el8.x86_64kdump.img
kdump: Dump path "/var1/crash" does not exist in dump target "UUID=c202ef45-3ac3-4adb-85e7-307a916757f0"
^^^^^^^^^^^
kdump: mkdumprd: failed to make kdump initrd
kdump: Starting kdump: [FAILED]
Here, it should output the absolute path of dump location with this
format: "<mount path>/<path>". To fix it, let's extend the relative
pathname to the absolute pathname in check_user_configured_target().
Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
This patch fixes bz1941106 and bz1941905 which passed empty rd.znet to the
kernel command line in the following cases,
- The IBM (Z15) KVM guest uses virtio for all devices including network
device, so there is no znet device for IBM KVM guest. So we can't
assume a s390x machine always has a znet device.
- When a bridged network is used, kexec-tools tries to obtain the znet
configuration from the ifcfg script of the bridged network rather than
from the ifcfg script of znet device.
We can iterate /sys/bus/ccwgroup/devices to tell if there if there is
a znet network device. By getting an ifname from znet, we can also avoid
mistaking the slave netdev as a znet network device in a bridged network
or bonded network.
Note: This patch also assumes there is only one znet device as commit
7148c0a30d ("add s390x netdev setup")
which greatly simplifies the code. According to IBM [1], there could be
more than znet devices for a z/VM system and a z/VM system may have a
non-znet network device like ConnectX. Since kdump_setup_znet was
introduced in 2012 and so far there is no known customer complaint that
invalidates this assumption I think it's safe to assume an IBM z/VM
system only has one znet device. Besides, there is no z/VM system found
on beaker to test the alternative scenarios.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1941905#c13
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Use a modified and minimized version of emergency shell.
The differences of this kdump shell and dracut emergency shell are:
- Kdump shell won't generate a rdsosreport automatically
- Customized prompts
- Never ask root password
- Won't tangle with dracut's emergency_action. If emergency_action is
set, dracut emergency shell will perform dracut's emergency_action
instead of kdump final_action on exit.
- If rd.shell=no is set, kdump shell will still work, dracut emergency
shell won't, even if kdump failure_action is set to shell.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
The wrapper is introduced in commit 002337c, according to the commit
message, the only usage of the wrapper is when dracut-initqueue calls
"systemctl start emergency" directly. In that case, emergency
is started, but not in a isolation mode, which means dracut-initqueue
is still running. On the other hand, emergency will call
"systemctl start dracut-initqueue" again when default action is dump_to_rootfs.
systemd would block on the last dracut-initqueue, waiting for the first
instance to exit, which leaves us hang.
In previous commit we added initqueue status detect in dump_to_rootfs,
so now even without the wrapper, it will not hang.
And actually, previously, with the wrapper, emergency might still hang
for like 30s. When dracut called emergency service because initqueue
timed out, dump_to_rootfs will try start initqueue again and timeout
again. Now with the wrapper removed, we can avoid these two kinds of
hangs, bacause without the isolation we can detect initqueue service
status correctly in such case.
Also remove the invalid header comments in service file, the service
is not part of systemd code. And sync the service spec with dracut.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
kdump's dump_to_rootfs will try to start initqueue unconditionally.
dump_to_rootfs will run after systemd isolate to emergency
target, so this is currently accetable.
But there is a problem when initqueue starts the emergency action
because of initqueue timeout. dump_to_rootfs will start initqueue and
lead to timeout again.
So following patch will remove the previous isolation wrapper, and
detect the service status here. Previous isolation makes the detection
impossible. Now this detection will be valid and helpful to prevent
double timeout or hang.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
Fix the warning observed when KDUMP_KERNELVER is specified:
kdumpctl[10926]: /lib/kdump/kdump-lib.sh: line 697: [: missing `]'
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Kairui Song <kasong@redhat.com>
For crashkernel=auto policy, if total RAM size is under a throttle,
there is no memory reserved for kdump.
Also correct a trivial bug by correcting the arch name.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Add a rough esitimation support, currently, following memory usage are
checked by this sub command:
- System RAM
- Kdump Initramfs size
- Kdump Kernel image size
- Kdump Kernel module size
- Kdump userspace user and other runtime allocated memory (currently
simply using a fixed value: 64M)
- LUKS encryption memory usage
The output of kdumpctl estimate looks like this:
# kdumpctl estimate
Reserved crashkernel: 256M
Recommanded crashkernel: 160M
Kernel image size: 47M
Kernel modules size: 12M
Initramfs size: 19M
Runtime reservation: 64M
Large modules:
xfs: 1892352
nouveau: 2318336
And if the kdump target is encrypted:
# kdumpctl estimate
Encrypted kdump target requires extra memory, assuming using the keyslot with minimun memory requirement
Reserved crashkernel: 256M
Recommanded crashkernel: 655M
Kernel image size: 47M
Kernel modules size: 12M
Initramfs size: 19M
Runtime reservation: 64M
LUKS required size: 512M
Large modules:
xfs: 1892352
nouveau: 2318336
WARNING: Current crashkernel size is lower than recommanded size 655M.
The "Recommanded" value is calculated based on memory usages mentioned
above, and will be adjusted accodingly to be no less than the value provided
by kdump_get_arch_recommend_size.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
Simplfy the code and also improve the performance. udevadm call is
heavy.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
kexec-tools needs to disable CMA for kdump kernel cmdline,
otherwise kdump kernel may run out of memory.
This patch strips the inherited cma=, hugetlb_cma= cmd
line from 1st kernel, and sets to be 0 for 2nd kernel.
Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
apath (a D-Bus active connection path) is used for nmcli connection operations, e.g.
$ nmcli connection show $apath
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
nmcli --get-values <field> connection show /org/freedesktop/NetworkManager/ActiveConnection/1
returns the following value for the corresponding field respectively,
Field Value
IP4.DNS "10.19.42.41 | 10.11.5.19 | 10.5.30.160"
802-3-ethernet.s390-subchannels ""
bond.options "mode=balance-rr"
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
This fixes bz1854037 which happens because kexec-tools generates rd.route for
eth0 instead of for kdump-eth0,
1. "rd.route=168.63.129.16:10.0.0.1:eth0 rd.route=169.254.169.254:10.0.0.1:eth0" is passed to the dracut cmdline by kexec-tools
2. In the 2rd kernel, dracut/modules.d/35network-manager/nm-config.sh calls
/usr/libexec/nm-initrd-generator to generate two .nmconnection files
based on the dracut cmdline, i.e. kdump-eth0.nmconnection and eth0.nmconnection,
- /run/NetworkManager/system-connections/kdump-eth0.nmconnection
[connection]
id=kdump-eth0
uuid=3ef53b1b-3908-437e-a15f-cf1f3ea2678b
type=ethernet
autoconnect-retries=1
interface-name=kdump-eth0
multi-connect=1
permissions=
wait-device-timeout=60000
[ethernet]
mac-address-blacklist=
[ipv4]
address1=10.0.0.4/24,10.0.0.1
dhcp-timeout=90
dns=168.63.129.16;
dns-search=
may-fail=false
method=manual
[ipv6]
addr-gen-mode=eui64
dhcp-timeout=90
dns-search=
method=disabled
[proxy]
- /run/NetworkManager/system-connections/eth0.nmconnection
[connection]
id=eth0
uuid=f224dc22-2891-4d7b-8f66-745029df4b53
type=ethernet
autoconnect-retries=1
interface-name=eth0
multi-connect=1
permissions=
[ethernet]
mac-address-blacklist=
[ipv4]
dhcp-timeout=90
dns=168.63.129.16;
dns-search=
method=auto
route1=168.63.129.16/32,10.0.0.1
route2=169.254.169.254/32,10.0.0.1
[ipv6]
addr-gen-mode=eui64
dhcp-timeout=90
dns-search=
method=auto
[proxy]
3. Since there's eth0.nmconnection, NetworkManager will try to get an IP for eth0 regardless of the fact it's a slave NIC and time out
```
$ ip link show
2: kdump-eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 00:0d:3a:11:86:8b brd ff:ff:ff:ff:ff:ff
3: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master kdump-eth0 state UP mode DEFAULT group default qlen 1000
```
Reported-by: Huijing Hei <hhei@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
vmcore-dmesg-incomplete.txt is generated by shell redirection,
which taking the default umask value. When dmesg collector exits
with non-zero, the file will exist and anyone can have access to
it.
This patch fixed the issue by chmod the file, making it accessible
only to its owner.
Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
This reverts commit 5633e83318.
vm.zone_reclaim_mode may cause trashing on some machines. And after
second thought, vm.zone_reclaim_mode is barely helpful for machines
with high mem stress, so just revert it.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
Kdump scirpt already have default values for core_collector, path in
many other place. Empty kdump.conf still works. Fix this corner case and
fix the error message.
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
Backport from upstream:
commit 0ef2ca6c9fa2f61f217a4bf5d7fd70f24e12b2eb
Author: Kazuhito Hagio <k-hagio-ab@nec.com>
Date: Thu Feb 4 16:29:06 2021 +0900
[PATCH] Show write byte size in report messages
Show write byte size in report messages. This value can be different
from the size of the actual file because of some holes on dumpfile
data structure.
$ makedumpfile --show-stats -l -d 1 vmcore dump.ld1
...
Total pages : 0x0000000000080000
Write bytes : 377686445
...
# ls -l dump.ld1
-rw------- 1 root root 377691573 Feb 4 16:28 dump.ld1
Note that this value should not be used with /proc/kcore to determine
how much disk space is needed for crash dump, because the real memory
usage when a crash occurs can vary widely.
Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Backport from upstream:
commit 6f3e75a558ed50d6ff0b42e3f61c099b2005b7bb
Author: Julien Thierry <jthierry@redhat.com>
Date: Tue Nov 24 10:45:25 2020 +0000
[PATCH 2/2] Add shorthand --show-stats option to show report stats
Provide shorthand --show-stats option to enable report messages
without needing to set a particular value for message-level.
Signed-off-by: Julien Thierry <jthierry@redhat.com>
Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
Backport from upstream.
commit 3422e1d6bc3511c5af9cb05ba74ad97dd93ffd7f
Author: Julien Thierry <jthierry@redhat.com>
Date: Tue Nov 24 10:45:24 2020 +0000
[PATCH 1/2] Add --dry-run option to prevent writing the dumpfile
Add a --dry-run option to run all operations without writing the
dump to the output file.
Signed-off-by: Julien Thierry <jthierry@redhat.com>
Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
If the dump target is the root disk, kdump scripts add an entry in
/etc/fstab for root disk with /sysroot as the mount point. The root
disk, passed through root=<> kernel commandline parameter, is mounted
at /sysroot in read-only mode before switching from initial ramdisk.
So, in fadump mode, a remount of /sysroot to read-write mode is needed
to capture dump successfully, because /sysroot is already mounted as
read-only based on root=<> boot parameter.
Commit e8ef4db8ff ("Fix dump_fs mount point detection and fallback
mount") removed initialization of $_op variable, the variable holding
the options the dump target was mounted with, leading to the below
error as remount was skipped:
kdump[586]: saving to /sysroot/var/crash/127.0.0.1-2021-04-22-07:22:08/
kdump.sh[587]: mkdir: cannot create directory '/sysroot/var/crash/127.0.0.1-2021-04-22-07:22:08/': Read-only file system
kdump[589]: saving vmcore failed
Restore $_op variable initialization in dump_fs() function to fix this.
Fixes: e8ef4db8ff ("Fix dump_fs mount point detection and fallback mount")
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Kairui Song <kasong@redhat.com>
This fix bz1854037 which happens because kexec-tools generates rd.route for
eth0 instead of for kdump-eth0,
1. "rd.route=168.63.129.16:10.0.0.1:eth0 rd.route=169.254.169.254:10.0.0.1:eth0" is passed to the dracut cmdline by kexec-tools
2. In the 2rd kernel,
- dracut/modules.d/40network/net-lib.sh will write /tmp/net.route.eth0 based on rd.route
- dracut/modules.d/45ifcfg/write-ifcfg.sh will copy /tmp/net.route.eth0 to /tmp/icfg and then copytree /tmp/ifcfg to /run/initramfs/state/etc/sysconfig/network-scripts
3. NetworkManager will try to get an IP for eth0 regardless of the fact it's a slave NIC and time out
```
$ ip link show
2: kdump-eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 00:0d:3a:11:86:8b brd ff:ff:ff:ff:ff:ff
3: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master kdump-eth0 state UP mode DEFAULT group default qlen 1000
```
Reported-by: Huijing Hei <hhei@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com