Unnamed repository
Go to file
Pingfan Liu 870ec2ec93 ppc64: tackle SRCU hang issue
Resolves: bz2158296
Upstream: RHEL-only

On PowerPC platform, the following hang is witnessed:

Welcome to
Red Hat Enterprise Linux 9.2 Beta (Plow) dracut-057-13.git20220816.el9 (Initramfs)
!

[    1.631210] systemd[1]: Hostname set to <ibm-p9z-18-lp11.virt.pnr.lab.eng.rdu2.redhat.com>.
[-- MARK -- Mon Sep 26 01:45:00 2022]
[  243.681283] INFO: task systemd:1 blocked for more than 122 seconds.
[  243.681303]       Not tainted 5.14.0-167.el9.ppc64le #1
[  243.681315] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  243.681329] task:systemd         state:D stack:    0 pid:    1 ppid:     0 flags:0x00042000
[  243.681349] Call Trace:
[  243.681356] [c00000001a603640] [c00000004f990100] 0xc00000004f990100 (unreliable)
[  243.681378] [c00000001a603830] [c00000001001e9cc] __switch_to+0x12c/0x220
[  243.681400] [c00000001a603890] [c000000010ec5b40] __schedule+0x230/0x720
[  243.681418] [c00000001a603950] [c000000010ec6090] schedule+0x60/0x110
[  243.681435] [c00000001a603980] [c000000010ecd948] schedule_timeout+0x168/0x1c0
[  243.681454] [c00000001a603a60] [c000000010ec7214] __wait_for_common+0x134/0x360
[  243.681473] [c00000001a603b00] [c00000001017c98c] __flush_work.isra.0+0x1dc/0x3d0
[  243.681493] [c00000001a603ba0] [c0000000105cbd88] fsnotify_wait_marks_destroyed+0x28/0x40
[  243.681512] [c00000001a603bc0] [c0000000105cb800] fsnotify_destroy_group+0x60/0x150
[  243.681531] [c00000001a603c30] [c0000000105cf640] inotify_release+0x30/0xa0
[  243.681548] [c00000001a603ca0] [c00000001054fad8] __fput+0xc8/0x350
[  243.681565] [c00000001a603cf0] [c000000010183174] task_work_run+0xe4/0x160
[  243.681583] [c00000001a603d40] [c000000010021874] do_notify_resume+0x134/0x140
[  243.681602] [c00000001a603d70] [c000000010030168] interrupt_exit_user_prepare_main+0x198/0x270
[  243.681622] [c00000001a603de0] [c0000000100305ac] syscall_exit_prepare+0x6c/0x180
[  243.681641] [c00000001a603e10] [c00000001000bff4] system_call_vectored_common+0xf4/0x278
[  243.681661] --- interrupt: 3000 at 0x7fffb3015ba4
[  243.681673] NIP:  00007fffb3015ba4 LR: 0000000000000000 CTR: 0000000000000000
[  243.681687] REGS: c00000001a603e80 TRAP: 3000   Not tainted  (5.14.0-167.el9.ppc64le)
[  243.681703] MSR:  800000000000d033 <SF,EE,PR,ME,IR,DR,RI,LE>  CR: 42044440  XER: 00000000
[  243.681737] IRQMASK: 0
[  243.681737] GPR00: 0000000000000006 00007fffd24a31a0 00007fffb3127200 0000000000000000
[  243.681737] GPR04: 0000000000000002 000000000000000a 0000000000000000 0000000000000000
[  243.681737] GPR08: 0000010009ea2d40 0000000000000000 0000000000000000 0000000000000000
[  243.681737] GPR12: 0000000000000000 00007fffb3834bc0 0000000000000000 0000000000000000
[  243.681737] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  243.681737] GPR20: 000000012c74ddf0 000000000000000e 000000000017cd3f 0000000000000000
[  243.681737] GPR24: 00007fffd24a3570 0000000000000005 0000010009eb5490 0000010009ea24e0
[  243.681737] GPR28: 0000010009ea2900 0000010009eb4850 0000010009ea2d70 00007fffb382dd98
[  243.681896] NIP [00007fffb3015ba4] 0x7fffb3015ba4
[  243.681907] LR [0000000000000000] 0x0
[  243.681917] --- interrupt: 3000
[  243.681928] INFO: task kworker/u16:1:34 blocked for more than 122 seconds.
[  243.681941]       Not tainted 5.14.0-167.el9.ppc64le #1
[  243.681951] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  243.681964] task:kworker/u16:1   state:D stack:    0 pid:   34 ppid:     2 flags:0x00000800
[  243.681982] Workqueue: events_unbound fsnotify_mark_destroy_workfn
[  243.681998] Call Trace:
[  243.682005] [c00000001a9336d0] [c00000004f990100] 0xc00000004f990100 (unreliable)
[  243.682023] [c00000001a9338c0] [c00000001001e9cc] __switch_to+0x12c/0x220
[  243.682042] [c00000001a933920] [c000000010ec5b40] __schedule+0x230/0x720
[  243.682059] [c00000001a9339e0] [c000000010ec6090] schedule+0x60/0x110
[  243.682075] [c00000001a933a10] [c000000010ecd948] schedule_timeout+0x168/0x1c0
[  243.682094] [c00000001a933af0] [c000000010ec7214] __wait_for_common+0x134/0x360
[  243.682113] [c00000001a933b90] [c000000010213370] __synchronize_srcu.part.0+0xa0/0xe0
[  243.682132] [c00000001a933c00] [c0000000105cc154] fsnotify_mark_destroy_workfn+0xc4/0x1a0
[  243.682151] [c00000001a933c70] [c00000001017acb8] process_one_work+0x298/0x580
[  243.682169] [c00000001a933d10] [c00000001017b048] worker_thread+0xa8/0x630
[  243.682185] [c00000001a933da0] [c000000010188348] kthread+0x1b8/0x1c0
[  243.682203] [c00000001a933e10] [c00000001000cd64] ret_from_kernel_thread+0x5c/0x64
[  366.561279] INFO: task systemd:1 blocked for more than 245 seconds.

The right solution should be in kernel, but since the patch [1] for SRCU
will not be merged into the mainline in near future, it had better to
have a userspace workaround to overcome this test blocker.

The workaround method is to pass the kernel parameter "srcutree.big_cpu_lim=0", so
that the SRCU system will always use srcu_node array.

[1]: https://lore.kernel.org/rcu/20221026032716.78674-1-kernelfans@gmail.com/T/#m6534975507c2abca497a94d81c7abbfea1d0978d

Signed-off-by: Pingfan Liu <piliu@redhat.com>
2023-01-06 11:26:03 +08:00
tests Merged update from upstream sources 2020-11-20 12:35:49 +00:00
.editorconfig kdump-lib-initramfs.sh: prepare to be a POSIX compatible lib 2021-11-09 21:45:15 +08:00
.gitignore RHEL 9.0.0 Alpha bootstrap 2020-10-15 14:45:57 +02:00
60-fadump.install fadump: add a kernel install hook to clean up fadump initramfs 2022-12-22 14:36:23 +08:00
60-kdump.install Write to `/var/lib/kdump` if $KDUMP_BOOTDIR not writable 2021-06-23 09:34:40 +08:00
92-crashkernel.install Prefix reset-crashkernel-{for-installed_kernel,after-update} with underscore 2022-10-27 14:47:57 +08:00
98-kexec.rules RHEL 9.0.0 Alpha bootstrap 2020-10-15 14:45:57 +02:00
98-kexec.rules.ppc64 Stop reloading kdump service on CPU hotplug event for FADump 2021-05-14 14:27:03 +08:00
README RHEL 9.0.0 Alpha bootstrap 2020-10-15 14:45:57 +02:00
crashkernel-howto.txt remind the users to run zipl after calling grubby on s390x 2022-09-19 09:10:54 +08:00
dracut-early-kdump-module-setup.sh dracut-early-kdump-module-setup.sh: install xargs and kdump-lib-initramfs.sh 2022-01-06 14:31:33 +08:00
dracut-early-kdump.sh dracut-early-kdump.sh: make it POSIX compatible 2021-11-10 10:27:00 +08:00
dracut-fadump-init-fadump.sh fadump-init: clean up mount points properly 2021-07-20 15:43:43 +08:00
dracut-fadump-module-setup.sh fadump: isolate fadump initramfs image within the default one 2021-07-20 15:43:11 +08:00
dracut-kdump-capture.service RHEL 9.0.0 Alpha bootstrap 2020-10-15 14:45:57 +02:00
dracut-kdump-emergency.service Merge kdump-error-handler.sh into kdump.sh 2021-11-09 21:45:31 +08:00
dracut-kdump-emergency.target RHEL 9.0.0 Alpha bootstrap 2020-10-15 14:45:57 +02:00
dracut-kdump.sh Wait for the network to be truly ready before dumping vmcore 2022-11-23 09:42:33 +08:00
dracut-module-setup.sh dracut-module-setup.sh: skip installing driver for the loopback interface 2022-12-27 05:31:11 +00:00
dracut-monitor_dd_progress RHEL 9.0.0 Alpha bootstrap 2020-10-15 14:45:57 +02:00
early-kdump-howto.txt RHEL 9.0.0 Alpha bootstrap 2020-10-15 14:45:57 +02:00
fadump-howto.txt update fadump-howto 2022-05-17 09:23:10 +00:00
gating.yaml Add gating.yaml to RHEL-9 kexec-tools 2021-06-08 20:03:41 +08:00
gen-kdump-conf.sh kdump.conf: use a simple generator script to maintain 2022-12-01 11:01:17 +08:00
kdump-dep-generator.sh Merged update from upstream sources 2021-01-22 08:12:00 +00:00
kdump-in-cluster-environment.txt RHEL 9.0.0 Alpha bootstrap 2020-10-15 14:45:57 +02:00
kdump-lib-initramfs.sh Address the cases where a NIC has a different name in kdump kernel 2022-11-23 09:42:18 +08:00
kdump-lib.sh Add lvm2 thin provision dump target checker 2022-11-09 15:56:18 +08:00
kdump-logger.sh Add header comment for POSIX compliant scripts 2021-11-10 10:26:54 +08:00
kdump-migrate-action.sh kdump/ppc64: rebuild initramfs image after migration 2021-12-03 18:13:09 +08:00
kdump-restart.sh kdump/ppc64: rebuild initramfs image after migration 2021-12-03 18:13:09 +08:00
kdump-udev-throttler RHEL 9.0.0 Alpha bootstrap 2020-10-15 14:45:57 +02:00
kdump.conf.5 introduce the auto_reset_crashkernel option to kdump.conf 2022-01-06 03:55:25 +00:00
kdump.service Merged update from upstream sources 2020-11-05 05:34:29 +00:00
kdump.sysconfig kdump.sysconfig*: add ignition.firstboot to KDUMP_COMMANDLINE_REMOVE 2022-05-27 10:08:59 +08:00
kdump.sysconfig.aarch64 kdump.sysconfig*: add ignition.firstboot to KDUMP_COMMANDLINE_REMOVE 2022-05-27 10:08:59 +08:00
kdump.sysconfig.i386 kdump.sysconfig*: add ignition.firstboot to KDUMP_COMMANDLINE_REMOVE 2022-05-27 10:08:59 +08:00
kdump.sysconfig.ppc64 ppc64: tackle SRCU hang issue 2023-01-06 11:26:03 +08:00
kdump.sysconfig.ppc64le ppc64: tackle SRCU hang issue 2023-01-06 11:26:03 +08:00
kdump.sysconfig.s390x kdump.sysconfig*: add ignition.firstboot to KDUMP_COMMANDLINE_REMOVE 2022-05-27 10:08:59 +08:00
kdump.sysconfig.x86_64 kdump.sysconfig*: add ignition.firstboot to KDUMP_COMMANDLINE_REMOVE 2022-05-27 10:08:59 +08:00
kdumpctl Don't try to update crashkernel when bootloader is not installed 2022-12-27 03:11:43 +00:00
kdumpctl.8 add man documentation for kdumpctl get-default-crashkernel 2022-05-17 09:23:10 +00:00
kexec-kdump-howto.txt update kexec-kdump-howto 2022-05-17 09:23:10 +00:00
kexec-tools-2.0.25-ppc64-ppc64-remove-rma_top-limit.patch kexec-tools: ppc64: remove rma_top limit 2022-12-06 11:01:48 +08:00
kexec-tools.spec Release 2.0.25-10 2022-12-27 15:11:50 +08:00
live-image-kdump-howto.txt RHEL 9.0.0 Alpha bootstrap 2020-10-15 14:45:57 +02:00
mkdumprd Reduce kdump memory consumption by only installing needed NIC drivers 2022-11-23 09:42:18 +08:00
mkdumprd.8 Merged update from upstream sources 2020-12-23 10:00:07 +00:00
mkfadumprd fadump: use 'zstd' as the default compression method 2022-12-22 14:36:23 +08:00
sources Rebase makedumpfile to 1.7.2 2022-11-01 11:04:23 +08:00
supported-kdump-targets.txt Update supported-kdump-targets.txt 2022-12-27 05:31:11 +00:00
zanata-notes.txt RHEL 9.0.0 Alpha bootstrap 2020-10-15 14:45:57 +02:00

README

Adding a patch to kexec-tools
=============================
There is a mailing list kexec@lists.fedoraproject.org where all the dicussion
related to fedora kexec-tools happen. All the patches are posted there for
inclusion and committed to kexec-tools after review.

So if you want your patches to be included in fedora kexec-tools package,
post these to kexec@lists.fedoraproject.org.

One can subscribe to list and browse through archives here.

https://admin.fedoraproject.org/mailman/listinfo/kexec