870ec2ec93
Resolves: bz2158296 Upstream: RHEL-only On PowerPC platform, the following hang is witnessed: Welcome to Red Hat Enterprise Linux 9.2 Beta (Plow) dracut-057-13.git20220816.el9 (Initramfs) ! [ 1.631210] systemd[1]: Hostname set to <ibm-p9z-18-lp11.virt.pnr.lab.eng.rdu2.redhat.com>. [-- MARK -- Mon Sep 26 01:45:00 2022] [ 243.681283] INFO: task systemd:1 blocked for more than 122 seconds. [ 243.681303] Not tainted 5.14.0-167.el9.ppc64le #1 [ 243.681315] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 243.681329] task:systemd state:D stack: 0 pid: 1 ppid: 0 flags:0x00042000 [ 243.681349] Call Trace: [ 243.681356] [c00000001a603640] [c00000004f990100] 0xc00000004f990100 (unreliable) [ 243.681378] [c00000001a603830] [c00000001001e9cc] __switch_to+0x12c/0x220 [ 243.681400] [c00000001a603890] [c000000010ec5b40] __schedule+0x230/0x720 [ 243.681418] [c00000001a603950] [c000000010ec6090] schedule+0x60/0x110 [ 243.681435] [c00000001a603980] [c000000010ecd948] schedule_timeout+0x168/0x1c0 [ 243.681454] [c00000001a603a60] [c000000010ec7214] __wait_for_common+0x134/0x360 [ 243.681473] [c00000001a603b00] [c00000001017c98c] __flush_work.isra.0+0x1dc/0x3d0 [ 243.681493] [c00000001a603ba0] [c0000000105cbd88] fsnotify_wait_marks_destroyed+0x28/0x40 [ 243.681512] [c00000001a603bc0] [c0000000105cb800] fsnotify_destroy_group+0x60/0x150 [ 243.681531] [c00000001a603c30] [c0000000105cf640] inotify_release+0x30/0xa0 [ 243.681548] [c00000001a603ca0] [c00000001054fad8] __fput+0xc8/0x350 [ 243.681565] [c00000001a603cf0] [c000000010183174] task_work_run+0xe4/0x160 [ 243.681583] [c00000001a603d40] [c000000010021874] do_notify_resume+0x134/0x140 [ 243.681602] [c00000001a603d70] [c000000010030168] interrupt_exit_user_prepare_main+0x198/0x270 [ 243.681622] [c00000001a603de0] [c0000000100305ac] syscall_exit_prepare+0x6c/0x180 [ 243.681641] [c00000001a603e10] [c00000001000bff4] system_call_vectored_common+0xf4/0x278 [ 243.681661] --- interrupt: 3000 at 0x7fffb3015ba4 [ 243.681673] NIP: 00007fffb3015ba4 LR: 0000000000000000 CTR: 0000000000000000 [ 243.681687] REGS: c00000001a603e80 TRAP: 3000 Not tainted (5.14.0-167.el9.ppc64le) [ 243.681703] MSR: 800000000000d033 <SF,EE,PR,ME,IR,DR,RI,LE> CR: 42044440 XER: 00000000 [ 243.681737] IRQMASK: 0 [ 243.681737] GPR00: 0000000000000006 00007fffd24a31a0 00007fffb3127200 0000000000000000 [ 243.681737] GPR04: 0000000000000002 000000000000000a 0000000000000000 0000000000000000 [ 243.681737] GPR08: 0000010009ea2d40 0000000000000000 0000000000000000 0000000000000000 [ 243.681737] GPR12: 0000000000000000 00007fffb3834bc0 0000000000000000 0000000000000000 [ 243.681737] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 243.681737] GPR20: 000000012c74ddf0 000000000000000e 000000000017cd3f 0000000000000000 [ 243.681737] GPR24: 00007fffd24a3570 0000000000000005 0000010009eb5490 0000010009ea24e0 [ 243.681737] GPR28: 0000010009ea2900 0000010009eb4850 0000010009ea2d70 00007fffb382dd98 [ 243.681896] NIP [00007fffb3015ba4] 0x7fffb3015ba4 [ 243.681907] LR [0000000000000000] 0x0 [ 243.681917] --- interrupt: 3000 [ 243.681928] INFO: task kworker/u16:1:34 blocked for more than 122 seconds. [ 243.681941] Not tainted 5.14.0-167.el9.ppc64le #1 [ 243.681951] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 243.681964] task:kworker/u16:1 state:D stack: 0 pid: 34 ppid: 2 flags:0x00000800 [ 243.681982] Workqueue: events_unbound fsnotify_mark_destroy_workfn [ 243.681998] Call Trace: [ 243.682005] [c00000001a9336d0] [c00000004f990100] 0xc00000004f990100 (unreliable) [ 243.682023] [c00000001a9338c0] [c00000001001e9cc] __switch_to+0x12c/0x220 [ 243.682042] [c00000001a933920] [c000000010ec5b40] __schedule+0x230/0x720 [ 243.682059] [c00000001a9339e0] [c000000010ec6090] schedule+0x60/0x110 [ 243.682075] [c00000001a933a10] [c000000010ecd948] schedule_timeout+0x168/0x1c0 [ 243.682094] [c00000001a933af0] [c000000010ec7214] __wait_for_common+0x134/0x360 [ 243.682113] [c00000001a933b90] [c000000010213370] __synchronize_srcu.part.0+0xa0/0xe0 [ 243.682132] [c00000001a933c00] [c0000000105cc154] fsnotify_mark_destroy_workfn+0xc4/0x1a0 [ 243.682151] [c00000001a933c70] [c00000001017acb8] process_one_work+0x298/0x580 [ 243.682169] [c00000001a933d10] [c00000001017b048] worker_thread+0xa8/0x630 [ 243.682185] [c00000001a933da0] [c000000010188348] kthread+0x1b8/0x1c0 [ 243.682203] [c00000001a933e10] [c00000001000cd64] ret_from_kernel_thread+0x5c/0x64 [ 366.561279] INFO: task systemd:1 blocked for more than 245 seconds. The right solution should be in kernel, but since the patch [1] for SRCU will not be merged into the mainline in near future, it had better to have a userspace workaround to overcome this test blocker. The workaround method is to pass the kernel parameter "srcutree.big_cpu_lim=0", so that the SRCU system will always use srcu_node array. [1]: https://lore.kernel.org/rcu/20221026032716.78674-1-kernelfans@gmail.com/T/#m6534975507c2abca497a94d81c7abbfea1d0978d Signed-off-by: Pingfan Liu <piliu@redhat.com>
59 lines
2.5 KiB
Plaintext
59 lines
2.5 KiB
Plaintext
# Kernel Version string for the -kdump kernel, such as 2.6.13-1544.FC5kdump
|
|
# If no version is specified, then the init script will try to find a
|
|
# kdump kernel with the same version number as the running kernel.
|
|
KDUMP_KERNELVER=""
|
|
|
|
# The kdump commandline is the command line that needs to be passed off to
|
|
# the kdump kernel. This will likely match the contents of the grub kernel
|
|
# line. For example:
|
|
# KDUMP_COMMANDLINE="ro root=LABEL=/"
|
|
# Dracut depends on proper root= options, so please make sure that appropriate
|
|
# root= options are copied from /proc/cmdline. In general it is best to append
|
|
# command line options using "KDUMP_COMMANDLINE_APPEND=".
|
|
# If a command line is not specified, the default will be taken from
|
|
# /proc/cmdline
|
|
KDUMP_COMMANDLINE=""
|
|
|
|
# This variable lets us remove arguments from the current kdump commandline
|
|
# as taken from either KDUMP_COMMANDLINE above, or from /proc/cmdline
|
|
# NOTE: some arguments such as crashkernel will always be removed
|
|
KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet log_buf_len swiotlb hugetlb_cma ignition.firstboot"
|
|
|
|
# This variable lets us append arguments to the current kdump commandline
|
|
# after processed by KDUMP_COMMANDLINE_REMOVE
|
|
KDUMP_COMMANDLINE_APPEND="irqpoll maxcpus=1 noirqdistrib reset_devices cgroup_disable=memory numa=off udev.children-max=2 ehea.use_mcs=0 panic=10 kvm_cma_resv_ratio=0 transparent_hugepage=never novmcoredd hugetlb_cma=0 srcutree.big_cpu_lim=0"
|
|
|
|
# Any additional kexec arguments required. In most situations, this should
|
|
# be left empty
|
|
#
|
|
# Example:
|
|
# KEXEC_ARGS="--elf32-core-headers"
|
|
KEXEC_ARGS="--dt-no-old-root"
|
|
|
|
#Where to find the boot image
|
|
#KDUMP_BOOTDIR="/boot"
|
|
|
|
#What is the image type used for kdump
|
|
KDUMP_IMG="vmlinuz"
|
|
|
|
#What is the images extension. Relocatable kernels don't have one
|
|
KDUMP_IMG_EXT=""
|
|
|
|
#Specify the action after failure
|
|
|
|
# Logging is controlled by following variables in the first kernel:
|
|
# - @var KDUMP_STDLOGLVL - logging level to standard error (console output)
|
|
# - @var KDUMP_SYSLOGLVL - logging level to syslog (by logger command)
|
|
# - @var KDUMP_KMSGLOGLVL - logging level to /dev/kmsg (only for boot-time)
|
|
#
|
|
# In the second kernel, kdump will use the rd.kdumploglvl option to set the
|
|
# log level in the above KDUMP_COMMANDLINE_APPEND.
|
|
# - @var rd.kdumploglvl - logging level to syslog (by logger command)
|
|
# - for example: add the rd.kdumploglvl=3 option to KDUMP_COMMANDLINE_APPEND
|
|
#
|
|
# Logging levels: no logging(0), error(1),warn(2),info(3),debug(4)
|
|
#
|
|
# KDUMP_STDLOGLVL=3
|
|
# KDUMP_SYSLOGLVL=0
|
|
# KDUMP_KMSGLOGLVL=0
|