Commit Graph

8 Commits

Author SHA1 Message Date
Pingfan Liu 2b2b6b84c0 Revert "ppc64: tackle SRCU hang issue"
Resolves: bz2177574
Upstream: RHEL-only

This reverts commit 870ec2ec93.

Now the real fix has gone into the RHEL-9 kernel [1], the temporary
workaround can be removed.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=2129726

Signed-off-by: Pingfan Liu <piliu@redhat.com>
2023-03-21 07:50:06 +00:00
Pingfan Liu 870ec2ec93 ppc64: tackle SRCU hang issue
Resolves: bz2158296
Upstream: RHEL-only

On PowerPC platform, the following hang is witnessed:

Welcome to
Red Hat Enterprise Linux 9.2 Beta (Plow) dracut-057-13.git20220816.el9 (Initramfs)
!

[    1.631210] systemd[1]: Hostname set to <ibm-p9z-18-lp11.virt.pnr.lab.eng.rdu2.redhat.com>.
[-- MARK -- Mon Sep 26 01:45:00 2022]
[  243.681283] INFO: task systemd:1 blocked for more than 122 seconds.
[  243.681303]       Not tainted 5.14.0-167.el9.ppc64le #1
[  243.681315] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  243.681329] task:systemd         state:D stack:    0 pid:    1 ppid:     0 flags:0x00042000
[  243.681349] Call Trace:
[  243.681356] [c00000001a603640] [c00000004f990100] 0xc00000004f990100 (unreliable)
[  243.681378] [c00000001a603830] [c00000001001e9cc] __switch_to+0x12c/0x220
[  243.681400] [c00000001a603890] [c000000010ec5b40] __schedule+0x230/0x720
[  243.681418] [c00000001a603950] [c000000010ec6090] schedule+0x60/0x110
[  243.681435] [c00000001a603980] [c000000010ecd948] schedule_timeout+0x168/0x1c0
[  243.681454] [c00000001a603a60] [c000000010ec7214] __wait_for_common+0x134/0x360
[  243.681473] [c00000001a603b00] [c00000001017c98c] __flush_work.isra.0+0x1dc/0x3d0
[  243.681493] [c00000001a603ba0] [c0000000105cbd88] fsnotify_wait_marks_destroyed+0x28/0x40
[  243.681512] [c00000001a603bc0] [c0000000105cb800] fsnotify_destroy_group+0x60/0x150
[  243.681531] [c00000001a603c30] [c0000000105cf640] inotify_release+0x30/0xa0
[  243.681548] [c00000001a603ca0] [c00000001054fad8] __fput+0xc8/0x350
[  243.681565] [c00000001a603cf0] [c000000010183174] task_work_run+0xe4/0x160
[  243.681583] [c00000001a603d40] [c000000010021874] do_notify_resume+0x134/0x140
[  243.681602] [c00000001a603d70] [c000000010030168] interrupt_exit_user_prepare_main+0x198/0x270
[  243.681622] [c00000001a603de0] [c0000000100305ac] syscall_exit_prepare+0x6c/0x180
[  243.681641] [c00000001a603e10] [c00000001000bff4] system_call_vectored_common+0xf4/0x278
[  243.681661] --- interrupt: 3000 at 0x7fffb3015ba4
[  243.681673] NIP:  00007fffb3015ba4 LR: 0000000000000000 CTR: 0000000000000000
[  243.681687] REGS: c00000001a603e80 TRAP: 3000   Not tainted  (5.14.0-167.el9.ppc64le)
[  243.681703] MSR:  800000000000d033 <SF,EE,PR,ME,IR,DR,RI,LE>  CR: 42044440  XER: 00000000
[  243.681737] IRQMASK: 0
[  243.681737] GPR00: 0000000000000006 00007fffd24a31a0 00007fffb3127200 0000000000000000
[  243.681737] GPR04: 0000000000000002 000000000000000a 0000000000000000 0000000000000000
[  243.681737] GPR08: 0000010009ea2d40 0000000000000000 0000000000000000 0000000000000000
[  243.681737] GPR12: 0000000000000000 00007fffb3834bc0 0000000000000000 0000000000000000
[  243.681737] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  243.681737] GPR20: 000000012c74ddf0 000000000000000e 000000000017cd3f 0000000000000000
[  243.681737] GPR24: 00007fffd24a3570 0000000000000005 0000010009eb5490 0000010009ea24e0
[  243.681737] GPR28: 0000010009ea2900 0000010009eb4850 0000010009ea2d70 00007fffb382dd98
[  243.681896] NIP [00007fffb3015ba4] 0x7fffb3015ba4
[  243.681907] LR [0000000000000000] 0x0
[  243.681917] --- interrupt: 3000
[  243.681928] INFO: task kworker/u16:1:34 blocked for more than 122 seconds.
[  243.681941]       Not tainted 5.14.0-167.el9.ppc64le #1
[  243.681951] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  243.681964] task:kworker/u16:1   state:D stack:    0 pid:   34 ppid:     2 flags:0x00000800
[  243.681982] Workqueue: events_unbound fsnotify_mark_destroy_workfn
[  243.681998] Call Trace:
[  243.682005] [c00000001a9336d0] [c00000004f990100] 0xc00000004f990100 (unreliable)
[  243.682023] [c00000001a9338c0] [c00000001001e9cc] __switch_to+0x12c/0x220
[  243.682042] [c00000001a933920] [c000000010ec5b40] __schedule+0x230/0x720
[  243.682059] [c00000001a9339e0] [c000000010ec6090] schedule+0x60/0x110
[  243.682075] [c00000001a933a10] [c000000010ecd948] schedule_timeout+0x168/0x1c0
[  243.682094] [c00000001a933af0] [c000000010ec7214] __wait_for_common+0x134/0x360
[  243.682113] [c00000001a933b90] [c000000010213370] __synchronize_srcu.part.0+0xa0/0xe0
[  243.682132] [c00000001a933c00] [c0000000105cc154] fsnotify_mark_destroy_workfn+0xc4/0x1a0
[  243.682151] [c00000001a933c70] [c00000001017acb8] process_one_work+0x298/0x580
[  243.682169] [c00000001a933d10] [c00000001017b048] worker_thread+0xa8/0x630
[  243.682185] [c00000001a933da0] [c000000010188348] kthread+0x1b8/0x1c0
[  243.682203] [c00000001a933e10] [c00000001000cd64] ret_from_kernel_thread+0x5c/0x64
[  366.561279] INFO: task systemd:1 blocked for more than 245 seconds.

The right solution should be in kernel, but since the patch [1] for SRCU
will not be merged into the mainline in near future, it had better to
have a userspace workaround to overcome this test blocker.

The workaround method is to pass the kernel parameter "srcutree.big_cpu_lim=0", so
that the SRCU system will always use srcu_node array.

[1]: https://lore.kernel.org/rcu/20221026032716.78674-1-kernelfans@gmail.com/T/#m6534975507c2abca497a94d81c7abbfea1d0978d

Signed-off-by: Pingfan Liu <piliu@redhat.com>
2023-01-06 11:26:03 +08:00
Lichen Liu fcca486525 kdump.sysconfig*: add ignition.firstboot to KDUMP_COMMANDLINE_REMOVE
Resolves: bz2090533
Upstream: Fedora
Conflict: None

commit 218d9917c03f25bc9872f076491c587815d16efb
Author: Dusty Mabe <dusty@dustymabe.com>
Date:   Mon May 16 14:04:12 2022 -0400

    kdump.sysconfig*: add ignition.firstboot to KDUMP_COMMANDLINE_REMOVE

    For CoreOS based systems we use Ignition for provisioning machines
    in the initramfs on first boot. We trigger Ignition right now by
    the presence of `ignition.firstboot` in the kernel command line. The
    kernel argument is only present on first boot so after a reboot it
    no longer is in the kernel command line.

    If a kernel crash happens before the first reboot of a machine we
    want the `ignition.firstboot` kernel argument to be removed and not
    passed on to the crash kernel.

Signed-off-by: Lichen Liu <lichliu@redhat.com>
2022-05-27 10:08:59 +08:00
Pingfan Liu 888c24c90b kdump.sysconfig: make kexec_file_load as default option on ppc64le
Resolves: bz1881876
Upstream: Fedora
Conflict: None

commit a239a939237ced11c35d52d722a7eecb84091de6
Author: Pingfan Liu <piliu@redhat.com>
Date:   Thu Oct 21 10:13:10 2021 +0800

    sysconfig: make kexec_file_load as default option on ppc64le

    Signed-off-by: Pingfan Liu <piliu@redhat.com>

Signed-off-by: Pingfan Liu <piliu@redhat.com>
2021-11-12 09:47:42 +08:00
Tao Liu 2b7d3aa34d Disable CMA in kdump 2nd kernel
Resolves: bz1950885
Upstream: fedora
Conflict: none

commit d5fe96cd7a779984bf2ba4c8dc51cd10c7e37efd
Author: Tao Liu <ltao@redhat.com>
Date:   Tue Apr 27 17:58:40 2021 +0800

    Disable CMA in kdump 2nd kernel

    kexec-tools needs to disable CMA for kdump kernel cmdline,
    otherwise kdump kernel may run out of memory.

    This patch strips the inherited cma=, hugetlb_cma= cmd
    line from 1st kernel, and sets to be 0 for 2nd kernel.

    Signed-off-by: Tao Liu <ltao@redhat.com>
    Acked-by: Kairui Song <kasong@redhat.com>

Signed-off-by: Tao Liu <ltao@redhat.com>
2021-05-14 14:27:03 +08:00
DistroBaker 17a51515f0 Merged update from upstream sources
This is an automated DistroBaker update from upstream sources.
If you do not know what this is about or would like to opt out,
contact the OSCI team.

Source: https://src.fedoraproject.org/rpms/kexec-tools.git#4f492cf73ea11ff74f5b062e18fcea45cb5e7eeb
2020-11-20 12:35:49 +00:00
DistroBaker 5cac7c3f96 Merged update from upstream sources
This is an automated DistroBaker update from upstream sources.
If you do not know what this is about or would like to opt out,
contact the OSCI team.

Source: https://src.fedoraproject.org/rpms/kexec-tools.git#bfd06661e81465d077bac435c90b4082134adf19
2020-11-05 05:34:29 +00:00
Petr Šabata f5bf4978d8 RHEL 9.0.0 Alpha bootstrap
The content of this branch was automatically imported from Fedora ELN
with the following as its source:
https://src.fedoraproject.org/rpms/kexec-tools#041ba89902961b5490a7143d9596dc00d732cba0
2020-10-15 14:45:57 +02:00