fc66e25f7b
11 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Tao Liu
|
fc66e25f7b |
Re-introduce vmcore creation notification to kdump
Upstream: fedora Resolves: RHEL-70214 Conflict: Yes, the conflict is the same as the original c9s commit |
||
Tao Liu
|
79aec45f8c |
Revert "Introduce vmcore creation notification to kdump"
Resolves: RHEL-70214 Upstream: fedora Conflict: Yes, the conflict is the same as the original c9s commit |
||
Tao Liu
|
c5aa460992 |
Introduce vmcore creation notification to kdump
Upstream: fedora Resolves: RHEL-32060 Conflict: Yes, there are several conflicts. 1) Upstream have moved dracut-kdump.sh into kdump-utils/dracut/99kdumpbase/kdump.sh, so the targeting files are changed. 2) There are several patchsets([1] [2]) which not backported to rhel9, so some formating conflicts encountered. But there is no functional change been made for the patch backporting. [1]: https://github.com/rhkdump/kdump-utils/pull/18/commits [2]: https://github.com/rhkdump/kdump-utils/pull/33/commits commit 88525ebf5e43cc86aea66dc75ec83db58233883b Author: Tao Liu <ltao@redhat.com> Date: Thu Sep 5 15:49:07 2024 +1200 Introduce vmcore creation notification to kdump Motivation ========== People may forget to recheck to ensure kdump works, which as a result, a possibility of no vmcores generated after a real system crash. It is unexpected for kdump. It is highly recommended people to recheck kdump after any system modification, such as: a. after kernel patching or whole yum update, as it might break something on which kdump is dependent, maybe due to introduction of any new bug etc. b. after any change at hardware level, maybe storage, networking, firmware upgrading etc. c. after implementing any new application, like which involves 3rd party modules etc. Though these exceed the range of kdump, however a simple vmcore creation status notification is good to have for now. Design ====== Kdump currently will check any relating files/fs/drivers modified before determine if initrd should rebuild when (re)start. A rebuild is an indicator of such modification, and kdump need to be rechecked. This will clear the vmcore creation status specified in $VMCORE_CREATION_STATUS. Vmcore creation check will happen at "kdumpctl (re)start/status", and will report the creation success/fail status to users. A "success" status indicates previously there has been a vmcore successfully generated based on the current env, so it is more likely a vmcore will be generated later when real crash happens; A "fail" status indicates previously there was no vmcore generated, or has been a vmcore creation failed based on current env. User should check the 2nd kernel log or the kexec-dmesg.log for the failing reason. $VMCORE_CREATION_STATUS is used for recording the vmcore creation status of the current env. The format will be like: success 1718682002 Which means, there has been a vmcore generated successfully at this timestamp for the current env. Usage ===== [root@localhost ~]# kdumpctl restart kdump: kexec: unloaded kdump kernel kdump: Stopping kdump: [OK] kdump: kexec: loaded kdump kernel kdump: Starting kdump: [OK] kdump: Notice: No vmcore creation test performed! [root@localhost ~]# kdumpctl test [root@localhost ~]# kdumpctl status kdump: Kdump is operational kdump: Notice: Last successful vmcore creation on Tue Jun 18 16:39:10 CST 2024 [root@localhost ~]# kdumpctl restart kdump: kexec: unloaded kdump kernel kdump: Stopping kdump: [OK] kdump: kexec: loaded kdump kernel kdump: Starting kdump: [OK] kdump: Notice: Last successful vmcore creation on Tue Jun 18 16:39:10 CST 2024 The notification for kdumpctl (re)start/status can be disabled by setting VMCORE_CREATION_NOTIFICATION in /etc/sysconfig/kdump Signed-off-by: Tao Liu <ltao@redhat.com> Signed-off-by: Tao Liu <ltao@redhat.com> |
||
Pingfan Liu
|
d9904e1794 |
ppc64le: replace kernel cmdline maxcpu=1 with nr_cpus=1
Resolves: https://issues.redhat.com/browse/RHEL-43581 Upstream: Fedora Conflict: Applied by manual commit 44a1b7da908a52c15a2b7ed286b59cfe7319b4c9 Author: Sourabh Jain <sourabhjain@linux.ibm.com> Date: Wed Feb 28 22:51:15 2024 +0530 ppc64le: replace kernel cmdline maxcpu=1 with nr_cpus=1 With patch series [1], PowerPC supports nr_cpus=1, so use nr_cpus=1 instead of maxcpu=1 in the kdump environment. Note this changes is dependent on kernel changes [1] [1] https://lore.kernel.org/all/170800202447.601034.7290612623478478380.b4-ty@ellerman.id.au/#t Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com> Acked-by: Pingfan Liu <piliu@redhat.com> Signed-off-by: Pingfan Liu <piliu@redhat.com> |
||
Pingfan Liu
|
2b2b6b84c0 |
Revert "ppc64: tackle SRCU hang issue"
Resolves: bz2177574
Upstream: RHEL-only
This reverts commit
|
||
Pingfan Liu
|
870ec2ec93 |
ppc64: tackle SRCU hang issue
Resolves: bz2158296 Upstream: RHEL-only On PowerPC platform, the following hang is witnessed: Welcome to Red Hat Enterprise Linux 9.2 Beta (Plow) dracut-057-13.git20220816.el9 (Initramfs) ! [ 1.631210] systemd[1]: Hostname set to <ibm-p9z-18-lp11.virt.pnr.lab.eng.rdu2.redhat.com>. [-- MARK -- Mon Sep 26 01:45:00 2022] [ 243.681283] INFO: task systemd:1 blocked for more than 122 seconds. [ 243.681303] Not tainted 5.14.0-167.el9.ppc64le #1 [ 243.681315] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 243.681329] task:systemd state:D stack: 0 pid: 1 ppid: 0 flags:0x00042000 [ 243.681349] Call Trace: [ 243.681356] [c00000001a603640] [c00000004f990100] 0xc00000004f990100 (unreliable) [ 243.681378] [c00000001a603830] [c00000001001e9cc] __switch_to+0x12c/0x220 [ 243.681400] [c00000001a603890] [c000000010ec5b40] __schedule+0x230/0x720 [ 243.681418] [c00000001a603950] [c000000010ec6090] schedule+0x60/0x110 [ 243.681435] [c00000001a603980] [c000000010ecd948] schedule_timeout+0x168/0x1c0 [ 243.681454] [c00000001a603a60] [c000000010ec7214] __wait_for_common+0x134/0x360 [ 243.681473] [c00000001a603b00] [c00000001017c98c] __flush_work.isra.0+0x1dc/0x3d0 [ 243.681493] [c00000001a603ba0] [c0000000105cbd88] fsnotify_wait_marks_destroyed+0x28/0x40 [ 243.681512] [c00000001a603bc0] [c0000000105cb800] fsnotify_destroy_group+0x60/0x150 [ 243.681531] [c00000001a603c30] [c0000000105cf640] inotify_release+0x30/0xa0 [ 243.681548] [c00000001a603ca0] [c00000001054fad8] __fput+0xc8/0x350 [ 243.681565] [c00000001a603cf0] [c000000010183174] task_work_run+0xe4/0x160 [ 243.681583] [c00000001a603d40] [c000000010021874] do_notify_resume+0x134/0x140 [ 243.681602] [c00000001a603d70] [c000000010030168] interrupt_exit_user_prepare_main+0x198/0x270 [ 243.681622] [c00000001a603de0] [c0000000100305ac] syscall_exit_prepare+0x6c/0x180 [ 243.681641] [c00000001a603e10] [c00000001000bff4] system_call_vectored_common+0xf4/0x278 [ 243.681661] --- interrupt: 3000 at 0x7fffb3015ba4 [ 243.681673] NIP: 00007fffb3015ba4 LR: 0000000000000000 CTR: 0000000000000000 [ 243.681687] REGS: c00000001a603e80 TRAP: 3000 Not tainted (5.14.0-167.el9.ppc64le) [ 243.681703] MSR: 800000000000d033 <SF,EE,PR,ME,IR,DR,RI,LE> CR: 42044440 XER: 00000000 [ 243.681737] IRQMASK: 0 [ 243.681737] GPR00: 0000000000000006 00007fffd24a31a0 00007fffb3127200 0000000000000000 [ 243.681737] GPR04: 0000000000000002 000000000000000a 0000000000000000 0000000000000000 [ 243.681737] GPR08: 0000010009ea2d40 0000000000000000 0000000000000000 0000000000000000 [ 243.681737] GPR12: 0000000000000000 00007fffb3834bc0 0000000000000000 0000000000000000 [ 243.681737] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 243.681737] GPR20: 000000012c74ddf0 000000000000000e 000000000017cd3f 0000000000000000 [ 243.681737] GPR24: 00007fffd24a3570 0000000000000005 0000010009eb5490 0000010009ea24e0 [ 243.681737] GPR28: 0000010009ea2900 0000010009eb4850 0000010009ea2d70 00007fffb382dd98 [ 243.681896] NIP [00007fffb3015ba4] 0x7fffb3015ba4 [ 243.681907] LR [0000000000000000] 0x0 [ 243.681917] --- interrupt: 3000 [ 243.681928] INFO: task kworker/u16:1:34 blocked for more than 122 seconds. [ 243.681941] Not tainted 5.14.0-167.el9.ppc64le #1 [ 243.681951] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 243.681964] task:kworker/u16:1 state:D stack: 0 pid: 34 ppid: 2 flags:0x00000800 [ 243.681982] Workqueue: events_unbound fsnotify_mark_destroy_workfn [ 243.681998] Call Trace: [ 243.682005] [c00000001a9336d0] [c00000004f990100] 0xc00000004f990100 (unreliable) [ 243.682023] [c00000001a9338c0] [c00000001001e9cc] __switch_to+0x12c/0x220 [ 243.682042] [c00000001a933920] [c000000010ec5b40] __schedule+0x230/0x720 [ 243.682059] [c00000001a9339e0] [c000000010ec6090] schedule+0x60/0x110 [ 243.682075] [c00000001a933a10] [c000000010ecd948] schedule_timeout+0x168/0x1c0 [ 243.682094] [c00000001a933af0] [c000000010ec7214] __wait_for_common+0x134/0x360 [ 243.682113] [c00000001a933b90] [c000000010213370] __synchronize_srcu.part.0+0xa0/0xe0 [ 243.682132] [c00000001a933c00] [c0000000105cc154] fsnotify_mark_destroy_workfn+0xc4/0x1a0 [ 243.682151] [c00000001a933c70] [c00000001017acb8] process_one_work+0x298/0x580 [ 243.682169] [c00000001a933d10] [c00000001017b048] worker_thread+0xa8/0x630 [ 243.682185] [c00000001a933da0] [c000000010188348] kthread+0x1b8/0x1c0 [ 243.682203] [c00000001a933e10] [c00000001000cd64] ret_from_kernel_thread+0x5c/0x64 [ 366.561279] INFO: task systemd:1 blocked for more than 245 seconds. The right solution should be in kernel, but since the patch [1] for SRCU will not be merged into the mainline in near future, it had better to have a userspace workaround to overcome this test blocker. The workaround method is to pass the kernel parameter "srcutree.big_cpu_lim=0", so that the SRCU system will always use srcu_node array. [1]: https://lore.kernel.org/rcu/20221026032716.78674-1-kernelfans@gmail.com/T/#m6534975507c2abca497a94d81c7abbfea1d0978d Signed-off-by: Pingfan Liu <piliu@redhat.com> |
||
Lichen Liu
|
fcca486525 |
kdump.sysconfig*: add ignition.firstboot to KDUMP_COMMANDLINE_REMOVE
Resolves: bz2090533
Upstream: Fedora
Conflict: None
commit
|
||
Tao Liu
|
2b7d3aa34d |
Disable CMA in kdump 2nd kernel
Resolves: bz1950885
Upstream: fedora
Conflict: none
commit
|
||
DistroBaker
|
17a51515f0 |
Merged update from upstream sources
This is an automated DistroBaker update from upstream sources. If you do not know what this is about or would like to opt out, contact the OSCI team. Source: https://src.fedoraproject.org/rpms/kexec-tools.git#4f492cf73ea11ff74f5b062e18fcea45cb5e7eeb |
||
DistroBaker
|
5cac7c3f96 |
Merged update from upstream sources
This is an automated DistroBaker update from upstream sources. If you do not know what this is about or would like to opt out, contact the OSCI team. Source: https://src.fedoraproject.org/rpms/kexec-tools.git#bfd06661e81465d077bac435c90b4082134adf19 |
||
Petr Šabata
|
f5bf4978d8 |
RHEL 9.0.0 Alpha bootstrap
The content of this branch was automatically imported from Fedora ELN with the following as its source: https://src.fedoraproject.org/rpms/kexec-tools#041ba89902961b5490a7143d9596dc00d732cba0 |