426b9269b8
We should use PatchXX to apply upstream kdump-utils patches instead of directly merge them into git. Resolves: RHEL-36415 Resolves: RHEL-37670 Upstream: https://github.com/rhkdump/kdump-utils/ Conflict: None Signed-off-by: Lichen Liu <lichliu@redhat.com>
62 lines
2.7 KiB
Diff
62 lines
2.7 KiB
Diff
From ada6f5edf1ae06fc88759aa2f94d09e2a98d21ef Mon Sep 17 00:00:00 2001
|
|
From: Tao Liu <ltao@redhat.com>
|
|
Date: Wed, 1 May 2024 16:53:19 +0800
|
|
Subject: [PATCH 6/7] sysconfig: add pcie_ports compat to
|
|
KDUMP_COMMANDLINE_APPEND on x86_64
|
|
|
|
There have been some of failing cases of kdump in 2nd kernel, where
|
|
ususally only one cpu is enabled by "nr_cpus=1", but with a large
|
|
number of devices, which may easily exceed the maximum IRQ resources of
|
|
one cpu can handle. As a result, the 2nd kernel will hang and kdump
|
|
fails. This issue is often observed on machines with many cpus and many
|
|
devices.
|
|
|
|
On those systems, pcieports consume quite proportion of IRQ resources,
|
|
many following message can be seen in dmesg log:
|
|
|
|
pcieport 0000:18:01.0: PME: Signaling with IRQ 109
|
|
|
|
According to kernel doc[1], when "pcie_ports=compat" applied, it will disable
|
|
native PCIe services (PME, AER, DPC, PCIe hotplug). Those functions are
|
|
power management events, error reporting, performance, hotplug related,
|
|
which are not the must-have functions for kdump. In addition, after
|
|
testing, no side effects such as cannot writing vmcore into sdx, nvme
|
|
etc been noticed.
|
|
|
|
This patch will disable native PCIe services for 2nd kernel, to saving the
|
|
scarce IRQ resources and increase the kdump success.
|
|
|
|
Attach Prarit's comments:
|
|
|
|
This makes sense to me. The only concern anyone should have is that a PCIE
|
|
error could have been responsible for taking down the kernel in the first
|
|
place, and booting into the second kernel could then also have a fatal
|
|
problem. I'm not sure we can ever fix that type of cascade of panics :)
|
|
so it makes sense to disable these features.
|
|
|
|
[1]: https://www.kernel.org/doc/html/v6.9-rc1/admin-guide/kernel-parameters.html
|
|
|
|
Signed-off-by: Tao Liu <ltao@redhat.com>
|
|
Acked-by: Prarit Bhargava <prarit@redhat.com>
|
|
Acked-by: Dave Young <dyoung@redhat.com>
|
|
---
|
|
gen-kdump-sysconfig.sh | 2 +-
|
|
1 file changed, 1 insertion(+), 1 deletion(-)
|
|
|
|
diff --git a/gen-kdump-sysconfig.sh b/gen-kdump-sysconfig.sh
|
|
index 78b0bb7..1a2cd92 100755
|
|
--- a/gen-kdump-sysconfig.sh
|
|
+++ b/gen-kdump-sysconfig.sh
|
|
@@ -104,7 +104,7 @@ s390x)
|
|
x86_64)
|
|
update_param KEXEC_ARGS "-s"
|
|
update_param KDUMP_COMMANDLINE_APPEND \
|
|
- "irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 acpi_no_memhotplug transparent_hugepage=never nokaslr hest_disable novmcoredd cma=0 hugetlb_cma=0"
|
|
+ "irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 acpi_no_memhotplug transparent_hugepage=never nokaslr hest_disable novmcoredd cma=0 hugetlb_cma=0 pcie_ports=compat"
|
|
;;
|
|
*)
|
|
echo "Warning: Unknown architecture '$1', using default sysconfig template." >&2
|
|
--
|
|
2.44.0
|
|
|