Commit Graph

4 Commits

Author SHA1 Message Date
Tao Liu
60e22375f8
sysconfig: add pcie_ports compat to KDUMP_COMMANDLINE_APPEND on x86_64
There have been some of failing cases of kdump in 2nd kernel, where
ususally only one cpu is enabled by "nr_cpus=1", but with a large
number of devices, which may easily exceed the maximum IRQ resources of
one cpu can handle. As a result, the 2nd kernel will hang and kdump
fails. This issue is often observed on machines with many cpus and many
devices.

On those systems, pcieports consume quite proportion of IRQ resources,
many following message can be seen in dmesg log:

   pcieport 0000:18:01.0: PME: Signaling with IRQ 109

According to kernel doc[1], when "pcie_ports=compat" applied, it will disable
native PCIe services (PME, AER, DPC, PCIe hotplug). Those functions are
power management events, error reporting, performance, hotplug related,
which are not the must-have functions for kdump. In addition, after
testing, no side effects such as cannot writing vmcore into sdx, nvme
etc been noticed.

This patch will disable native PCIe services for 2nd kernel, to saving the
scarce IRQ resources and increase the kdump success.

Attach Prarit's comments:

This makes sense to me. The only concern anyone should have is that a PCIE
error could have been responsible for taking down the kernel in the first
place, and booting into the second kernel could then also have a fatal
problem. I'm not sure we can ever fix that type of cascade of panics :)
so it makes sense to disable these features.

[1]: https://www.kernel.org/doc/html/v6.9-rc1/admin-guide/kernel-parameters.html

Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
2024-05-31 11:32:26 +08:00
Lichen Liu
29fe563644 kdump.conf: redirect unknown architecture warning to stderr
The warning messages should not be included in the generated files.
Redirecting the warning for an unknown architecture to stderr.

Signed-off-by: Lichen Liu <lichliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-06-09 10:19:04 +08:00
Philipp Rudo
d9dfea12da sysconfig: add zfcp.allow_lun_scan to KDUMP_COMMANDLINE_REMOVE on s390
Probing unnecessary I/O devices wastes memory and in extreme cases can
cause the crashkernel to run OOM. That's why the s390-tools maintain
their own module, 95zdev-kdump [1], that disables auto LUN scanning and
only configures zfcp devices that can be used as dump target. So remove
zfcp.allow_lun_scan from the kernel command line to prevent that we
accidentally overwrite the default set by the module.

[1] https://github.com/ibm-s390-linux/s390-tools/blob/master/zdev/dracut/95zdev-kdump/module-setup.sh

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-03-13 15:30:00 +08:00
Kairui Song
677da8a59b sysconfig: use a simple generator script to maintain
These kdump.sysconfig.* files are almost identical with a bit difference
in several parameters, just use a simple script to generate them upon
packaging. This should make it easier to maintain, updating a comment or
param for a certain arch can be done in one place.

There are only some comment or empty option differences with the generated
version because some arch's sysconfig is not up-to-dated, this actually
fixes the issue, I used the following script to check these differences:

 # for arch in aarch64 i386 ppc64 ppc64le s390x x86_64; do
   ./gen-kdump-sysconfig.sh $arch > kdump.sysconfig.$arch.new
   git checkout HEAD^ kdump.sysconfig.$arch &>/dev/null
   echo "$arch:"
   diff kdump.sysconfig.$arch kdump.sysconfig.$arch.new; echo ""
   done; git reset;

Signed-off-by: Kairui Song <kasong@tencent.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-08-16 14:35:35 +08:00