Resolves: RHEL-42442
Resolves: RHEL-22171
commit 2c741555d9749e9a137378332e561382f9e25739
Author: Philipp Rudo <prudo@redhat.com>
Date: Mon Jul 1 12:52:39 2024 +0200
kdumpctl.8: Add description to reset-crashkernel --reboot
There is no description for parameter --reboot for reset-crashkernel.
Thus add one.
Suggested-by: Lichen Liu <lichliu@redhat.com>
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Resolves: RHEL-42442
Resolves: RHEL-22171
Upstream: RHEL-only
This bug was fixed upstream in 574f8f5 ("kdumpctl: Simplify fadump
handling in reset_crashkernel"). Backporting this commit to RHEL9 would
require also to backport other commits that would change the behavior of
reset-crashkernel, e.g. no longer setting the default kernel command
line in the grub config. These changes are too invasive for RHEL9. Thus
go with a minimalistic RHEL-only fix.
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Resolves: RHEL-39494
Conflicts: Small difference in context of 2nd hunk.
commit 3028529915d3026e62b59d8f3faadddd410baa75
Author: Philipp Rudo <prudo@redhat.com>
Date: Fri Jun 14 11:48:24 2024 +0200
kdumpctl: Drop default kexec '-d' option
Kernel commits cbc2fe9d9cb2 ("kexec_file: add kexec_file flag to control
debug printing") and a85ee18c7900 ("kexec_file: print out debugging
message if required") added debug messages to the kexec_file_load system
call when option -d is provided to the kexec user space tool. As
kexec_file_load is the default and option -d is set by default these
messages are always printed when a crash kernel is loaded. This not only
clutters the kernel log but also potentially leaks confidential kernel
information to users. As the messages are printed to the kernel log, not
stderr, the redirection to /var/log/kdump.log won't catch them. This
will become even more problematic as for RHEL10 the kernel will be built
without support for the kexec_load system call. So kexec_file_load will
be the only choice in the future.
The redirection also caused confusion in a recent bug report. There a
user moved a working /etc/sysconfig/kdump from ppc to s390 with
KEXEC_ARGS containing the --dt-no-old-root option. This option is arch
specific and does not exist on s390. Thus the kexec-tools failed with an
'unrecognized option' error followed by the usage(). The problem was
that the 'unrecognized option' error is printed to stderr, which got
redirected to /var/log/kdump.log, while the usage() is printed to
stdout, which ended up in the systemd journal. This caused confusion as
the user only checked the journal and found the usage() without any
error message.
Thus remove the default -d option and the redirection of stderr to
/var/log/kdump.log for the kexec-tools user space tool.
This commit ultimately reverts 88a8b94 ("kdumpctl: add the '-d' option to
enable the kexec loading debugging messages").
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-43581
Upstream: Fedora
Conflict: Applied by manual
commit 44a1b7da908a52c15a2b7ed286b59cfe7319b4c9
Author: Sourabh Jain <sourabhjain@linux.ibm.com>
Date: Wed Feb 28 22:51:15 2024 +0530
ppc64le: replace kernel cmdline maxcpu=1 with nr_cpus=1
With patch series [1], PowerPC supports nr_cpus=1,
so use nr_cpus=1 instead of maxcpu=1 in the kdump environment.
Note this changes is dependent on kernel changes [1]
[1] https://lore.kernel.org/all/170800202447.601034.7290612623478478380.b4-ty@ellerman.id.au/#t
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Upstream: fedora
Resolves: RHEL-3929
Conflict: Yes, for fedora there is no kdump.sysconfig.x86_64,
but gen-kdump-sysconfig.sh. So for backporting, the
modification is made on kdump.sysconfig.x86_64.
commit ada6f5edf1ae06fc88759aa2f94d09e2a98d21ef
Author: Tao Liu <ltao@redhat.com>
Date: Wed May 1 16:53:19 2024 +0800
sysconfig: add pcie_ports compat to KDUMP_COMMANDLINE_APPEND on x86_64
There have been some of failing cases of kdump in 2nd kernel, where
ususally only one cpu is enabled by "nr_cpus=1", but with a large
number of devices, which may easily exceed the maximum IRQ resources of
one cpu can handle. As a result, the 2nd kernel will hang and kdump
fails. This issue is often observed on machines with many cpus and many
devices.
On those systems, pcieports consume quite proportion of IRQ resources,
many following message can be seen in dmesg log:
pcieport 0000:18:01.0: PME: Signaling with IRQ 109
According to kernel doc[1], when "pcie_ports=compat" applied, it will disable
native PCIe services (PME, AER, DPC, PCIe hotplug). Those functions are
power management events, error reporting, performance, hotplug related,
which are not the must-have functions for kdump. In addition, after
testing, no side effects such as cannot writing vmcore into sdx, nvme
etc been noticed.
This patch will disable native PCIe services for 2nd kernel, to saving the
scarce IRQ resources and increase the kdump success.
Attach Prarit's comments:
This makes sense to me. The only concern anyone should have is that a PCIE
error could have been responsible for taking down the kernel in the first
place, and booting into the second kernel could then also have a fatal
problem. I'm not sure we can ever fix that type of cascade of panics :)
so it makes sense to disable these features.
[1]: https://www.kernel.org/doc/html/v6.9-rc1/admin-guide/kernel-parameters.html
Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-35811
Conflict: None
Upstream Status: git@github.com:rhkdump/kdump-utils.git
commit 247c7a5f39b305f9a83bad2d936d00237165b7e0
Author: Mamoru Nishibe (Fujitsu) <nishibe.mamoru@fujitsu.com>
Date: Wed Apr 24 08:11:12 2024 +0000
mkdumprd: Fix makedumpfile parameter check.
If only "makedumpfile" is written in "core_collector" of /etc/kdump.conf
and try to run makedumpfile without options,
"makedumpfile --check-params" fails and terminates abnormally.
# grep ^core_collector /etc/kdump.conf
core_collector makedumpfile
# /usr/bin/kdumpctl start
:
Commandline parameter is invalid.
Try `makedumpfile --help' for more information.
kdump: makedumpfile parameter check failed.
kdump: mkdumprd: failed to make kdump initrd
kdump: Starting kdump: [FAILED]
On the other hand, "makedumpfile --check-params" works fine without any options.
# makedumpfile --check-params vmcore dumpfile
# echo $?
0
In addition, before verify_core_collector() was implemented,
initial RAM for kdump was successfully created using only "core_collector makedumpfile".
I consider it a regression.
This is due to a parameter extraction error in verify_core_collector().
Fix it to correctly extract only the options as follows.
Fixes: a1c28126 ("mkdumprd: Use makedumpfile --check-params option")
Signed-off-by: Mamoru Nishibe <nishibe.mamoru@fujitsu.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Related: https://issues.redhat.com/browse/RHEL-7028
Conflict: None
Upstream Status: git@github.com:rhkdump/kdump-utils.git
commit 7a8edc8de67dccae23b01461bc3b17c0ad42aa5f
Author: Coiby Xu <coxu@redhat.com>
Date: Wed Sep 27 09:31:39 2023 +0800
Install the driver of physical device for a SR-IOV virtual device
Currently, network dumping failed over a NIC that is a Single Root I/O
Virtualization (SR-IOV) virtual device. Usually the driver of the
virtual device won't specify the dependency on the driver of the
physical device. So to fix this issue, the driver of the physical device
needs to be found and installed as well.
Fixes: a65dde2d ("Reduce kdump memory consumption by only installing needed NIC drivers")
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-7028
Conflict: None
Upstream Status: git@github.com:rhkdump/kdump-utils.git
commit d057153a1c3c36612a14143b29c0ff0be34e4fc2
Author: Coiby Xu <coxu@redhat.com>
Date: Thu Sep 21 11:50:14 2023 +0800
Try to install PHY and MDIO bus drivers explicitly
Resolves: https://issues.redhat.com/browse/RHEL-7028
Currently, nfs dumping fails on some machines that has a dedicated PHY
driver (dealing with the physical layer) or MDIO bus (connecting the MAC
to PHY devices) driver. This is because kexec-tools doesn't install
dedicated PHY or MDIO driver explicitly. Usually a NIC driver shouldn't
specify the dependency on the needed PHY or MDIO driver because it
shouldn't a NIC (medium access control, MAC) driver is for dealing with
the Data link layer and a PHY driver is for physical layer. So as long
as a MAC driver can talk to the PHY layer via APIs, it shouldn't care
which PHY driver or device it's talking to. So when the
dependency on a PHY driver or MDIO driver is not found by dracut's
instmods, the PHY or MDIO driver won't be installed.
This patch passes =drivers/net/phy and =drivers/net/mdio to dracut's
instmods which will only install in-use PHY or MDIO driver(s).
Note ideally we should find out which PHY driver is used by a NIC but
unfortunately currently no universal way can be found
(/sys/class/net/NIC_NAME/phydev/driver/module can be used to find the
name of the PHY driver for some NICs but it doesn't exist for some NICs
like Qualcomm Atheros AR8031). So is it for a MDIO bus driver.
Fortunately currently no huge memory consumption is found for a PHY or
MDIO driver.
Fixes: a65dde2d ("Reduce kdump memory consumption by only installing needed NIC drivers")
Reported-by: Doreen Alongi <dalongi@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-35640
Upstream Status: git://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
Conflict: None
commit 9d9cf8de8b2ad8273861a30476a46f34cd34871a
Author: Baoquan He <bhe@redhat.com>
Date: Tue Nov 14 23:20:30 2023 +0800
kexec_file: add kexec_file flag to support debug printing
This add KEXEC_FILE_DEBUG to kexec_file_flags so that it can be passed
to kernel when '-d' is added with kexec_file_load interface. With that
flag enabled, kernel can enable the debugging message printing.
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Simon Horman <horms@kernel.org>
Signed-off-by: Baoquan He <bhe@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-13996
Upstream: Fedora
Conflict: None
commit 468336700d
Author: Lichen Liu <lichliu@redhat.com>
Date: Mon Jan 22 15:59:09 2024 +0800
dracut-module-setup: Skip initrd-cleanup and initrd-parse-etc in kdump
When using multipath devices as the target for kdump, if user_friendly_name
is also specified, devices default to names like "mpath*", e.g., mpatha.
In dracut, we obtain a persistent device name via get_persistent_dev. However,
dracut currently believes using /dev/mapper/mpath* could cause issues, thus
alternatively names are used, here it's /dev/disk/by-uuid/<FS_UUID>.
During the kdump boot progress, the /dev/disk/by-uuid/<FS_UUID> will exist as
soon as one of the path devices exists, but it won't be usable by systemd,
since multipathd will claim that device as a path device. Then multipathd will
get stopped before it can create the multipath device.
Without user_friendly_name, /dev/mapper/<WWID> is considered a persistent
device name, avoiding the issue.
The exit of multipathd is due to two dependencies in the current dracut module
90multipath/multipathd.service, "Before=initrd-cleanup.service" and
"Conflicts=initrd-cleanup.service".
As per man 5 systemd.unit, if A.service has "Conflicts=B.service", starting
B.service will stop A.service.
This is useful during normal boot. However, we will never switch-root after
capturing vmcore in kdump.
We need to ensure that multipathd is not killed due to such dependency issue.
Without modifying multipathd.service, we add ConditionPathExists=!/proc/vmcore
to skip initrd-cleanup.service in kdump. This approach is beneficial as
it avoid the potential termination of other services that conflict with
initrd-cleanup.service. Also skip initrd-parse-etc.service as it will try to
start initrd-cleanup.service. Both of these services are used for switch root,
so they can be safely skipped in kdump.
Suggested-by: Benjamin Marzinski <bmarzins@redhat.com>
Suggested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-17451
Upstream: Fedora
Conflict: None
commit c752cbb2d3
Author: Coiby Xu <coxu@redhat.com>
Date: Wed Dec 6 16:15:50 2023 +0800
Explain the auto_reset_crashkernel option in more details
Resolves: https://issues.redhat.com/browse/RHEL-17451
Explain what factors affect the default crashkernel value and ask users
to reset it manually if needed.
Cc: Baoquan He <bhe@redhat.com>
Cc: Jie Li <jieli@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-11897
Upstream: Fedora
Conflict: None
commit 38d9990389
Author: Coiby Xu <coxu@redhat.com>
Date: Tue Dec 26 11:17:29 2023 +0800
Use the same /etc/resolve.conf in kdump initrd if it's managed manually
Resolves: https://issues.redhat.com/browse/RHEL-11897
Previously fix 0177e248 ("Use the same /etc/resolve.conf in kdump initrd
if it's managed manually") is problematic,
1) it generated .conf file unrecognized by NetowrkManager ;
2) this .conf file was installed to current file system instead of to the kdump initrd;
3) this incorrect .conf file prevented the starting of NetworkManager.
This patch fixes the above issues and also suppresses a harmless warning
when systemd-resolved.service doesn't exist,
# systemctl -q is-enabled systemd-resolved
Failed to get unit file state for systemd-resolved.service: No such file or directory
Fixes: 0177e248 ("Use the same /etc/resolve.conf in kdump initrd if it's managed manually")
Reported-by: Jie Li <jieli@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Reviewed-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-10484
Upstream: Fedora
Conflict: Missing upstream patch d4e8772("kdumpctl: make do_estimate more
robust")
commit 741861164e
Author: Lichen Liu <lichliu@redhat.com>
Date: Mon Oct 30 14:51:59 2023 +0800
kdumpctl: Only returns immediately after an error occurs in check_*_modified
Currently is_system_modified will return immediately when check_*_modified
return a non-zero value, and the remaining checks will not be executed.
For example, if there is a fs-related error exists, and someone changes the
kdump.conf, check_files_modified will return 1 and is_system_modified will
return 1 immediately. This will cause kdumpctl to skip check_fs/drivers_modified,
kdump.service will rebuild the initrd and start successfully, however, any
errors should prevent kdump.service from starting.
This patch will cause check_*_modifed to continue running until an error occurs
or all execution ends.
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Acked-by: Tao Liu <ltao@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-8727
Resolves: https://issues.redhat.com/browse/RHEL-8710
Upstream: Fedora
Conflict: The 1st hunk is merged manualy because of conflict caused
by context.
commit 4841bc6a6d
Author: Baoquan He <bhe@redhat.com>
Date: Thu Sep 21 18:01:02 2023 +0800
kdump-lib.sh: add extra 64M to default crashkernel if sme/sev is active
It's reported that kdump kernel failed to boot and can't dump vmcore
when crashkernel=192M and SME/SEV is active.
This is because swiotlb will be enabled and reserves 64M memory by
default on system with SME/SEV enabled. Then kdump kernel will be out of
memory after taking 64M away for swiotlb init.
So here add extra 64M memory to default crashkernel value so that kdump
kernel can function well as before. When doing that, search journalctl
for the "Memory Encryption Features active: AMD" to check if SME or SEV
is active. This line of log is printed out in kernel function as below
and the type SME is mutual exclusive with type SEV.
***:
arch/x86/mm/mem_encrypt.c:print_mem_encrypt_feature_info()
Note:
1) The conditional check is relying on journalctl log because I didn't
find available system interface to check if SEV is active. Even
though we can check if SME is active via /proc/cpuinfo. For
consistency, I take the same check for both SME and SEV by searching
journalctl.
2) The conditional check is relying on journalctl log, means it won't
work for crashkernel setting in anoconda because the installation
kernel doesn't have the SME/SEV setting. So customer need manually
run 'kdumpctl reset-crashkernel' to reset crashkernel to add the
extra 64M after OS installation.
3) We need watch the line of log printing in
print_mem_encrypt_feature_info() in kernel just in case people may
change it in the future.
Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Baoquan He <bhe@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-8727
Resolves: https://issues.redhat.com/browse/RHEL-8710
Upstream: Fedora
Conflict: spec/kdump-lib_spec.sh doesn't exist in RHEL, so I remove
the hunk directly.
commit 64f2827a4b
Author: Philipp Rudo <prudo@redhat.com>
Date: Wed Sep 6 10:49:47 2023 +0200
kdump-lib: Harden _crashkernel_add
_crashkernel_add currently always assumes the good case, i.e. that the
value of the crashkernel parameter has the correct syntax and that the
delta added is a number. Both doesn't have to be true when the values
are provided by users. Thus add some additional checks.
Furthermore require the delta to have a explicit unit, i.e. no longer
assume that is in megabytes, i.e. 100 -> 100M.
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Signed-off-by: Baoquan He <bhe@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-14003
Upstream: Fedora
Conflict: None
commit 4fa17b2ee4
Author: Nayna Jain <nayna@linux.ibm.com>
Date: Tue Oct 3 23:41:47 2023 -0400
powerpc: update kdumpctl to load kernel signing key for fadump
On secure boot enabled systems with static keys, kexec with kexec_file_load(-s)
fails as "Permission Denied" when fadump is enabled.
Similar to kdump, load kernel signing key for fadump as well.
Reported-by: Sachin P Bappalige <sachinpb@linux.vnet.ibm.com>
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: https://issues.redhat.com/browse/RHEL-14003
Upstream: Fedora
Conflict: None
commit fe6eb30e67
Author: Nayna Jain <nayna@linux.ibm.com>
Date: Tue Oct 3 23:41:46 2023 -0400
powerpc: update kdumpctl to remove deletion of kernel signing key once loaded
Kernel signing key is deleted once kdump is loaded. This causes confusion in
debugging since key is no longer visible. Unless someone knows how
kdumpctl script works, it is difficult to find out how kdump could be
loaded when there is no key on .ima keyring.
Remove deletion of kernel signing key once loaded. And then to prevent
multiple loading of same key when kdump service is disabled/enabled, update
key description field as well.
Suggested-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Resolves: RHEL-14000
Upstream: https://github.com/horms/kexec-tools
Conflict: None
commit bd0200c47c45dd420244b39ddabcecdab1fb9a8e
Author: Hari Bathini <hbathini@linux.ibm.com>
Date: Wed Sep 20 17:29:27 2023 +0530
kexec: update manpage with explicit mention of clean kexec
While the manpage does mention about kexec boot with a clean shutdown,
it is not explicit about it. Make it explicit.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Simon Horman <horms@kernel.org>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: bz2232499
Upstream: Fedora Rawhide
Conflict: None
commit 4b7b7736ee
Author: Sourabh Jain <sourabhjain@linux.ibm.com>
Date: Wed Aug 2 20:36:48 2023 +0530
Introduce a function to get reserved memory size
The size of the reserved memory in the functions show_reserved_mem,
check_crash_mem_reserved, and do_estimate are fetched from the sysfs
node `/sys/kernel/kexec_crash_size`. However, in the case of fadump,
the reserved area size is instead present in
/sys/kernel/fadump/mem_reserved.
For example:
$ kdumpctl showmem
kdump: Dump mode is fadump
kdump: Reserved 0MB memory for crash kernel
The above command showed 0MB for Reserved memory which is incorrect, the
actual reservation was 2048MB.
To resolve this issue a new helper function is introduced to fetch
reserved memory size based on the dump mode. For "fadump" mode,
it looks in `/sys/kernel/fadump/mem_reserved`, otherwise, it uses
`/sys/kernel/kexec_crash_size`. And all functions that previously
fetching reserved memory directly from `/sys/kernel/kexec_crash_size`
sysfs node are now updated to use this new function to get the reserved
memory size.
With the fix in place, the `kdumpctl showmem` command will now display
correct reserved memory size.
$ kdumpctl showmem
kdump: Dump mode is fadump
kdump: Reserved 2048MB memory for crash kernel
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Reported-by: Sachin P Bappalige <sachinpb@linux.vnet.ibm.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: bz2232499
Upstream: Fedora Rawhide
Conflict: Some newer patches has been rebased, which caused git am to
encounter some problems.
commit fc7c65312a
Author: Sourabh Jain <sourabhjain@linux.ibm.com>
Date: Thu Aug 17 16:38:35 2023 +0530
powerpc: update fadump sysfs node path
The fadump sysfs nodes /sys/kernel/fadump_[enabled|registered], have
been relocated to /sys/kernel/fadump/[enabled|registered] by kernel
commits d418b19f34ed ("powerpc/fadump: Reorganize /sys/kernel/fadump_*
sysfs files").
To ensure compatibility, symbolic links were added for each relocated
sysfs entry. Nonetheless, note that these symbolic links might be
removed later, as they have been deprecated by kernel commit
3f5f1f22ef10 ("Documentation/ABI: Mark /sys/kernel/fadump_* sysfs files
deprecated")
This patch updates the scripts to use the updated fadump sysfs files.
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: bz2232499
Upstream: Fedora Rawhide
Conflict: Some newer patches has been rebased, which caused git am to
encounter some problems.
commit b9fd7a4076
Author: Philipp Rudo <prudo@redhat.com>
Date: Thu Jan 12 16:31:02 2023 +0100
kdumpctl: merge check_current_{kdump,fadump}_status
Both functions are almost identical. The only differences are (1) the
sysfs node the status is read from and (2) the fact the fadump version
doesn't verify if the file it's trying to read actually exists. Thus
merge the two functions and get rid of the check_current_status wrapper.
While at it rename the function to is_kernel_loaded which explains
better what the function does.
Finally, after moving FADUMP_REGISTER_SYS_NODE shellcheck can no longer
access the definition and starts complaining about it not being quoted.
Thus quote all uses of FADUMP_REGISTER_SYS_NODE.
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: bz2165018
Upstream: Fedora Rawhide
Conflict: None
commit daa829f79e
Author: Lichen Liu <lichliu@redhat.com>
Date: Mon Jun 12 17:17:43 2023 +0800
spec: kdump/ppc64: make servicelog_notify silent when there are no errors
There is confusing message in /var/log/anaconda/packaging.log when installing
kexec-tools during the system installation on ppc64le:
Event Notification Registration successful (id: 1)
Make servicelog_notify slient when there are no erros.
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2215606
Upstream: Fedora Rawhide
Conflict: None
commit dda81d72c2
Author: Philipp Rudo <prudo@redhat.com>
Date: Mon Jun 19 14:31:48 2023 +0200
kdumpctl: Fix temporary directory location
The temporary directory is currently created under the current working
directory. That alone isn't ideal but works most of the time. However,
it will fail when the current working directory is not writable. So make
sure the directory is created within TMPDIR.
Fixes: ea00b7d ("kdumpctl: Move temp file in get_kernel_size to global temp dir")
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2165839
Upstream: Fedora
Conflict: None
commit f3139012f2
Author: Pingfan Liu <piliu@redhat.com>
Date: Tue Jun 20 08:50:31 2023 +0800
kdump-lib: Match 64k debug kernel in prepare_kdump_bootinfo()
For kernel 64k variant, it terminates with substring 64k-debug, e.g.
vmlinuz-5.14.0-327.el9.aarch64+64k-debug.
Providing an extra matching pattern to filter out it.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2160676
Upstream: Fedora rawhide
Conflict: None
commit 64d93c886f
Author: Pingfan Liu <piliu@redhat.com>
Date: Fri Jun 9 16:04:29 2023 +0800
kdumpctl: Fix the matching of plus symbol by grep's EREs
After introducing 64k variant kernel on aarch64, an example kernel name
looks like "vmlinuz-5.14.0-316.el9.aarch64+64k". To match the plus
symbol, it demands an escape charater.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2160676
Upstream: Fedora rawhide
Conflict: None
commit 7a2c4cbc3b
Author: Pingfan Liu <piliu@redhat.com>
Date: Tue Jun 13 17:43:23 2023 +0800
kdump-lib: Evaluate the memory consumption by smmu and mlx5 separately
On 4k and 64k kernels, the typical consumption values for SMMU are 36MB
and 384MB, respectively. Hence for 64k kernel, the consumption by smmu
should be taken into account carefully.
To do it by adding the extra 384MB value if installing a 64k kernel.
The upper limit value 384MB is calculated according to the formula in
the kernel smmu driver.
As for mlx5 network cards, it is measured by a pratical test, 200M for
64k variant, 150M for 4k variant
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2160676
Upstream: Fedora rawhide
Conflict: None
commit 05c4861443
Author: Pingfan Liu <piliu at redhat.com>
Date: Tue Jun 13 17:43:22 2023 +0800
kdump-lib: add support for 64K aarch64
On aarch64, both 4K and 64K kernel can be installed, while they demand
different size reserved memory for kdump kernel.
'get_conf PAGE_SIZE' can not work if installing a 64K kernel when
running a 4K kernel. Hence resorting to the kernel release naming rules.
At present, the 64K kernel has the keyword '64k' in its suffix.
The base line for 64K is decided based on 4K. The diff 100M is picked up
since on a high end machine without smmu enabled, the diff of MemFree is
82M.
As for the smmu case, a huge difference in the memory consumption lies
between 64k and 4k driver. And it should be calculated separatedly.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2160676
Upstream: Fedora rawhide
Conflict: None
commit d8b961be37
Author: Pingfan Liu <piliu@redhat.com>
Date: Tue Jun 13 17:43:21 2023 +0800
kdump-lib: Introduce parse_kver_from_path() to get kernel version from its path name
kdump_get_arch_recommend_crashkernel() expects the kernel version info,
while _update_kernel() provides the absolute path, which contains the
kernel version info.
This patch introduce a dedicated function parse_kver_from_path() to
extract the kernel info from the path
Credit to Philipp, who contributes the original code.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2160676
Upstream: Fedora rawhide
Conflict: Drop shellspec test case
commit 51efbcf83e
Author: Pingfan Liu <piliu@redhat.com>
Date: Tue Jun 13 17:43:20 2023 +0800
kdump-lib: Introduce a help function _crashkernel_add()
This help function can manipulate the crashkernel cmdline by adding an
number for each item. Also a basic test case for _crashkernel_add() is
provided in this patch.
Credit to Philipp, who contributes the original code.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2160676
Upstream: Fedora rawhide
Conflict: applied manually due to slight difference in context
commit 5b31b099ae
Author: Coiby Xu <coxu@redhat.com>
Date: Wed Apr 26 04:48:25 2023 +0800
Simplify the management of the kernel parameter crashkernel
Currently, kexec-tools only updates the crashkernel to a new default
value only when both two conditions are met,
- auto_reset_crashkernel=yes in kdump.conf
- existing kernels or current running kernel should use the old default
value.
To address seen corner cases, the logic to tell if the second condition
is met becomes quite complex. Instead of making the logic more complex
to support aarch64-64k, this patch drops the second condition to
simplify the management of the crashkernel kernel parameter.
Another change brought by this simplification is kexec-tools will also
set up the kernel crashkernel parameter for a fresh install (previously
it's limited to osbuild).
Note
1. This patch also stop trying to update /etc/default/grub because
a) it only affects the static file /boot/grub2/grub.cfg
b) grubby is recommended to change the kernel command-line parameters
for both Fedora [1] and RHEL9 [2][3]
c) For the cases of aarch64 and POWER, different kernels could have
different default crashkernel value.
2. Starting with Fedora 37, posttrans rpm scriplet distinguish between
package install and upgrade.
[1] https://fedoraproject.org/wiki/GRUB_2
[2] https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/managing_monitoring_and_updating_the_kernel/configuring-kernel-command-line-parameters_managing-monitoring-and-updating-the-kernel#changing-kernel-command-line-parameters-for-all-boot-entries_configuring-kernel-command-line-parameters
[3] https://access.redhat.com/solutions/1136173
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2165839
Upstream: Fedora
Conflict: None
commit 81d3cc344d
Author: Pingfan Liu <piliu@redhat.com>
Date: Thu Apr 20 11:26:34 2023 +0800
kdump-lib: fix the matching pattern for debug-kernel
On aarch64, a 64k kernel's name looks like:
vmlinuz-5.14.0-300.el9.aarch64+64k and the corresponding debug kernel's
name looks like: vmlinuz-5.14.0-300.el9.aarch64+64k-debug, which ends
with the suffix -debug instead of +debug.
Fix the matching pattern by [+|-]debug
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Pingfan Liu <piliu@redhat.com>