Commit Graph

10 Commits

Author SHA1 Message Date
Baoquan He
69d61bb3e3
update 98-kexec rules for crash hotplug
In kernel, with the support of cpu/memory hotplug on crash, kdump
reloading only needs to update the elfcorehdr.

To realize the benefits, we need prevent udev from updating kdump
kernel on hot un/plug changes when detecting that the crash_hotplug
sysfs nodes are present.

Link: https://lore.kernel.org/lkml/20230814214446.6659-1-eric.devolder@oracle.com/
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d68b4b6f307d155475cce541f2aee938032ed22e
Signed-off-by: Baoquan He <bhe@redhat.com>
2024-05-31 11:32:26 +08:00
Kairui Song
3a36568581 Make udev reload rules quiet during bootup
In commit 1c97aee and commit 227c185 udev rules was rewritten to use
systemd-run to run in a non-blocking mode. The problem is that it's a
bit noise, especially on machine bootup, systemd will always generate
extra logs for service start, you might see your journal full of lines
like these if you have many CPUs (each CPU generates a udev event on
boot):

...
Nov 22 22:23:05 localhost systemd[1]: Started /usr/lib/udev/kdump-udev-throttler.
Nov 22 22:23:05 localhost systemd[1]: Started /usr/lib/udev/kdump-udev-throttler.
Nov 22 22:23:05 localhost systemd[1]: Started /usr/lib/udev/kdump-udev-throttler.
Nov 22 22:23:05 localhost systemd[1]: Started /usr/lib/udev/kdump-udev-throttler.
...

While system is still booting up, kdump service is not started yet, so
systemd-run calls will end up doing nothing, the throttler being called
by systemd-run will just exit if kdump is not loaded.

This patch avoid systemd-run from being called at first place if kdump
service is not running by checking kdump.service status in udev rule,
so there won't be unnecessary logs.

Also remove the kdump service checking logic in kdump-udev-throttler as
udev is the only expected callee of this script, if it's not being
called at first place when kdump service is running, this checking will
be redundant. And even if any user called this script manually, it will
still work well as this script will call 'kdumpctl reload', it reload
the kdump resource only if kdump is loaded already.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
2018-12-06 17:44:03 +08:00
Kairui Song
1c97aee728 Throttle kdump reload request triggered by udev event
Previously, kdump will restart / reload for many times on hotplug
event, especially memory hotplug events. Hotplugged memory may
generate many udev event as memory are managed and hotplugged in
small chunks by the kernel.

This results in unnecessary system workload and an actually longer
delay of kdump reload and the hotplug event, as udev will either
get blocked or kdumpctl will be waiting for other triggered operation.

To fix this, introduce a kdump-udev-throttler as an agent which will
be called by udev and merge concurrent kdump restart requests. Tested
with a Hyper-V VM which is failing due to udev timeout previously,
no new issues found.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
2018-11-01 22:33:17 +08:00
Kairui Song
227c18506c Rewrite kdump's udev rules
According to udev's man page, PROGRAM is either used to determine
device's name or whether the device matches the rule. So we should
use RUN insteand. Meanwhile, both RUN / PROGRAM only accepts very
short-running foreground tasks, but kdump restart may take a long
time if there are any device changes that will lead to image rebuild,
which may lead to buggy behavior.

On the other hand, memory / CPU hot plug should never trigger a
initramfs rebuild.

To solve this problem, we will use new introduced "kdumpctl reload"
instead, and use systemd-run to create a transient service unit for
the reload and run it in no-block mode, so udev won't be blocked by
anything.

We need to make systemd-run execute in non-blocking mode, and do not
synchronously wait for the operation to finish, because udev expect
the command line in RUN to be finished immediately, however, kdumpctl
reload may take 0.5-1s for an ordinary reload, or even slower on some
machines. So we give systemd-run an explicit --no-block option to run
in non-blocking mode. Without --no-blocking, systemd-run will verify,
enqueue and wait for the operation to finish. By using the --no-block
option, systemd-run will only verify and enqueue the unit then
return. In this way, we make sure the command is executed
asynchronously, and the status will be monitored and logged by
systemd, which is reliable and non-blocking.

Another thing to mention is that --no-block is only needed after
systemd-v220, before v220 systemd-run uses non-blocking mode by
default and --no-block option is not available on earlier systemd
versions.

Also reformat the udev rules to a more maintanceable format.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
2018-11-01 22:33:11 +08:00
Dave Young
fcaa683b01 Revert "Improve 'cpu add' udev rules"
This reverts commit 65d37d19c7.
Since we have no such bugs in Fedora and original rule works fine thus revert it.
2017-08-17 16:42:11 +08:00
Pratyush Anand
65d37d19c7 Improve 'cpu add' udev rules
Currently kdump service is restarted even when any new file is added in cpu
subsystem. So, it can be restarted multiple times in the cases like loading
of acpi_cpufreq module or online/offline of any cpu.

However, we should see kdump service restart only once in case a new CPU
is added or removed. cpu crash notes buffer is created when a new CPU is
added. It's location does not change when a CPU is onlined/offlined or
acpi_cpufreq driver is loaded. Therefore, no need to restart kdump
service in such cases.

Thus, we need to introduce an extra filter for the kernel name of the
directory created by cpu_add which is KERNEL=="cpu[0-9]*". This will ensure
that kdump service is not restarted when any new file is added in /removed
from cpu subsystem.

Signed-off-by: Pratyush Anand <panand@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
2017-08-08 10:10:35 +08:00
Vivek Goyal
b8b1667e2c udev-rules: Restart kdump service on cpu ADD/REMOVE events
This patch changes restart of kdump service from cpu online/offline events
to cpu add/remove events.

Some people have complained that they are running cpu online/offline tests
at high frequency and kdump restarts at high frequency and systemd disables
the service. As a temporary fix, we committed a patch to never disable
kdump service.

In general it probably is a good idea to restart kdump service on cpu
add/remove events.

Toshi Kani confirmed following.

- File for /sys/devices/system/cpu/cpuX/crash_notes will be created first
  before ADD event goes out. That means we can not miss creating EFL notes
  for newly created cpu.

- For REMOVE event files under /sys/devices/system/cpu/cpuX/ are removed
  first and then REMOVE event goes out. That means we will remove the elf
  note header for removed cpu.

- There are some race conditions like a cpu is removed but system crashes
  before kdump service restarts. In that case vmcore.c has to be more robust
  to be able to inspect elf notes and discard empty ones.

  Also it is possible that after cpu remove, crash notes memory got reused
  for something else and after crash vmcore.c might see some random data.
  It does basic size checks and discards elf notes if checks don't pass.

  Above rance conditions can happen even with OFFLINE event and there is
  no good way to remove these altogether. So making vmcore.c more robust
  is the right solution here.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: WANG Chao <chaowang@redhat.com>
2014-09-15 21:55:07 +08:00
Prarit Bhargava
e8f71f38d5 Fix kdump udev memory event restarts
During debugging of another problem issues were noted with the kdump udev
rules.  The kdump service is restarted on memory add and remove events.
These are the wrong events for these types of devices and result in an overly
aggressive restarting of the kdump service.

There are four udev events to consider, "add", "remove", "online", and
"offline".  The remove event is a complete removal from the system -- neither
the hardware nor the kernel know about the hardware; it has been physically
removed.  The add event is associated with hardware being physically added to
the system.  The kernel has some limited knowledge of the device, however,
it is not avaiable for the kernel to use until it is brought online.  Online
events refer to the device being available for the kernel to use.  Opposite
to that is the offline event, which occurs when a device is no longer in
use by the kernel.

Note that in all four events the kernel *may* have some remaining information
stored about the device.

In the case of memory hotplug, kdump should be restarted when a memory module
is onlined or offlined.  This is because the memory is not in use by the
kernel until the memory is onlined, and it is unused when the memory is
offlined.

Making these modifications results in smooth service on systems that do
heavy memory onlining and offlining.

Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-05-04 17:08:34 +08:00
Dave Young
d123cc3f2b udev rules fix
Resolves: bz808817

use systemctl try-restart kdump.service instead of old /etc/init.d/kdump restart
systemctl try-restart will restart kdump service only if the kdump service is runing. Original behavior is wrong when user does not chkconfig on the service.

Tested the cpu online/offline events.

Signed-off-by: Dave Young <dyoung@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2012-05-28 09:50:47 +08:00
Neil Horman
558bea7d40 Mass Update of RHEL5 patches 2008-06-05 15:18:53 +00:00