b8b1667e2c
This patch changes restart of kdump service from cpu online/offline events to cpu add/remove events. Some people have complained that they are running cpu online/offline tests at high frequency and kdump restarts at high frequency and systemd disables the service. As a temporary fix, we committed a patch to never disable kdump service. In general it probably is a good idea to restart kdump service on cpu add/remove events. Toshi Kani confirmed following. - File for /sys/devices/system/cpu/cpuX/crash_notes will be created first before ADD event goes out. That means we can not miss creating EFL notes for newly created cpu. - For REMOVE event files under /sys/devices/system/cpu/cpuX/ are removed first and then REMOVE event goes out. That means we will remove the elf note header for removed cpu. - There are some race conditions like a cpu is removed but system crashes before kdump service restarts. In that case vmcore.c has to be more robust to be able to inspect elf notes and discard empty ones. Also it is possible that after cpu remove, crash notes memory got reused for something else and after crash vmcore.c might see some random data. It does basic size checks and discards elf notes if checks don't pass. Above rance conditions can happen even with OFFLINE event and there is no good way to remove these altogether. So making vmcore.c more robust is the right solution here. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: WANG Chao <chaowang@redhat.com>
5 lines
352 B
Plaintext
5 lines
352 B
Plaintext
SUBSYSTEM=="cpu", ACTION=="add", PROGRAM="/bin/systemctl try-restart kdump.service"
|
|
SUBSYSTEM=="cpu", ACTION=="remove", PROGRAM="/bin/systemctl try-restart kdump.service"
|
|
SUBSYSTEM=="memory", ACTION=="online", PROGRAM="/bin/systemctl try-restart kdump.service"
|
|
SUBSYSTEM=="memory", ACTION=="offline", PROGRAM="/bin/systemctl try-restart kdump.service"
|