From b8b1667e2c83ef9a0abd900a2852f1cd43ff2a5a Mon Sep 17 00:00:00 2001 From: Vivek Goyal Date: Fri, 5 Sep 2014 16:16:25 -0400 Subject: [PATCH] udev-rules: Restart kdump service on cpu ADD/REMOVE events This patch changes restart of kdump service from cpu online/offline events to cpu add/remove events. Some people have complained that they are running cpu online/offline tests at high frequency and kdump restarts at high frequency and systemd disables the service. As a temporary fix, we committed a patch to never disable kdump service. In general it probably is a good idea to restart kdump service on cpu add/remove events. Toshi Kani confirmed following. - File for /sys/devices/system/cpu/cpuX/crash_notes will be created first before ADD event goes out. That means we can not miss creating EFL notes for newly created cpu. - For REMOVE event files under /sys/devices/system/cpu/cpuX/ are removed first and then REMOVE event goes out. That means we will remove the elf note header for removed cpu. - There are some race conditions like a cpu is removed but system crashes before kdump service restarts. In that case vmcore.c has to be more robust to be able to inspect elf notes and discard empty ones. Also it is possible that after cpu remove, crash notes memory got reused for something else and after crash vmcore.c might see some random data. It does basic size checks and discards elf notes if checks don't pass. Above rance conditions can happen even with OFFLINE event and there is no good way to remove these altogether. So making vmcore.c more robust is the right solution here. Signed-off-by: Vivek Goyal Acked-by: WANG Chao --- 98-kexec.rules | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/98-kexec.rules b/98-kexec.rules index 162260d..e32ee13 100644 --- a/98-kexec.rules +++ b/98-kexec.rules @@ -1,4 +1,4 @@ -SUBSYSTEM=="cpu", ACTION=="online", PROGRAM="/bin/systemctl try-restart kdump.service" -SUBSYSTEM=="cpu", ACTION=="offline", PROGRAM="/bin/systemctl try-restart kdump.service" +SUBSYSTEM=="cpu", ACTION=="add", PROGRAM="/bin/systemctl try-restart kdump.service" +SUBSYSTEM=="cpu", ACTION=="remove", PROGRAM="/bin/systemctl try-restart kdump.service" SUBSYSTEM=="memory", ACTION=="online", PROGRAM="/bin/systemctl try-restart kdump.service" SUBSYSTEM=="memory", ACTION=="offline", PROGRAM="/bin/systemctl try-restart kdump.service"