ras-events: quit loop in read_ras_event when kbuf data is broken

Resolves: RHEL-68127

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
This commit is contained in:
Aristeu Rozanski 2024-11-19 12:46:06 -05:00
parent 0d211e2538
commit 09ea5ccc0c
2 changed files with 41 additions and 1 deletions

View File

@ -0,0 +1,35 @@
commit 794530fbf270eae9f6f43c6d0bbd3ec6f2b210f3
Author: hubin <hubin73@huawei.com>
Date: Thu May 18 16:14:41 2023 +0800
ras-events: quit loop in read_ras_event when kbuf data is broken
when kbuf data is broken, kbuffer_next_event() may move kbuf->index back to
the current kbuf->index position, causing dead loop.
In this situation, rasdaemon will repeatedly parse an invalid event, and
print warning like "ug! negative record size -8!", pushing cpu utilization
rate to 100%.
when kbuf data is broken, discard current page and continue reading next page
kbuf.
Signed-off-by: hubin <hubin73@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
diff --git a/ras-events.c b/ras-events.c
index 2662467..fced7ab 100644
--- a/ras-events.c
+++ b/ras-events.c
@@ -512,6 +512,11 @@ static int read_ras_event_all_cpus(struct pthread_data *pdata,
kbuffer_load_subbuffer(kbuf, page);
while ((data = kbuffer_read_event(kbuf, &time_stamp))) {
+ if (kbuffer_curr_size(kbuf) < 0) {
+ log(TERM, LOG_ERR, "invalid kbuf data, discard\n");
+ break;
+ }
+
parse_ras_data(&pdata[i],
kbuf, data, time_stamp);

View File

@ -1,6 +1,6 @@
Name: rasdaemon
Version: 0.6.7
Release: 16%{?dist}
Release: 17%{?dist}
Summary: Utility to receive RAS error tracings
License: GPL-2.0-only
URL: http://git.infradead.org/users/mchehab/rasdaemon.git
@ -43,6 +43,7 @@ Patch34: ad0444190e02bca309a61a4bad51bc0e16c0aef5.patch
Patch35: b1ace39286e287282a275b6edc90dc2f64e60a3c.patch
Patch36: 045ab08eaa00172d50621df9502f6910f3fe3af4.patch
Patch37: 79065939fc4bc1da72a3718937fab80e73a6dd75.patch
Patch38: 794530fbf270eae9f6f43c6d0bbd3ec6f2b210f3.patch
ExcludeArch: s390 s390x
BuildRequires: make
@ -115,6 +116,7 @@ an utility for reporting current error counts from the EDAC sysfs files.
%patch35 -p1
%patch36 -p1
%patch37 -p1
%patch38 -p1
# The tarball is locked in time the first time aclocal was ran and will keep
# requiring an older version of automake
@ -150,6 +152,9 @@ sed -i "s/^PAGE_CE_ACTION=.*/PAGE_CE_ACTION=account/" %{buildroot}/%{_sysconfdir
%{_sysconfdir}/sysconfig/rasdaemon
%changelog
* Tue Nov 19 2024 Aristeu Rozanski <aris@redhat.com> 0.6.7-17
- ras-events: quit loop in read_ras_event when kbuf data is broken [RHEL-68127]
* Thu Sep 05 2024 Aristeu Rozanski <aris@redhat.com> 0.6.7-16
- rasdaemon: Add support to parse the PPIN field of mce tracepoint [RHEL-52911]
- rasdaemon: Add support to parse microcode field of mce tracepoint [RHEL-52911]