don't re-arm fault clearing timer on unrelated netlink events (#2172650)

Resolves: #2172650
This commit is contained in:
Miroslav Lichvar 2023-03-09 09:43:52 +01:00
parent eb4db35349
commit 58c2dd989d
2 changed files with 55 additions and 0 deletions

52
linuxptp-faultrearm.patch Normal file
View File

@ -0,0 +1,52 @@
commit 134dc3c4655fcd9f314a5e56cd50db2f87366f5a
Author: davidjm via Linuxptp-devel <linuxptp-devel@lists.sourceforge.net>
Date: Wed Nov 23 15:50:30 2022 -0800
Don't re-arm fault clearing timer on unrelated netlink events
Set the timer only when an event causes the port to transition to the
FAULTY state, rather than potentially re-arming the timeout when an
event occurs while the port was already FAULTY.
Concretely this occurs when a port is in fault, perhaps due to a
single time out while polling for tx-timestamp. If any other port in the
system (including unrelated ones ptp4l does not even know about) cause
netlink messages to be sent. As it stands, clock_poll() will note that
the port is in fault (from before, not due to the current event) and
reset the timeout to its original value.
If such unrelated netlink messages arrive at a regular enough cadence
the timeout may be repeatedly reset, not trigger on time (if at all) and
the port may not get a chance to clear its fault, perhaps indefinitely.
Signed-off-by: David Mirabito <davidjm@arista.com>
diff --git a/clock.c b/clock.c
index eea7983..451473e 100644
--- a/clock.c
+++ b/clock.c
@@ -1586,6 +1586,7 @@ void clock_set_sde(struct clock *c, int sde)
int clock_poll(struct clock *c)
{
int cnt, i;
+ enum port_state prior_state;
enum fsm_event event;
struct pollfd *cur;
struct port *p;
@@ -1609,6 +1610,7 @@ int clock_poll(struct clock *c)
/* Let the ports handle their events. */
for (i = 0; i < N_POLLFD; i++) {
if (cur[i].revents & (POLLIN|POLLPRI|POLLERR)) {
+ prior_state = port_state(p);
if (cur[i].revents & POLLERR) {
pr_err("port %d: unexpected socket error",
port_number(p));
@@ -1624,7 +1626,7 @@ int clock_poll(struct clock *c)
}
port_dispatch(p, event, 0);
/* Clear any fault after a little while. */
- if (PS_FAULTY == port_state(p)) {
+ if ((PS_FAULTY == port_state(p)) && (prior_state != PS_FAULTY)) {
clock_fault_timeout(p, 1);
break;
}

View File

@ -51,6 +51,8 @@ Patch14: linuxptp-vlanbond.patch
Patch15: linuxptp-eintr.patch
# check for unexpected changes in frequency offset
Patch16: linuxptp-freqcheck.patch
# don't re-arm fault clearing timer on unrelated netlink events
Patch17: linuxptp-faultrearm.patch
BuildRequires: gcc gcc-c++ make systemd
@ -80,6 +82,7 @@ Supporting legacy APIs and other platforms is not a goal.
%patch14 -p1 -b .vlanbond
%patch15 -p1 -b .eintr
%patch16 -p1 -b .freqcheck
%patch17 -p1 -b .faultrearm
mv linuxptp-testsuite-%{testsuite_ver}* testsuite
mv clknetsim-%{clknetsim_ver}* testsuite/clknetsim