Fix missed wakeup in POSIX thread condition variables (RHEL-2419)
Resolves: RHEL-2419
This commit is contained in:
parent
f604405c3e
commit
bc9f002dda
447
glibc-RHEL-2419-1.patch
Normal file
447
glibc-RHEL-2419-1.patch
Normal file
@ -0,0 +1,447 @@
|
||||
commit 1db84775f831a1494993ce9c118deaf9537cc50a
|
||||
Author: Frank Barrus <frankbarrus_sw@shaggy.cc>
|
||||
Date: Wed Dec 4 07:55:02 2024 -0500
|
||||
|
||||
pthreads NPTL: lost wakeup fix 2
|
||||
|
||||
This fixes the lost wakeup (from a bug in signal stealing) with a change
|
||||
in the usage of g_signals[] in the condition variable internal state.
|
||||
It also completely eliminates the concept and handling of signal stealing,
|
||||
as well as the need for signalers to block to wait for waiters to wake
|
||||
up every time there is a G1/G2 switch. This greatly reduces the average
|
||||
and maximum latency for pthread_cond_signal.
|
||||
|
||||
The g_signals[] field now contains a signal count that is relative to
|
||||
the current g1_start value. Since it is a 32-bit field, and the LSB is
|
||||
still reserved (though not currently used anymore), it has a 31-bit value
|
||||
that corresponds to the low 31 bits of the sequence number in g1_start.
|
||||
(since g1_start also has an LSB flag, this means bits 31:1 in g_signals
|
||||
correspond to bits 31:1 in g1_start, plus the current signal count)
|
||||
|
||||
By making the signal count relative to g1_start, there is no longer
|
||||
any ambiguity or A/B/A issue, and thus any checks before blocking,
|
||||
including the futex call itself, are guaranteed not to block if the G1/G2
|
||||
switch occurs, even if the signal count remains the same. This allows
|
||||
initially safely blocking in G2 until the switch to G1 occurs, and
|
||||
then transitioning from G1 to a new G1 or G2, and always being able to
|
||||
distinguish the state change. This removes the race condition and A/B/A
|
||||
problems that otherwise ocurred if a late (pre-empted) waiter were to
|
||||
resume just as the futex call attempted to block on g_signal since
|
||||
otherwise there was no last opportunity to re-check things like whether
|
||||
the current G1 group was already closed.
|
||||
|
||||
By fixing these issues, the signal stealing code can be eliminated,
|
||||
since there is no concept of signal stealing anymore. The code to block
|
||||
for all waiters to exit g_refs can also be removed, since any waiters
|
||||
that are still in the g_refs region can be guaranteed to safely wake
|
||||
up and exit. If there are still any left at this time, they are all
|
||||
sent one final futex wakeup to ensure that they are not blocked any
|
||||
longer, but there is no need for the signaller to block and wait for
|
||||
them to wake up and exit the g_refs region.
|
||||
|
||||
The signal count is then effectively "zeroed" but since it is now
|
||||
relative to g1_start, this is done by advancing it to a new value that
|
||||
can be observed by any pending blocking waiters. Any late waiters can
|
||||
always tell the difference, and can thus just cleanly exit if they are
|
||||
in a stale G1 or G2. They can never steal a signal from the current
|
||||
G1 if they are not in the current G1, since the signal value that has
|
||||
to match in the cmpxchg has the low 31 bits of the g1_start value
|
||||
contained in it, and that's first checked, and then it won't match if
|
||||
there's a G1/G2 change.
|
||||
|
||||
Note: the 31-bit sequence number used in g_signals is designed to
|
||||
handle wrap-around when checking the signal count, but if the entire
|
||||
31-bit wraparound (2 billion signals) occurs while there is still a
|
||||
late waiter that has not yet resumed, and it happens to then match
|
||||
the current g1_start low bits, and the pre-emption occurs after the
|
||||
normal "closed group" checks (which are 64-bit) but then hits the
|
||||
futex syscall and signal consuming code, then an A/B/A issue could
|
||||
still result and cause an incorrect assumption about whether it
|
||||
should block. This particular scenario seems unlikely in practice.
|
||||
Note that once awake from the futex, the waiter would notice the
|
||||
closed group before consuming the signal (since that's still a 64-bit
|
||||
check that would not be aliased in the wrap-around in g_signals),
|
||||
so the biggest impact would be blocking on the futex until the next
|
||||
full wakeup from a G1/G2 switch.
|
||||
|
||||
Signed-off-by: Frank Barrus <frankbarrus_sw@shaggy.cc>
|
||||
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
||||
|
||||
# Conflicts:
|
||||
# nptl/pthread_cond_common.c (Missing spelling fixes)
|
||||
# nptl/pthread_cond_wait.c (Likewise)
|
||||
|
||||
diff --git a/nptl/pthread_cond_common.c b/nptl/pthread_cond_common.c
|
||||
index c35b9ef03afd2c64..b1565b780d175d3a 100644
|
||||
--- a/nptl/pthread_cond_common.c
|
||||
+++ b/nptl/pthread_cond_common.c
|
||||
@@ -341,7 +341,6 @@ static bool __attribute__ ((unused))
|
||||
__condvar_quiesce_and_switch_g1 (pthread_cond_t *cond, uint64_t wseq,
|
||||
unsigned int *g1index, int private)
|
||||
{
|
||||
- const unsigned int maxspin = 0;
|
||||
unsigned int g1 = *g1index;
|
||||
|
||||
/* If there is no waiter in G2, we don't do anything. The expression may
|
||||
@@ -362,84 +361,46 @@ __condvar_quiesce_and_switch_g1 (pthread_cond_t *cond, uint64_t wseq,
|
||||
* New waiters arriving concurrently with the group switching will all go
|
||||
into G2 until we atomically make the switch. Waiters existing in G2
|
||||
are not affected.
|
||||
- * Waiters in G1 will be closed out immediately by setting a flag in
|
||||
- __g_signals, which will prevent waiters from blocking using a futex on
|
||||
- __g_signals and also notifies them that the group is closed. As a
|
||||
- result, they will eventually remove their group reference, allowing us
|
||||
- to close switch group roles. */
|
||||
-
|
||||
- /* First, set the closed flag on __g_signals. This tells waiters that are
|
||||
- about to wait that they shouldn't do that anymore. This basically
|
||||
- serves as an advance notificaton of the upcoming change to __g1_start;
|
||||
- waiters interpret it as if __g1_start was larger than their waiter
|
||||
- sequence position. This allows us to change __g1_start after waiting
|
||||
- for all existing waiters with group references to leave, which in turn
|
||||
- makes recovery after stealing a signal simpler because it then can be
|
||||
- skipped if __g1_start indicates that the group is closed (otherwise,
|
||||
- we would have to recover always because waiters don't know how big their
|
||||
- groups are). Relaxed MO is fine. */
|
||||
- atomic_fetch_or_relaxed (cond->__data.__g_signals + g1, 1);
|
||||
-
|
||||
- /* Wait until there are no group references anymore. The fetch-or operation
|
||||
- injects us into the modification order of __g_refs; release MO ensures
|
||||
- that waiters incrementing __g_refs after our fetch-or see the previous
|
||||
- changes to __g_signals and to __g1_start that had to happen before we can
|
||||
- switch this G1 and alias with an older group (we have two groups, so
|
||||
- aliasing requires switching group roles twice). Note that nobody else
|
||||
- can have set the wake-request flag, so we do not have to act upon it.
|
||||
-
|
||||
- Also note that it is harmless if older waiters or waiters from this G1
|
||||
- get a group reference after we have quiesced the group because it will
|
||||
- remain closed for them either because of the closed flag in __g_signals
|
||||
- or the later update to __g1_start. New waiters will never arrive here
|
||||
- but instead continue to go into the still current G2. */
|
||||
- unsigned r = atomic_fetch_or_release (cond->__data.__g_refs + g1, 0);
|
||||
- while ((r >> 1) > 0)
|
||||
- {
|
||||
- for (unsigned int spin = maxspin; ((r >> 1) > 0) && (spin > 0); spin--)
|
||||
- {
|
||||
- /* TODO Back off. */
|
||||
- r = atomic_load_relaxed (cond->__data.__g_refs + g1);
|
||||
- }
|
||||
- if ((r >> 1) > 0)
|
||||
- {
|
||||
- /* There is still a waiter after spinning. Set the wake-request
|
||||
- flag and block. Relaxed MO is fine because this is just about
|
||||
- this futex word.
|
||||
-
|
||||
- Update r to include the set wake-request flag so that the upcoming
|
||||
- futex_wait only blocks if the flag is still set (otherwise, we'd
|
||||
- violate the basic client-side futex protocol). */
|
||||
- r = atomic_fetch_or_relaxed (cond->__data.__g_refs + g1, 1) | 1;
|
||||
-
|
||||
- if ((r >> 1) > 0)
|
||||
- futex_wait_simple (cond->__data.__g_refs + g1, r, private);
|
||||
- /* Reload here so we eventually see the most recent value even if we
|
||||
- do not spin. */
|
||||
- r = atomic_load_relaxed (cond->__data.__g_refs + g1);
|
||||
- }
|
||||
- }
|
||||
- /* Acquire MO so that we synchronize with the release operation that waiters
|
||||
- use to decrement __g_refs and thus happen after the waiters we waited
|
||||
- for. */
|
||||
- atomic_thread_fence_acquire ();
|
||||
+ * Waiters in G1 will be closed out immediately by the advancing of
|
||||
+ __g_signals to the next "lowseq" (low 31 bits of the new g1_start),
|
||||
+ which will prevent waiters from blocking using a futex on
|
||||
+ __g_signals since it provides enough signals for all possible
|
||||
+ remaining waiters. As a result, they can each consume a signal
|
||||
+ and they will eventually remove their group reference. */
|
||||
|
||||
/* Update __g1_start, which finishes closing this group. The value we add
|
||||
will never be negative because old_orig_size can only be zero when we
|
||||
switch groups the first time after a condvar was initialized, in which
|
||||
- case G1 will be at index 1 and we will add a value of 1. See above for
|
||||
- why this takes place after waiting for quiescence of the group.
|
||||
+ case G1 will be at index 1 and we will add a value of 1.
|
||||
Relaxed MO is fine because the change comes with no additional
|
||||
constraints that others would have to observe. */
|
||||
__condvar_add_g1_start_relaxed (cond,
|
||||
(old_orig_size << 1) + (g1 == 1 ? 1 : - 1));
|
||||
|
||||
- /* Now reopen the group, thus enabling waiters to again block using the
|
||||
- futex controlled by __g_signals. Release MO so that observers that see
|
||||
- no signals (and thus can block) also see the write __g1_start and thus
|
||||
- that this is now a new group (see __pthread_cond_wait_common for the
|
||||
- matching acquire MO loads). */
|
||||
- atomic_store_release (cond->__data.__g_signals + g1, 0);
|
||||
+ unsigned int lowseq = ((old_g1_start + old_orig_size) << 1) & ~1U;
|
||||
+
|
||||
+ /* If any waiters still hold group references (and thus could be blocked),
|
||||
+ then wake them all up now and prevent any running ones from blocking.
|
||||
+ This is effectively a catch-all for any possible current or future
|
||||
+ bugs that can allow the group size to reach 0 before all G1 waiters
|
||||
+ have been awakened or at least given signals to consume, or any
|
||||
+ other case that can leave blocked (or about to block) older waiters.. */
|
||||
+ if ((atomic_fetch_or_release (cond->__data.__g_refs + g1, 0) >> 1) > 0)
|
||||
+ {
|
||||
+ /* First advance signals to the end of the group (i.e. enough signals
|
||||
+ for the entire G1 group) to ensure that waiters which have not
|
||||
+ yet blocked in the futex will not block.
|
||||
+ Note that in the vast majority of cases, this should never
|
||||
+ actually be necessary, since __g_signals will have enough
|
||||
+ signals for the remaining g_refs waiters. As an optimization,
|
||||
+ we could check this first before proceeding, although that
|
||||
+ could still leave the potential for futex lost wakeup bugs
|
||||
+ if the signal count was non-zero but the futex wakeup
|
||||
+ was somehow lost. */
|
||||
+ atomic_store_release (cond->__data.__g_signals + g1, lowseq);
|
||||
+
|
||||
+ futex_wake (cond->__data.__g_signals + g1, INT_MAX, private);
|
||||
+ }
|
||||
|
||||
/* At this point, the old G1 is now a valid new G2 (but not in use yet).
|
||||
No old waiter can neither grab a signal nor acquire a reference without
|
||||
@@ -451,6 +412,10 @@ __condvar_quiesce_and_switch_g1 (pthread_cond_t *cond, uint64_t wseq,
|
||||
g1 ^= 1;
|
||||
*g1index ^= 1;
|
||||
|
||||
+ /* Now advance the new G1 g_signals to the new lowseq, giving it
|
||||
+ an effective signal count of 0 to start. */
|
||||
+ atomic_store_release (cond->__data.__g_signals + g1, lowseq);
|
||||
+
|
||||
/* These values are just observed by signalers, and thus protected by the
|
||||
lock. */
|
||||
unsigned int orig_size = wseq - (old_g1_start + old_orig_size);
|
||||
diff --git a/nptl/pthread_cond_wait.c b/nptl/pthread_cond_wait.c
|
||||
index dc8c511f1a72517a..c34280c6bc9e80fb 100644
|
||||
--- a/nptl/pthread_cond_wait.c
|
||||
+++ b/nptl/pthread_cond_wait.c
|
||||
@@ -239,9 +239,7 @@ __condvar_cleanup_waiting (void *arg)
|
||||
signaled), and a reference count.
|
||||
|
||||
The group reference count is used to maintain the number of waiters that
|
||||
- are using the group's futex. Before a group can change its role, the
|
||||
- reference count must show that no waiters are using the futex anymore; this
|
||||
- prevents ABA issues on the futex word.
|
||||
+ are using the group's futex.
|
||||
|
||||
To represent which intervals in the waiter sequence the groups cover (and
|
||||
thus also which group slot contains G1 or G2), we use a 64b counter to
|
||||
@@ -301,11 +299,12 @@ __condvar_cleanup_waiting (void *arg)
|
||||
last reference.
|
||||
* Reference count used by waiters concurrently with signalers that have
|
||||
acquired the condvar-internal lock.
|
||||
- __g_signals: The number of signals that can still be consumed.
|
||||
+ __g_signals: The number of signals that can still be consumed, relative to
|
||||
+ the current g1_start. (i.e. bits 31 to 1 of __g_signals are bits
|
||||
+ 31 to 1 of g1_start with the signal count added)
|
||||
* Used as a futex word by waiters. Used concurrently by waiters and
|
||||
signalers.
|
||||
- * LSB is true iff this group has been completely signaled (i.e., it is
|
||||
- closed).
|
||||
+ * LSB is currently reserved and 0.
|
||||
__g_size: Waiters remaining in this group (i.e., which have not been
|
||||
signaled yet.
|
||||
* Accessed by signalers and waiters that cancel waiting (both do so only
|
||||
@@ -329,18 +328,6 @@ __condvar_cleanup_waiting (void *arg)
|
||||
sufficient because if a waiter can see a sufficiently large value, it could
|
||||
have also consume a signal in the waiters group.
|
||||
|
||||
- Waiters try to grab a signal from __g_signals without holding a reference
|
||||
- count, which can lead to stealing a signal from a more recent group after
|
||||
- their own group was already closed. They cannot always detect whether they
|
||||
- in fact did because they do not know when they stole, but they can
|
||||
- conservatively add a signal back to the group they stole from; if they
|
||||
- did so unnecessarily, all that happens is a spurious wake-up. To make this
|
||||
- even less likely, __g1_start contains the index of the current g2 too,
|
||||
- which allows waiters to check if there aliasing on the group slots; if
|
||||
- there wasn't, they didn't steal from the current G1, which means that the
|
||||
- G1 they stole from must have been already closed and they do not need to
|
||||
- fix anything.
|
||||
-
|
||||
It is essential that the last field in pthread_cond_t is __g_signals[1]:
|
||||
The previous condvar used a pointer-sized field in pthread_cond_t, so a
|
||||
PTHREAD_COND_INITIALIZER from that condvar implementation might only
|
||||
@@ -436,6 +423,9 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
{
|
||||
while (1)
|
||||
{
|
||||
+ uint64_t g1_start = __condvar_load_g1_start_relaxed (cond);
|
||||
+ unsigned int lowseq = (g1_start & 1) == g ? signals : g1_start & ~1U;
|
||||
+
|
||||
/* Spin-wait first.
|
||||
Note that spinning first without checking whether a timeout
|
||||
passed might lead to what looks like a spurious wake-up even
|
||||
@@ -447,35 +437,45 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
having to compare against the current time seems to be the right
|
||||
choice from a performance perspective for most use cases. */
|
||||
unsigned int spin = maxspin;
|
||||
- while (signals == 0 && spin > 0)
|
||||
+ while (spin > 0 && ((int)(signals - lowseq) < 2))
|
||||
{
|
||||
/* Check that we are not spinning on a group that's already
|
||||
closed. */
|
||||
- if (seq < (__condvar_load_g1_start_relaxed (cond) >> 1))
|
||||
- goto done;
|
||||
+ if (seq < (g1_start >> 1))
|
||||
+ break;
|
||||
|
||||
/* TODO Back off. */
|
||||
|
||||
/* Reload signals. See above for MO. */
|
||||
signals = atomic_load_acquire (cond->__data.__g_signals + g);
|
||||
+ g1_start = __condvar_load_g1_start_relaxed (cond);
|
||||
+ lowseq = (g1_start & 1) == g ? signals : g1_start & ~1U;
|
||||
spin--;
|
||||
}
|
||||
|
||||
- /* If our group will be closed as indicated by the flag on signals,
|
||||
- don't bother grabbing a signal. */
|
||||
- if (signals & 1)
|
||||
- goto done;
|
||||
-
|
||||
- /* If there is an available signal, don't block. */
|
||||
- if (signals != 0)
|
||||
+ if (seq < (g1_start >> 1))
|
||||
+ {
|
||||
+ /* If the group is closed already,
|
||||
+ then this waiter originally had enough extra signals to
|
||||
+ consume, up until the time its group was closed. */
|
||||
+ goto done;
|
||||
+ }
|
||||
+
|
||||
+ /* If there is an available signal, don't block.
|
||||
+ If __g1_start has advanced at all, then we must be in G1
|
||||
+ by now, perhaps in the process of switching back to an older
|
||||
+ G2, but in either case we're allowed to consume the available
|
||||
+ signal and should not block anymore. */
|
||||
+ if ((int)(signals - lowseq) >= 2)
|
||||
break;
|
||||
|
||||
/* No signals available after spinning, so prepare to block.
|
||||
We first acquire a group reference and use acquire MO for that so
|
||||
that we synchronize with the dummy read-modify-write in
|
||||
__condvar_quiesce_and_switch_g1 if we read from that. In turn,
|
||||
- in this case this will make us see the closed flag on __g_signals
|
||||
- that designates a concurrent attempt to reuse the group's slot.
|
||||
+ in this case this will make us see the advancement of __g_signals
|
||||
+ to the upcoming new g1_start that occurs with a concurrent
|
||||
+ attempt to reuse the group's slot.
|
||||
We use acquire MO for the __g_signals check to make the
|
||||
__g1_start check work (see spinning above).
|
||||
Note that the group reference acquisition will not mask the
|
||||
@@ -483,15 +483,24 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
an atomic read-modify-write operation and thus extend the release
|
||||
sequence. */
|
||||
atomic_fetch_add_acquire (cond->__data.__g_refs + g, 2);
|
||||
- if (((atomic_load_acquire (cond->__data.__g_signals + g) & 1) != 0)
|
||||
- || (seq < (__condvar_load_g1_start_relaxed (cond) >> 1)))
|
||||
+ signals = atomic_load_acquire (cond->__data.__g_signals + g);
|
||||
+ g1_start = __condvar_load_g1_start_relaxed (cond);
|
||||
+ lowseq = (g1_start & 1) == g ? signals : g1_start & ~1U;
|
||||
+
|
||||
+ if (seq < (g1_start >> 1))
|
||||
{
|
||||
- /* Our group is closed. Wake up any signalers that might be
|
||||
- waiting. */
|
||||
+ /* group is closed already, so don't block */
|
||||
__condvar_dec_grefs (cond, g, private);
|
||||
goto done;
|
||||
}
|
||||
|
||||
+ if ((int)(signals - lowseq) >= 2)
|
||||
+ {
|
||||
+ /* a signal showed up or G1/G2 switched after we grabbed the refcount */
|
||||
+ __condvar_dec_grefs (cond, g, private);
|
||||
+ break;
|
||||
+ }
|
||||
+
|
||||
// Now block.
|
||||
struct _pthread_cleanup_buffer buffer;
|
||||
struct _condvar_cleanup_buffer cbuffer;
|
||||
@@ -502,7 +511,7 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
__pthread_cleanup_push (&buffer, __condvar_cleanup_waiting, &cbuffer);
|
||||
|
||||
err = __futex_abstimed_wait_cancelable64 (
|
||||
- cond->__data.__g_signals + g, 0, clockid, abstime, private);
|
||||
+ cond->__data.__g_signals + g, signals, clockid, abstime, private);
|
||||
|
||||
__pthread_cleanup_pop (&buffer, 0);
|
||||
|
||||
@@ -525,6 +534,8 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
signals = atomic_load_acquire (cond->__data.__g_signals + g);
|
||||
}
|
||||
|
||||
+ if (seq < (__condvar_load_g1_start_relaxed (cond) >> 1))
|
||||
+ goto done;
|
||||
}
|
||||
/* Try to grab a signal. Use acquire MO so that we see an up-to-date value
|
||||
of __g1_start below (see spinning above for a similar case). In
|
||||
@@ -533,69 +544,6 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
while (!atomic_compare_exchange_weak_acquire (cond->__data.__g_signals + g,
|
||||
&signals, signals - 2));
|
||||
|
||||
- /* We consumed a signal but we could have consumed from a more recent group
|
||||
- that aliased with ours due to being in the same group slot. If this
|
||||
- might be the case our group must be closed as visible through
|
||||
- __g1_start. */
|
||||
- uint64_t g1_start = __condvar_load_g1_start_relaxed (cond);
|
||||
- if (seq < (g1_start >> 1))
|
||||
- {
|
||||
- /* We potentially stole a signal from a more recent group but we do not
|
||||
- know which group we really consumed from.
|
||||
- We do not care about groups older than current G1 because they are
|
||||
- closed; we could have stolen from these, but then we just add a
|
||||
- spurious wake-up for the current groups.
|
||||
- We will never steal a signal from current G2 that was really intended
|
||||
- for G2 because G2 never receives signals (until it becomes G1). We
|
||||
- could have stolen a signal from G2 that was conservatively added by a
|
||||
- previous waiter that also thought it stole a signal -- but given that
|
||||
- that signal was added unnecessarily, it's not a problem if we steal
|
||||
- it.
|
||||
- Thus, the remaining case is that we could have stolen from the current
|
||||
- G1, where "current" means the __g1_start value we observed. However,
|
||||
- if the current G1 does not have the same slot index as we do, we did
|
||||
- not steal from it and do not need to undo that. This is the reason
|
||||
- for putting a bit with G2's index into__g1_start as well. */
|
||||
- if (((g1_start & 1) ^ 1) == g)
|
||||
- {
|
||||
- /* We have to conservatively undo our potential mistake of stealing
|
||||
- a signal. We can stop trying to do that when the current G1
|
||||
- changes because other spinning waiters will notice this too and
|
||||
- __condvar_quiesce_and_switch_g1 has checked that there are no
|
||||
- futex waiters anymore before switching G1.
|
||||
- Relaxed MO is fine for the __g1_start load because we need to
|
||||
- merely be able to observe this fact and not have to observe
|
||||
- something else as well.
|
||||
- ??? Would it help to spin for a little while to see whether the
|
||||
- current G1 gets closed? This might be worthwhile if the group is
|
||||
- small or close to being closed. */
|
||||
- unsigned int s = atomic_load_relaxed (cond->__data.__g_signals + g);
|
||||
- while (__condvar_load_g1_start_relaxed (cond) == g1_start)
|
||||
- {
|
||||
- /* Try to add a signal. We don't need to acquire the lock
|
||||
- because at worst we can cause a spurious wake-up. If the
|
||||
- group is in the process of being closed (LSB is true), this
|
||||
- has an effect similar to us adding a signal. */
|
||||
- if (((s & 1) != 0)
|
||||
- || atomic_compare_exchange_weak_relaxed
|
||||
- (cond->__data.__g_signals + g, &s, s + 2))
|
||||
- {
|
||||
- /* If we added a signal, we also need to add a wake-up on
|
||||
- the futex. We also need to do that if we skipped adding
|
||||
- a signal because the group is being closed because
|
||||
- while __condvar_quiesce_and_switch_g1 could have closed
|
||||
- the group, it might stil be waiting for futex waiters to
|
||||
- leave (and one of those waiters might be the one we stole
|
||||
- the signal from, which cause it to block using the
|
||||
- futex). */
|
||||
- futex_wake (cond->__data.__g_signals + g, 1, private);
|
||||
- break;
|
||||
- }
|
||||
- /* TODO Back off. */
|
||||
- }
|
||||
- }
|
||||
- }
|
||||
-
|
||||
done:
|
||||
|
||||
/* Confirm that we have been woken. We do that before acquiring the mutex
|
39
glibc-RHEL-2419-10.patch
Normal file
39
glibc-RHEL-2419-10.patch
Normal file
@ -0,0 +1,39 @@
|
||||
Partial revert of commit c36fc50781995e6758cae2b6927839d0157f213c
|
||||
to restore the layout of pthread_cond_t and avoid a downstream
|
||||
rpminspect and abidiff (libabigail tooling) spurious warning
|
||||
about internal ABI changes. Without this change all RHEL developers
|
||||
using pthread_cond_t would have to audit and waive the warning.
|
||||
The alternative is to update the supression lists used in abidiff,
|
||||
propagate that to the rpminspect service, and wait for that to
|
||||
complete before doing the update. The more conservative position
|
||||
is the partial revert of the layout change.
|
||||
|
||||
This is a downstream-only change and is not required upstream.
|
||||
|
||||
diff --git a/sysdeps/nptl/bits/thread-shared-types.h b/sysdeps/nptl/bits/thread-shared-types.h
|
||||
index 5cd33b765d9689eb..5644472323fe5424 100644
|
||||
--- a/sysdeps/nptl/bits/thread-shared-types.h
|
||||
+++ b/sysdeps/nptl/bits/thread-shared-types.h
|
||||
@@ -109,7 +109,8 @@ struct __pthread_cond_s
|
||||
unsigned int __high;
|
||||
} __g1_start32;
|
||||
};
|
||||
- unsigned int __g_size[2] __LOCK_ALIGNMENT;
|
||||
+ unsigned int __glibc_unused___g_refs[2] __LOCK_ALIGNMENT;
|
||||
+ unsigned int __g_size[2];
|
||||
unsigned int __g1_orig_size;
|
||||
unsigned int __wrefs;
|
||||
unsigned int __g_signals[2];
|
||||
diff --git a/sysdeps/nptl/pthread.h b/sysdeps/nptl/pthread.h
|
||||
index 7ea6001784783371..43146e91c9d9579b 100644
|
||||
--- a/sysdeps/nptl/pthread.h
|
||||
+++ b/sysdeps/nptl/pthread.h
|
||||
@@ -152,7 +152,7 @@ enum
|
||||
|
||||
|
||||
/* Conditional variable handling. */
|
||||
-#define PTHREAD_COND_INITIALIZER { { {0}, {0}, {0, 0}, 0, 0, {0, 0} } }
|
||||
+#define PTHREAD_COND_INITIALIZER { { {0}, {0}, {0, 0}, {0, 0}, 0, 0, {0, 0} } }
|
||||
|
||||
|
||||
/* Cleanup buffers */
|
133
glibc-RHEL-2419-2.patch
Normal file
133
glibc-RHEL-2419-2.patch
Normal file
@ -0,0 +1,133 @@
|
||||
commit 0cc973160c23bb67f895bc887dd6942d29f8fee3
|
||||
Author: Malte Skarupke <malteskarupke@fastmail.fm>
|
||||
Date: Wed Dec 4 07:55:22 2024 -0500
|
||||
|
||||
nptl: Update comments and indentation for new condvar implementation
|
||||
|
||||
Some comments were wrong after the most recent commit. This fixes that.
|
||||
|
||||
Also fixing indentation where it was using spaces instead of tabs.
|
||||
|
||||
Signed-off-by: Malte Skarupke <malteskarupke@fastmail.fm>
|
||||
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
||||
|
||||
diff --git a/nptl/pthread_cond_common.c b/nptl/pthread_cond_common.c
|
||||
index b1565b780d175d3a..b355e38fb57862b1 100644
|
||||
--- a/nptl/pthread_cond_common.c
|
||||
+++ b/nptl/pthread_cond_common.c
|
||||
@@ -361,8 +361,9 @@ __condvar_quiesce_and_switch_g1 (pthread_cond_t *cond, uint64_t wseq,
|
||||
* New waiters arriving concurrently with the group switching will all go
|
||||
into G2 until we atomically make the switch. Waiters existing in G2
|
||||
are not affected.
|
||||
- * Waiters in G1 will be closed out immediately by the advancing of
|
||||
- __g_signals to the next "lowseq" (low 31 bits of the new g1_start),
|
||||
+ * Waiters in G1 have already received a signal and been woken. If they
|
||||
+ haven't woken yet, they will be closed out immediately by the advancing
|
||||
+ of __g_signals to the next "lowseq" (low 31 bits of the new g1_start),
|
||||
which will prevent waiters from blocking using a futex on
|
||||
__g_signals since it provides enough signals for all possible
|
||||
remaining waiters. As a result, they can each consume a signal
|
||||
diff --git a/nptl/pthread_cond_wait.c b/nptl/pthread_cond_wait.c
|
||||
index c34280c6bc9e80fb..7dabcb15d2d818e7 100644
|
||||
--- a/nptl/pthread_cond_wait.c
|
||||
+++ b/nptl/pthread_cond_wait.c
|
||||
@@ -250,7 +250,7 @@ __condvar_cleanup_waiting (void *arg)
|
||||
figure out whether they are in a group that has already been completely
|
||||
signaled (i.e., if the current G1 starts at a later position that the
|
||||
waiter's position). Waiters cannot determine whether they are currently
|
||||
- in G2 or G1 -- but they do not have too because all they are interested in
|
||||
+ in G2 or G1 -- but they do not have to because all they are interested in
|
||||
is whether there are available signals, and they always start in G2 (whose
|
||||
group slot they know because of the bit in the waiter sequence. Signalers
|
||||
will simply fill the right group until it is completely signaled and can
|
||||
@@ -413,7 +413,7 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
}
|
||||
|
||||
/* Now wait until a signal is available in our group or it is closed.
|
||||
- Acquire MO so that if we observe a value of zero written after group
|
||||
+ Acquire MO so that if we observe (signals == lowseq) after group
|
||||
switching in __condvar_quiesce_and_switch_g1, we synchronize with that
|
||||
store and will see the prior update of __g1_start done while switching
|
||||
groups too. */
|
||||
@@ -423,8 +423,8 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
{
|
||||
while (1)
|
||||
{
|
||||
- uint64_t g1_start = __condvar_load_g1_start_relaxed (cond);
|
||||
- unsigned int lowseq = (g1_start & 1) == g ? signals : g1_start & ~1U;
|
||||
+ uint64_t g1_start = __condvar_load_g1_start_relaxed (cond);
|
||||
+ unsigned int lowseq = (g1_start & 1) == g ? signals : g1_start & ~1U;
|
||||
|
||||
/* Spin-wait first.
|
||||
Note that spinning first without checking whether a timeout
|
||||
@@ -448,21 +448,21 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
|
||||
/* Reload signals. See above for MO. */
|
||||
signals = atomic_load_acquire (cond->__data.__g_signals + g);
|
||||
- g1_start = __condvar_load_g1_start_relaxed (cond);
|
||||
- lowseq = (g1_start & 1) == g ? signals : g1_start & ~1U;
|
||||
+ g1_start = __condvar_load_g1_start_relaxed (cond);
|
||||
+ lowseq = (g1_start & 1) == g ? signals : g1_start & ~1U;
|
||||
spin--;
|
||||
}
|
||||
|
||||
- if (seq < (g1_start >> 1))
|
||||
+ if (seq < (g1_start >> 1))
|
||||
{
|
||||
- /* If the group is closed already,
|
||||
+ /* If the group is closed already,
|
||||
then this waiter originally had enough extra signals to
|
||||
consume, up until the time its group was closed. */
|
||||
goto done;
|
||||
- }
|
||||
+ }
|
||||
|
||||
/* If there is an available signal, don't block.
|
||||
- If __g1_start has advanced at all, then we must be in G1
|
||||
+ If __g1_start has advanced at all, then we must be in G1
|
||||
by now, perhaps in the process of switching back to an older
|
||||
G2, but in either case we're allowed to consume the available
|
||||
signal and should not block anymore. */
|
||||
@@ -484,22 +484,23 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
sequence. */
|
||||
atomic_fetch_add_acquire (cond->__data.__g_refs + g, 2);
|
||||
signals = atomic_load_acquire (cond->__data.__g_signals + g);
|
||||
- g1_start = __condvar_load_g1_start_relaxed (cond);
|
||||
- lowseq = (g1_start & 1) == g ? signals : g1_start & ~1U;
|
||||
+ g1_start = __condvar_load_g1_start_relaxed (cond);
|
||||
+ lowseq = (g1_start & 1) == g ? signals : g1_start & ~1U;
|
||||
|
||||
- if (seq < (g1_start >> 1))
|
||||
+ if (seq < (g1_start >> 1))
|
||||
{
|
||||
- /* group is closed already, so don't block */
|
||||
+ /* group is closed already, so don't block */
|
||||
__condvar_dec_grefs (cond, g, private);
|
||||
goto done;
|
||||
}
|
||||
|
||||
if ((int)(signals - lowseq) >= 2)
|
||||
{
|
||||
- /* a signal showed up or G1/G2 switched after we grabbed the refcount */
|
||||
+ /* a signal showed up or G1/G2 switched after we grabbed the
|
||||
+ refcount */
|
||||
__condvar_dec_grefs (cond, g, private);
|
||||
break;
|
||||
- }
|
||||
+ }
|
||||
|
||||
// Now block.
|
||||
struct _pthread_cleanup_buffer buffer;
|
||||
@@ -537,10 +538,8 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
if (seq < (__condvar_load_g1_start_relaxed (cond) >> 1))
|
||||
goto done;
|
||||
}
|
||||
- /* Try to grab a signal. Use acquire MO so that we see an up-to-date value
|
||||
- of __g1_start below (see spinning above for a similar case). In
|
||||
- particular, if we steal from a more recent group, we will also see a
|
||||
- more recent __g1_start below. */
|
||||
+ /* Try to grab a signal. See above for MO. (if we do another loop
|
||||
+ iteration we need to see the correct value of g1_start) */
|
||||
while (!atomic_compare_exchange_weak_acquire (cond->__data.__g_signals + g,
|
||||
&signals, signals - 2));
|
||||
|
67
glibc-RHEL-2419-3.patch
Normal file
67
glibc-RHEL-2419-3.patch
Normal file
@ -0,0 +1,67 @@
|
||||
commit b42cc6af11062c260c7dfa91f1c89891366fed3e
|
||||
Author: Malte Skarupke <malteskarupke@fastmail.fm>
|
||||
Date: Wed Dec 4 07:55:50 2024 -0500
|
||||
|
||||
nptl: Remove unnecessary catch-all-wake in condvar group switch
|
||||
|
||||
This wake is unnecessary. We only switch groups after every sleeper in a group
|
||||
has been woken. Sure, they may take a while to actually wake up and may still
|
||||
hold a reference, but waking them a second time doesn't speed that up. Instead
|
||||
this just makes the code more complicated and may hide problems.
|
||||
|
||||
In particular this safety wake wouldn't even have helped with the bug that was
|
||||
fixed by Barrus' patch: The bug there was that pthread_cond_signal would not
|
||||
switch g1 when it should, so we wouldn't even have entered this code path.
|
||||
|
||||
Signed-off-by: Malte Skarupke <malteskarupke@fastmail.fm>
|
||||
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
||||
|
||||
diff --git a/nptl/pthread_cond_common.c b/nptl/pthread_cond_common.c
|
||||
index b355e38fb57862b1..517ad52077829552 100644
|
||||
--- a/nptl/pthread_cond_common.c
|
||||
+++ b/nptl/pthread_cond_common.c
|
||||
@@ -361,13 +361,7 @@ __condvar_quiesce_and_switch_g1 (pthread_cond_t *cond, uint64_t wseq,
|
||||
* New waiters arriving concurrently with the group switching will all go
|
||||
into G2 until we atomically make the switch. Waiters existing in G2
|
||||
are not affected.
|
||||
- * Waiters in G1 have already received a signal and been woken. If they
|
||||
- haven't woken yet, they will be closed out immediately by the advancing
|
||||
- of __g_signals to the next "lowseq" (low 31 bits of the new g1_start),
|
||||
- which will prevent waiters from blocking using a futex on
|
||||
- __g_signals since it provides enough signals for all possible
|
||||
- remaining waiters. As a result, they can each consume a signal
|
||||
- and they will eventually remove their group reference. */
|
||||
+ * Waiters in G1 have already received a signal and been woken. */
|
||||
|
||||
/* Update __g1_start, which finishes closing this group. The value we add
|
||||
will never be negative because old_orig_size can only be zero when we
|
||||
@@ -380,29 +374,6 @@ __condvar_quiesce_and_switch_g1 (pthread_cond_t *cond, uint64_t wseq,
|
||||
|
||||
unsigned int lowseq = ((old_g1_start + old_orig_size) << 1) & ~1U;
|
||||
|
||||
- /* If any waiters still hold group references (and thus could be blocked),
|
||||
- then wake them all up now and prevent any running ones from blocking.
|
||||
- This is effectively a catch-all for any possible current or future
|
||||
- bugs that can allow the group size to reach 0 before all G1 waiters
|
||||
- have been awakened or at least given signals to consume, or any
|
||||
- other case that can leave blocked (or about to block) older waiters.. */
|
||||
- if ((atomic_fetch_or_release (cond->__data.__g_refs + g1, 0) >> 1) > 0)
|
||||
- {
|
||||
- /* First advance signals to the end of the group (i.e. enough signals
|
||||
- for the entire G1 group) to ensure that waiters which have not
|
||||
- yet blocked in the futex will not block.
|
||||
- Note that in the vast majority of cases, this should never
|
||||
- actually be necessary, since __g_signals will have enough
|
||||
- signals for the remaining g_refs waiters. As an optimization,
|
||||
- we could check this first before proceeding, although that
|
||||
- could still leave the potential for futex lost wakeup bugs
|
||||
- if the signal count was non-zero but the futex wakeup
|
||||
- was somehow lost. */
|
||||
- atomic_store_release (cond->__data.__g_signals + g1, lowseq);
|
||||
-
|
||||
- futex_wake (cond->__data.__g_signals + g1, INT_MAX, private);
|
||||
- }
|
||||
-
|
||||
/* At this point, the old G1 is now a valid new G2 (but not in use yet).
|
||||
No old waiter can neither grab a signal nor acquire a reference without
|
||||
noticing that __g1_start is larger.
|
107
glibc-RHEL-2419-4.patch
Normal file
107
glibc-RHEL-2419-4.patch
Normal file
@ -0,0 +1,107 @@
|
||||
commit 4f7b051f8ee3feff1b53b27a906f245afaa9cee1
|
||||
Author: Malte Skarupke <malteskarupke@fastmail.fm>
|
||||
Date: Wed Dec 4 07:56:13 2024 -0500
|
||||
|
||||
nptl: Remove unnecessary quadruple check in pthread_cond_wait
|
||||
|
||||
pthread_cond_wait was checking whether it was in a closed group no less than
|
||||
four times. Checking once is enough. Here are the four checks:
|
||||
|
||||
1. While spin-waiting. This was dead code: maxspin is set to 0 and has been
|
||||
for years.
|
||||
2. Before deciding to go to sleep, and before incrementing grefs: I kept this
|
||||
3. After incrementing grefs. There is no reason to think that the group would
|
||||
close while we do an atomic increment. Obviously it could close at any
|
||||
point, but that doesn't mean we have to recheck after every step. This
|
||||
check was equally good as check 2, except it has to do more work.
|
||||
4. When we find ourselves in a group that has a signal. We only get here after
|
||||
we check that we're not in a closed group. There is no need to check again.
|
||||
The check would only have helped in cases where the compare_exchange in the
|
||||
next line would also have failed. Relying on the compare_exchange is fine.
|
||||
|
||||
Removing the duplicate checks clarifies the code.
|
||||
|
||||
Signed-off-by: Malte Skarupke <malteskarupke@fastmail.fm>
|
||||
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
||||
|
||||
diff --git a/nptl/pthread_cond_wait.c b/nptl/pthread_cond_wait.c
|
||||
index 7dabcb15d2d818e7..ba9a19bedc2c176f 100644
|
||||
--- a/nptl/pthread_cond_wait.c
|
||||
+++ b/nptl/pthread_cond_wait.c
|
||||
@@ -367,7 +367,6 @@ static __always_inline int
|
||||
__pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
clockid_t clockid, const struct __timespec64 *abstime)
|
||||
{
|
||||
- const int maxspin = 0;
|
||||
int err;
|
||||
int result = 0;
|
||||
|
||||
@@ -426,33 +425,6 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
uint64_t g1_start = __condvar_load_g1_start_relaxed (cond);
|
||||
unsigned int lowseq = (g1_start & 1) == g ? signals : g1_start & ~1U;
|
||||
|
||||
- /* Spin-wait first.
|
||||
- Note that spinning first without checking whether a timeout
|
||||
- passed might lead to what looks like a spurious wake-up even
|
||||
- though we should return ETIMEDOUT (e.g., if the caller provides
|
||||
- an absolute timeout that is clearly in the past). However,
|
||||
- (1) spurious wake-ups are allowed, (2) it seems unlikely that a
|
||||
- user will (ab)use pthread_cond_wait as a check for whether a
|
||||
- point in time is in the past, and (3) spinning first without
|
||||
- having to compare against the current time seems to be the right
|
||||
- choice from a performance perspective for most use cases. */
|
||||
- unsigned int spin = maxspin;
|
||||
- while (spin > 0 && ((int)(signals - lowseq) < 2))
|
||||
- {
|
||||
- /* Check that we are not spinning on a group that's already
|
||||
- closed. */
|
||||
- if (seq < (g1_start >> 1))
|
||||
- break;
|
||||
-
|
||||
- /* TODO Back off. */
|
||||
-
|
||||
- /* Reload signals. See above for MO. */
|
||||
- signals = atomic_load_acquire (cond->__data.__g_signals + g);
|
||||
- g1_start = __condvar_load_g1_start_relaxed (cond);
|
||||
- lowseq = (g1_start & 1) == g ? signals : g1_start & ~1U;
|
||||
- spin--;
|
||||
- }
|
||||
-
|
||||
if (seq < (g1_start >> 1))
|
||||
{
|
||||
/* If the group is closed already,
|
||||
@@ -483,24 +455,6 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
an atomic read-modify-write operation and thus extend the release
|
||||
sequence. */
|
||||
atomic_fetch_add_acquire (cond->__data.__g_refs + g, 2);
|
||||
- signals = atomic_load_acquire (cond->__data.__g_signals + g);
|
||||
- g1_start = __condvar_load_g1_start_relaxed (cond);
|
||||
- lowseq = (g1_start & 1) == g ? signals : g1_start & ~1U;
|
||||
-
|
||||
- if (seq < (g1_start >> 1))
|
||||
- {
|
||||
- /* group is closed already, so don't block */
|
||||
- __condvar_dec_grefs (cond, g, private);
|
||||
- goto done;
|
||||
- }
|
||||
-
|
||||
- if ((int)(signals - lowseq) >= 2)
|
||||
- {
|
||||
- /* a signal showed up or G1/G2 switched after we grabbed the
|
||||
- refcount */
|
||||
- __condvar_dec_grefs (cond, g, private);
|
||||
- break;
|
||||
- }
|
||||
|
||||
// Now block.
|
||||
struct _pthread_cleanup_buffer buffer;
|
||||
@@ -534,9 +488,6 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
/* Reload signals. See above for MO. */
|
||||
signals = atomic_load_acquire (cond->__data.__g_signals + g);
|
||||
}
|
||||
-
|
||||
- if (seq < (__condvar_load_g1_start_relaxed (cond) >> 1))
|
||||
- goto done;
|
||||
}
|
||||
/* Try to grab a signal. See above for MO. (if we do another loop
|
||||
iteration we need to see the correct value of g1_start) */
|
172
glibc-RHEL-2419-5.patch
Normal file
172
glibc-RHEL-2419-5.patch
Normal file
@ -0,0 +1,172 @@
|
||||
commit c36fc50781995e6758cae2b6927839d0157f213c
|
||||
Author: Malte Skarupke <malteskarupke@fastmail.fm>
|
||||
Date: Wed Dec 4 07:56:38 2024 -0500
|
||||
|
||||
nptl: Remove g_refs from condition variables
|
||||
|
||||
This variable used to be needed to wait in group switching until all sleepers
|
||||
have confirmed that they have woken. This is no longer needed. Nothing waits
|
||||
on this variable so there is no need to track how many threads are currently
|
||||
asleep in each group.
|
||||
|
||||
Signed-off-by: Malte Skarupke <malteskarupke@fastmail.fm>
|
||||
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
||||
|
||||
# Conflicts:
|
||||
# nptl/tst-cond22.c (No atomic wide counter refactor)
|
||||
# sysdeps/nptl/bits/thread-shared-types.h (Likewise)
|
||||
|
||||
diff --git a/nptl/pthread_cond_wait.c b/nptl/pthread_cond_wait.c
|
||||
index ba9a19bedc2c176f..9652dbafe08dfde1 100644
|
||||
--- a/nptl/pthread_cond_wait.c
|
||||
+++ b/nptl/pthread_cond_wait.c
|
||||
@@ -144,23 +144,6 @@ __condvar_cancel_waiting (pthread_cond_t *cond, uint64_t seq, unsigned int g,
|
||||
}
|
||||
}
|
||||
|
||||
-/* Wake up any signalers that might be waiting. */
|
||||
-static void
|
||||
-__condvar_dec_grefs (pthread_cond_t *cond, unsigned int g, int private)
|
||||
-{
|
||||
- /* Release MO to synchronize-with the acquire load in
|
||||
- __condvar_quiesce_and_switch_g1. */
|
||||
- if (atomic_fetch_add_release (cond->__data.__g_refs + g, -2) == 3)
|
||||
- {
|
||||
- /* Clear the wake-up request flag before waking up. We do not need more
|
||||
- than relaxed MO and it doesn't matter if we apply this for an aliased
|
||||
- group because we wake all futex waiters right after clearing the
|
||||
- flag. */
|
||||
- atomic_fetch_and_relaxed (cond->__data.__g_refs + g, ~(unsigned int) 1);
|
||||
- futex_wake (cond->__data.__g_refs + g, INT_MAX, private);
|
||||
- }
|
||||
-}
|
||||
-
|
||||
/* Clean-up for cancellation of waiters waiting for normal signals. We cancel
|
||||
our registration as a waiter, confirm we have woken up, and re-acquire the
|
||||
mutex. */
|
||||
@@ -172,8 +155,6 @@ __condvar_cleanup_waiting (void *arg)
|
||||
pthread_cond_t *cond = cbuffer->cond;
|
||||
unsigned g = cbuffer->wseq & 1;
|
||||
|
||||
- __condvar_dec_grefs (cond, g, cbuffer->private);
|
||||
-
|
||||
__condvar_cancel_waiting (cond, cbuffer->wseq >> 1, g, cbuffer->private);
|
||||
/* FIXME With the current cancellation implementation, it is possible that
|
||||
a thread is cancelled after it has returned from a syscall. This could
|
||||
@@ -328,15 +309,6 @@ __condvar_cleanup_waiting (void *arg)
|
||||
sufficient because if a waiter can see a sufficiently large value, it could
|
||||
have also consume a signal in the waiters group.
|
||||
|
||||
- It is essential that the last field in pthread_cond_t is __g_signals[1]:
|
||||
- The previous condvar used a pointer-sized field in pthread_cond_t, so a
|
||||
- PTHREAD_COND_INITIALIZER from that condvar implementation might only
|
||||
- initialize 4 bytes to zero instead of the 8 bytes we need (i.e., 44 bytes
|
||||
- in total instead of the 48 we need). __g_signals[1] is not accessed before
|
||||
- the first group switch (G2 starts at index 0), which will set its value to
|
||||
- zero after a harmless fetch-or whose return value is ignored. This
|
||||
- effectively completes initialization.
|
||||
-
|
||||
|
||||
Limitations:
|
||||
* This condvar isn't designed to allow for more than
|
||||
@@ -441,21 +413,6 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
if ((int)(signals - lowseq) >= 2)
|
||||
break;
|
||||
|
||||
- /* No signals available after spinning, so prepare to block.
|
||||
- We first acquire a group reference and use acquire MO for that so
|
||||
- that we synchronize with the dummy read-modify-write in
|
||||
- __condvar_quiesce_and_switch_g1 if we read from that. In turn,
|
||||
- in this case this will make us see the advancement of __g_signals
|
||||
- to the upcoming new g1_start that occurs with a concurrent
|
||||
- attempt to reuse the group's slot.
|
||||
- We use acquire MO for the __g_signals check to make the
|
||||
- __g1_start check work (see spinning above).
|
||||
- Note that the group reference acquisition will not mask the
|
||||
- release MO when decrementing the reference count because we use
|
||||
- an atomic read-modify-write operation and thus extend the release
|
||||
- sequence. */
|
||||
- atomic_fetch_add_acquire (cond->__data.__g_refs + g, 2);
|
||||
-
|
||||
// Now block.
|
||||
struct _pthread_cleanup_buffer buffer;
|
||||
struct _condvar_cleanup_buffer cbuffer;
|
||||
@@ -472,18 +429,11 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
|
||||
if (__glibc_unlikely (err == ETIMEDOUT || err == EOVERFLOW))
|
||||
{
|
||||
- __condvar_dec_grefs (cond, g, private);
|
||||
- /* If we timed out, we effectively cancel waiting. Note that
|
||||
- we have decremented __g_refs before cancellation, so that a
|
||||
- deadlock between waiting for quiescence of our group in
|
||||
- __condvar_quiesce_and_switch_g1 and us trying to acquire
|
||||
- the lock during cancellation is not possible. */
|
||||
+ /* If we timed out, we effectively cancel waiting. */
|
||||
__condvar_cancel_waiting (cond, seq, g, private);
|
||||
result = err;
|
||||
goto done;
|
||||
}
|
||||
- else
|
||||
- __condvar_dec_grefs (cond, g, private);
|
||||
|
||||
/* Reload signals. See above for MO. */
|
||||
signals = atomic_load_acquire (cond->__data.__g_signals + g);
|
||||
diff --git a/nptl/tst-cond22.c b/nptl/tst-cond22.c
|
||||
index 64f19ea0a55af057..ebeeeaf666070076 100644
|
||||
--- a/nptl/tst-cond22.c
|
||||
+++ b/nptl/tst-cond22.c
|
||||
@@ -106,10 +106,10 @@ do_test (void)
|
||||
status = 1;
|
||||
}
|
||||
|
||||
- printf ("cond = { %llu, %llu, %u/%u/%u, %u/%u/%u, %u, %u }\n",
|
||||
+ printf ("cond = { %llu, %llu, %u/%u, %u/%u, %u, %u }\n",
|
||||
c.__data.__wseq, c.__data.__g1_start,
|
||||
- c.__data.__g_signals[0], c.__data.__g_refs[0], c.__data.__g_size[0],
|
||||
- c.__data.__g_signals[1], c.__data.__g_refs[1], c.__data.__g_size[1],
|
||||
+ c.__data.__g_signals[0], c.__data.__g_size[0],
|
||||
+ c.__data.__g_signals[1], c.__data.__g_size[1],
|
||||
c.__data.__g1_orig_size, c.__data.__wrefs);
|
||||
|
||||
if (pthread_create (&th, NULL, tf, (void *) 1l) != 0)
|
||||
@@ -149,10 +149,10 @@ do_test (void)
|
||||
status = 1;
|
||||
}
|
||||
|
||||
- printf ("cond = { %llu, %llu, %u/%u/%u, %u/%u/%u, %u, %u }\n",
|
||||
+ printf ("cond = { %llu, %llu, %u/%u, %u/%u, %u, %u }\n",
|
||||
c.__data.__wseq, c.__data.__g1_start,
|
||||
- c.__data.__g_signals[0], c.__data.__g_refs[0], c.__data.__g_size[0],
|
||||
- c.__data.__g_signals[1], c.__data.__g_refs[1], c.__data.__g_size[1],
|
||||
+ c.__data.__g_signals[0], c.__data.__g_size[0],
|
||||
+ c.__data.__g_signals[1], c.__data.__g_size[1],
|
||||
c.__data.__g1_orig_size, c.__data.__wrefs);
|
||||
|
||||
return status;
|
||||
diff --git a/sysdeps/nptl/bits/thread-shared-types.h b/sysdeps/nptl/bits/thread-shared-types.h
|
||||
index 44bf1e358dbdaaff..5cd33b765d9689eb 100644
|
||||
--- a/sysdeps/nptl/bits/thread-shared-types.h
|
||||
+++ b/sysdeps/nptl/bits/thread-shared-types.h
|
||||
@@ -109,8 +109,7 @@ struct __pthread_cond_s
|
||||
unsigned int __high;
|
||||
} __g1_start32;
|
||||
};
|
||||
- unsigned int __g_refs[2] __LOCK_ALIGNMENT;
|
||||
- unsigned int __g_size[2];
|
||||
+ unsigned int __g_size[2] __LOCK_ALIGNMENT;
|
||||
unsigned int __g1_orig_size;
|
||||
unsigned int __wrefs;
|
||||
unsigned int __g_signals[2];
|
||||
diff --git a/sysdeps/nptl/pthread.h b/sysdeps/nptl/pthread.h
|
||||
index 43146e91c9d9579b..7ea6001784783371 100644
|
||||
--- a/sysdeps/nptl/pthread.h
|
||||
+++ b/sysdeps/nptl/pthread.h
|
||||
@@ -152,7 +152,7 @@ enum
|
||||
|
||||
|
||||
/* Conditional variable handling. */
|
||||
-#define PTHREAD_COND_INITIALIZER { { {0}, {0}, {0, 0}, {0, 0}, 0, 0, {0, 0} } }
|
||||
+#define PTHREAD_COND_INITIALIZER { { {0}, {0}, {0, 0}, 0, 0, {0, 0} } }
|
||||
|
||||
|
||||
/* Cleanup buffers */
|
91
glibc-RHEL-2419-6.patch
Normal file
91
glibc-RHEL-2419-6.patch
Normal file
@ -0,0 +1,91 @@
|
||||
commit 929a4764ac90382616b6a21f099192b2475da674
|
||||
Author: Malte Skarupke <malteskarupke@fastmail.fm>
|
||||
Date: Wed Dec 4 08:03:44 2024 -0500
|
||||
|
||||
nptl: Use a single loop in pthread_cond_wait instaed of a nested loop
|
||||
|
||||
The loop was a little more complicated than necessary. There was only one
|
||||
break statement out of the inner loop, and the outer loop was nearly empty.
|
||||
So just remove the outer loop, moving its code to the one break statement in
|
||||
the inner loop. This allows us to replace all gotos with break statements.
|
||||
|
||||
Signed-off-by: Malte Skarupke <malteskarupke@fastmail.fm>
|
||||
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
||||
|
||||
diff --git a/nptl/pthread_cond_wait.c b/nptl/pthread_cond_wait.c
|
||||
index 9652dbafe08dfde1..4886056d136db138 100644
|
||||
--- a/nptl/pthread_cond_wait.c
|
||||
+++ b/nptl/pthread_cond_wait.c
|
||||
@@ -383,17 +383,15 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
return err;
|
||||
}
|
||||
|
||||
- /* Now wait until a signal is available in our group or it is closed.
|
||||
- Acquire MO so that if we observe (signals == lowseq) after group
|
||||
- switching in __condvar_quiesce_and_switch_g1, we synchronize with that
|
||||
- store and will see the prior update of __g1_start done while switching
|
||||
- groups too. */
|
||||
- unsigned int signals = atomic_load_acquire (cond->__data.__g_signals + g);
|
||||
-
|
||||
- do
|
||||
- {
|
||||
+
|
||||
while (1)
|
||||
{
|
||||
+ /* Now wait until a signal is available in our group or it is closed.
|
||||
+ Acquire MO so that if we observe (signals == lowseq) after group
|
||||
+ switching in __condvar_quiesce_and_switch_g1, we synchronize with that
|
||||
+ store and will see the prior update of __g1_start done while switching
|
||||
+ groups too. */
|
||||
+ unsigned int signals = atomic_load_acquire (cond->__data.__g_signals + g);
|
||||
uint64_t g1_start = __condvar_load_g1_start_relaxed (cond);
|
||||
unsigned int lowseq = (g1_start & 1) == g ? signals : g1_start & ~1U;
|
||||
|
||||
@@ -402,7 +400,7 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
/* If the group is closed already,
|
||||
then this waiter originally had enough extra signals to
|
||||
consume, up until the time its group was closed. */
|
||||
- goto done;
|
||||
+ break;
|
||||
}
|
||||
|
||||
/* If there is an available signal, don't block.
|
||||
@@ -411,7 +409,16 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
G2, but in either case we're allowed to consume the available
|
||||
signal and should not block anymore. */
|
||||
if ((int)(signals - lowseq) >= 2)
|
||||
- break;
|
||||
+ {
|
||||
+ /* Try to grab a signal. See above for MO. (if we do another loop
|
||||
+ iteration we need to see the correct value of g1_start) */
|
||||
+ if (atomic_compare_exchange_weak_acquire (
|
||||
+ cond->__data.__g_signals + g,
|
||||
+ &signals, signals - 2))
|
||||
+ break;
|
||||
+ else
|
||||
+ continue;
|
||||
+ }
|
||||
|
||||
// Now block.
|
||||
struct _pthread_cleanup_buffer buffer;
|
||||
@@ -432,19 +439,9 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
/* If we timed out, we effectively cancel waiting. */
|
||||
__condvar_cancel_waiting (cond, seq, g, private);
|
||||
result = err;
|
||||
- goto done;
|
||||
+ break;
|
||||
}
|
||||
-
|
||||
- /* Reload signals. See above for MO. */
|
||||
- signals = atomic_load_acquire (cond->__data.__g_signals + g);
|
||||
}
|
||||
- }
|
||||
- /* Try to grab a signal. See above for MO. (if we do another loop
|
||||
- iteration we need to see the correct value of g1_start) */
|
||||
- while (!atomic_compare_exchange_weak_acquire (cond->__data.__g_signals + g,
|
||||
- &signals, signals - 2));
|
||||
-
|
||||
- done:
|
||||
|
||||
/* Confirm that we have been woken. We do that before acquiring the mutex
|
||||
to allow for execution of pthread_cond_destroy while having acquired the
|
138
glibc-RHEL-2419-7.patch
Normal file
138
glibc-RHEL-2419-7.patch
Normal file
@ -0,0 +1,138 @@
|
||||
commit ee6c14ed59d480720721aaacc5fb03213dc153da
|
||||
Author: Malte Skarupke <malteskarupke@fastmail.fm>
|
||||
Date: Wed Dec 4 08:04:10 2024 -0500
|
||||
|
||||
nptl: Fix indentation
|
||||
|
||||
In my previous change I turned a nested loop into a simple loop. I'm doing
|
||||
the resulting indentation changes in a separate commit to make the diff on
|
||||
the previous commit easier to review.
|
||||
|
||||
Signed-off-by: Malte Skarupke <malteskarupke@fastmail.fm>
|
||||
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
||||
|
||||
diff --git a/nptl/pthread_cond_wait.c b/nptl/pthread_cond_wait.c
|
||||
index 4886056d136db138..6c130436b016977a 100644
|
||||
--- a/nptl/pthread_cond_wait.c
|
||||
+++ b/nptl/pthread_cond_wait.c
|
||||
@@ -384,65 +384,65 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
}
|
||||
|
||||
|
||||
- while (1)
|
||||
- {
|
||||
- /* Now wait until a signal is available in our group or it is closed.
|
||||
- Acquire MO so that if we observe (signals == lowseq) after group
|
||||
- switching in __condvar_quiesce_and_switch_g1, we synchronize with that
|
||||
- store and will see the prior update of __g1_start done while switching
|
||||
- groups too. */
|
||||
- unsigned int signals = atomic_load_acquire (cond->__data.__g_signals + g);
|
||||
- uint64_t g1_start = __condvar_load_g1_start_relaxed (cond);
|
||||
- unsigned int lowseq = (g1_start & 1) == g ? signals : g1_start & ~1U;
|
||||
-
|
||||
- if (seq < (g1_start >> 1))
|
||||
- {
|
||||
- /* If the group is closed already,
|
||||
- then this waiter originally had enough extra signals to
|
||||
- consume, up until the time its group was closed. */
|
||||
- break;
|
||||
- }
|
||||
-
|
||||
- /* If there is an available signal, don't block.
|
||||
- If __g1_start has advanced at all, then we must be in G1
|
||||
- by now, perhaps in the process of switching back to an older
|
||||
- G2, but in either case we're allowed to consume the available
|
||||
- signal and should not block anymore. */
|
||||
- if ((int)(signals - lowseq) >= 2)
|
||||
- {
|
||||
- /* Try to grab a signal. See above for MO. (if we do another loop
|
||||
- iteration we need to see the correct value of g1_start) */
|
||||
- if (atomic_compare_exchange_weak_acquire (
|
||||
- cond->__data.__g_signals + g,
|
||||
+ while (1)
|
||||
+ {
|
||||
+ /* Now wait until a signal is available in our group or it is closed.
|
||||
+ Acquire MO so that if we observe (signals == lowseq) after group
|
||||
+ switching in __condvar_quiesce_and_switch_g1, we synchronize with that
|
||||
+ store and will see the prior update of __g1_start done while switching
|
||||
+ groups too. */
|
||||
+ unsigned int signals = atomic_load_acquire (cond->__data.__g_signals + g);
|
||||
+ uint64_t g1_start = __condvar_load_g1_start_relaxed (cond);
|
||||
+ unsigned int lowseq = (g1_start & 1) == g ? signals : g1_start & ~1U;
|
||||
+
|
||||
+ if (seq < (g1_start >> 1))
|
||||
+ {
|
||||
+ /* If the group is closed already,
|
||||
+ then this waiter originally had enough extra signals to
|
||||
+ consume, up until the time its group was closed. */
|
||||
+ break;
|
||||
+ }
|
||||
+
|
||||
+ /* If there is an available signal, don't block.
|
||||
+ If __g1_start has advanced at all, then we must be in G1
|
||||
+ by now, perhaps in the process of switching back to an older
|
||||
+ G2, but in either case we're allowed to consume the available
|
||||
+ signal and should not block anymore. */
|
||||
+ if ((int)(signals - lowseq) >= 2)
|
||||
+ {
|
||||
+ /* Try to grab a signal. See above for MO. (if we do another loop
|
||||
+ iteration we need to see the correct value of g1_start) */
|
||||
+ if (atomic_compare_exchange_weak_acquire (
|
||||
+ cond->__data.__g_signals + g,
|
||||
&signals, signals - 2))
|
||||
- break;
|
||||
- else
|
||||
- continue;
|
||||
- }
|
||||
-
|
||||
- // Now block.
|
||||
- struct _pthread_cleanup_buffer buffer;
|
||||
- struct _condvar_cleanup_buffer cbuffer;
|
||||
- cbuffer.wseq = wseq;
|
||||
- cbuffer.cond = cond;
|
||||
- cbuffer.mutex = mutex;
|
||||
- cbuffer.private = private;
|
||||
- __pthread_cleanup_push (&buffer, __condvar_cleanup_waiting, &cbuffer);
|
||||
-
|
||||
- err = __futex_abstimed_wait_cancelable64 (
|
||||
- cond->__data.__g_signals + g, signals, clockid, abstime, private);
|
||||
-
|
||||
- __pthread_cleanup_pop (&buffer, 0);
|
||||
-
|
||||
- if (__glibc_unlikely (err == ETIMEDOUT || err == EOVERFLOW))
|
||||
- {
|
||||
- /* If we timed out, we effectively cancel waiting. */
|
||||
- __condvar_cancel_waiting (cond, seq, g, private);
|
||||
- result = err;
|
||||
break;
|
||||
- }
|
||||
+ else
|
||||
+ continue;
|
||||
}
|
||||
|
||||
+ // Now block.
|
||||
+ struct _pthread_cleanup_buffer buffer;
|
||||
+ struct _condvar_cleanup_buffer cbuffer;
|
||||
+ cbuffer.wseq = wseq;
|
||||
+ cbuffer.cond = cond;
|
||||
+ cbuffer.mutex = mutex;
|
||||
+ cbuffer.private = private;
|
||||
+ __pthread_cleanup_push (&buffer, __condvar_cleanup_waiting, &cbuffer);
|
||||
+
|
||||
+ err = __futex_abstimed_wait_cancelable64 (
|
||||
+ cond->__data.__g_signals + g, signals, clockid, abstime, private);
|
||||
+
|
||||
+ __pthread_cleanup_pop (&buffer, 0);
|
||||
+
|
||||
+ if (__glibc_unlikely (err == ETIMEDOUT || err == EOVERFLOW))
|
||||
+ {
|
||||
+ /* If we timed out, we effectively cancel waiting. */
|
||||
+ __condvar_cancel_waiting (cond, seq, g, private);
|
||||
+ result = err;
|
||||
+ break;
|
||||
+ }
|
||||
+ }
|
||||
+
|
||||
/* Confirm that we have been woken. We do that before acquiring the mutex
|
||||
to allow for execution of pthread_cond_destroy while having acquired the
|
||||
mutex. */
|
147
glibc-RHEL-2419-8.patch
Normal file
147
glibc-RHEL-2419-8.patch
Normal file
@ -0,0 +1,147 @@
|
||||
commit 4b79e27a5073c02f6bff9aa8f4791230a0ab1867
|
||||
Author: Malte Skarupke <malteskarupke@fastmail.fm>
|
||||
Date: Wed Dec 4 08:04:54 2024 -0500
|
||||
|
||||
nptl: rename __condvar_quiesce_and_switch_g1
|
||||
|
||||
This function no longer waits for threads to leave g1, so rename it to
|
||||
__condvar_switch_g1
|
||||
|
||||
Signed-off-by: Malte Skarupke <malteskarupke@fastmail.fm>
|
||||
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
||||
|
||||
diff --git a/nptl/pthread_cond_broadcast.c b/nptl/pthread_cond_broadcast.c
|
||||
index f1275b2f15817788..3dff819952718892 100644
|
||||
--- a/nptl/pthread_cond_broadcast.c
|
||||
+++ b/nptl/pthread_cond_broadcast.c
|
||||
@@ -61,7 +61,7 @@ ___pthread_cond_broadcast (pthread_cond_t *cond)
|
||||
cond->__data.__g_size[g1] << 1);
|
||||
cond->__data.__g_size[g1] = 0;
|
||||
|
||||
- /* We need to wake G1 waiters before we quiesce G1 below. */
|
||||
+ /* We need to wake G1 waiters before we switch G1 below. */
|
||||
/* TODO Only set it if there are indeed futex waiters. We could
|
||||
also try to move this out of the critical section in cases when
|
||||
G2 is empty (and we don't need to quiesce). */
|
||||
@@ -70,7 +70,7 @@ ___pthread_cond_broadcast (pthread_cond_t *cond)
|
||||
|
||||
/* G1 is complete. Step (2) is next unless there are no waiters in G2, in
|
||||
which case we can stop. */
|
||||
- if (__condvar_quiesce_and_switch_g1 (cond, wseq, &g1, private))
|
||||
+ if (__condvar_switch_g1 (cond, wseq, &g1, private))
|
||||
{
|
||||
/* Step (3): Send signals to all waiters in the old G2 / new G1. */
|
||||
atomic_fetch_add_relaxed (cond->__data.__g_signals + g1,
|
||||
diff --git a/nptl/pthread_cond_common.c b/nptl/pthread_cond_common.c
|
||||
index 517ad52077829552..7b2b1e4605f163e7 100644
|
||||
--- a/nptl/pthread_cond_common.c
|
||||
+++ b/nptl/pthread_cond_common.c
|
||||
@@ -329,16 +329,15 @@ __condvar_get_private (int flags)
|
||||
return FUTEX_SHARED;
|
||||
}
|
||||
|
||||
-/* This closes G1 (whose index is in G1INDEX), waits for all futex waiters to
|
||||
- leave G1, converts G1 into a fresh G2, and then switches group roles so that
|
||||
- the former G2 becomes the new G1 ending at the current __wseq value when we
|
||||
- eventually make the switch (WSEQ is just an observation of __wseq by the
|
||||
- signaler).
|
||||
+/* This closes G1 (whose index is in G1INDEX), converts G1 into a fresh G2,
|
||||
+ and then switches group roles so that the former G2 becomes the new G1
|
||||
+ ending at the current __wseq value when we eventually make the switch
|
||||
+ (WSEQ is just an observation of __wseq by the signaler).
|
||||
If G2 is empty, it will not switch groups because then it would create an
|
||||
empty G1 which would require switching groups again on the next signal.
|
||||
Returns false iff groups were not switched because G2 was empty. */
|
||||
static bool __attribute__ ((unused))
|
||||
-__condvar_quiesce_and_switch_g1 (pthread_cond_t *cond, uint64_t wseq,
|
||||
+__condvar_switch_g1 (pthread_cond_t *cond, uint64_t wseq,
|
||||
unsigned int *g1index, int private)
|
||||
{
|
||||
unsigned int g1 = *g1index;
|
||||
@@ -354,8 +353,7 @@ __condvar_quiesce_and_switch_g1 (pthread_cond_t *cond, uint64_t wseq,
|
||||
+ cond->__data.__g_size[g1 ^ 1]) == 0)
|
||||
return false;
|
||||
|
||||
- /* Now try to close and quiesce G1. We have to consider the following kinds
|
||||
- of waiters:
|
||||
+ /* We have to consider the following kinds of waiters:
|
||||
* Waiters from less recent groups than G1 are not affected because
|
||||
nothing will change for them apart from __g1_start getting larger.
|
||||
* New waiters arriving concurrently with the group switching will all go
|
||||
@@ -363,12 +361,12 @@ __condvar_quiesce_and_switch_g1 (pthread_cond_t *cond, uint64_t wseq,
|
||||
are not affected.
|
||||
* Waiters in G1 have already received a signal and been woken. */
|
||||
|
||||
- /* Update __g1_start, which finishes closing this group. The value we add
|
||||
- will never be negative because old_orig_size can only be zero when we
|
||||
- switch groups the first time after a condvar was initialized, in which
|
||||
- case G1 will be at index 1 and we will add a value of 1.
|
||||
- Relaxed MO is fine because the change comes with no additional
|
||||
- constraints that others would have to observe. */
|
||||
+ /* Update __g1_start, which closes this group. The value we add will never
|
||||
+ be negative because old_orig_size can only be zero when we switch groups
|
||||
+ the first time after a condvar was initialized, in which case G1 will be
|
||||
+ at index 1 and we will add a value of 1. Relaxed MO is fine because the
|
||||
+ change comes with no additional constraints that others would have to
|
||||
+ observe. */
|
||||
__condvar_add_g1_start_relaxed (cond,
|
||||
(old_orig_size << 1) + (g1 == 1 ? 1 : - 1));
|
||||
|
||||
diff --git a/nptl/pthread_cond_signal.c b/nptl/pthread_cond_signal.c
|
||||
index 171193b13e203290..4f7639e386fc207a 100644
|
||||
--- a/nptl/pthread_cond_signal.c
|
||||
+++ b/nptl/pthread_cond_signal.c
|
||||
@@ -70,18 +70,17 @@ ___pthread_cond_signal (pthread_cond_t *cond)
|
||||
bool do_futex_wake = false;
|
||||
|
||||
/* If G1 is still receiving signals, we put the signal there. If not, we
|
||||
- check if G2 has waiters, and if so, quiesce and switch G1 to the former
|
||||
- G2; if this results in a new G1 with waiters (G2 might have cancellations
|
||||
- already, see __condvar_quiesce_and_switch_g1), we put the signal in the
|
||||
- new G1. */
|
||||
+ check if G2 has waiters, and if so, switch G1 to the former G2; if this
|
||||
+ results in a new G1 with waiters (G2 might have cancellations already,
|
||||
+ see __condvar_switch_g1), we put the signal in the new G1. */
|
||||
if ((cond->__data.__g_size[g1] != 0)
|
||||
- || __condvar_quiesce_and_switch_g1 (cond, wseq, &g1, private))
|
||||
+ || __condvar_switch_g1 (cond, wseq, &g1, private))
|
||||
{
|
||||
/* Add a signal. Relaxed MO is fine because signaling does not need to
|
||||
- establish a happens-before relation (see above). We do not mask the
|
||||
- release-MO store when initializing a group in
|
||||
- __condvar_quiesce_and_switch_g1 because we use an atomic
|
||||
- read-modify-write and thus extend that store's release sequence. */
|
||||
+ establish a happens-before relation (see above). We do not mask the
|
||||
+ release-MO store when initializing a group in __condvar_switch_g1
|
||||
+ because we use an atomic read-modify-write and thus extend that
|
||||
+ store's release sequence. */
|
||||
atomic_fetch_add_relaxed (cond->__data.__g_signals + g1, 2);
|
||||
cond->__data.__g_size[g1]--;
|
||||
/* TODO Only set it if there are indeed futex waiters. */
|
||||
diff --git a/nptl/pthread_cond_wait.c b/nptl/pthread_cond_wait.c
|
||||
index 6c130436b016977a..173bd134164eed44 100644
|
||||
--- a/nptl/pthread_cond_wait.c
|
||||
+++ b/nptl/pthread_cond_wait.c
|
||||
@@ -355,8 +355,7 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
because we do not need to establish any happens-before relation with
|
||||
signalers (see __pthread_cond_signal); modification order alone
|
||||
establishes a total order of waiters/signals. We do need acquire MO
|
||||
- to synchronize with group reinitialization in
|
||||
- __condvar_quiesce_and_switch_g1. */
|
||||
+ to synchronize with group reinitialization in __condvar_switch_g1. */
|
||||
uint64_t wseq = __condvar_fetch_add_wseq_acquire (cond, 2);
|
||||
/* Find our group's index. We always go into what was G2 when we acquired
|
||||
our position. */
|
||||
@@ -388,9 +387,9 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
{
|
||||
/* Now wait until a signal is available in our group or it is closed.
|
||||
Acquire MO so that if we observe (signals == lowseq) after group
|
||||
- switching in __condvar_quiesce_and_switch_g1, we synchronize with that
|
||||
- store and will see the prior update of __g1_start done while switching
|
||||
- groups too. */
|
||||
+ switching in __condvar_switch_g1, we synchronize with that store and
|
||||
+ will see the prior update of __g1_start done while switching groups
|
||||
+ too. */
|
||||
unsigned int signals = atomic_load_acquire (cond->__data.__g_signals + g);
|
||||
uint64_t g1_start = __condvar_load_g1_start_relaxed (cond);
|
||||
unsigned int lowseq = (g1_start & 1) == g ? signals : g1_start & ~1U;
|
179
glibc-RHEL-2419-9.patch
Normal file
179
glibc-RHEL-2419-9.patch
Normal file
@ -0,0 +1,179 @@
|
||||
commit 91bb902f58264a2fd50fbce8f39a9a290dd23706
|
||||
Author: Malte Skarupke <malteskarupke@fastmail.fm>
|
||||
Date: Wed Dec 4 08:05:40 2024 -0500
|
||||
|
||||
nptl: Use all of g1_start and g_signals
|
||||
|
||||
The LSB of g_signals was unused. The LSB of g1_start was used to indicate
|
||||
which group is G2. This was used to always go to sleep in pthread_cond_wait
|
||||
if a waiter is in G2. A comment earlier in the file says that this is not
|
||||
correct to do:
|
||||
|
||||
"Waiters cannot determine whether they are currently in G2 or G1 -- but they
|
||||
do not have to because all they are interested in is whether there are
|
||||
available signals"
|
||||
|
||||
I either would have had to update the comment, or get rid of the check. I
|
||||
chose to get rid of the check. In fact I don't quite know why it was there.
|
||||
There will never be available signals for group G2, so we didn't need the
|
||||
special case. Even if there were, this would just be a spurious wake. This
|
||||
might have caught some cases where the count has wrapped around, but it
|
||||
wouldn't reliably do that, (and even if it did, why would you want to force a
|
||||
sleep in that case?) and we don't support that many concurrent waiters
|
||||
anyway. Getting rid of it allows us to use one more bit, making us more
|
||||
robust to wraparound.
|
||||
|
||||
Signed-off-by: Malte Skarupke <malteskarupke@fastmail.fm>
|
||||
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
||||
|
||||
diff --git a/nptl/pthread_cond_broadcast.c b/nptl/pthread_cond_broadcast.c
|
||||
index 3dff819952718892..6fd6cfe9d002c5d5 100644
|
||||
--- a/nptl/pthread_cond_broadcast.c
|
||||
+++ b/nptl/pthread_cond_broadcast.c
|
||||
@@ -58,7 +58,7 @@ ___pthread_cond_broadcast (pthread_cond_t *cond)
|
||||
{
|
||||
/* Add as many signals as the remaining size of the group. */
|
||||
atomic_fetch_add_relaxed (cond->__data.__g_signals + g1,
|
||||
- cond->__data.__g_size[g1] << 1);
|
||||
+ cond->__data.__g_size[g1]);
|
||||
cond->__data.__g_size[g1] = 0;
|
||||
|
||||
/* We need to wake G1 waiters before we switch G1 below. */
|
||||
@@ -74,7 +74,7 @@ ___pthread_cond_broadcast (pthread_cond_t *cond)
|
||||
{
|
||||
/* Step (3): Send signals to all waiters in the old G2 / new G1. */
|
||||
atomic_fetch_add_relaxed (cond->__data.__g_signals + g1,
|
||||
- cond->__data.__g_size[g1] << 1);
|
||||
+ cond->__data.__g_size[g1]);
|
||||
cond->__data.__g_size[g1] = 0;
|
||||
/* TODO Only set it if there are indeed futex waiters. */
|
||||
do_futex_wake = true;
|
||||
diff --git a/nptl/pthread_cond_common.c b/nptl/pthread_cond_common.c
|
||||
index 7b2b1e4605f163e7..485aca4076a372d7 100644
|
||||
--- a/nptl/pthread_cond_common.c
|
||||
+++ b/nptl/pthread_cond_common.c
|
||||
@@ -348,9 +348,9 @@ __condvar_switch_g1 (pthread_cond_t *cond, uint64_t wseq,
|
||||
behavior.
|
||||
Note that this works correctly for a zero-initialized condvar too. */
|
||||
unsigned int old_orig_size = __condvar_get_orig_size (cond);
|
||||
- uint64_t old_g1_start = __condvar_load_g1_start_relaxed (cond) >> 1;
|
||||
- if (((unsigned) (wseq - old_g1_start - old_orig_size)
|
||||
- + cond->__data.__g_size[g1 ^ 1]) == 0)
|
||||
+ uint64_t old_g1_start = __condvar_load_g1_start_relaxed (cond);
|
||||
+ uint64_t new_g1_start = old_g1_start + old_orig_size;
|
||||
+ if (((unsigned) (wseq - new_g1_start) + cond->__data.__g_size[g1 ^ 1]) == 0)
|
||||
return false;
|
||||
|
||||
/* We have to consider the following kinds of waiters:
|
||||
@@ -361,16 +361,10 @@ __condvar_switch_g1 (pthread_cond_t *cond, uint64_t wseq,
|
||||
are not affected.
|
||||
* Waiters in G1 have already received a signal and been woken. */
|
||||
|
||||
- /* Update __g1_start, which closes this group. The value we add will never
|
||||
- be negative because old_orig_size can only be zero when we switch groups
|
||||
- the first time after a condvar was initialized, in which case G1 will be
|
||||
- at index 1 and we will add a value of 1. Relaxed MO is fine because the
|
||||
- change comes with no additional constraints that others would have to
|
||||
- observe. */
|
||||
- __condvar_add_g1_start_relaxed (cond,
|
||||
- (old_orig_size << 1) + (g1 == 1 ? 1 : - 1));
|
||||
-
|
||||
- unsigned int lowseq = ((old_g1_start + old_orig_size) << 1) & ~1U;
|
||||
+ /* Update __g1_start, which closes this group. Relaxed MO is fine because
|
||||
+ the change comes with no additional constraints that others would have
|
||||
+ to observe. */
|
||||
+ __condvar_add_g1_start_relaxed (cond, old_orig_size);
|
||||
|
||||
/* At this point, the old G1 is now a valid new G2 (but not in use yet).
|
||||
No old waiter can neither grab a signal nor acquire a reference without
|
||||
@@ -382,13 +376,13 @@ __condvar_switch_g1 (pthread_cond_t *cond, uint64_t wseq,
|
||||
g1 ^= 1;
|
||||
*g1index ^= 1;
|
||||
|
||||
- /* Now advance the new G1 g_signals to the new lowseq, giving it
|
||||
+ /* Now advance the new G1 g_signals to the new g1_start, giving it
|
||||
an effective signal count of 0 to start. */
|
||||
- atomic_store_release (cond->__data.__g_signals + g1, lowseq);
|
||||
+ atomic_store_release (cond->__data.__g_signals + g1, (unsigned)new_g1_start);
|
||||
|
||||
/* These values are just observed by signalers, and thus protected by the
|
||||
lock. */
|
||||
- unsigned int orig_size = wseq - (old_g1_start + old_orig_size);
|
||||
+ unsigned int orig_size = wseq - new_g1_start;
|
||||
__condvar_set_orig_size (cond, orig_size);
|
||||
/* Use and addition to not loose track of cancellations in what was
|
||||
previously G2. */
|
||||
diff --git a/nptl/pthread_cond_signal.c b/nptl/pthread_cond_signal.c
|
||||
index 4f7639e386fc207a..9a5bac92fe8fc246 100644
|
||||
--- a/nptl/pthread_cond_signal.c
|
||||
+++ b/nptl/pthread_cond_signal.c
|
||||
@@ -81,7 +81,7 @@ ___pthread_cond_signal (pthread_cond_t *cond)
|
||||
release-MO store when initializing a group in __condvar_switch_g1
|
||||
because we use an atomic read-modify-write and thus extend that
|
||||
store's release sequence. */
|
||||
- atomic_fetch_add_relaxed (cond->__data.__g_signals + g1, 2);
|
||||
+ atomic_fetch_add_relaxed (cond->__data.__g_signals + g1, 1);
|
||||
cond->__data.__g_size[g1]--;
|
||||
/* TODO Only set it if there are indeed futex waiters. */
|
||||
do_futex_wake = true;
|
||||
diff --git a/nptl/pthread_cond_wait.c b/nptl/pthread_cond_wait.c
|
||||
index 173bd134164eed44..944b241ec26b9b32 100644
|
||||
--- a/nptl/pthread_cond_wait.c
|
||||
+++ b/nptl/pthread_cond_wait.c
|
||||
@@ -85,7 +85,7 @@ __condvar_cancel_waiting (pthread_cond_t *cond, uint64_t seq, unsigned int g,
|
||||
not hold a reference on the group. */
|
||||
__condvar_acquire_lock (cond, private);
|
||||
|
||||
- uint64_t g1_start = __condvar_load_g1_start_relaxed (cond) >> 1;
|
||||
+ uint64_t g1_start = __condvar_load_g1_start_relaxed (cond);
|
||||
if (g1_start > seq)
|
||||
{
|
||||
/* Our group is closed, so someone provided enough signals for it.
|
||||
@@ -260,7 +260,6 @@ __condvar_cleanup_waiting (void *arg)
|
||||
* Waiters fetch-add while having acquire the mutex associated with the
|
||||
condvar. Signalers load it and fetch-xor it concurrently.
|
||||
__g1_start: Starting position of G1 (inclusive)
|
||||
- * LSB is index of current G2.
|
||||
* Modified by signalers while having acquired the condvar-internal lock
|
||||
and observed concurrently by waiters.
|
||||
__g1_orig_size: Initial size of G1
|
||||
@@ -281,11 +280,9 @@ __condvar_cleanup_waiting (void *arg)
|
||||
* Reference count used by waiters concurrently with signalers that have
|
||||
acquired the condvar-internal lock.
|
||||
__g_signals: The number of signals that can still be consumed, relative to
|
||||
- the current g1_start. (i.e. bits 31 to 1 of __g_signals are bits
|
||||
- 31 to 1 of g1_start with the signal count added)
|
||||
+ the current g1_start. (i.e. g1_start with the signal count added)
|
||||
* Used as a futex word by waiters. Used concurrently by waiters and
|
||||
signalers.
|
||||
- * LSB is currently reserved and 0.
|
||||
__g_size: Waiters remaining in this group (i.e., which have not been
|
||||
signaled yet.
|
||||
* Accessed by signalers and waiters that cancel waiting (both do so only
|
||||
@@ -392,9 +389,8 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
too. */
|
||||
unsigned int signals = atomic_load_acquire (cond->__data.__g_signals + g);
|
||||
uint64_t g1_start = __condvar_load_g1_start_relaxed (cond);
|
||||
- unsigned int lowseq = (g1_start & 1) == g ? signals : g1_start & ~1U;
|
||||
|
||||
- if (seq < (g1_start >> 1))
|
||||
+ if (seq < g1_start)
|
||||
{
|
||||
/* If the group is closed already,
|
||||
then this waiter originally had enough extra signals to
|
||||
@@ -407,13 +403,13 @@ __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
|
||||
by now, perhaps in the process of switching back to an older
|
||||
G2, but in either case we're allowed to consume the available
|
||||
signal and should not block anymore. */
|
||||
- if ((int)(signals - lowseq) >= 2)
|
||||
+ if ((int)(signals - (unsigned int)g1_start) > 0)
|
||||
{
|
||||
/* Try to grab a signal. See above for MO. (if we do another loop
|
||||
iteration we need to see the correct value of g1_start) */
|
||||
if (atomic_compare_exchange_weak_acquire (
|
||||
cond->__data.__g_signals + g,
|
||||
- &signals, signals - 2))
|
||||
+ &signals, signals - 1))
|
||||
break;
|
||||
else
|
||||
continue;
|
15
glibc.spec
15
glibc.spec
@ -157,7 +157,7 @@ end \
|
||||
Summary: The GNU libc libraries
|
||||
Name: glibc
|
||||
Version: %{glibcversion}
|
||||
Release: 163%{?dist}
|
||||
Release: 164%{?dist}
|
||||
|
||||
# In general, GPLv2+ is used by programs, LGPLv2+ is used for
|
||||
# libraries.
|
||||
@ -1098,6 +1098,16 @@ Patch790: glibc-RHEL-67592-1.patch
|
||||
Patch791: glibc-RHEL-67592-2.patch
|
||||
Patch792: glibc-RHEL-67592-3.patch
|
||||
Patch793: glibc-RHEL-67592-4.patch
|
||||
Patch794: glibc-RHEL-2419-1.patch
|
||||
Patch795: glibc-RHEL-2419-2.patch
|
||||
Patch796: glibc-RHEL-2419-3.patch
|
||||
Patch797: glibc-RHEL-2419-4.patch
|
||||
Patch798: glibc-RHEL-2419-5.patch
|
||||
Patch799: glibc-RHEL-2419-6.patch
|
||||
Patch800: glibc-RHEL-2419-7.patch
|
||||
Patch801: glibc-RHEL-2419-8.patch
|
||||
Patch802: glibc-RHEL-2419-9.patch
|
||||
Patch803: glibc-RHEL-2419-10.patch
|
||||
|
||||
##############################################################################
|
||||
# Continued list of core "glibc" package information:
|
||||
@ -3091,6 +3101,9 @@ update_gconv_modules_cache ()
|
||||
%endif
|
||||
|
||||
%changelog
|
||||
* Fri Feb 7 2025 Carlos O'Donell <carlos@redhat.com> - 2.34-164
|
||||
- Fix missed wakeup in POSIX thread condition variables (RHEL-2419)
|
||||
|
||||
* Tue Feb 4 2025 DJ Delorie <dj@redhat.com> - 2.34-163
|
||||
- manual: sigaction's sa_flags field and SA_SIGINFO (RHEL-67592)
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user