240 lines
		
	
	
		
			7.5 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			240 lines
		
	
	
		
			7.5 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
| ======================================
 | |
| Sequence counters and sequential locks
 | |
| ======================================
 | |
| 
 | |
| Introduction
 | |
| ============
 | |
| 
 | |
| Sequence counters are a reader-writer consistency mechanism with
 | |
| lockless readers (read-only retry loops), and no writer starvation. They
 | |
| are used for data that's rarely written to (e.g. system time), where the
 | |
| reader wants a consistent set of information and is willing to retry if
 | |
| that information changes.
 | |
| 
 | |
| A data set is consistent when the sequence count at the beginning of the
 | |
| read side critical section is even and the same sequence count value is
 | |
| read again at the end of the critical section. The data in the set must
 | |
| be copied out inside the read side critical section. If the sequence
 | |
| count has changed between the start and the end of the critical section,
 | |
| the reader must retry.
 | |
| 
 | |
| Writers increment the sequence count at the start and the end of their
 | |
| critical section. After starting the critical section the sequence count
 | |
| is odd and indicates to the readers that an update is in progress. At
 | |
| the end of the write side critical section the sequence count becomes
 | |
| even again which lets readers make progress.
 | |
| 
 | |
| A sequence counter write side critical section must never be preempted
 | |
| or interrupted by read side sections. Otherwise the reader will spin for
 | |
| the entire scheduler tick due to the odd sequence count value and the
 | |
| interrupted writer. If that reader belongs to a real-time scheduling
 | |
| class, it can spin forever and the kernel will livelock.
 | |
| 
 | |
| This mechanism cannot be used if the protected data contains pointers,
 | |
| as the writer can invalidate a pointer that the reader is following.
 | |
| 
 | |
| 
 | |
| .. _seqcount_t:
 | |
| 
 | |
| Sequence counters (``seqcount_t``)
 | |
| ==================================
 | |
| 
 | |
| This is the the raw counting mechanism, which does not protect against
 | |
| multiple writers.  Write side critical sections must thus be serialized
 | |
| by an external lock.
 | |
| 
 | |
| If the write serialization primitive is not implicitly disabling
 | |
| preemption, preemption must be explicitly disabled before entering the
 | |
| write side section. If the read section can be invoked from hardirq or
 | |
| softirq contexts, interrupts or bottom halves must also be respectively
 | |
| disabled before entering the write section.
 | |
| 
 | |
| If it's desired to automatically handle the sequence counter
 | |
| requirements of writer serialization and non-preemptibility, use
 | |
| :ref:`seqlock_t` instead.
 | |
| 
 | |
| Initialization::
 | |
| 
 | |
| 	/* dynamic */
 | |
| 	seqcount_t foo_seqcount;
 | |
| 	seqcount_init(&foo_seqcount);
 | |
| 
 | |
| 	/* static */
 | |
| 	static seqcount_t foo_seqcount = SEQCNT_ZERO(foo_seqcount);
 | |
| 
 | |
| 	/* C99 struct init */
 | |
| 	struct {
 | |
| 		.seq   = SEQCNT_ZERO(foo.seq),
 | |
| 	} foo;
 | |
| 
 | |
| Write path::
 | |
| 
 | |
| 	/* Serialized context with disabled preemption */
 | |
| 
 | |
| 	write_seqcount_begin(&foo_seqcount);
 | |
| 
 | |
| 	/* ... [[write-side critical section]] ... */
 | |
| 
 | |
| 	write_seqcount_end(&foo_seqcount);
 | |
| 
 | |
| Read path::
 | |
| 
 | |
| 	do {
 | |
| 		seq = read_seqcount_begin(&foo_seqcount);
 | |
| 
 | |
| 		/* ... [[read-side critical section]] ... */
 | |
| 
 | |
| 	} while (read_seqcount_retry(&foo_seqcount, seq));
 | |
| 
 | |
| 
 | |
| .. _seqcount_locktype_t:
 | |
| 
 | |
| Sequence counters with associated locks (``seqcount_LOCKNAME_t``)
 | |
| -----------------------------------------------------------------
 | |
| 
 | |
| As discussed at :ref:`seqcount_t`, sequence count write side critical
 | |
| sections must be serialized and non-preemptible. This variant of
 | |
| sequence counters associate the lock used for writer serialization at
 | |
| initialization time, which enables lockdep to validate that the write
 | |
| side critical sections are properly serialized.
 | |
| 
 | |
| This lock association is a NOOP if lockdep is disabled and has neither
 | |
| storage nor runtime overhead. If lockdep is enabled, the lock pointer is
 | |
| stored in struct seqcount and lockdep's "lock is held" assertions are
 | |
| injected at the beginning of the write side critical section to validate
 | |
| that it is properly protected.
 | |
| 
 | |
| For lock types which do not implicitly disable preemption, preemption
 | |
| protection is enforced in the write side function.
 | |
| 
 | |
| The following sequence counters with associated locks are defined:
 | |
| 
 | |
|   - ``seqcount_spinlock_t``
 | |
|   - ``seqcount_raw_spinlock_t``
 | |
|   - ``seqcount_rwlock_t``
 | |
|   - ``seqcount_mutex_t``
 | |
|   - ``seqcount_ww_mutex_t``
 | |
| 
 | |
| The sequence counter read and write APIs can take either a plain
 | |
| seqcount_t or any of the seqcount_LOCKNAME_t variants above.
 | |
| 
 | |
| Initialization (replace "LOCKNAME" with one of the supported locks)::
 | |
| 
 | |
| 	/* dynamic */
 | |
| 	seqcount_LOCKNAME_t foo_seqcount;
 | |
| 	seqcount_LOCKNAME_init(&foo_seqcount, &lock);
 | |
| 
 | |
| 	/* static */
 | |
| 	static seqcount_LOCKNAME_t foo_seqcount =
 | |
| 		SEQCNT_LOCKNAME_ZERO(foo_seqcount, &lock);
 | |
| 
 | |
| 	/* C99 struct init */
 | |
| 	struct {
 | |
| 		.seq   = SEQCNT_LOCKNAME_ZERO(foo.seq, &lock),
 | |
| 	} foo;
 | |
| 
 | |
| Write path: same as in :ref:`seqcount_t`, while running from a context
 | |
| with the associated write serialization lock acquired.
 | |
| 
 | |
| Read path: same as in :ref:`seqcount_t`.
 | |
| 
 | |
| 
 | |
| .. _seqcount_latch_t:
 | |
| 
 | |
| Latch sequence counters (``seqcount_latch_t``)
 | |
| ----------------------------------------------
 | |
| 
 | |
| Latch sequence counters are a multiversion concurrency control mechanism
 | |
| where the embedded seqcount_t counter even/odd value is used to switch
 | |
| between two copies of protected data. This allows the sequence counter
 | |
| read path to safely interrupt its own write side critical section.
 | |
| 
 | |
| Use seqcount_latch_t when the write side sections cannot be protected
 | |
| from interruption by readers. This is typically the case when the read
 | |
| side can be invoked from NMI handlers.
 | |
| 
 | |
| Check `raw_write_seqcount_latch()` for more information.
 | |
| 
 | |
| 
 | |
| .. _seqlock_t:
 | |
| 
 | |
| Sequential locks (``seqlock_t``)
 | |
| ================================
 | |
| 
 | |
| This contains the :ref:`seqcount_t` mechanism earlier discussed, plus an
 | |
| embedded spinlock for writer serialization and non-preemptibility.
 | |
| 
 | |
| If the read side section can be invoked from hardirq or softirq context,
 | |
| use the write side function variants which disable interrupts or bottom
 | |
| halves respectively.
 | |
| 
 | |
| Initialization::
 | |
| 
 | |
| 	/* dynamic */
 | |
| 	seqlock_t foo_seqlock;
 | |
| 	seqlock_init(&foo_seqlock);
 | |
| 
 | |
| 	/* static */
 | |
| 	static DEFINE_SEQLOCK(foo_seqlock);
 | |
| 
 | |
| 	/* C99 struct init */
 | |
| 	struct {
 | |
| 		.seql   = __SEQLOCK_UNLOCKED(foo.seql)
 | |
| 	} foo;
 | |
| 
 | |
| Write path::
 | |
| 
 | |
| 	write_seqlock(&foo_seqlock);
 | |
| 
 | |
| 	/* ... [[write-side critical section]] ... */
 | |
| 
 | |
| 	write_sequnlock(&foo_seqlock);
 | |
| 
 | |
| Read path, three categories:
 | |
| 
 | |
| 1. Normal Sequence readers which never block a writer but they must
 | |
|    retry if a writer is in progress by detecting change in the sequence
 | |
|    number.  Writers do not wait for a sequence reader::
 | |
| 
 | |
| 	do {
 | |
| 		seq = read_seqbegin(&foo_seqlock);
 | |
| 
 | |
| 		/* ... [[read-side critical section]] ... */
 | |
| 
 | |
| 	} while (read_seqretry(&foo_seqlock, seq));
 | |
| 
 | |
| 2. Locking readers which will wait if a writer or another locking reader
 | |
|    is in progress. A locking reader in progress will also block a writer
 | |
|    from entering its critical section. This read lock is
 | |
|    exclusive. Unlike rwlock_t, only one locking reader can acquire it::
 | |
| 
 | |
| 	read_seqlock_excl(&foo_seqlock);
 | |
| 
 | |
| 	/* ... [[read-side critical section]] ... */
 | |
| 
 | |
| 	read_sequnlock_excl(&foo_seqlock);
 | |
| 
 | |
| 3. Conditional lockless reader (as in 1), or locking reader (as in 2),
 | |
|    according to a passed marker. This is used to avoid lockless readers
 | |
|    starvation (too much retry loops) in case of a sharp spike in write
 | |
|    activity. First, a lockless read is tried (even marker passed). If
 | |
|    that trial fails (odd sequence counter is returned, which is used as
 | |
|    the next iteration marker), the lockless read is transformed to a
 | |
|    full locking read and no retry loop is necessary::
 | |
| 
 | |
| 	/* marker; even initialization */
 | |
| 	int seq = 0;
 | |
| 	do {
 | |
| 		read_seqbegin_or_lock(&foo_seqlock, &seq);
 | |
| 
 | |
| 		/* ... [[read-side critical section]] ... */
 | |
| 
 | |
| 	} while (need_seqretry(&foo_seqlock, seq));
 | |
| 	done_seqretry(&foo_seqlock, seq);
 | |
| 
 | |
| 
 | |
| API documentation
 | |
| =================
 | |
| 
 | |
| .. kernel-doc:: include/linux/seqlock.h
 |