169 lines
		
	
	
		
			5.6 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			169 lines
		
	
	
		
			5.6 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
| ===================================
 | |
| refcount_t API compared to atomic_t
 | |
| ===================================
 | |
| 
 | |
| .. contents:: :local:
 | |
| 
 | |
| Introduction
 | |
| ============
 | |
| 
 | |
| The goal of refcount_t API is to provide a minimal API for implementing
 | |
| an object's reference counters. While a generic architecture-independent
 | |
| implementation from lib/refcount.c uses atomic operations underneath,
 | |
| there are a number of differences between some of the ``refcount_*()`` and
 | |
| ``atomic_*()`` functions with regards to the memory ordering guarantees.
 | |
| This document outlines the differences and provides respective examples
 | |
| in order to help maintainers validate their code against the change in
 | |
| these memory ordering guarantees.
 | |
| 
 | |
| The terms used through this document try to follow the formal LKMM defined in
 | |
| tools/memory-model/Documentation/explanation.txt.
 | |
| 
 | |
| memory-barriers.txt and atomic_t.txt provide more background to the
 | |
| memory ordering in general and for atomic operations specifically.
 | |
| 
 | |
| Relevant types of memory ordering
 | |
| =================================
 | |
| 
 | |
| .. note:: The following section only covers some of the memory
 | |
|    ordering types that are relevant for the atomics and reference
 | |
|    counters and used through this document. For a much broader picture
 | |
|    please consult memory-barriers.txt document.
 | |
| 
 | |
| In the absence of any memory ordering guarantees (i.e. fully unordered)
 | |
| atomics & refcounters only provide atomicity and
 | |
| program order (po) relation (on the same CPU). It guarantees that
 | |
| each ``atomic_*()`` and ``refcount_*()`` operation is atomic and instructions
 | |
| are executed in program order on a single CPU.
 | |
| This is implemented using READ_ONCE()/WRITE_ONCE() and
 | |
| compare-and-swap primitives.
 | |
| 
 | |
| A strong (full) memory ordering guarantees that all prior loads and
 | |
| stores (all po-earlier instructions) on the same CPU are completed
 | |
| before any po-later instruction is executed on the same CPU.
 | |
| It also guarantees that all po-earlier stores on the same CPU
 | |
| and all propagated stores from other CPUs must propagate to all
 | |
| other CPUs before any po-later instruction is executed on the original
 | |
| CPU (A-cumulative property). This is implemented using smp_mb().
 | |
| 
 | |
| A RELEASE memory ordering guarantees that all prior loads and
 | |
| stores (all po-earlier instructions) on the same CPU are completed
 | |
| before the operation. It also guarantees that all po-earlier
 | |
| stores on the same CPU and all propagated stores from other CPUs
 | |
| must propagate to all other CPUs before the release operation
 | |
| (A-cumulative property). This is implemented using
 | |
| smp_store_release().
 | |
| 
 | |
| An ACQUIRE memory ordering guarantees that all post loads and
 | |
| stores (all po-later instructions) on the same CPU are
 | |
| completed after the acquire operation. It also guarantees that all
 | |
| po-later stores on the same CPU must propagate to all other CPUs
 | |
| after the acquire operation executes. This is implemented using
 | |
| smp_acquire__after_ctrl_dep().
 | |
| 
 | |
| A control dependency (on success) for refcounters guarantees that
 | |
| if a reference for an object was successfully obtained (reference
 | |
| counter increment or addition happened, function returned true),
 | |
| then further stores are ordered against this operation.
 | |
| Control dependency on stores are not implemented using any explicit
 | |
| barriers, but rely on CPU not to speculate on stores. This is only
 | |
| a single CPU relation and provides no guarantees for other CPUs.
 | |
| 
 | |
| 
 | |
| Comparison of functions
 | |
| =======================
 | |
| 
 | |
| case 1) - non-"Read/Modify/Write" (RMW) ops
 | |
| -------------------------------------------
 | |
| 
 | |
| Function changes:
 | |
| 
 | |
|  * atomic_set() --> refcount_set()
 | |
|  * atomic_read() --> refcount_read()
 | |
| 
 | |
| Memory ordering guarantee changes:
 | |
| 
 | |
|  * none (both fully unordered)
 | |
| 
 | |
| 
 | |
| case 2) - increment-based ops that return no value
 | |
| --------------------------------------------------
 | |
| 
 | |
| Function changes:
 | |
| 
 | |
|  * atomic_inc() --> refcount_inc()
 | |
|  * atomic_add() --> refcount_add()
 | |
| 
 | |
| Memory ordering guarantee changes:
 | |
| 
 | |
|  * none (both fully unordered)
 | |
| 
 | |
| case 3) - decrement-based RMW ops that return no value
 | |
| ------------------------------------------------------
 | |
| 
 | |
| Function changes:
 | |
| 
 | |
|  * atomic_dec() --> refcount_dec()
 | |
| 
 | |
| Memory ordering guarantee changes:
 | |
| 
 | |
|  * fully unordered --> RELEASE ordering
 | |
| 
 | |
| 
 | |
| case 4) - increment-based RMW ops that return a value
 | |
| -----------------------------------------------------
 | |
| 
 | |
| Function changes:
 | |
| 
 | |
|  * atomic_inc_not_zero() --> refcount_inc_not_zero()
 | |
|  * no atomic counterpart --> refcount_add_not_zero()
 | |
| 
 | |
| Memory ordering guarantees changes:
 | |
| 
 | |
|  * fully ordered --> control dependency on success for stores
 | |
| 
 | |
| .. note:: We really assume here that necessary ordering is provided as a
 | |
|    result of obtaining pointer to the object!
 | |
| 
 | |
| 
 | |
| case 5) - generic dec/sub decrement-based RMW ops that return a value
 | |
| ---------------------------------------------------------------------
 | |
| 
 | |
| Function changes:
 | |
| 
 | |
|  * atomic_dec_and_test() --> refcount_dec_and_test()
 | |
|  * atomic_sub_and_test() --> refcount_sub_and_test()
 | |
| 
 | |
| Memory ordering guarantees changes:
 | |
| 
 | |
|  * fully ordered --> RELEASE ordering + ACQUIRE ordering on success
 | |
| 
 | |
| 
 | |
| case 6) other decrement-based RMW ops that return a value
 | |
| ---------------------------------------------------------
 | |
| 
 | |
| Function changes:
 | |
| 
 | |
|  * no atomic counterpart --> refcount_dec_if_one()
 | |
|  * ``atomic_add_unless(&var, -1, 1)`` --> ``refcount_dec_not_one(&var)``
 | |
| 
 | |
| Memory ordering guarantees changes:
 | |
| 
 | |
|  * fully ordered --> RELEASE ordering + control dependency
 | |
| 
 | |
| .. note:: atomic_add_unless() only provides full order on success.
 | |
| 
 | |
| 
 | |
| case 7) - lock-based RMW
 | |
| ------------------------
 | |
| 
 | |
| Function changes:
 | |
| 
 | |
|  * atomic_dec_and_lock() --> refcount_dec_and_lock()
 | |
|  * atomic_dec_and_mutex_lock() --> refcount_dec_and_mutex_lock()
 | |
| 
 | |
| Memory ordering guarantees changes:
 | |
| 
 | |
|  * fully ordered --> RELEASE ordering + control dependency + hold
 | |
|    spin_lock() on success
 |