106 lines
		
	
	
		
			4.6 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			106 lines
		
	
	
		
			4.6 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
============================
 | 
						||
NUMA resource associativity
 | 
						||
============================
 | 
						||
 | 
						||
Associativity represents the groupings of the various platform resources into
 | 
						||
domains of substantially similar mean performance relative to resources outside
 | 
						||
of that domain. Resources subsets of a given domain that exhibit better
 | 
						||
performance relative to each other than relative to other resources subsets
 | 
						||
are represented as being members of a sub-grouping domain. This performance
 | 
						||
characteristic is presented in terms of NUMA node distance within the Linux kernel.
 | 
						||
From the platform view, these groups are also referred to as domains.
 | 
						||
 | 
						||
PAPR interface currently supports different ways of communicating these resource
 | 
						||
grouping details to the OS. These are referred to as Form 0, Form 1 and Form2
 | 
						||
associativity grouping. Form 0 is the oldest format and is now considered deprecated.
 | 
						||
 | 
						||
Hypervisor indicates the type/form of associativity used via "ibm,architecture-vec-5 property".
 | 
						||
Bit 0 of byte 5 in the "ibm,architecture-vec-5" property indicates usage of Form 0 or Form 1.
 | 
						||
A value of 1 indicates the usage of Form 1 associativity. For Form 2 associativity
 | 
						||
bit 2 of byte 5 in the "ibm,architecture-vec-5" property is used.
 | 
						||
 | 
						||
Form 0
 | 
						||
------
 | 
						||
Form 0 associativity supports only two NUMA distances (LOCAL and REMOTE).
 | 
						||
 | 
						||
Form 1
 | 
						||
------
 | 
						||
With Form 1 a combination of ibm,associativity-reference-points, and ibm,associativity
 | 
						||
device tree properties are used to determine the NUMA distance between resource groups/domains.
 | 
						||
 | 
						||
The “ibm,associativity” property contains a list of one or more numbers (domainID)
 | 
						||
representing the resource’s platform grouping domains.
 | 
						||
 | 
						||
The “ibm,associativity-reference-points” property contains a list of one or more numbers
 | 
						||
(domainID index) that represents the 1 based ordinal in the associativity lists.
 | 
						||
The list of domainID indexes represents an increasing hierarchy of resource grouping.
 | 
						||
 | 
						||
ex:
 | 
						||
{ primary domainID index, secondary domainID index, tertiary domainID index.. }
 | 
						||
 | 
						||
Linux kernel uses the domainID at the primary domainID index as the NUMA node id.
 | 
						||
Linux kernel computes NUMA distance between two domains by recursively comparing
 | 
						||
if they belong to the same higher-level domains. For mismatch at every higher
 | 
						||
level of the resource group, the kernel doubles the NUMA distance between the
 | 
						||
comparing domains.
 | 
						||
 | 
						||
Form 2
 | 
						||
-------
 | 
						||
Form 2 associativity format adds separate device tree properties representing NUMA node distance
 | 
						||
thereby making the node distance computation flexible. Form 2 also allows flexible primary
 | 
						||
domain numbering. With numa distance computation now detached from the index value in
 | 
						||
"ibm,associativity-reference-points" property, Form 2 allows a large number of primary domain
 | 
						||
ids at the same domainID index representing resource groups of different performance/latency
 | 
						||
characteristics.
 | 
						||
 | 
						||
Hypervisor indicates the usage of FORM2 associativity using bit 2 of byte 5 in the
 | 
						||
"ibm,architecture-vec-5" property.
 | 
						||
 | 
						||
"ibm,numa-lookup-index-table" property contains a list of one or more numbers representing
 | 
						||
the domainIDs present in the system. The offset of the domainID in this property is
 | 
						||
used as an index while computing numa distance information via "ibm,numa-distance-table".
 | 
						||
 | 
						||
prop-encoded-array: The number N of the domainIDs encoded as with encode-int, followed by
 | 
						||
N domainID encoded as with encode-int
 | 
						||
 | 
						||
For ex:
 | 
						||
"ibm,numa-lookup-index-table" =  {4, 0, 8, 250, 252}. The offset of domainID 8 (2) is used when
 | 
						||
computing the distance of domain 8 from other domains present in the system. For the rest of
 | 
						||
this document, this offset will be referred to as domain distance offset.
 | 
						||
 | 
						||
"ibm,numa-distance-table" property contains a list of one or more numbers representing the NUMA
 | 
						||
distance between resource groups/domains present in the system.
 | 
						||
 | 
						||
prop-encoded-array: The number N of the distance values encoded as with encode-int, followed by
 | 
						||
N distance values encoded as with encode-bytes. The max distance value we could encode is 255.
 | 
						||
The number N must be equal to the square of m where m is the number of domainIDs in the
 | 
						||
numa-lookup-index-table.
 | 
						||
 | 
						||
For ex:
 | 
						||
ibm,numa-lookup-index-table = <3 0 8 40>;
 | 
						||
ibm,numa-distace-table = <9>, /bits/ 8 < 10  20  80 20  10 160 80 160  10>;
 | 
						||
 | 
						||
::
 | 
						||
 | 
						||
	  | 0    8   40
 | 
						||
	--|------------
 | 
						||
	  |
 | 
						||
	0 | 10   20  80
 | 
						||
	  |
 | 
						||
	8 | 20   10  160
 | 
						||
	  |
 | 
						||
	40| 80   160  10
 | 
						||
 | 
						||
A possible "ibm,associativity" property for resources in node 0, 8 and 40
 | 
						||
 | 
						||
{ 3, 6, 7, 0 }
 | 
						||
{ 3, 6, 9, 8 }
 | 
						||
{ 3, 6, 7, 40}
 | 
						||
 | 
						||
With "ibm,associativity-reference-points"  { 0x3 }
 | 
						||
 | 
						||
"ibm,lookup-index-table" helps in having a compact representation of distance matrix.
 | 
						||
Since domainID can be sparse, the matrix of distances can also be effectively sparse.
 | 
						||
With "ibm,lookup-index-table" we can achieve a compact representation of
 | 
						||
distance information.
 |