qemu-kvm/kvm-spapr-Fix-ibm-max-assoc...

150 lines
4.5 KiB
Diff

From f39913b42600b838c415f6fb561be940bea265dd Mon Sep 17 00:00:00 2001
From: Serhii Popovych <spopovyc@redhat.com>
Date: Wed, 9 Jan 2019 13:31:49 +0000
Subject: [PATCH 1/2] spapr: Fix ibm, max-associativity-domains property number
of nodes
RH-Author: Serhii Popovych <spopovyc@redhat.com>
Message-id: <1547040709-797-1-git-send-email-spopovyc@redhat.com>
Patchwork-id: 83920
O-Subject: [RHEL-8.0 qemu-kvm PATCH v2] spapr: Fix ibm, max-associativity-domains property number of nodes
Bugzilla: 1653114
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
RH-Acked-by: David Gibson <dgibson@redhat.com>
RH-Acked-by: Thomas Huth <thuth@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1653114
Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=19727263
Branch: rhel8/master-3.1.0
Upstream: Merged
Testing: Build and boot tested on rhel-7.6 with steps described in
comment 0. Issue no longer reproducible.
Laurent Vivier reported off by one with maximum number of NUMA nodes
provided by qemu-kvm being less by one than required according to
description of "ibm,max-associativity-domains" property in LoPAPR.
It appears that I incorrectly treated LoPAPR description of this
property assuming it provides last valid domain (NUMA node here)
instead of maximum number of domains.
### Before hot-add
(qemu) info numa
3 nodes
node 0 cpus: 0
node 0 size: 0 MB
node 0 plugged: 0 MB
node 1 cpus:
node 1 size: 1024 MB
node 1 plugged: 0 MB
node 2 cpus:
node 2 size: 0 MB
node 2 plugged: 0 MB
$ numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0
node 0 size: 0 MB
node 0 free: 0 MB
node 1 cpus:
node 1 size: 999 MB
node 1 free: 658 MB
node distances:
node 0 1
0: 10 40
1: 40 10
### Hot-add
(qemu) object_add memory-backend-ram,id=mem0,size=1G
(qemu) device_add pc-dimm,id=dimm1,memdev=mem0,node=2
(qemu) [ 87.704898] pseries-hotplug-mem: Attempting to hot-add 4 ...
<there is no "Initmem setup node 2 [mem 0xHEX-0xHEX]">
[ 87.705128] lpar: Attempting to resize HPT to shift 21
... <HPT resize messages>
### After hot-add
(qemu) info numa
3 nodes
node 0 cpus: 0
node 0 size: 0 MB
node 0 plugged: 0 MB
node 1 cpus:
node 1 size: 1024 MB
node 1 plugged: 0 MB
node 2 cpus:
node 2 size: 1024 MB
node 2 plugged: 1024 MB
$ numactl -H
available: 2 nodes (0-1)
^^^^^^^^^^^^^^^^^^^^^^^^
Still only two nodes (and memory hot-added to node 0 below)
node 0 cpus: 0
node 0 size: 1024 MB
node 0 free: 1021 MB
node 1 cpus:
node 1 size: 999 MB
node 1 free: 658 MB
node distances:
node 0 1
0: 10 40
1: 40 10
After fix applied numactl(8) reports 3 nodes available and memory
plugged into node 2 as expected.
>From David Gibson:
------------------
Qemu makes a distinction between "non NUMA" (nb_numa_nodes == 0) and
"NUMA with one node" (nb_numa_nodes == 1). But from a PAPR guests's
point of view these are equivalent. I don't want to present two
different cases to the guest when we don't need to, so even though the
guest can handle it, I'd prefer we put a '1' here for both the
nb_numa_nodes == 0 and nb_numa_nodes == 1 case.
This consolidates everything discussed previously on mailing list.
Fixes: da9f80fbad21 ("spapr: Add ibm,max-associativity-domains property")
Reported-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Serhii Popovych <spopovyc@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
(cherry picked from commit 3908a24fcb83913079d315de0ca6d598e8616dbb)
Signed-off-by: Serhii Popovych <spopovyc@redhat.com>
---
v2:
Rebased against rhel8/qemu-kvm-3.1.0 for RHEL Advanced Virtualization
product.
Added "Brach:" tag to commint message as suggested by Laurent Vivier.
hw/ppc/spapr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
---
hw/ppc/spapr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index d5d2eb4..bd2abb7 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1033,7 +1033,7 @@ static void spapr_dt_rtas(sPAPRMachineState *spapr, void *fdt)
cpu_to_be32(0),
cpu_to_be32(0),
cpu_to_be32(0),
- cpu_to_be32(nb_numa_nodes ? nb_numa_nodes - 1 : 0),
+ cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 1),
};
_FDT(rtas = fdt_add_subnode(fdt, 0, "rtas"));
--
1.8.3.1