Branch synchronization with RHEL 8.8.0

This commit is contained in:
Miroslav Rezanina 2023-04-04 08:34:43 -04:00
parent ffb018130f
commit d9dd6a665d
133 changed files with 4414 additions and 13884 deletions

5
.gitignore vendored
View File

@ -1,2 +1,5 @@
SOURCES/qemu-6.2.0.tar.xz
/qemu-5.2.0-rc2.tar.xz
/qemu-5.2.0.tar.xz
/qemu-6.0.0.tar.xz
/qemu-6.2.0.tar.xz
SOURCES/qemu-6.2.0.tar.xz

View File

@ -1,313 +0,0 @@
From fc113ecd7c99646a7ced0b99570b5927ae6d595f Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Wed, 26 May 2021 10:56:02 +0200
Subject: Initial redhat build
This patch introduces redhat build structure in redhat subdirectory. In addition,
several issues are fixed in QEMU tree:
- Change of app name for sasl_server_init in VNC code from qemu to qemu-kvm
- As we use qemu-kvm as name in all places, this is updated to be consistent
- Man page renamed from qemu to qemu-kvm
- man page is installed using make install so we have to fix it in qemu tree
We disable make check due to issues with some of the tests.
This rebase is based on qemu-kvm-6.2.0-13.el9
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
--
Rebase changes (6.1.0):
- Move build to .distro
- Move changes for support file to related commit
- Added dependency for python3-sphinx-rtd_theme
- Removed --disable-sheepdog configure option
- Added new hw-display modules
- SASL initialization moved to ui/vnc-auth-sasl.c
- Add accel-qtest-<arch> and accel-tcg-x86_64 libraries
- Added hw-usb-host module
- Disable new configure options (bpf, nvmm, slirp-smbd)
- Use -pie for ksmctl build (annocheck complain fix)
Rebase changes (6.2.0):
- removed --disable-jemalloc and --disable-tcmalloc configure options
- added audio-oss.so
- added fdt requirement for x86_64
- tests/acceptance renamed to tests/avocado
- added multiboot_dma.bin
- Add -Wno-string-plus-int to extra flags
- Updated configure options
Rebase changes (7.0.0):
- Do not use -mlittle CFLAG on ppc64le
- Used upstream handling issue with ui/clipboard.c
- Use -mlittle-endian on ppc64le instead of deleteing it in configure
- Drop --disable-libxml2 option for configure (upstream)
- Remove vof roms
- Disable AVX2 support
- Use internal meson
- Disable new configure options (dbus-display and qga-vss)
- Change permissions on installing tests/Makefile.include
- Remove ssh block driver
Merged patches (6.0.0):
- 605758c902 Limit build on Power to qemu-img and qemu-ga only
Merged patches (6.1.0):
- f04f91751f Use cached tarballs
- 6581165c65 Remove message with running VM count
- 03c3cac9fc spec-file: build qemu-kvm without SPICE and QXL
- e0ae6c1f6c spec-file: Obsolete qemu-kvm-ui-spice
- 9d2e9f9ecf spec: Do not build qemu-kvm-block-gluster
- cf470b4234 spec: Do not link pcnet and ne2k_pci roms
- e981284a6b redhat: Install the s390-netboot.img that we've built
- 24ef557f33 spec: Remove usage of Group: tag
- c40d69b4f4 spec: Drop %defattr usage
- f8e98798ce spec: Clean up BuildRequires
- 47246b43ee spec: Remove iasl BuildRequires
- 170dc1cbe0 spec: Remove redundant 0 in conditionals
- 8718f6fa11 spec: Add more have_XXX conditionals
- a001269ce9 spec: Remove binutils versioned Requires
- 34545ee641 spec: Remove diffutils BuildRequires
- c2c82beac9 spec: Remove redundant Requires:
- 9314c231f4 spec: Add XXX_version macros
- c43db0bf0f spec: Add have_block_rbd
- 3ecb0c0319 qga: drop StandardError=syslog
- 018049dc80 Remove iscsi support
- a2edf18777 redhat: Replace the kvm-setup.service with a /etc/modules-load.d config file
- 387b5fbcfe redhat: Move qemu-kvm-docs dependency to qemu-kvm
- 4ead693178 redhat: introducting qemu-kvm-hw-usbredir
- 4dc6fc3035 redhat: use the standard vhost-user JSON path
- 84757178b4 Fix local build
- 8c394227dd spec: Restrict block drivers in tools
- b6aa7c1fae Move tools to separate package
- eafd82e509 Split qemu-pr-helper to separate package
- 2c0182e2aa spec: RPM_BUILD_ROOT -> %{buildroot}
- 91bd55ca13 spec: More use of %{name} instead of 'qemu-kvm'
- 50ba299c61 spec: Use qemu-pr-helper.service from qemu.git (partial)
- ee08d4e0a3 spec: Use %{_sourcedir} for referencing sources
- 039e7f7d02 spec: Add tools_only
- 884ba71617 spec: %build: Add run_configure helper
- 8ebd864d65 spec: %build: Disable more bits with %{disable_everything} (partial)
- f23fdb53f5 spec: %build: Add macros for some 'configure' parameters
- fe951a8bd8 spec: %files: Move qemu-guest-agent and qemu-img earlier
- 353b632e37 spec: %install: Remove redundant bits
- 9d2015b752 spec: %install: Add %{modprobe_kvm_conf} macro
- 6d05134e8c spec: %install: Remove qemu-guest-agent /etc/qemu-kvm usage
- 985b226467 spec: %install: clean up qemu-ga section
- dfaf9c600d spec: %install: Use a single %{tools_only} section
- f6978ddb46 spec: Make tools_only not cross spec sections
- 071c211098 spec: %install: Limit time spent in %{qemu_kvm_build}
- 1b65c674be spec: misc syntactic merges with Fedora
- 4da16294cf spec: Use Fedora's pattern for specifying rc version
- d7ee259a79 spec: %files: don't use fine grained -docs file list
- 64cad0c60f spec: %files: Add licenses to qemu-common too
- c3de4f080a spec: %install: Drop python3 shebang fixup
- 46fc216115 Update local build to work with spec file improvements
- bab9531548 spec: Remove buildldflags
- c8360ab6a9 spec: Use %make_build macro
- f6966c66e9 spec: Drop make install sharedir and datadir usage
- 86982421bc spec: use %make_install macro
- 191c405d22 spec: parallelize `make check`
- 251a1fb958 spec: Drop explicit --build-id
- 44c7dda6c3 spec: use %{build_ldflags}
- 0009a34354 Move virtiofsd to separate package
- 34d1b200b3 Utilize --firmware configure option
- 2800e1dd03 spec: Switch toolchain to Clang/LLVM (except process-patches.sh)
- e8a70f500f spec: Use safe-stack for x86_64
- e29445d50d spec: Reenable write support for VMDK etc. in tools
- a4fe2a3e16 redhat: Disable LTO on non-x86 architectures
Merged patches (6.2.0):
- 333452440b remove sgabios dependency
- 7d3633f184 enable pulseaudio
- bd898709b0 spec: disable use of gcrypt for crypto backends in favour of gnutls
- e4f0c6dee6 spec: Remove block-curl and block-ssh dependency
- 4dc13bfe63 spec: Build the VDI block driver
- d2f2ff3c74 spec: Explicitly include compress filter
- a7d047f9c2 Move ksmtuned files to separate package
Merged patches (7.0.0):
- 098d4d08d0 spec: Rename qemu-kvm-hw-usbredir to qemu-kvm-device-usb-redirect
- c2bd0d6834 spec: Split qemu-kvm-ui-opengl
- 2c9cda805d spec: Introduce packages for virtio-gpu-* modules (changed as rhel device tree not set)
- d0414a3e0b spec: Introduce device-display-virtio-vga* packages
- 3534ec46d4 spec: Move usb-host module to separate package
- ddc14d4737 spec: Move qtest accel module to tests package
- 6f2c4befa6 spec: Extend qemu-kvm-core description
- 6f11866e4e (rhel/rhel-9.0.0) Update to qemu-kvm-6.2.0-6.el9
- da0a28758f ui/clipboard: fix use-after-free regression
- 895d4d52eb spec: Remove qemu-virtiofsd
- c8c8c8bd84 spec: Fix obsolete for spice subpackages
- d46d2710b2 spec: Obsolete old usb redir subpackage
- 6f52a50b68 spec: Obsolete ssh driver
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
---
.distro/85-kvm.preset | 5 -
.distro/Makefile | 100 +
.distro/Makefile.common | 40 +
.distro/README.tests | 39 +
.distro/ksm.service | 13 -
.distro/ksm.sysconfig | 4 -
.distro/ksmctl.c | 77 -
.distro/ksmtuned | 139 -
.distro/ksmtuned.conf | 21 -
.distro/ksmtuned.service | 12 -
.distro/kvm-setup | 49 -
.distro/kvm-setup.service | 14 -
.distro/modules-load.conf | 4 +
.distro/qemu-guest-agent.service | 1 -
.distro/qemu-kvm.spec.template | 4034 +++++++++++++++++++++++
.distro/rpminspect.yaml | 6 +-
.distro/scripts/extract_build_cmd.py | 12 +
.gitignore | 1 +
README.systemtap | 43 +
meson.build | 4 +-
scripts/qemu-guest-agent/fsfreeze-hook | 2 +-
scripts/systemtap/conf.d/qemu_kvm.conf | 4 +
scripts/systemtap/script.d/qemu_kvm.stp | 1 +
tests/check-block.sh | 2 +
ui/vnc-auth-sasl.c | 2 +-
25 files changed, 4290 insertions(+), 339 deletions(-)
delete mode 100644 .distro/85-kvm.preset
create mode 100644 .distro/Makefile
create mode 100644 .distro/Makefile.common
create mode 100644 .distro/README.tests
delete mode 100644 .distro/ksm.service
delete mode 100644 .distro/ksm.sysconfig
delete mode 100644 .distro/ksmctl.c
delete mode 100644 .distro/ksmtuned
delete mode 100644 .distro/ksmtuned.conf
delete mode 100644 .distro/ksmtuned.service
delete mode 100644 .distro/kvm-setup
delete mode 100644 .distro/kvm-setup.service
create mode 100644 .distro/modules-load.conf
create mode 100644 .distro/qemu-kvm.spec.template
create mode 100644 README.systemtap
create mode 100644 scripts/systemtap/conf.d/qemu_kvm.conf
create mode 100644 scripts/systemtap/script.d/qemu_kvm.stp
diff --git a/README.systemtap b/README.systemtap
new file mode 100644
index 0000000000..ad913fc990
--- /dev/null
+++ b/README.systemtap
@@ -0,0 +1,43 @@
+QEMU tracing using systemtap-initscript
+---------------------------------------
+
+You can capture QEMU trace data all the time using systemtap-initscript. This
+uses SystemTap's flight recorder mode to trace all running guests to a
+fixed-size buffer on the host. Old trace entries are overwritten by new
+entries when the buffer size wraps.
+
+1. Install the systemtap-initscript package:
+ # yum install systemtap-initscript
+
+2. Install the systemtap scripts and the conf file:
+ # cp /usr/share/qemu-kvm/systemtap/script.d/qemu_kvm.stp /etc/systemtap/script.d/
+ # cp /usr/share/qemu-kvm/systemtap/conf.d/qemu_kvm.conf /etc/systemtap/conf.d/
+
+The set of trace events to enable is given in qemu_kvm.stp. This SystemTap
+script can be customized to add or remove trace events provided in
+/usr/share/systemtap/tapset/qemu-kvm-simpletrace.stp.
+
+SystemTap customizations can be made to qemu_kvm.conf to control the flight
+recorder buffer size and whether to store traces in memory only or disk too.
+See stap(1) for option documentation.
+
+3. Start the systemtap service.
+ # service systemtap start qemu_kvm
+
+4. Make the service start at boot time.
+ # chkconfig systemtap on
+
+5. Confirm that the service works.
+ # service systemtap status qemu_kvm
+ qemu_kvm is running...
+
+When you want to inspect the trace buffer, perform the following steps:
+
+1. Dump the trace buffer.
+ # staprun -A qemu_kvm >/tmp/trace.log
+
+2. Start the systemtap service because the preceding step stops the service.
+ # service systemtap start qemu_kvm
+
+3. Translate the trace record to readable format.
+ # /usr/share/qemu-kvm/simpletrace.py --no-header /usr/share/qemu-kvm/trace-events /tmp/trace.log
diff --git a/meson.build b/meson.build
index 861de93c4f..6f7e430f0f 100644
--- a/meson.build
+++ b/meson.build
@@ -2394,7 +2394,9 @@ if capstone_opt == 'internal'
# Include all configuration defines via a header file, which will wind up
# as a dependency on the object file, and thus changes here will result
# in a rebuild.
- '-include', 'capstone-defs.h'
+ '-include', 'capstone-defs.h',
+
+ '-Wp,-D_GLIBCXX_ASSERTIONS',
]
libcapstone = static_library('capstone',
diff --git a/scripts/qemu-guest-agent/fsfreeze-hook b/scripts/qemu-guest-agent/fsfreeze-hook
index 13aafd4845..e9b84ec028 100755
--- a/scripts/qemu-guest-agent/fsfreeze-hook
+++ b/scripts/qemu-guest-agent/fsfreeze-hook
@@ -8,7 +8,7 @@
# request, it is issued with "thaw" argument after filesystem is thawed.
LOGFILE=/var/log/qga-fsfreeze-hook.log
-FSFREEZE_D=$(dirname -- "$0")/fsfreeze-hook.d
+FSFREEZE_D=$(dirname -- "$(realpath $0)")/fsfreeze-hook.d
# Check whether file $1 is a backup or rpm-generated file and should be ignored
is_ignored_file() {
diff --git a/scripts/systemtap/conf.d/qemu_kvm.conf b/scripts/systemtap/conf.d/qemu_kvm.conf
new file mode 100644
index 0000000000..372d8160a4
--- /dev/null
+++ b/scripts/systemtap/conf.d/qemu_kvm.conf
@@ -0,0 +1,4 @@
+# Force load uprobes (see BZ#1118352)
+stap -e 'probe process("/usr/libexec/qemu-kvm").function("main") { printf("") }' -c true
+
+qemu_kvm_OPT="-s4" # per-CPU buffer size, in megabytes
diff --git a/scripts/systemtap/script.d/qemu_kvm.stp b/scripts/systemtap/script.d/qemu_kvm.stp
new file mode 100644
index 0000000000..c04abf9449
--- /dev/null
+++ b/scripts/systemtap/script.d/qemu_kvm.stp
@@ -0,0 +1 @@
+probe qemu.kvm.simpletrace.handle_qmp_command,qemu.kvm.simpletrace.monitor_protocol_*,qemu.kvm.simpletrace.migrate_set_state {}
diff --git a/tests/check-block.sh b/tests/check-block.sh
index f59496396c..d900d8b35e 100755
--- a/tests/check-block.sh
+++ b/tests/check-block.sh
@@ -48,6 +48,8 @@ if LANG=C bash --version | grep -q 'GNU bash, version [123]' ; then
skip "bash version too old ==> Not running the qemu-iotests."
fi
+exit 0
+
cd tests/qemu-iotests
# QEMU_CHECK_BLOCK_AUTO is used to disable some unstable sub-tests
diff --git a/ui/vnc-auth-sasl.c b/ui/vnc-auth-sasl.c
index 47fdae5b21..2a950caa2a 100644
--- a/ui/vnc-auth-sasl.c
+++ b/ui/vnc-auth-sasl.c
@@ -42,7 +42,7 @@
bool vnc_sasl_server_init(Error **errp)
{
- int saslErr = sasl_server_init(NULL, "qemu");
+ int saslErr = sasl_server_init(NULL, "qemu-kvm");
if (saslErr != SASL_OK) {
error_setg(errp, "Failed to initialize SASL auth: %s",
--
2.31.1

View File

@ -1,642 +0,0 @@
From 51ec7495d69fe4b4d0b61642ca6c0e7fd7a1032d Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Thu, 15 Jul 2021 03:22:36 -0400
Subject: Enable/disable devices for RHEL
This commit adds all changes related to changes in supported devices.
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
--
Rebase notes (6.1.0):
- Added CONFIG_TPM (except s390x)
- default-configs moved to configs
- Use --with-device-<ARCH> configure option to use rhel configs
Rebase notes (6.2.0):
- Add CONFIG_ISA_FDC
- Do not remove -no-hpet documentation
Rebase notes (7.0.0):
- Added CONFIG_ARM_GIC_TCG option for aarch64
- Fixes necessary for layout change fixes
- Renamed CONFIG_ARM_GIC_TCG to CONFIG_ARM_GICV3_TCG
- Removed upstream devices
Merged patches (6.1.0):
- c51bf45304 Remove SPICE and QXL from x86_64-rh-devices.mak
- 02fc745601 aarch64-rh-devices: add CONFIG_PVPANIC_PCI
- f2fe835153 aarch64-rh-devices: add CONFIG_PXB
- b5431733ad disable CONFIG_USB_STORAGE_BOT
- 478ba0cdf6 Disable TPM passthrough
- 2504d68a7c aarch64: Add USB storage devices
- 51c2a3253c disable ac97 audio
Merged patches (6.2.0):
- 9f2f9fa2ba disable sga device
Merged patches (7.0.0):
- fd7c45a5a8 redhat: Enable virtio-mem as tech-preview on x86-64
- c9e68ea451 Enable SGX -- RH Only
---
.distro/qemu-kvm.spec.template | 18 +--
.../aarch64-softmmu/aarch64-rh-devices.mak | 34 ++++++
.../ppc64-softmmu/ppc64-rh-devices.mak | 35 ++++++
configs/devices/rh-virtio.mak | 10 ++
.../s390x-softmmu/s390x-rh-devices.mak | 15 +++
.../x86_64-softmmu/x86_64-rh-devices.mak | 103 ++++++++++++++++++
hw/acpi/ich9.c | 4 +-
hw/arm/meson.build | 2 +-
hw/block/fdc.c | 10 ++
hw/cpu/meson.build | 5 +-
hw/display/cirrus_vga.c | 5 +-
hw/ide/piix.c | 5 +-
hw/input/pckbd.c | 2 +
hw/net/e1000.c | 2 +
hw/ppc/spapr_cpu_core.c | 2 +
hw/usb/meson.build | 2 +-
target/arm/cpu_tcg.c | 10 ++
target/ppc/cpu-models.c | 9 ++
target/s390x/cpu_models_sysemu.c | 3 +
target/s390x/kvm/kvm.c | 8 ++
20 files changed, 269 insertions(+), 15 deletions(-)
create mode 100644 configs/devices/aarch64-softmmu/aarch64-rh-devices.mak
create mode 100644 configs/devices/ppc64-softmmu/ppc64-rh-devices.mak
create mode 100644 configs/devices/rh-virtio.mak
create mode 100644 configs/devices/s390x-softmmu/s390x-rh-devices.mak
create mode 100644 configs/devices/x86_64-softmmu/x86_64-rh-devices.mak
diff --git a/configs/devices/aarch64-softmmu/aarch64-rh-devices.mak b/configs/devices/aarch64-softmmu/aarch64-rh-devices.mak
new file mode 100644
index 0000000000..5f6ee1de5b
--- /dev/null
+++ b/configs/devices/aarch64-softmmu/aarch64-rh-devices.mak
@@ -0,0 +1,34 @@
+include ../rh-virtio.mak
+
+CONFIG_ARM_GIC_KVM=y
+CONFIG_ARM_GICV3_TCG=y
+CONFIG_ARM_GIC=y
+CONFIG_ARM_SMMUV3=y
+CONFIG_ARM_V7M=y
+CONFIG_ARM_VIRT=y
+CONFIG_EDID=y
+CONFIG_PCIE_PORT=y
+CONFIG_PCI_DEVICES=y
+CONFIG_PCI_TESTDEV=y
+CONFIG_PFLASH_CFI01=y
+CONFIG_SCSI=y
+CONFIG_SEMIHOSTING=y
+CONFIG_USB=y
+CONFIG_USB_XHCI=y
+CONFIG_USB_XHCI_PCI=y
+CONFIG_USB_STORAGE_CORE=y
+CONFIG_USB_STORAGE_CLASSIC=y
+CONFIG_VFIO=y
+CONFIG_VFIO_PCI=y
+CONFIG_VIRTIO_MMIO=y
+CONFIG_VIRTIO_PCI=y
+CONFIG_XIO3130=y
+CONFIG_NVDIMM=y
+CONFIG_ACPI_APEI=y
+CONFIG_TPM=y
+CONFIG_TPM_EMULATOR=y
+CONFIG_TPM_TIS_SYSBUS=y
+CONFIG_PTIMER=y
+CONFIG_ARM_COMPATIBLE_SEMIHOSTING=y
+CONFIG_PVPANIC_PCI=y
+CONFIG_PXB=y
diff --git a/configs/devices/ppc64-softmmu/ppc64-rh-devices.mak b/configs/devices/ppc64-softmmu/ppc64-rh-devices.mak
new file mode 100644
index 0000000000..6a3e3f0227
--- /dev/null
+++ b/configs/devices/ppc64-softmmu/ppc64-rh-devices.mak
@@ -0,0 +1,35 @@
+include ../rh-virtio.mak
+
+CONFIG_DIMM=y
+CONFIG_MEM_DEVICE=y
+CONFIG_NVDIMM=y
+CONFIG_PCI=y
+CONFIG_PCI_DEVICES=y
+CONFIG_PCI_TESTDEV=y
+CONFIG_PCI_EXPRESS=y
+CONFIG_PSERIES=y
+CONFIG_SCSI=y
+CONFIG_SPAPR_VSCSI=y
+CONFIG_TEST_DEVICES=y
+CONFIG_USB=y
+CONFIG_USB_OHCI=y
+CONFIG_USB_OHCI_PCI=y
+CONFIG_USB_SMARTCARD=y
+CONFIG_USB_STORAGE_CORE=y
+CONFIG_USB_STORAGE_CLASSIC=y
+CONFIG_USB_XHCI=y
+CONFIG_USB_XHCI_NEC=y
+CONFIG_USB_XHCI_PCI=y
+CONFIG_VFIO=y
+CONFIG_VFIO_PCI=y
+CONFIG_VGA=y
+CONFIG_VGA_PCI=y
+CONFIG_VHOST_USER=y
+CONFIG_VIRTIO_PCI=y
+CONFIG_VIRTIO_VGA=y
+CONFIG_WDT_IB6300ESB=y
+CONFIG_XICS=y
+CONFIG_XIVE=y
+CONFIG_TPM=y
+CONFIG_TPM_SPAPR=y
+CONFIG_TPM_EMULATOR=y
diff --git a/configs/devices/rh-virtio.mak b/configs/devices/rh-virtio.mak
new file mode 100644
index 0000000000..94ede1b5f6
--- /dev/null
+++ b/configs/devices/rh-virtio.mak
@@ -0,0 +1,10 @@
+CONFIG_VIRTIO=y
+CONFIG_VIRTIO_BALLOON=y
+CONFIG_VIRTIO_BLK=y
+CONFIG_VIRTIO_GPU=y
+CONFIG_VIRTIO_INPUT=y
+CONFIG_VIRTIO_INPUT_HOST=y
+CONFIG_VIRTIO_NET=y
+CONFIG_VIRTIO_RNG=y
+CONFIG_VIRTIO_SCSI=y
+CONFIG_VIRTIO_SERIAL=y
diff --git a/configs/devices/s390x-softmmu/s390x-rh-devices.mak b/configs/devices/s390x-softmmu/s390x-rh-devices.mak
new file mode 100644
index 0000000000..d3b38312e1
--- /dev/null
+++ b/configs/devices/s390x-softmmu/s390x-rh-devices.mak
@@ -0,0 +1,15 @@
+include ../rh-virtio.mak
+
+CONFIG_PCI=y
+CONFIG_S390_CCW_VIRTIO=y
+CONFIG_S390_FLIC=y
+CONFIG_S390_FLIC_KVM=y
+CONFIG_SCLPCONSOLE=y
+CONFIG_SCSI=y
+CONFIG_VFIO=y
+CONFIG_VFIO_AP=y
+CONFIG_VFIO_CCW=y
+CONFIG_VFIO_PCI=y
+CONFIG_VHOST_USER=y
+CONFIG_VIRTIO_CCW=y
+CONFIG_WDT_DIAG288=y
diff --git a/configs/devices/x86_64-softmmu/x86_64-rh-devices.mak b/configs/devices/x86_64-softmmu/x86_64-rh-devices.mak
new file mode 100644
index 0000000000..d0c9e66641
--- /dev/null
+++ b/configs/devices/x86_64-softmmu/x86_64-rh-devices.mak
@@ -0,0 +1,103 @@
+include ../rh-virtio.mak
+
+CONFIG_ACPI=y
+CONFIG_ACPI_PCI=y
+CONFIG_ACPI_CPU_HOTPLUG=y
+CONFIG_ACPI_MEMORY_HOTPLUG=y
+CONFIG_ACPI_NVDIMM=y
+CONFIG_ACPI_SMBUS=y
+CONFIG_ACPI_VMGENID=y
+CONFIG_ACPI_X86=y
+CONFIG_ACPI_X86_ICH=y
+CONFIG_AHCI=y
+CONFIG_APIC=y
+CONFIG_APM=y
+CONFIG_BOCHS_DISPLAY=y
+CONFIG_DIMM=y
+CONFIG_E1000E_PCI_EXPRESS=y
+CONFIG_E1000_PCI=y
+CONFIG_EDU=y
+CONFIG_FDC=y
+CONFIG_FDC_SYSBUS=y
+CONFIG_FDC_ISA=y
+CONFIG_FW_CFG_DMA=y
+CONFIG_HDA=y
+CONFIG_HYPERV=y
+CONFIG_HYPERV_TESTDEV=y
+CONFIG_I2C=y
+CONFIG_I440FX=y
+CONFIG_I8254=y
+CONFIG_I8257=y
+CONFIG_I8259=y
+CONFIG_I82801B11=y
+CONFIG_IDE_CORE=y
+CONFIG_IDE_PCI=y
+CONFIG_IDE_PIIX=y
+CONFIG_IDE_QDEV=y
+CONFIG_IOAPIC=y
+CONFIG_IOH3420=y
+CONFIG_ISA_BUS=y
+CONFIG_ISA_DEBUG=y
+CONFIG_ISA_TESTDEV=y
+CONFIG_LPC_ICH9=y
+CONFIG_MC146818RTC=y
+CONFIG_MEM_DEVICE=y
+CONFIG_NVDIMM=y
+CONFIG_OPENGL=y
+CONFIG_PAM=y
+CONFIG_PC=y
+CONFIG_PCI=y
+CONFIG_PCIE_PORT=y
+CONFIG_PCI_DEVICES=y
+CONFIG_PCI_EXPRESS=y
+CONFIG_PCI_EXPRESS_Q35=y
+CONFIG_PCI_I440FX=y
+CONFIG_PCI_TESTDEV=y
+CONFIG_PCKBD=y
+CONFIG_PCSPK=y
+CONFIG_PC_ACPI=y
+CONFIG_PC_PCI=y
+CONFIG_PFLASH_CFI01=y
+CONFIG_PVPANIC_ISA=y
+CONFIG_PXB=y
+CONFIG_Q35=y
+CONFIG_RTL8139_PCI=y
+CONFIG_SCSI=y
+CONFIG_SERIAL=y
+CONFIG_SERIAL_ISA=y
+CONFIG_SERIAL_PCI=y
+CONFIG_SEV=y
+CONFIG_SMBIOS=y
+CONFIG_SMBUS_EEPROM=y
+CONFIG_TEST_DEVICES=y
+CONFIG_USB=y
+CONFIG_USB_EHCI=y
+CONFIG_USB_EHCI_PCI=y
+CONFIG_USB_SMARTCARD=y
+CONFIG_USB_STORAGE_CORE=y
+CONFIG_USB_STORAGE_CLASSIC=y
+CONFIG_USB_UHCI=y
+CONFIG_USB_XHCI=y
+CONFIG_USB_XHCI_NEC=y
+CONFIG_USB_XHCI_PCI=y
+CONFIG_VFIO=y
+CONFIG_VFIO_PCI=y
+CONFIG_VGA=y
+CONFIG_VGA_CIRRUS=y
+CONFIG_VGA_PCI=y
+CONFIG_VHOST_USER=y
+CONFIG_VHOST_USER_BLK=y
+CONFIG_VIRTIO_MEM=y
+CONFIG_VIRTIO_PCI=y
+CONFIG_VIRTIO_VGA=y
+CONFIG_VMMOUSE=y
+CONFIG_VMPORT=y
+CONFIG_VTD=y
+CONFIG_WDT_IB6300ESB=y
+CONFIG_WDT_IB700=y
+CONFIG_XIO3130=y
+CONFIG_TPM=y
+CONFIG_TPM_CRB=y
+CONFIG_TPM_TIS_ISA=y
+CONFIG_TPM_EMULATOR=y
+CONFIG_SGX=y
diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index bd9bbade70..de1e401cdf 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -435,8 +435,8 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm)
static const uint32_t gpe0_len = ICH9_PMIO_GPE0_LEN;
pm->acpi_memory_hotplug.is_enabled = true;
pm->cpu_hotplug_legacy = true;
- pm->disable_s3 = 0;
- pm->disable_s4 = 0;
+ pm->disable_s3 = 1;
+ pm->disable_s4 = 1;
pm->s4_val = 2;
pm->use_acpi_hotplug_bridge = true;
pm->keep_pci_slot_hpc = true;
diff --git a/hw/arm/meson.build b/hw/arm/meson.build
index 721a8eb8be..87ed4dd914 100644
--- a/hw/arm/meson.build
+++ b/hw/arm/meson.build
@@ -31,7 +31,7 @@ arm_ss.add(when: 'CONFIG_VEXPRESS', if_true: files('vexpress.c'))
arm_ss.add(when: 'CONFIG_ZYNQ', if_true: files('xilinx_zynq.c'))
arm_ss.add(when: 'CONFIG_SABRELITE', if_true: files('sabrelite.c'))
-arm_ss.add(when: 'CONFIG_ARM_V7M', if_true: files('armv7m.c'))
+#arm_ss.add(when: 'CONFIG_ARM_V7M', if_true: files('armv7m.c'))
arm_ss.add(when: 'CONFIG_EXYNOS4', if_true: files('exynos4210.c'))
arm_ss.add(when: 'CONFIG_PXA2XX', if_true: files('pxa2xx.c', 'pxa2xx_gpio.c', 'pxa2xx_pic.c'))
arm_ss.add(when: 'CONFIG_DIGIC', if_true: files('digic.c'))
diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index 347875a0cd..ca1776121f 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -49,6 +49,8 @@
#include "qom/object.h"
#include "fdc-internal.h"
+#include "hw/boards.h"
+
/********************************************************/
/* debug Floppy devices */
@@ -2338,6 +2340,14 @@ void fdctrl_realize_common(DeviceState *dev, FDCtrl *fdctrl, Error **errp)
FDrive *drive;
static int command_tables_inited = 0;
+ /* Restricted for Red Hat Enterprise Linux: */
+ MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
+ if (!strstr(mc->name, "-rhel7.")) {
+ error_setg(errp, "Device %s is not supported with machine type %s",
+ object_get_typename(OBJECT(dev)), mc->name);
+ return;
+ }
+
if (fdctrl->fallback == FLOPPY_DRIVE_TYPE_AUTO) {
error_setg(errp, "Cannot choose a fallback FDrive type of 'auto'");
return;
diff --git a/hw/cpu/meson.build b/hw/cpu/meson.build
index 9e52fee9e7..bb71c9f3e7 100644
--- a/hw/cpu/meson.build
+++ b/hw/cpu/meson.build
@@ -1,6 +1,7 @@
-softmmu_ss.add(files('core.c', 'cluster.c'))
+#softmmu_ss.add(files('core.c', 'cluster.c'))
+softmmu_ss.add(files('core.c'))
specific_ss.add(when: 'CONFIG_ARM11MPCORE', if_true: files('arm11mpcore.c'))
specific_ss.add(when: 'CONFIG_REALVIEW', if_true: files('realview_mpcore.c'))
specific_ss.add(when: 'CONFIG_A9MPCORE', if_true: files('a9mpcore.c'))
-specific_ss.add(when: 'CONFIG_A15MPCORE', if_true: files('a15mpcore.c'))
+#specific_ss.add(when: 'CONFIG_A15MPCORE', if_true: files('a15mpcore.c'))
diff --git a/hw/display/cirrus_vga.c b/hw/display/cirrus_vga.c
index 3bb6a58698..6447fdb02e 100644
--- a/hw/display/cirrus_vga.c
+++ b/hw/display/cirrus_vga.c
@@ -2945,7 +2945,10 @@ static void pci_cirrus_vga_realize(PCIDevice *dev, Error **errp)
PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(dev);
int16_t device_id = pc->device_id;
- /*
+ warn_report("'cirrus-vga' is deprecated, "
+ "please use a different VGA card instead");
+
+ /*
* Follow real hardware, cirrus card emulated has 4 MB video memory.
* Also accept 8 MB/16 MB for backward compatibility.
*/
diff --git a/hw/ide/piix.c b/hw/ide/piix.c
index ce89fd0aa3..fbcf802b13 100644
--- a/hw/ide/piix.c
+++ b/hw/ide/piix.c
@@ -232,7 +232,8 @@ static void piix3_ide_class_init(ObjectClass *klass, void *data)
k->device_id = PCI_DEVICE_ID_INTEL_82371SB_1;
k->class_id = PCI_CLASS_STORAGE_IDE;
set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
- dc->hotpluggable = false;
+ /* Disabled for Red Hat Enterprise Linux: */
+ dc->user_creatable = false;
}
static const TypeInfo piix3_ide_info = {
@@ -261,6 +262,8 @@ static void piix4_ide_class_init(ObjectClass *klass, void *data)
k->class_id = PCI_CLASS_STORAGE_IDE;
set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
dc->hotpluggable = false;
+ /* Disabled for Red Hat Enterprise Linux: */
+ dc->user_creatable = false;
}
static const TypeInfo piix4_ide_info = {
diff --git a/hw/input/pckbd.c b/hw/input/pckbd.c
index 4efdf75620..5143ebaa27 100644
--- a/hw/input/pckbd.c
+++ b/hw/input/pckbd.c
@@ -814,6 +814,8 @@ static void i8042_class_initfn(ObjectClass *klass, void *data)
dc->vmsd = &vmstate_kbd_isa;
isa->build_aml = i8042_build_aml;
set_bit(DEVICE_CATEGORY_INPUT, dc->categories);
+ /* Disabled for Red Hat Enterprise Linux: */
+ dc->user_creatable = false;
}
static const TypeInfo i8042_info = {
diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index f5bc81296d..282d01e374 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -1821,6 +1821,7 @@ static const E1000Info e1000_devices[] = {
.revision = 0x03,
.phy_id2 = E1000_PHY_ID2_8254xx_DEFAULT,
},
+#if 0 /* Disabled for Red Hat Enterprise Linux 7 */
{
.name = "e1000-82544gc",
.device_id = E1000_DEV_ID_82544GC_COPPER,
@@ -1833,6 +1834,7 @@ static const E1000Info e1000_devices[] = {
.revision = 0x03,
.phy_id2 = E1000_PHY_ID2_8254xx_DEFAULT,
},
+#endif
};
static void e1000_register_types(void)
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 8a4861f45a..fcb5dfe792 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -379,10 +379,12 @@ static const TypeInfo spapr_cpu_core_type_infos[] = {
.instance_size = sizeof(SpaprCpuCore),
.class_size = sizeof(SpaprCpuCoreClass),
},
+#if 0 /* Disabled for Red Hat Enterprise Linux */
DEFINE_SPAPR_CPU_CORE_TYPE("970_v2.2"),
DEFINE_SPAPR_CPU_CORE_TYPE("970mp_v1.0"),
DEFINE_SPAPR_CPU_CORE_TYPE("970mp_v1.1"),
DEFINE_SPAPR_CPU_CORE_TYPE("power5+_v2.1"),
+#endif
DEFINE_SPAPR_CPU_CORE_TYPE("power7_v2.3"),
DEFINE_SPAPR_CPU_CORE_TYPE("power7+_v2.1"),
DEFINE_SPAPR_CPU_CORE_TYPE("power8_v2.0"),
diff --git a/hw/usb/meson.build b/hw/usb/meson.build
index de853d780d..0776ae6a20 100644
--- a/hw/usb/meson.build
+++ b/hw/usb/meson.build
@@ -52,7 +52,7 @@ softmmu_ss.add(when: 'CONFIG_USB_SMARTCARD', if_true: files('dev-smartcard-reade
if cacard.found()
usbsmartcard_ss = ss.source_set()
usbsmartcard_ss.add(when: 'CONFIG_USB_SMARTCARD',
- if_true: [cacard, files('ccid-card-emulated.c', 'ccid-card-passthru.c')])
+ if_true: [cacard, files('ccid-card-passthru.c')])
hw_usb_modules += {'smartcard': usbsmartcard_ss}
endif
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index 13d0e9b195..3826fa5122 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -22,6 +22,7 @@
/* CPU models. These are not needed for the AArch64 linux-user build. */
#if !defined(CONFIG_USER_ONLY) || !defined(TARGET_AARCH64)
+#if 0 /* Disabled for Red Hat Enterprise Linux */
#if !defined(CONFIG_USER_ONLY) && defined(CONFIG_TCG)
static bool arm_v7m_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
{
@@ -375,6 +376,7 @@ static void cortex_a9_initfn(Object *obj)
cpu->ccsidr[1] = 0x200fe019; /* 16k L1 icache. */
define_arm_cp_regs(cpu, cortexa9_cp_reginfo);
}
+#endif /* disabled for RHEL */
#ifndef CONFIG_USER_ONLY
static uint64_t a15_l2ctlr_read(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -400,6 +402,7 @@ static const ARMCPRegInfo cortexa15_cp_reginfo[] = {
REGINFO_SENTINEL
};
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static void cortex_a7_initfn(Object *obj)
{
ARMCPU *cpu = ARM_CPU(obj);
@@ -445,6 +448,7 @@ static void cortex_a7_initfn(Object *obj)
cpu->ccsidr[2] = 0x711fe07a; /* 4096K L2 unified cache */
define_arm_cp_regs(cpu, cortexa15_cp_reginfo); /* Same as A15 */
}
+#endif /* disabled for RHEL */
static void cortex_a15_initfn(Object *obj)
{
@@ -488,6 +492,7 @@ static void cortex_a15_initfn(Object *obj)
define_arm_cp_regs(cpu, cortexa15_cp_reginfo);
}
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static void cortex_m0_initfn(Object *obj)
{
ARMCPU *cpu = ARM_CPU(obj);
@@ -928,6 +933,7 @@ static void arm_v7m_class_init(ObjectClass *oc, void *data)
cc->gdb_core_xml_file = "arm-m-profile.xml";
}
+#endif /* disabled for RHEL */
#ifndef TARGET_AARCH64
/*
@@ -1007,6 +1013,7 @@ static void arm_max_initfn(Object *obj)
#endif /* !TARGET_AARCH64 */
static const ARMCPUInfo arm_tcg_cpus[] = {
+#if 0 /* Disabled for Red Hat Enterprise Linux */
{ .name = "arm926", .initfn = arm926_initfn },
{ .name = "arm946", .initfn = arm946_initfn },
{ .name = "arm1026", .initfn = arm1026_initfn },
@@ -1022,7 +1029,9 @@ static const ARMCPUInfo arm_tcg_cpus[] = {
{ .name = "cortex-a7", .initfn = cortex_a7_initfn },
{ .name = "cortex-a8", .initfn = cortex_a8_initfn },
{ .name = "cortex-a9", .initfn = cortex_a9_initfn },
+#endif /* disabled for RHEL */
{ .name = "cortex-a15", .initfn = cortex_a15_initfn },
+#if 0 /* Disabled for Red Hat Enterprise Linux */
{ .name = "cortex-m0", .initfn = cortex_m0_initfn,
.class_init = arm_v7m_class_init },
{ .name = "cortex-m3", .initfn = cortex_m3_initfn,
@@ -1053,6 +1062,7 @@ static const ARMCPUInfo arm_tcg_cpus[] = {
{ .name = "pxa270-b1", .initfn = pxa270b1_initfn },
{ .name = "pxa270-c0", .initfn = pxa270c0_initfn },
{ .name = "pxa270-c5", .initfn = pxa270c5_initfn },
+#endif /* disabled for RHEL */
#ifndef TARGET_AARCH64
{ .name = "max", .initfn = arm_max_initfn },
#endif
diff --git a/target/ppc/cpu-models.c b/target/ppc/cpu-models.c
index 976be5e0d1..dd78883410 100644
--- a/target/ppc/cpu-models.c
+++ b/target/ppc/cpu-models.c
@@ -66,6 +66,7 @@
#define POWERPC_DEF(_name, _pvr, _type, _desc) \
POWERPC_DEF_SVR(_name, _desc, _pvr, POWERPC_SVR_NONE, _type)
+#if 0 /* Embedded and 32-bit CPUs disabled for Red Hat Enterprise Linux */
/* Embedded PowerPC */
/* PowerPC 405 family */
/* PowerPC 405 cores */
@@ -698,8 +699,10 @@
"PowerPC 7447A v1.2 (G4)")
POWERPC_DEF("7457a_v1.2", CPU_POWERPC_74x7A_v12, 7455,
"PowerPC 7457A v1.2 (G4)")
+#endif
/* 64 bits PowerPC */
#if defined(TARGET_PPC64)
+#if 0 /* Disabled for Red Hat Enterprise Linux */
POWERPC_DEF("970_v2.2", CPU_POWERPC_970_v22, 970,
"PowerPC 970 v2.2")
POWERPC_DEF("970fx_v1.0", CPU_POWERPC_970FX_v10, 970,
@@ -718,6 +721,7 @@
"PowerPC 970MP v1.1")
POWERPC_DEF("power5+_v2.1", CPU_POWERPC_POWER5P_v21, POWER5P,
"POWER5+ v2.1")
+#endif
POWERPC_DEF("power7_v2.3", CPU_POWERPC_POWER7_v23, POWER7,
"POWER7 v2.3")
POWERPC_DEF("power7+_v2.1", CPU_POWERPC_POWER7P_v21, POWER7,
@@ -897,12 +901,15 @@ PowerPCCPUAlias ppc_cpu_aliases[] = {
{ "7447a", "7447a_v1.2" },
{ "7457a", "7457a_v1.2" },
{ "apollo7pm", "7457a_v1.0" },
+#endif
#if defined(TARGET_PPC64)
+#if 0 /* Disabled for Red Hat Enterprise Linux */
{ "970", "970_v2.2" },
{ "970fx", "970fx_v3.1" },
{ "970mp", "970mp_v1.1" },
{ "power5+", "power5+_v2.1" },
{ "power5gs", "power5+_v2.1" },
+#endif
{ "power7", "power7_v2.3" },
{ "power7+", "power7+_v2.1" },
{ "power8e", "power8e_v2.1" },
@@ -912,6 +919,7 @@ PowerPCCPUAlias ppc_cpu_aliases[] = {
{ "power10", "power10_v2.0" },
#endif
+#if 0 /* Disabled for Red Hat Enterprise Linux */
/* Generic PowerPCs */
#if defined(TARGET_PPC64)
{ "ppc64", "970fx_v3.1" },
@@ -919,5 +927,6 @@ PowerPCCPUAlias ppc_cpu_aliases[] = {
{ "ppc32", "604" },
{ "ppc", "604" },
{ "default", "604" },
+#endif
{ NULL, NULL }
};
diff --git a/target/s390x/cpu_models_sysemu.c b/target/s390x/cpu_models_sysemu.c
index 05c3ccaaff..6a04ccab1b 100644
--- a/target/s390x/cpu_models_sysemu.c
+++ b/target/s390x/cpu_models_sysemu.c
@@ -36,6 +36,9 @@ static void check_unavailable_features(const S390CPUModel *max_model,
(max_model->def->gen == model->def->gen &&
max_model->def->ec_ga < model->def->ec_ga)) {
list_add_feat("type", unavailable);
+ } else if (model->def->gen < 11 && kvm_enabled()) {
+ /* Older CPU models are not supported on Red Hat Enterprise Linux */
+ list_add_feat("type", unavailable);
}
/* detect missing features if any to properly report them */
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index 6acf14d5ec..74f089d87f 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -2512,6 +2512,14 @@ void kvm_s390_apply_cpu_model(const S390CPUModel *model, Error **errp)
error_setg(errp, "KVM doesn't support CPU models");
return;
}
+
+ /* Older CPU models are not supported on Red Hat Enterprise Linux */
+ if (model->def->gen < 11) {
+ error_setg(errp, "KVM: Unsupported CPU type specified: %s",
+ MACHINE(qdev_get_machine())->cpu_type);
+ return;
+ }
+
prop.cpuid = s390_cpuid_from_cpu_model(model);
prop.ibc = s390_ibc_from_cpu_model(model);
/* configure cpu features indicated via STFL(e) */
--
2.31.1

View File

@ -1,619 +0,0 @@
From a525db3951dc68c469d1f51bdc69ab6e75e72c37 Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Fri, 11 Jan 2019 09:54:45 +0100
Subject: Machine type related general changes
This patch is first part of original "Add RHEL machine types" patch we
split to allow easier review. It contains changes not related to any
architecture.
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
--
Rebase notes (6.2.0):
- Do not duplicate minimal_version_id for piix4_pm
- Remove empty line chunks in serial.c
- Remove migration.h include in serial.c
- Update hw_compat_rhel_8_5 (from MR 66)
Rebase notes (7.0.0):
- Remove downstream changes leftovers in hw/rtc/mc146818rtc.c
- Remove unnecessary change in hw/usb/hcd-uhci.c
Merged patches (6.1.0):
- f2fb42a3c6 redhat: add missing entries in hw_compat_rhel_8_4
- 1949ec258e hw/arm/virt: Disable PL011 clock migration through hw_compat_rhel_8_3
- a3995e2eff Remove RHEL 7.0.0 machine type (only generic changes)
- ad3190a79b Remove RHEL 7.1.0 machine type (only generic changes)
- 84bbe15d4e Remove RHEL 7.2.0 machine type (only generic changes)
- 0215eb3356 Remove RHEL 7.3.0 machine types (only generic changes)
- af69d1ca6e Remove RHEL 7.4.0 machine types (only generic changes)
- 8f7a74ab78 Remove RHEL 7.5.0 machine types (only generic changes)
Merged patches (6.2.0):
- d687ac13d2 redhat: Define hw_compat_rhel_8_5
Merged patches (7.0.0):
- ef5afcc86d Fix virtio-net-pci* "vectors" compat
- 168f0d56e3 compat: Update hw_compat_rhel_8_5 with 6.2.0 RC2 changes
---
hw/acpi/piix4.c | 6 +-
hw/arm/virt.c | 2 +-
hw/core/machine.c | 186 +++++++++++++++++++++++++++++++++++
hw/display/vga-isa.c | 2 +-
hw/i386/pc_piix.c | 2 +
hw/i386/pc_q35.c | 2 +
hw/net/rtl8139.c | 4 +-
hw/smbios/smbios.c | 46 ++++++++-
hw/timer/i8254_common.c | 2 +-
hw/usb/hcd-xhci-pci.c | 59 ++++++++---
hw/usb/hcd-xhci-pci.h | 1 +
include/hw/boards.h | 21 ++++
include/hw/firmware/smbios.h | 5 +-
include/hw/i386/pc.h | 3 +
14 files changed, 316 insertions(+), 25 deletions(-)
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index fe5625d07a..28544e78c3 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -287,7 +287,7 @@ static bool vmstate_test_migrate_acpi_index(void *opaque, int version_id)
static const VMStateDescription vmstate_acpi = {
.name = "piix4_pm",
.version_id = 3,
- .minimum_version_id = 3,
+ .minimum_version_id = 2,
.post_load = vmstate_acpi_post_load,
.fields = (VMStateField[]) {
VMSTATE_PCI_DEVICE(parent_obj, PIIX4PMState),
@@ -653,8 +653,8 @@ static void piix4_send_gpe(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
static Property piix4_pm_properties[] = {
DEFINE_PROP_UINT32("smb_io_base", PIIX4PMState, smb_io_base, 0),
- DEFINE_PROP_UINT8(ACPI_PM_PROP_S3_DISABLED, PIIX4PMState, disable_s3, 0),
- DEFINE_PROP_UINT8(ACPI_PM_PROP_S4_DISABLED, PIIX4PMState, disable_s4, 0),
+ DEFINE_PROP_UINT8(ACPI_PM_PROP_S3_DISABLED, PIIX4PMState, disable_s3, 1),
+ DEFINE_PROP_UINT8(ACPI_PM_PROP_S4_DISABLED, PIIX4PMState, disable_s4, 1),
DEFINE_PROP_UINT8(ACPI_PM_PROP_S4_VAL, PIIX4PMState, s4_val, 2),
DEFINE_PROP_BOOL(ACPI_PM_PROP_ACPI_PCIHP_BRIDGE, PIIX4PMState,
use_acpi_hotplug_bridge, true),
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index d2e5ecd234..6a84031fd7 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1596,7 +1596,7 @@ static void virt_build_smbios(VirtMachineState *vms)
smbios_set_defaults("QEMU", product,
vmc->smbios_old_sys_ver ? "1.0" : mc->name, false,
- true, SMBIOS_ENTRY_POINT_TYPE_64);
+ true, NULL, NULL, SMBIOS_ENTRY_POINT_TYPE_64);
smbios_get_tables(MACHINE(vms), NULL, 0,
&smbios_tables, &smbios_tables_len,
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 1e23fdc14b..ea430d844e 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -37,6 +37,192 @@
#include "hw/virtio/virtio.h"
#include "hw/virtio/virtio-pci.h"
+/*
+ * Mostly the same as hw_compat_6_0 and hw_compat_6_1
+ */
+GlobalProperty hw_compat_rhel_8_5[] = {
+ /* hw_compat_rhel_8_5 from hw_compat_6_0 */
+ { "gpex-pcihost", "allow-unmapped-accesses", "false" },
+ /* hw_compat_rhel_8_5 from hw_compat_6_0 */
+ { "i8042", "extended-state", "false"},
+ /* hw_compat_rhel_8_5 from hw_compat_6_0 */
+ { "nvme-ns", "eui64-default", "off"},
+ /* hw_compat_rhel_8_5 from hw_compat_6_0 */
+ { "e1000", "init-vet", "off" },
+ /* hw_compat_rhel_8_5 from hw_compat_6_0 */
+ { "e1000e", "init-vet", "off" },
+ /* hw_compat_rhel_8_5 from hw_compat_6_0 */
+ { "vhost-vsock-device", "seqpacket", "off" },
+ /* hw_compat_rhel_8_5 from hw_compat_6_1 */
+ { "vhost-user-vsock-device", "seqpacket", "off" },
+ /* hw_compat_rhel_8_5 from hw_compat_6_1 */
+ { "nvme-ns", "shared", "off" },
+};
+const size_t hw_compat_rhel_8_5_len = G_N_ELEMENTS(hw_compat_rhel_8_5);
+
+/*
+ * Mostly the same as hw_compat_5_2
+ */
+GlobalProperty hw_compat_rhel_8_4[] = {
+ /* hw_compat_rhel_8_4 from hw_compat_5_2 */
+ { "ICH9-LPC", "smm-compat", "on"},
+ /* hw_compat_rhel_8_4 from hw_compat_5_2 */
+ { "PIIX4_PM", "smm-compat", "on"},
+ /* hw_compat_rhel_8_4 from hw_compat_5_2 */
+ { "virtio-blk-device", "report-discard-granularity", "off" },
+ /* hw_compat_rhel_8_4 from hw_compat_5_2 */
+ /*
+ * Upstream incorrectly had "virtio-net-pci" instead of "virtio-net-pci-base",
+ * (https://bugzilla.redhat.com/show_bug.cgi?id=1999141)
+ */
+ { "virtio-net-pci-base", "vectors", "3"},
+};
+const size_t hw_compat_rhel_8_4_len = G_N_ELEMENTS(hw_compat_rhel_8_4);
+
+/*
+ * Mostly the same as hw_compat_5_1
+ */
+GlobalProperty hw_compat_rhel_8_3[] = {
+ /* hw_compat_rhel_8_3 from hw_compat_5_1 */
+ { "vhost-scsi", "num_queues", "1"},
+ /* hw_compat_rhel_8_3 from hw_compat_5_1 */
+ { "vhost-user-blk", "num-queues", "1"},
+ /* hw_compat_rhel_8_3 from hw_compat_5_1 */
+ { "vhost-user-scsi", "num_queues", "1"},
+ /* hw_compat_rhel_8_3 from hw_compat_5_1 */
+ { "virtio-blk-device", "num-queues", "1"},
+ /* hw_compat_rhel_8_3 from hw_compat_5_1 */
+ { "virtio-scsi-device", "num_queues", "1"},
+ /* hw_compat_rhel_8_3 from hw_compat_5_1 */
+ { "nvme", "use-intel-id", "on"},
+ /* hw_compat_rhel_8_3 from hw_compat_5_1 */
+ { "pvpanic", "events", "1"}, /* PVPANIC_PANICKED */
+ /* hw_compat_rhel_8_3 from hw_compat_5_1 */
+ { "pl011", "migrate-clk", "off" },
+ /* hw_compat_rhel_8_3 bz 1912846 */
+ { "pci-xhci", "x-rh-late-msi-cap", "off" },
+ /* hw_compat_rhel_8_3 from hw_compat_5_1 */
+ { "virtio-pci", "x-ats-page-aligned", "off"},
+};
+const size_t hw_compat_rhel_8_3_len = G_N_ELEMENTS(hw_compat_rhel_8_3);
+
+/*
+ * The same as hw_compat_4_2 + hw_compat_5_0
+ */
+GlobalProperty hw_compat_rhel_8_2[] = {
+ /* hw_compat_rhel_8_2 from hw_compat_4_2 */
+ { "virtio-blk-device", "queue-size", "128"},
+ /* hw_compat_rhel_8_2 from hw_compat_4_2 */
+ { "virtio-scsi-device", "virtqueue_size", "128"},
+ /* hw_compat_rhel_8_2 from hw_compat_4_2 */
+ { "virtio-blk-device", "x-enable-wce-if-config-wce", "off" },
+ /* hw_compat_rhel_8_2 from hw_compat_4_2 */
+ { "virtio-blk-device", "seg-max-adjust", "off"},
+ /* hw_compat_rhel_8_2 from hw_compat_4_2 */
+ { "virtio-scsi-device", "seg_max_adjust", "off"},
+ /* hw_compat_rhel_8_2 from hw_compat_4_2 */
+ { "vhost-blk-device", "seg_max_adjust", "off"},
+ /* hw_compat_rhel_8_2 from hw_compat_4_2 */
+ { "usb-host", "suppress-remote-wake", "off" },
+ /* hw_compat_rhel_8_2 from hw_compat_4_2 */
+ { "usb-redir", "suppress-remote-wake", "off" },
+ /* hw_compat_rhel_8_2 from hw_compat_4_2 */
+ { "qxl", "revision", "4" },
+ /* hw_compat_rhel_8_2 from hw_compat_4_2 */
+ { "qxl-vga", "revision", "4" },
+ /* hw_compat_rhel_8_2 from hw_compat_4_2 */
+ { "fw_cfg", "acpi-mr-restore", "false" },
+ /* hw_compat_rhel_8_2 from hw_compat_4_2 */
+ { "virtio-device", "use-disabled-flag", "false" },
+ /* hw_compat_rhel_8_2 from hw_compat_5_0 */
+ { "pci-host-bridge", "x-config-reg-migration-enabled", "off" },
+ /* hw_compat_rhel_8_2 from hw_compat_5_0 */
+ { "virtio-balloon-device", "page-poison", "false" },
+ /* hw_compat_rhel_8_2 from hw_compat_5_0 */
+ { "vmport", "x-read-set-eax", "off" },
+ /* hw_compat_rhel_8_2 from hw_compat_5_0 */
+ { "vmport", "x-signal-unsupported-cmd", "off" },
+ /* hw_compat_rhel_8_2 from hw_compat_5_0 */
+ { "vmport", "x-report-vmx-type", "off" },
+ /* hw_compat_rhel_8_2 from hw_compat_5_0 */
+ { "vmport", "x-cmds-v2", "off" },
+ /* hw_compat_rhel_8_2 from hw_compat_5_0 */
+ { "virtio-device", "x-disable-legacy-check", "true" },
+};
+const size_t hw_compat_rhel_8_2_len = G_N_ELEMENTS(hw_compat_rhel_8_2);
+
+/*
+ * The same as hw_compat_4_1
+ */
+GlobalProperty hw_compat_rhel_8_1[] = {
+ /* hw_compat_rhel_8_1 from hw_compat_4_1 */
+ { "virtio-pci", "x-pcie-flr-init", "off" },
+};
+const size_t hw_compat_rhel_8_1_len = G_N_ELEMENTS(hw_compat_rhel_8_1);
+
+/* The same as hw_compat_3_1
+ * format of array has been changed by:
+ * 6c36bddf5340 ("machine: Use shorter format for GlobalProperty arrays")
+ */
+GlobalProperty hw_compat_rhel_8_0[] = {
+ /* hw_compat_rhel_8_0 from hw_compat_3_1 */
+ { "pcie-root-port", "x-speed", "2_5" },
+ /* hw_compat_rhel_8_0 from hw_compat_3_1 */
+ { "pcie-root-port", "x-width", "1" },
+ /* hw_compat_rhel_8_0 from hw_compat_3_1 */
+ { "memory-backend-file", "x-use-canonical-path-for-ramblock-id", "true" },
+ /* hw_compat_rhel_8_0 from hw_compat_3_1 */
+ { "memory-backend-memfd", "x-use-canonical-path-for-ramblock-id", "true" },
+ /* hw_compat_rhel_8_0 from hw_compat_3_1 */
+ { "tpm-crb", "ppi", "false" },
+ /* hw_compat_rhel_8_0 from hw_compat_3_1 */
+ { "tpm-tis", "ppi", "false" },
+ /* hw_compat_rhel_8_0 from hw_compat_3_1 */
+ { "usb-kbd", "serial", "42" },
+ /* hw_compat_rhel_8_0 from hw_compat_3_1 */
+ { "usb-mouse", "serial", "42" },
+ /* hw_compat_rhel_8_0 from hw_compat_3_1 */
+ { "usb-tablet", "serial", "42" },
+ /* hw_compat_rhel_8_0 from hw_compat_3_1 */
+ { "virtio-blk-device", "discard", "false" },
+ /* hw_compat_rhel_8_0 from hw_compat_3_1 */
+ { "virtio-blk-device", "write-zeroes", "false" },
+ /* hw_compat_rhel_8_0 from hw_compat_4_0 */
+ { "VGA", "edid", "false" },
+ /* hw_compat_rhel_8_0 from hw_compat_4_0 */
+ { "secondary-vga", "edid", "false" },
+ /* hw_compat_rhel_8_0 from hw_compat_4_0 */
+ { "bochs-display", "edid", "false" },
+ /* hw_compat_rhel_8_0 from hw_compat_4_0 */
+ { "virtio-vga", "edid", "false" },
+ /* hw_compat_rhel_8_0 from hw_compat_4_0 */
+ { "virtio-gpu-device", "edid", "false" },
+ /* hw_compat_rhel_8_0 from hw_compat_4_0 */
+ { "virtio-device", "use-started", "false" },
+ /* hw_compat_rhel_8_0 from hw_compat_3_1 - that was added in 4.1 */
+ { "pcie-root-port-base", "disable-acs", "true" },
+};
+const size_t hw_compat_rhel_8_0_len = G_N_ELEMENTS(hw_compat_rhel_8_0);
+
+/* The same as hw_compat_3_0 + hw_compat_2_12
+ * except that
+ * there's nothing in 3_0
+ * migration.decompress-error-check=off was in 7.5 from bz 1584139
+ */
+GlobalProperty hw_compat_rhel_7_6[] = {
+ /* hw_compat_rhel_7_6 from hw_compat_2_12 */
+ { "hda-audio", "use-timer", "false" },
+ /* hw_compat_rhel_7_6 from hw_compat_2_12 */
+ { "cirrus-vga", "global-vmstate", "true" },
+ /* hw_compat_rhel_7_6 from hw_compat_2_12 */
+ { "VGA", "global-vmstate", "true" },
+ /* hw_compat_rhel_7_6 from hw_compat_2_12 */
+ { "vmware-svga", "global-vmstate", "true" },
+ /* hw_compat_rhel_7_6 from hw_compat_2_12 */
+ { "qxl-vga", "global-vmstate", "true" },
+};
+const size_t hw_compat_rhel_7_6_len = G_N_ELEMENTS(hw_compat_rhel_7_6);
+
GlobalProperty hw_compat_6_2[] = {
{ "PIIX4_PM", "x-not-migrate-acpi-index", "on"},
};
diff --git a/hw/display/vga-isa.c b/hw/display/vga-isa.c
index 46abbc5653..505467059b 100644
--- a/hw/display/vga-isa.c
+++ b/hw/display/vga-isa.c
@@ -88,7 +88,7 @@ static void vga_isa_realizefn(DeviceState *dev, Error **errp)
}
static Property vga_isa_properties[] = {
- DEFINE_PROP_UINT32("vgamem_mb", ISAVGAState, state.vram_size_mb, 8),
+ DEFINE_PROP_UINT32("vgamem_mb", ISAVGAState, state.vram_size_mb, 16),
DEFINE_PROP_END_OF_LIST(),
};
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index b72c03d0a6..c797e98312 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -177,6 +177,8 @@ static void pc_init1(MachineState *machine,
smbios_set_defaults("QEMU", "Standard PC (i440FX + PIIX, 1996)",
mc->name, pcmc->smbios_legacy_mode,
pcmc->smbios_uuid_encoded,
+ pcmc->smbios_stream_product,
+ pcmc->smbios_stream_version,
pcms->smbios_entry_point_type);
}
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 1780f79bc1..b695f88c45 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -200,6 +200,8 @@ static void pc_q35_init(MachineState *machine)
smbios_set_defaults("QEMU", "Standard PC (Q35 + ICH9, 2009)",
mc->name, pcmc->smbios_legacy_mode,
pcmc->smbios_uuid_encoded,
+ pcmc->smbios_stream_product,
+ pcmc->smbios_stream_version,
pcms->smbios_entry_point_type);
}
diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
index 6b65823b4b..75dacabc43 100644
--- a/hw/net/rtl8139.c
+++ b/hw/net/rtl8139.c
@@ -3179,7 +3179,7 @@ static int rtl8139_pre_save(void *opaque)
static const VMStateDescription vmstate_rtl8139 = {
.name = "rtl8139",
- .version_id = 5,
+ .version_id = 4,
.minimum_version_id = 3,
.post_load = rtl8139_post_load,
.pre_save = rtl8139_pre_save,
@@ -3260,7 +3260,9 @@ static const VMStateDescription vmstate_rtl8139 = {
VMSTATE_UINT32(tally_counters.TxMCol, RTL8139State),
VMSTATE_UINT64(tally_counters.RxOkPhy, RTL8139State),
VMSTATE_UINT64(tally_counters.RxOkBrd, RTL8139State),
+#if 0 /* Disabled for Red Hat Enterprise Linux bz 1420195 */
VMSTATE_UINT32_V(tally_counters.RxOkMul, RTL8139State, 5),
+#endif
VMSTATE_UINT16(tally_counters.TxAbt, RTL8139State),
VMSTATE_UINT16(tally_counters.TxUndrn, RTL8139State),
diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index 60349ee402..0edcc98434 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -57,6 +57,9 @@ static bool smbios_legacy = true;
static bool smbios_uuid_encoded = true;
/* end: legacy structures & constants for <= 2.0 machines */
+/* Set to true for modern Windows 10 HardwareID-6 compat */
+static bool smbios_type2_required;
+
uint8_t *smbios_tables;
size_t smbios_tables_len;
@@ -639,7 +642,7 @@ static void smbios_build_type_1_table(void)
static void smbios_build_type_2_table(void)
{
- SMBIOS_BUILD_TABLE_PRE(2, T2_BASE, false); /* optional */
+ SMBIOS_BUILD_TABLE_PRE(2, T2_BASE, smbios_type2_required);
SMBIOS_TABLE_SET_STR(2, manufacturer_str, type2.manufacturer);
SMBIOS_TABLE_SET_STR(2, product_str, type2.product);
@@ -914,7 +917,10 @@ void smbios_set_cpuid(uint32_t version, uint32_t features)
void smbios_set_defaults(const char *manufacturer, const char *product,
const char *version, bool legacy_mode,
- bool uuid_encoded, SmbiosEntryPointType ep_type)
+ bool uuid_encoded,
+ const char *stream_product,
+ const char *stream_version,
+ SmbiosEntryPointType ep_type)
{
smbios_have_defaults = true;
smbios_legacy = legacy_mode;
@@ -935,11 +941,45 @@ void smbios_set_defaults(const char *manufacturer, const char *product,
g_free(smbios_entries);
}
+ /*
+ * If @stream_product & @stream_version are non-NULL, then
+ * we're following rules for new Windows driver support.
+ * The data we have to report is defined in this doc:
+ *
+ * https://docs.microsoft.com/en-us/windows-hardware/drivers/install/specifying-hardware-ids-for-a-computer
+ *
+ * The Windows drivers are written to expect use of the
+ * scheme documented as "HardwareID-6" against Windows 10,
+ * which uses SMBIOS System (Type 1) and Base Board (Type 2)
+ * tables and will match on
+ *
+ * System Manufacturer = Red Hat (@manufacturer)
+ * System SKU Number = 8.2.0 (@stream_version)
+ * Baseboard Manufacturer = Red Hat (@manufacturer)
+ * Baseboard Product = RHEL-AV (@stream_product)
+ *
+ * NB, SKU must be changed with each RHEL-AV release
+ *
+ * Other fields can be freely used by applications using
+ * QEMU. For example apps can use the "System product"
+ * and "System version" to identify themselves.
+ *
+ * We get 'System Manufacturer' and 'Baseboard Manufacturer'
+ */
SMBIOS_SET_DEFAULT(type1.manufacturer, manufacturer);
SMBIOS_SET_DEFAULT(type1.product, product);
SMBIOS_SET_DEFAULT(type1.version, version);
+ SMBIOS_SET_DEFAULT(type1.family, "Red Hat Enterprise Linux");
+ if (stream_version != NULL) {
+ SMBIOS_SET_DEFAULT(type1.sku, stream_version);
+ }
SMBIOS_SET_DEFAULT(type2.manufacturer, manufacturer);
- SMBIOS_SET_DEFAULT(type2.product, product);
+ if (stream_product != NULL) {
+ SMBIOS_SET_DEFAULT(type2.product, stream_product);
+ smbios_type2_required = true;
+ } else {
+ SMBIOS_SET_DEFAULT(type2.product, product);
+ }
SMBIOS_SET_DEFAULT(type2.version, version);
SMBIOS_SET_DEFAULT(type3.manufacturer, manufacturer);
SMBIOS_SET_DEFAULT(type3.version, version);
diff --git a/hw/timer/i8254_common.c b/hw/timer/i8254_common.c
index 050875b497..32935da46c 100644
--- a/hw/timer/i8254_common.c
+++ b/hw/timer/i8254_common.c
@@ -231,7 +231,7 @@ static const VMStateDescription vmstate_pit_common = {
.pre_save = pit_dispatch_pre_save,
.post_load = pit_dispatch_post_load,
.fields = (VMStateField[]) {
- VMSTATE_UINT32_V(channels[0].irq_disabled, PITCommonState, 3),
+ VMSTATE_UINT32(channels[0].irq_disabled, PITCommonState), /* qemu-kvm's v2 had 'flags' here */
VMSTATE_STRUCT_ARRAY(channels, PITCommonState, 3, 2,
vmstate_pit_channel, PITChannelState),
VMSTATE_INT64(channels[0].next_transition_time,
diff --git a/hw/usb/hcd-xhci-pci.c b/hw/usb/hcd-xhci-pci.c
index e934b1a5b1..e18b05e528 100644
--- a/hw/usb/hcd-xhci-pci.c
+++ b/hw/usb/hcd-xhci-pci.c
@@ -104,6 +104,33 @@ static int xhci_pci_vmstate_post_load(void *opaque, int version_id)
return 0;
}
+/* RH bz 1912846 */
+static bool usb_xhci_pci_add_msi(struct PCIDevice *dev, Error **errp)
+{
+ int ret;
+ Error *err = NULL;
+ XHCIPciState *s = XHCI_PCI(dev);
+
+ ret = msi_init(dev, 0x70, s->xhci.numintrs, true, false, &err);
+ /*
+ * Any error other than -ENOTSUP(board's MSI support is broken)
+ * is a programming error
+ */
+ assert(!ret || ret == -ENOTSUP);
+ if (ret && s->msi == ON_OFF_AUTO_ON) {
+ /* Can't satisfy user's explicit msi=on request, fail */
+ error_append_hint(&err, "You have to use msi=auto (default) or "
+ "msi=off with this machine type.\n");
+ error_propagate(errp, err);
+ return true;
+ }
+ assert(!err || s->msi == ON_OFF_AUTO_AUTO);
+ /* With msi=auto, we fall back to MSI off silently */
+ error_free(err);
+
+ return false;
+}
+
static void usb_xhci_pci_realize(struct PCIDevice *dev, Error **errp)
{
int ret;
@@ -125,23 +152,12 @@ static void usb_xhci_pci_realize(struct PCIDevice *dev, Error **errp)
s->xhci.nec_quirks = true;
}
- if (s->msi != ON_OFF_AUTO_OFF) {
- ret = msi_init(dev, 0x70, s->xhci.numintrs, true, false, &err);
- /*
- * Any error other than -ENOTSUP(board's MSI support is broken)
- * is a programming error
- */
- assert(!ret || ret == -ENOTSUP);
- if (ret && s->msi == ON_OFF_AUTO_ON) {
- /* Can't satisfy user's explicit msi=on request, fail */
- error_append_hint(&err, "You have to use msi=auto (default) or "
- "msi=off with this machine type.\n");
+ if (s->msi != ON_OFF_AUTO_OFF && s->rh_late_msi_cap) {
+ /* This gives the behaviour from 5.2.0 onwards, lspci shows 90,a0,70 */
+ if (usb_xhci_pci_add_msi(dev, &err)) {
error_propagate(errp, err);
return;
}
- assert(!err || s->msi == ON_OFF_AUTO_AUTO);
- /* With msi=auto, we fall back to MSI off silently */
- error_free(err);
}
pci_register_bar(dev, 0,
PCI_BASE_ADDRESS_SPACE_MEMORY |
@@ -154,6 +170,14 @@ static void usb_xhci_pci_realize(struct PCIDevice *dev, Error **errp)
assert(ret > 0);
}
+ /* RH bz 1912846 */
+ if (s->msi != ON_OFF_AUTO_OFF && !s->rh_late_msi_cap) {
+ /* This gives the older RH machine behaviour, lspci shows 90,70,a0 */
+ if (usb_xhci_pci_add_msi(dev, &err)) {
+ error_propagate(errp, err);
+ return;
+ }
+ }
if (s->msix != ON_OFF_AUTO_OFF) {
/* TODO check for errors, and should fail when msix=on */
msix_init(dev, s->xhci.numintrs,
@@ -198,11 +222,18 @@ static void xhci_instance_init(Object *obj)
qdev_alias_all_properties(DEVICE(&s->xhci), obj);
}
+static Property xhci_pci_properties[] = {
+ /* RH bz 1912846 */
+ DEFINE_PROP_BOOL("x-rh-late-msi-cap", XHCIPciState, rh_late_msi_cap, true),
+ DEFINE_PROP_END_OF_LIST()
+};
+
static void xhci_class_init(ObjectClass *klass, void *data)
{
PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
DeviceClass *dc = DEVICE_CLASS(klass);
+ device_class_set_props(dc, xhci_pci_properties);
dc->reset = xhci_pci_reset;
dc->vmsd = &vmstate_xhci_pci;
set_bit(DEVICE_CATEGORY_USB, dc->categories);
diff --git a/hw/usb/hcd-xhci-pci.h b/hw/usb/hcd-xhci-pci.h
index c193f79443..086a1feb1e 100644
--- a/hw/usb/hcd-xhci-pci.h
+++ b/hw/usb/hcd-xhci-pci.h
@@ -39,6 +39,7 @@ typedef struct XHCIPciState {
XHCIState xhci;
OnOffAuto msi;
OnOffAuto msix;
+ bool rh_late_msi_cap; /* bz 1912846 */
} XHCIPciState;
#endif
diff --git a/include/hw/boards.h b/include/hw/boards.h
index c92ac8815c..c90a19b4d1 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -449,4 +449,25 @@ extern const size_t hw_compat_2_2_len;
extern GlobalProperty hw_compat_2_1[];
extern const size_t hw_compat_2_1_len;
+extern GlobalProperty hw_compat_rhel_8_5[];
+extern const size_t hw_compat_rhel_8_5_len;
+
+extern GlobalProperty hw_compat_rhel_8_4[];
+extern const size_t hw_compat_rhel_8_4_len;
+
+extern GlobalProperty hw_compat_rhel_8_3[];
+extern const size_t hw_compat_rhel_8_3_len;
+
+extern GlobalProperty hw_compat_rhel_8_2[];
+extern const size_t hw_compat_rhel_8_2_len;
+
+extern GlobalProperty hw_compat_rhel_8_1[];
+extern const size_t hw_compat_rhel_8_1_len;
+
+extern GlobalProperty hw_compat_rhel_8_0[];
+extern const size_t hw_compat_rhel_8_0_len;
+
+extern GlobalProperty hw_compat_rhel_7_6[];
+extern const size_t hw_compat_rhel_7_6_len;
+
#endif
diff --git a/include/hw/firmware/smbios.h b/include/hw/firmware/smbios.h
index 4b7ad77a44..9acff96a86 100644
--- a/include/hw/firmware/smbios.h
+++ b/include/hw/firmware/smbios.h
@@ -272,7 +272,10 @@ void smbios_entry_add(QemuOpts *opts, Error **errp);
void smbios_set_cpuid(uint32_t version, uint32_t features);
void smbios_set_defaults(const char *manufacturer, const char *product,
const char *version, bool legacy_mode,
- bool uuid_encoded, SmbiosEntryPointType ep_type);
+ bool uuid_encoded,
+ const char *stream_product,
+ const char *stream_version,
+ SmbiosEntryPointType ep_type);
uint8_t *smbios_get_table_legacy(MachineState *ms, size_t *length);
void smbios_get_tables(MachineState *ms,
const struct smbios_phys_mem_area *mem_array,
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 1a27de9c8b..91331059d9 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -113,6 +113,9 @@ struct PCMachineClass {
bool smbios_defaults;
bool smbios_legacy_mode;
bool smbios_uuid_encoded;
+ /* New fields needed for Windows HardwareID-6 matching */
+ const char *smbios_stream_product;
+ const char *smbios_stream_version;
/* RAM / address space compat: */
bool gigabyte_align;
--
2.31.1

View File

@ -1,352 +0,0 @@
From 697aaa43e3c0f20fc312f06be6c1093f1ba907e1 Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Fri, 19 Oct 2018 12:53:31 +0200
Subject: Add aarch64 machine types
Adding changes to add RHEL machine types for aarch64 architecture.
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
---
Rebase notes (6.1.0):
- Use CONFIG_TPM check when using TPM structures
- Add support for default_bus_bypass_iommu
- ea4c0b32d9 arm/virt: Register highmem and gic-version as class properties
- 895e1fa86a hw/arm/virt: Add 8.5 and 9.0 machine types and remove older ones
Rebase notes (7.0.0):
- Added dtb-kaslr-seed option
- Set no_tcg_lpa2 to true
Merged patches (6.2.0):
- 9a3d4fde0e hw/arm/virt: Remove 9.0 machine type
- f7d04d6695 hw: arm: virt: Add hw_compat_rhel_8_5 to 8.5 machine type
Merged patches (7.0.0):
- 3b82be3dd3 redhat: virt-rhel8.5.0: Update machine type compatibility for QEMU 6.2.0 update
- c354a86c9b hw/arm/virt: Register "iommu" as a class property
- c1a2630dc9 hw/arm/virt: Register "its" as a class property
- 9d8c61dc93 hw/arm/virt: Rename default_bus_bypass_iommu
- a1d1b6eeb6 hw/arm/virt: Expose the 'RAS' option
- 47f8fe1b82 hw/arm/virt: Add 9.0 machine type and remove 8.5 one
- ed2346788f hw/arm/virt: Check no_tcg_its and minor style changes
---
hw/arm/virt.c | 234 +++++++++++++++++++++++++++++++++++++++++-
include/hw/arm/virt.h | 8 ++
2 files changed, 241 insertions(+), 1 deletion(-)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 6a84031fd7..e06862d22a 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -80,6 +80,7 @@
#include "hw/char/pl011.h"
#include "qemu/guest-random.h"
+#if 0 /* Disabled for Red Hat Enterprise Linux */
#define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
void *data) \
@@ -106,7 +107,48 @@
DEFINE_VIRT_MACHINE_LATEST(major, minor, true)
#define DEFINE_VIRT_MACHINE(major, minor) \
DEFINE_VIRT_MACHINE_LATEST(major, minor, false)
-
+#endif /* disabled for RHEL */
+
+#define DEFINE_RHEL_MACHINE_LATEST(m, n, s, latest) \
+ static void rhel##m##n##s##_virt_class_init(ObjectClass *oc, \
+ void *data) \
+ { \
+ MachineClass *mc = MACHINE_CLASS(oc); \
+ rhel##m##n##s##_virt_options(mc); \
+ mc->desc = "RHEL " # m "." # n "." # s " ARM Virtual Machine"; \
+ if (latest) { \
+ mc->alias = "virt"; \
+ mc->is_default = 1; \
+ } \
+ } \
+ static const TypeInfo rhel##m##n##s##_machvirt_info = { \
+ .name = MACHINE_TYPE_NAME("virt-rhel" # m "." # n "." # s), \
+ .parent = TYPE_RHEL_MACHINE, \
+ .class_init = rhel##m##n##s##_virt_class_init, \
+ }; \
+ static void rhel##m##n##s##_machvirt_init(void) \
+ { \
+ type_register_static(&rhel##m##n##s##_machvirt_info); \
+ } \
+ type_init(rhel##m##n##s##_machvirt_init);
+
+#define DEFINE_RHEL_MACHINE_AS_LATEST(major, minor, subminor) \
+ DEFINE_RHEL_MACHINE_LATEST(major, minor, subminor, true)
+#define DEFINE_RHEL_MACHINE(major, minor, subminor) \
+ DEFINE_RHEL_MACHINE_LATEST(major, minor, subminor, false)
+
+/* This variable is for changes to properties that are RHEL specific,
+ * different to the current upstream and to be applied to the latest
+ * machine type.
+ */
+GlobalProperty arm_rhel_compat[] = {
+ {
+ .driver = "virtio-net-pci",
+ .property = "romfile",
+ .value = "",
+ },
+};
+const size_t arm_rhel_compat_len = G_N_ELEMENTS(arm_rhel_compat);
/* Number of external interrupt lines to configure the GIC with */
#define NUM_IRQS 256
@@ -2250,6 +2292,7 @@ static void machvirt_init(MachineState *machine)
qemu_add_machine_init_done_notifier(&vms->machine_done);
}
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static bool virt_get_secure(Object *obj, Error **errp)
{
VirtMachineState *vms = VIRT_MACHINE(obj);
@@ -2277,6 +2320,7 @@ static void virt_set_virt(Object *obj, bool value, Error **errp)
vms->virt = value;
}
+#endif /* disabled for RHEL */
static bool virt_get_highmem(Object *obj, Error **errp)
{
@@ -2402,6 +2446,7 @@ static void virt_set_ras(Object *obj, bool value, Error **errp)
vms->ras = value;
}
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static bool virt_get_mte(Object *obj, Error **errp)
{
VirtMachineState *vms = VIRT_MACHINE(obj);
@@ -2415,6 +2460,7 @@ static void virt_set_mte(Object *obj, bool value, Error **errp)
vms->mte = value;
}
+#endif /* disabled for RHEL */
static char *virt_get_gic_version(Object *obj, Error **errp)
{
@@ -2818,6 +2864,7 @@ static int virt_kvm_type(MachineState *ms, const char *type_str)
return fixed_ipa ? 0 : requested_pa_size;
}
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static void virt_machine_class_init(ObjectClass *oc, void *data)
{
MachineClass *mc = MACHINE_CLASS(oc);
@@ -3206,3 +3253,188 @@ static void virt_machine_2_6_options(MachineClass *mc)
vmc->no_pmu = true;
}
DEFINE_VIRT_MACHINE(2, 6)
+#endif /* disabled for RHEL */
+
+static void rhel_machine_class_init(ObjectClass *oc, void *data)
+{
+ MachineClass *mc = MACHINE_CLASS(oc);
+ HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
+
+ mc->family = "virt-rhel-Z";
+ mc->init = machvirt_init;
+ /* Maximum supported VCPU count for all virt-rhel* machines */
+ mc->max_cpus = 384;
+#ifdef CONFIG_TPM
+ machine_class_allow_dynamic_sysbus_dev(mc, TYPE_TPM_TIS_SYSBUS);
+#endif
+ mc->block_default_type = IF_VIRTIO;
+ mc->no_cdrom = 1;
+ mc->pci_allow_0_address = true;
+ /* We know we will never create a pre-ARMv7 CPU which needs 1K pages */
+ mc->minimum_page_bits = 12;
+ mc->possible_cpu_arch_ids = virt_possible_cpu_arch_ids;
+ mc->cpu_index_to_instance_props = virt_cpu_index_to_props;
+ mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a57");
+ mc->get_default_cpu_node_id = virt_get_default_cpu_node_id;
+ mc->kvm_type = virt_kvm_type;
+ assert(!mc->get_hotplug_handler);
+ mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
+ hc->pre_plug = virt_machine_device_pre_plug_cb;
+ hc->plug = virt_machine_device_plug_cb;
+ hc->unplug_request = virt_machine_device_unplug_request_cb;
+ hc->unplug = virt_machine_device_unplug_cb;
+ mc->nvdimm_supported = true;
+ mc->auto_enable_numa_with_memhp = true;
+ mc->auto_enable_numa_with_memdev = true;
+ mc->default_ram_id = "mach-virt.ram";
+
+ object_class_property_add(oc, "acpi", "OnOffAuto",
+ virt_get_acpi, virt_set_acpi,
+ NULL, NULL);
+ object_class_property_set_description(oc, "acpi",
+ "Enable ACPI");
+
+ object_class_property_add_bool(oc, "highmem", virt_get_highmem,
+ virt_set_highmem);
+ object_class_property_set_description(oc, "highmem",
+ "Set on/off to enable/disable using "
+ "physical address space above 32 bits");
+
+ object_class_property_add_str(oc, "gic-version", virt_get_gic_version,
+ virt_set_gic_version);
+ object_class_property_set_description(oc, "gic-version",
+ "Set GIC version. "
+ "Valid values are 2, 3, host and max");
+
+ object_class_property_add_str(oc, "iommu", virt_get_iommu, virt_set_iommu);
+ object_class_property_set_description(oc, "iommu",
+ "Set the IOMMU type. "
+ "Valid values are none and smmuv3");
+
+ object_class_property_add_bool(oc, "default-bus-bypass-iommu",
+ virt_get_default_bus_bypass_iommu,
+ virt_set_default_bus_bypass_iommu);
+ object_class_property_set_description(oc, "default-bus-bypass-iommu",
+ "Set on/off to enable/disable "
+ "bypass_iommu for default root bus");
+
+ object_class_property_add_bool(oc, "ras", virt_get_ras,
+ virt_set_ras);
+ object_class_property_set_description(oc, "ras",
+ "Set on/off to enable/disable reporting host memory errors "
+ "to a KVM guest using ACPI and guest external abort exceptions");
+
+ object_class_property_add_bool(oc, "its", virt_get_its,
+ virt_set_its);
+ object_class_property_set_description(oc, "its",
+ "Set on/off to enable/disable "
+ "ITS instantiation");
+
+ object_class_property_add_str(oc, "x-oem-id",
+ virt_get_oem_id,
+ virt_set_oem_id);
+ object_class_property_set_description(oc, "x-oem-id",
+ "Override the default value of field OEMID "
+ "in ACPI table header."
+ "The string may be up to 6 bytes in size");
+
+
+ object_class_property_add_str(oc, "x-oem-table-id",
+ virt_get_oem_table_id,
+ virt_set_oem_table_id);
+ object_class_property_set_description(oc, "x-oem-table-id",
+ "Override the default value of field OEM Table ID "
+ "in ACPI table header."
+ "The string may be up to 8 bytes in size");
+
+ object_class_property_add_bool(oc, "dtb-kaslr-seed",
+ virt_get_dtb_kaslr_seed,
+ virt_set_dtb_kaslr_seed);
+ object_class_property_set_description(oc, "dtb-kaslr-seed",
+ "Set off to disable passing of kaslr-seed "
+ "dtb node to guest");
+}
+
+static void rhel_virt_instance_init(Object *obj)
+{
+ VirtMachineState *vms = VIRT_MACHINE(obj);
+ VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
+
+ /* EL3 is disabled by default and non-configurable for RHEL */
+ vms->secure = false;
+
+ /* EL2 is disabled by default and non-configurable for RHEL */
+ vms->virt = false;
+
+ /* High memory is enabled by default */
+ vms->highmem = true;
+ vms->gic_version = VIRT_GIC_VERSION_NOSEL;
+
+ vms->highmem_ecam = !vmc->no_highmem_ecam;
+
+ if (vmc->no_its) {
+ vms->its = false;
+ } else {
+ /* Default allows ITS instantiation */
+ vms->its = true;
+
+ if (vmc->no_tcg_its) {
+ vms->tcg_its = false;
+ } else {
+ vms->tcg_its = true;
+ }
+ }
+
+ /* Default disallows iommu instantiation */
+ vms->iommu = VIRT_IOMMU_NONE;
+
+ /* The default root bus is attached to iommu by default */
+ vms->default_bus_bypass_iommu = false;
+
+ /* Default disallows RAS instantiation and is non-configurable for RHEL */
+ vms->ras = false;
+
+ /* MTE is disabled by default and non-configurable for RHEL */
+ vms->mte = false;
+
+ /* Supply a kaslr-seed by default */
+ vms->dtb_kaslr_seed = true;
+
+ vms->irqmap = a15irqmap;
+
+ virt_flash_create(vms);
+
+ vms->oem_id = g_strndup(ACPI_BUILD_APPNAME6, 6);
+ vms->oem_table_id = g_strndup(ACPI_BUILD_APPNAME8, 8);
+}
+
+static const TypeInfo rhel_machine_info = {
+ .name = TYPE_RHEL_MACHINE,
+ .parent = TYPE_MACHINE,
+ .abstract = true,
+ .instance_size = sizeof(VirtMachineState),
+ .class_size = sizeof(VirtMachineClass),
+ .class_init = rhel_machine_class_init,
+ .instance_init = rhel_virt_instance_init,
+ .interfaces = (InterfaceInfo[]) {
+ { TYPE_HOTPLUG_HANDLER },
+ { }
+ },
+};
+
+static void rhel_machine_init(void)
+{
+ type_register_static(&rhel_machine_info);
+}
+type_init(rhel_machine_init);
+
+static void rhel900_virt_options(MachineClass *mc)
+{
+ VirtMachineClass *vmc = VIRT_MACHINE_CLASS(OBJECT_CLASS(mc));
+
+ compat_props_add(mc->compat_props, arm_rhel_compat, arm_rhel_compat_len);
+
+ /* Disable FEAT_LPA2 since old kernels (<= v5.12) don't boot with that feature */
+ vmc->no_tcg_lpa2 = true;
+}
+DEFINE_RHEL_MACHINE_AS_LATEST(9, 0, 0)
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 7e76ee2619..9b1efe8f0e 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -179,9 +179,17 @@ struct VirtMachineState {
#define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
+#if 0 /* disabled for Red Hat Enterprise Linux */
#define TYPE_VIRT_MACHINE MACHINE_TYPE_NAME("virt")
OBJECT_DECLARE_TYPE(VirtMachineState, VirtMachineClass, VIRT_MACHINE)
+#else
+#define TYPE_RHEL_MACHINE MACHINE_TYPE_NAME("virt-rhel")
+typedef struct VirtMachineClass VirtMachineClass;
+typedef struct VirtMachineState VirtMachineState;
+DECLARE_OBJ_CHECKERS(VirtMachineState, VirtMachineClass, VIRT_MACHINE, TYPE_RHEL_MACHINE)
+#endif
+
void virt_acpi_setup(VirtMachineState *vms);
bool virt_is_acpi_enabled(VirtMachineState *vms);
--
2.31.1

View File

@ -1,528 +0,0 @@
From f61b3d7dc000886e23943457ee9baf1d4cae43b4 Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Fri, 19 Oct 2018 13:27:13 +0200
Subject: Add ppc64 machine types
Adding changes to add RHEL machine types for ppc64 architecture.
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
Rebase notes (6.2.0):
- Fixed rebase conflict relicts
- Update machine type compat for 6.2 (from MR 66)
Merged patches (6.1.0):
- c438c25ac3 redhat: Define pseries-rhel8.5.0 machine type
- a3995e2eff Remove RHEL 7.0.0 machine type (only ppc64 changes)
- ad3190a79b Remove RHEL 7.1.0 machine type (only ppc64 changes)
- 84bbe15d4e Remove RHEL 7.2.0 machine type (only ppc64 changes)
- 0215eb3356 Remove RHEL 7.3.0 machine types (only ppc64 changes)
- af69d1ca6e Remove RHEL 7.4.0 machine types (only ppc64 changes)
- 8f7a74ab78 Remove RHEL 7.5.0 machine types (only ppc64 changes)
---
hw/ppc/spapr.c | 243 ++++++++++++++++++++++++++++++++++++++++
hw/ppc/spapr_cpu_core.c | 13 +++
include/hw/ppc/spapr.h | 4 +
target/ppc/compat.c | 13 ++-
target/ppc/cpu.h | 1 +
target/ppc/kvm.c | 27 +++++
target/ppc/kvm_ppc.h | 13 +++
7 files changed, 313 insertions(+), 1 deletion(-)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a4372ba189..5fdf8b506d 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1622,6 +1622,9 @@ static void spapr_machine_reset(MachineState *machine)
pef_kvm_reset(machine->cgs, &error_fatal);
spapr_caps_apply(spapr);
+ if (spapr->svm_allowed) {
+ kvmppc_svm_allow(&error_fatal);
+ }
first_ppc_cpu = POWERPC_CPU(first_cpu);
if (kvm_enabled() && kvmppc_has_cap_mmu_radix() &&
@@ -3317,6 +3320,20 @@ static void spapr_set_host_serial(Object *obj, const char *value, Error **errp)
spapr->host_serial = g_strdup(value);
}
+static bool spapr_get_svm_allowed(Object *obj, Error **errp)
+{
+ SpaprMachineState *spapr = SPAPR_MACHINE(obj);
+
+ return spapr->svm_allowed;
+}
+
+static void spapr_set_svm_allowed(Object *obj, bool value, Error **errp)
+{
+ SpaprMachineState *spapr = SPAPR_MACHINE(obj);
+
+ spapr->svm_allowed = value;
+}
+
static void spapr_instance_init(Object *obj)
{
SpaprMachineState *spapr = SPAPR_MACHINE(obj);
@@ -3395,6 +3412,12 @@ static void spapr_instance_init(Object *obj)
spapr_get_host_serial, spapr_set_host_serial);
object_property_set_description(obj, "host-serial",
"Host serial number to advertise in guest device tree");
+ object_property_add_bool(obj, "x-svm-allowed",
+ spapr_get_svm_allowed,
+ spapr_set_svm_allowed);
+ object_property_set_description(obj, "x-svm-allowed",
+ "Allow the guest to become a Secure Guest"
+ " (experimental only)");
}
static void spapr_machine_finalizefn(Object *obj)
@@ -4652,6 +4675,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
vmc->client_architecture_support = spapr_vof_client_architecture_support;
vmc->quiesce = spapr_vof_quiesce;
vmc->setprop = spapr_vof_setprop;
+ smc->has_power9_support = true;
}
static const TypeInfo spapr_machine_info = {
@@ -4703,6 +4727,7 @@ static void spapr_machine_latest_class_options(MachineClass *mc)
} \
type_init(spapr_machine_register_##suffix)
+#if 0 /* Disabled for Red Hat Enterprise Linux */
/*
* pseries-7.0
*/
@@ -4830,6 +4855,7 @@ static void spapr_machine_4_1_class_options(MachineClass *mc)
}
DEFINE_SPAPR_MACHINE(4_1, "4.1", false);
+#endif
/*
* pseries-4.0
@@ -4849,6 +4875,8 @@ static bool phb_placement_4_0(SpaprMachineState *spapr, uint32_t index,
*nv2atsd = 0;
return true;
}
+
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static void spapr_machine_4_0_class_options(MachineClass *mc)
{
SpaprMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
@@ -5176,6 +5204,221 @@ static void spapr_machine_2_1_class_options(MachineClass *mc)
compat_props_add(mc->compat_props, hw_compat_2_1, hw_compat_2_1_len);
}
DEFINE_SPAPR_MACHINE(2_1, "2.1", false);
+#endif
+
+static void spapr_machine_rhel_default_class_options(MachineClass *mc)
+{
+ /*
+ * Defaults for the latest behaviour inherited from the base class
+ * can be overriden here for all pseries-rhel* machines.
+ */
+
+ /* Maximum supported VCPU count */
+ mc->max_cpus = 384;
+}
+
+/*
+ * pseries-rhel8.5.0
+ * like pseries-6.0
+ */
+
+static void spapr_machine_rhel850_class_options(MachineClass *mc)
+{
+ SpaprMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
+
+ /* The default machine type must apply the RHEL specific defaults */
+ spapr_machine_rhel_default_class_options(mc);
+ compat_props_add(mc->compat_props, hw_compat_rhel_8_5,
+ hw_compat_rhel_8_5_len);
+ smc->pre_6_2_numa_affinity = true;
+ mc->smp_props.prefer_sockets = true;
+}
+
+DEFINE_SPAPR_MACHINE(rhel850, "rhel8.5.0", true);
+
+/*
+ * pseries-rhel8.4.0
+ * like pseries-5.2
+ */
+
+static void spapr_machine_rhel840_class_options(MachineClass *mc)
+{
+ spapr_machine_rhel850_class_options(mc);
+ compat_props_add(mc->compat_props, hw_compat_rhel_8_4,
+ hw_compat_rhel_8_4_len);
+}
+
+DEFINE_SPAPR_MACHINE(rhel840, "rhel8.4.0", false);
+
+/*
+ * pseries-rhel8.3.0
+ * like pseries-5.1
+ */
+
+static void spapr_machine_rhel830_class_options(MachineClass *mc)
+{
+ SpaprMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
+
+ spapr_machine_rhel840_class_options(mc);
+ compat_props_add(mc->compat_props, hw_compat_rhel_8_3,
+ hw_compat_rhel_8_3_len);
+
+ /* from pseries-5.1 */
+ smc->pre_5_2_numa_associativity = true;
+}
+
+DEFINE_SPAPR_MACHINE(rhel830, "rhel8.3.0", false);
+
+/*
+ * pseries-rhel8.2.0
+ * like pseries-4.2 + pseries-5.0
+ * except SPAPR_CAP_CCF_ASSIST that has been backported to pseries-rhel8.1.0
+ */
+
+static void spapr_machine_rhel820_class_options(MachineClass *mc)
+{
+ SpaprMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
+ /* from pseries-5.0 */
+ static GlobalProperty compat[] = {
+ { TYPE_SPAPR_PCI_HOST_BRIDGE, "pre-5.1-associativity", "on" },
+ };
+
+ spapr_machine_rhel830_class_options(mc);
+ compat_props_add(mc->compat_props, hw_compat_rhel_8_2,
+ hw_compat_rhel_8_2_len);
+ compat_props_add(mc->compat_props, compat, G_N_ELEMENTS(compat));
+
+ /* from pseries-4.2 */
+ smc->default_caps.caps[SPAPR_CAP_FWNMI] = SPAPR_CAP_OFF;
+ smc->rma_limit = 16 * GiB;
+ mc->nvdimm_supported = false;
+
+ /* from pseries-5.0 */
+ mc->numa_mem_supported = true;
+ smc->pre_5_1_assoc_refpoints = true;
+}
+
+DEFINE_SPAPR_MACHINE(rhel820, "rhel8.2.0", false);
+
+/*
+ * pseries-rhel8.1.0
+ * like pseries-4.1
+ */
+
+static void spapr_machine_rhel810_class_options(MachineClass *mc)
+{
+ SpaprMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
+ static GlobalProperty compat[] = {
+ /* Only allow 4kiB and 64kiB IOMMU pagesizes */
+ { TYPE_SPAPR_PCI_HOST_BRIDGE, "pgsz", "0x11000" },
+ };
+
+ spapr_machine_rhel820_class_options(mc);
+
+ /* from pseries-4.1 */
+ smc->linux_pci_probe = false;
+ smc->smp_threads_vsmt = false;
+ compat_props_add(mc->compat_props, hw_compat_rhel_8_1,
+ hw_compat_rhel_8_1_len);
+ compat_props_add(mc->compat_props, compat, G_N_ELEMENTS(compat));
+
+ /* from pseries-4.2 */
+ smc->default_caps.caps[SPAPR_CAP_CCF_ASSIST] = SPAPR_CAP_OFF;
+}
+
+DEFINE_SPAPR_MACHINE(rhel810, "rhel8.1.0", false);
+
+/*
+ * pseries-rhel8.0.0
+ * like pseries-3.1 and pseries-4.0
+ * except SPAPR_CAP_CFPC, SPAPR_CAP_SBBC and SPAPR_CAP_IBS
+ * that have been backported to pseries-rhel8.0.0
+ */
+
+static void spapr_machine_rhel800_class_options(MachineClass *mc)
+{
+ SpaprMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
+
+ spapr_machine_rhel810_class_options(mc);
+ compat_props_add(mc->compat_props, hw_compat_rhel_8_0,
+ hw_compat_rhel_8_0_len);
+
+ /* pseries-4.0 */
+ smc->phb_placement = phb_placement_4_0;
+ smc->irq = &spapr_irq_xics;
+ smc->pre_4_1_migration = true;
+
+ /* pseries-3.1 */
+ mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power8_v2.0");
+ smc->update_dt_enabled = false;
+ smc->dr_phb_enabled = false;
+ smc->broken_host_serial_model = true;
+ smc->default_caps.caps[SPAPR_CAP_LARGE_DECREMENTER] = SPAPR_CAP_OFF;
+}
+
+DEFINE_SPAPR_MACHINE(rhel800, "rhel8.0.0", false);
+
+/*
+ * pseries-rhel7.6.0
+ * like spapr_compat_2_12 and spapr_compat_3_0
+ * spapr_compat_0 is empty
+ */
+GlobalProperty spapr_compat_rhel7_6[] = {
+ { TYPE_POWERPC_CPU, "pre-3.0-migration", "on" },
+ { TYPE_SPAPR_CPU_CORE, "pre-3.0-migration", "on" },
+};
+const size_t spapr_compat_rhel7_6_len = G_N_ELEMENTS(spapr_compat_rhel7_6);
+
+
+static void spapr_machine_rhel760_class_options(MachineClass *mc)
+{
+ SpaprMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
+
+ spapr_machine_rhel800_class_options(mc);
+ compat_props_add(mc->compat_props, hw_compat_rhel_7_6, hw_compat_rhel_7_6_len);
+ compat_props_add(mc->compat_props, spapr_compat_rhel7_6, spapr_compat_rhel7_6_len);
+
+ /* from spapr_machine_3_0_class_options() */
+ smc->legacy_irq_allocation = true;
+ smc->nr_xirqs = 0x400;
+ smc->irq = &spapr_irq_xics_legacy;
+
+ /* from spapr_machine_2_12_class_options() */
+ /* We depend on kvm_enabled() to choose a default value for the
+ * hpt-max-page-size capability. Of course we can't do it here
+ * because this is too early and the HW accelerator isn't initialzed
+ * yet. Postpone this to machine init (see default_caps_with_cpu()).
+ */
+ smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = 0;
+
+ /* SPAPR_CAP_WORKAROUND enabled in pseries-rhel800 by
+ * f21757edc554
+ * "Enable mitigations by default for pseries-4.0 machine type")
+ */
+ smc->default_caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_BROKEN;
+ smc->default_caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_BROKEN;
+ smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_BROKEN;
+}
+
+DEFINE_SPAPR_MACHINE(rhel760, "rhel7.6.0", false);
+
+/*
+ * pseries-rhel7.6.0-sxxm
+ *
+ * pseries-rhel7.6.0 with speculative execution exploit mitigations enabled by default
+ */
+
+static void spapr_machine_rhel760sxxm_class_options(MachineClass *mc)
+{
+ SpaprMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
+
+ spapr_machine_rhel760_class_options(mc);
+ smc->default_caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_WORKAROUND;
+ smc->default_caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_WORKAROUND;
+ smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_FIXED_CCD;
+}
+
+DEFINE_SPAPR_MACHINE(rhel760sxxm, "rhel7.6.0-sxxm", false);
static void spapr_machine_register_types(void)
{
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index fcb5dfe792..ab8fb5bf62 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -25,6 +25,7 @@
#include "sysemu/reset.h"
#include "sysemu/hw_accel.h"
#include "qemu/error-report.h"
+#include "cpu-models.h"
static void spapr_reset_vcpu(PowerPCCPU *cpu)
{
@@ -259,6 +260,7 @@ static bool spapr_realize_vcpu(PowerPCCPU *cpu, SpaprMachineState *spapr,
{
CPUPPCState *env = &cpu->env;
CPUState *cs = CPU(cpu);
+ SpaprMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
if (!qdev_realize(DEVICE(cpu), NULL, errp)) {
return false;
@@ -270,6 +272,17 @@ static bool spapr_realize_vcpu(PowerPCCPU *cpu, SpaprMachineState *spapr,
/* Set time-base frequency to 512 MHz. vhyp must be set first. */
cpu_ppc_tb_init(env, SPAPR_TIMEBASE_FREQ);
+ if (!smc->has_power9_support &&
+ (((spapr->max_compat_pvr &&
+ ppc_compat_cmp(spapr->max_compat_pvr,
+ CPU_POWERPC_LOGICAL_3_00) >= 0)) ||
+ (!spapr->max_compat_pvr &&
+ ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_3_00, 0, 0)))) {
+ error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
+ "POWER9 CPU is not supported by this machine class");
+ return false;
+ }
+
if (spapr_irq_cpu_intc_create(spapr, cpu, errp) < 0) {
qdev_unrealize(DEVICE(cpu));
return false;
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index f5c33dcc86..4a68e0a901 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -154,6 +154,7 @@ struct SpaprMachineClass {
bool pre_5_2_numa_associativity;
bool pre_6_2_numa_affinity;
+ bool has_power9_support;
bool (*phb_placement)(SpaprMachineState *spapr, uint32_t index,
uint64_t *buid, hwaddr *pio,
hwaddr *mmio32, hwaddr *mmio64,
@@ -241,6 +242,9 @@ struct SpaprMachineState {
/* Set by -boot */
char *boot_device;
+ /* Secure Guest support via x-svm-allowed */
+ bool svm_allowed;
+
/*< public >*/
char *kvm_type;
char *host_model;
diff --git a/target/ppc/compat.c b/target/ppc/compat.c
index 7949a24f5a..f207a9ba01 100644
--- a/target/ppc/compat.c
+++ b/target/ppc/compat.c
@@ -114,8 +114,19 @@ static const CompatInfo *compat_by_pvr(uint32_t pvr)
return NULL;
}
+long ppc_compat_cmp(uint32_t pvr1, uint32_t pvr2)
+{
+ const CompatInfo *compat1 = compat_by_pvr(pvr1);
+ const CompatInfo *compat2 = compat_by_pvr(pvr2);
+
+ g_assert(compat1);
+ g_assert(compat2);
+
+ return compat1 - compat2;
+}
+
static bool pcc_compat(PowerPCCPUClass *pcc, uint32_t compat_pvr,
- uint32_t min_compat_pvr, uint32_t max_compat_pvr)
+ uint32_t min_compat_pvr, uint32_t max_compat_pvr)
{
const CompatInfo *compat = compat_by_pvr(compat_pvr);
const CompatInfo *min = compat_by_pvr(min_compat_pvr);
diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 047b24ba50..79c5ac50b9 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1462,6 +1462,7 @@ static inline int cpu_mmu_index(CPUPPCState *env, bool ifetch)
/* Compatibility modes */
#if defined(TARGET_PPC64)
+long ppc_compat_cmp(uint32_t pvr1, uint32_t pvr2);
bool ppc_check_compat(PowerPCCPU *cpu, uint32_t compat_pvr,
uint32_t min_compat_pvr, uint32_t max_compat_pvr);
bool ppc_type_check_compat(const char *cputype, uint32_t compat_pvr,
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index dc93b99189..154888cce5 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -90,6 +90,7 @@ static int cap_ppc_nested_kvm_hv;
static int cap_large_decr;
static int cap_fwnmi;
static int cap_rpt_invalidate;
+static int cap_ppc_secure_guest;
static uint32_t debug_inst_opcode;
@@ -137,6 +138,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
cap_resize_hpt = kvm_vm_check_extension(s, KVM_CAP_SPAPR_RESIZE_HPT);
kvmppc_get_cpu_characteristics(s);
cap_ppc_nested_kvm_hv = kvm_vm_check_extension(s, KVM_CAP_PPC_NESTED_HV);
+ cap_ppc_secure_guest = kvm_vm_check_extension(s, KVM_CAP_PPC_SECURE_GUEST);
cap_large_decr = kvmppc_get_dec_bits();
cap_fwnmi = kvm_vm_check_extension(s, KVM_CAP_PPC_FWNMI);
/*
@@ -2563,6 +2565,16 @@ int kvmppc_has_cap_rpt_invalidate(void)
return cap_rpt_invalidate;
}
+bool kvmppc_has_cap_secure_guest(void)
+{
+ return !!cap_ppc_secure_guest;
+}
+
+int kvmppc_enable_cap_secure_guest(void)
+{
+ return kvm_vm_enable_cap(kvm_state, KVM_CAP_PPC_SECURE_GUEST, 0, 1);
+}
+
PowerPCCPUClass *kvm_ppc_get_host_cpu_class(void)
{
uint32_t host_pvr = mfpvr();
@@ -2959,3 +2971,18 @@ bool kvm_arch_cpu_check_are_resettable(void)
{
return true;
}
+
+void kvmppc_svm_allow(Error **errp)
+{
+ if (!kvm_enabled()) {
+ error_setg(errp, "No PEF support in tcg, try x-svm-allowed=off");
+ return;
+ }
+
+ if (!kvmppc_has_cap_secure_guest()) {
+ error_setg(errp, "KVM implementation does not support secure guests, "
+ "try x-svm-allowed=off");
+ } else if (kvmppc_enable_cap_secure_guest() < 0) {
+ error_setg(errp, "Error enabling x-svm-allowed, try x-svm-allowed=off");
+ }
+}
diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
index ee9325bf9a..20dbb95989 100644
--- a/target/ppc/kvm_ppc.h
+++ b/target/ppc/kvm_ppc.h
@@ -40,6 +40,7 @@ int kvmppc_booke_watchdog_enable(PowerPCCPU *cpu);
target_ulong kvmppc_configure_v3_mmu(PowerPCCPU *cpu,
bool radix, bool gtse,
uint64_t proc_tbl);
+void kvmppc_svm_allow(Error **errp);
#ifndef CONFIG_USER_ONLY
bool kvmppc_spapr_use_multitce(void);
int kvmppc_spapr_enable_inkernel_multitce(void);
@@ -74,6 +75,8 @@ int kvmppc_get_cap_large_decr(void);
int kvmppc_enable_cap_large_decr(PowerPCCPU *cpu, int enable);
int kvmppc_has_cap_rpt_invalidate(void);
int kvmppc_enable_hwrng(void);
+bool kvmppc_has_cap_secure_guest(void);
+int kvmppc_enable_cap_secure_guest(void);
int kvmppc_put_books_sregs(PowerPCCPU *cpu);
PowerPCCPUClass *kvm_ppc_get_host_cpu_class(void);
void kvmppc_check_papr_resize_hpt(Error **errp);
@@ -393,6 +396,16 @@ static inline int kvmppc_has_cap_rpt_invalidate(void)
return false;
}
+static inline bool kvmppc_has_cap_secure_guest(void)
+{
+ return false;
+}
+
+static inline int kvmppc_enable_cap_secure_guest(void)
+{
+ return -1;
+}
+
static inline int kvmppc_enable_hwrng(void)
{
return -1;
--
2.31.1

View File

@ -1,186 +0,0 @@
From 680f343e58a50a99d17bc7dedd3ee90980912023 Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Fri, 19 Oct 2018 13:47:32 +0200
Subject: Add s390x machine types
Adding changes to add RHEL machine types for s390x architecture.
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
--
Merged patches (6.1.0):
- 64a9a5c971 hw/s390x: Remove the RHEL7-only machine type
- 395516d62b redhat: s390x: add rhel-8.5.0 compat machine
Merged patches (6.2.0):
- 3bf66f4520 redhat: Add s390x machine type compatibility update for 6.1 rebase
Merged patches (7.0.0):
- e6ff4de4f7 redhat: Add s390x machine type compatibility handling for the rebase to v6.2
- 4b0efa7e21 redhat: Add rhel8.6.0 and rhel9.0.0 machine types for s390x
- dcc64971bf RHEL: mark old machine types as deprecated (partialy)
---
hw/core/machine.c | 6 +++
hw/s390x/s390-virtio-ccw.c | 104 ++++++++++++++++++++++++++++++++++++-
include/hw/boards.h | 2 +
3 files changed, 111 insertions(+), 1 deletion(-)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index ea430d844e..77202a3570 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -37,6 +37,12 @@
#include "hw/virtio/virtio.h"
#include "hw/virtio/virtio-pci.h"
+/*
+ * RHEL only: machine types for previous major releases are deprecated
+ */
+const char *rhel_old_machine_deprecation =
+ "machine types for previous major releases are deprecated";
+
/*
* Mostly the same as hw_compat_6_0 and hw_compat_6_1
*/
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 90480e7cf9..ec4176a1e0 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -767,7 +767,7 @@ bool css_migration_enabled(void)
{ \
MachineClass *mc = MACHINE_CLASS(oc); \
ccw_machine_##suffix##_class_options(mc); \
- mc->desc = "VirtIO-ccw based S390 machine v" verstr; \
+ mc->desc = "VirtIO-ccw based S390 machine " verstr; \
if (latest) { \
mc->alias = "s390-ccw-virtio"; \
mc->is_default = true; \
@@ -791,6 +791,7 @@ bool css_migration_enabled(void)
} \
type_init(ccw_machine_register_##suffix)
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static void ccw_machine_7_0_instance_options(MachineState *machine)
{
}
@@ -1115,6 +1116,107 @@ static void ccw_machine_2_4_class_options(MachineClass *mc)
compat_props_add(mc->compat_props, compat, G_N_ELEMENTS(compat));
}
DEFINE_CCW_MACHINE(2_4, "2.4", false);
+#endif
+
+static void ccw_machine_rhel900_instance_options(MachineState *machine)
+{
+}
+
+static void ccw_machine_rhel900_class_options(MachineClass *mc)
+{
+}
+DEFINE_CCW_MACHINE(rhel900, "rhel9.0.0", true);
+
+static void ccw_machine_rhel860_instance_options(MachineState *machine)
+{
+ /* Note: The -rhel8.6.0 and -rhel9.0.0 machines are technically identical */
+ ccw_machine_rhel900_instance_options(machine);
+}
+
+static void ccw_machine_rhel860_class_options(MachineClass *mc)
+{
+ ccw_machine_rhel900_class_options(mc);
+
+ /* All RHEL machines for prior major releases are deprecated */
+ mc->deprecation_reason = rhel_old_machine_deprecation;
+}
+DEFINE_CCW_MACHINE(rhel860, "rhel8.6.0", false);
+
+static void ccw_machine_rhel850_instance_options(MachineState *machine)
+{
+ static const S390FeatInit qemu_cpu_feat = { S390_FEAT_LIST_QEMU_V6_0 };
+
+ ccw_machine_rhel860_instance_options(machine);
+
+ s390_set_qemu_cpu_model(0x2964, 13, 2, qemu_cpu_feat);
+
+ s390_cpudef_featoff_greater(16, 1, S390_FEAT_NNPA);
+ s390_cpudef_featoff_greater(16, 1, S390_FEAT_VECTOR_PACKED_DECIMAL_ENH2);
+ s390_cpudef_featoff_greater(16, 1, S390_FEAT_BEAR_ENH);
+ s390_cpudef_featoff_greater(16, 1, S390_FEAT_RDP);
+ s390_cpudef_featoff_greater(16, 1, S390_FEAT_PAI);
+}
+
+static void ccw_machine_rhel850_class_options(MachineClass *mc)
+{
+ ccw_machine_rhel860_class_options(mc);
+ compat_props_add(mc->compat_props, hw_compat_rhel_8_5, hw_compat_rhel_8_5_len);
+ mc->smp_props.prefer_sockets = true;
+}
+DEFINE_CCW_MACHINE(rhel850, "rhel8.5.0", false);
+
+static void ccw_machine_rhel840_instance_options(MachineState *machine)
+{
+ ccw_machine_rhel850_instance_options(machine);
+}
+
+static void ccw_machine_rhel840_class_options(MachineClass *mc)
+{
+ ccw_machine_rhel850_class_options(mc);
+ compat_props_add(mc->compat_props, hw_compat_rhel_8_4, hw_compat_rhel_8_4_len);
+}
+DEFINE_CCW_MACHINE(rhel840, "rhel8.4.0", false);
+
+static void ccw_machine_rhel820_instance_options(MachineState *machine)
+{
+ ccw_machine_rhel840_instance_options(machine);
+}
+
+static void ccw_machine_rhel820_class_options(MachineClass *mc)
+{
+ ccw_machine_rhel840_class_options(mc);
+ mc->fixup_ram_size = s390_fixup_ram_size;
+ /* we did not publish a rhel8.3.0 machine */
+ compat_props_add(mc->compat_props, hw_compat_rhel_8_3, hw_compat_rhel_8_3_len);
+ compat_props_add(mc->compat_props, hw_compat_rhel_8_2, hw_compat_rhel_8_2_len);
+}
+DEFINE_CCW_MACHINE(rhel820, "rhel8.2.0", false);
+
+static void ccw_machine_rhel760_instance_options(MachineState *machine)
+{
+ static const S390FeatInit qemu_cpu_feat = { S390_FEAT_LIST_QEMU_V3_1 };
+
+ ccw_machine_rhel820_instance_options(machine);
+
+ s390_set_qemu_cpu_model(0x2827, 12, 2, qemu_cpu_feat);
+
+ /* The multiple-epoch facility was not available with rhel7.6.0 on z14GA1 */
+ s390_cpudef_featoff(14, 1, S390_FEAT_MULTIPLE_EPOCH);
+ s390_cpudef_featoff(14, 1, S390_FEAT_PTFF_QSIE);
+ s390_cpudef_featoff(14, 1, S390_FEAT_PTFF_QTOUE);
+ s390_cpudef_featoff(14, 1, S390_FEAT_PTFF_STOE);
+ s390_cpudef_featoff(14, 1, S390_FEAT_PTFF_STOUE);
+}
+
+static void ccw_machine_rhel760_class_options(MachineClass *mc)
+{
+ ccw_machine_rhel820_class_options(mc);
+ /* We never published the s390x version of RHEL-AV 8.0 and 8.1, so add this here */
+ compat_props_add(mc->compat_props, hw_compat_rhel_8_1, hw_compat_rhel_8_1_len);
+ compat_props_add(mc->compat_props, hw_compat_rhel_8_0, hw_compat_rhel_8_0_len);
+ compat_props_add(mc->compat_props, hw_compat_rhel_7_6, hw_compat_rhel_7_6_len);
+}
+DEFINE_CCW_MACHINE(rhel760, "rhel7.6.0", false);
static void ccw_machine_register_types(void)
{
diff --git a/include/hw/boards.h b/include/hw/boards.h
index c90a19b4d1..bf59275f18 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -470,4 +470,6 @@ extern const size_t hw_compat_rhel_8_0_len;
extern GlobalProperty hw_compat_rhel_7_6[];
extern const size_t hw_compat_rhel_7_6_len;
+extern const char *rhel_old_machine_deprecation;
+
#endif
--
2.31.1

View File

@ -1,714 +0,0 @@
From 427a575ca57966bc72e1ebf218081da530d435d7 Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Fri, 19 Oct 2018 13:10:31 +0200
Subject: Add x86_64 machine types
Adding changes to add RHEL machine types for x86_64 architecture.
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
Rebase notes (6.1.0):
- Update qemu64 cpu spec
Rebase notes (7.0.0):
- Reset alias for all machine-types except latest one
Merged patches (6.1.0):
- 59c284ad3b x86: Add x86 rhel8.5 machine types
- a8868b42fe redhat: x86: Enable 'kvm-asyncpf-int' by default
- a3995e2eff Remove RHEL 7.0.0 machine type (only x86_64 changes)
- ad3190a79b Remove RHEL 7.1.0 machine type (only x86_64 changes)
- 84bbe15d4e Remove RHEL 7.2.0 machine type (only x86_64 changes)
- 0215eb3356 Remove RHEL 7.3.0 machine types (only x86_64 changes)
- af69d1ca6e Remove RHEL 7.4.0 machine types (only x86_64 changes)
- 8f7a74ab78 Remove RHEL 7.5.0 machine types (only x86_64 changes)
Merged patches (7.0.0):
- eae7d8dd3c x86/rhel machine types: Add pc_rhel_8_5_compat
- 6762f56469 x86/rhel machine types: Wire compat into q35 and i440fx
- 5762101438 rhel machine types/x86: set prefer_sockets
- 9ba9ddc632 x86: Add q35 RHEL 8.6.0 machine type
- 6110d865e5 x86: Add q35 RHEL 9.0.0 machine type
- dcc64971bf RHEL: mark old machine types as deprecated (partialy)
- 6b396f182b RHEL: disable "seqpacket" for "vhost-vsock-device" in rhel8.6.0
---
hw/core/machine.c | 10 ++
hw/i386/pc.c | 135 +++++++++++++++++++++-
hw/i386/pc_piix.c | 79 ++++++++++++-
hw/i386/pc_q35.c | 227 ++++++++++++++++++++++++++++++++++++-
hw/s390x/s390-virtio-ccw.c | 1 +
include/hw/boards.h | 5 +
include/hw/i386/pc.h | 24 ++++
target/i386/kvm/kvm-cpu.c | 1 +
target/i386/kvm/kvm.c | 4 +
tests/qtest/pvpanic-test.c | 5 +-
10 files changed, 484 insertions(+), 7 deletions(-)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 77202a3570..28989b6e7b 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -43,6 +43,16 @@
const char *rhel_old_machine_deprecation =
"machine types for previous major releases are deprecated";
+GlobalProperty hw_compat_rhel_8_6[] = {
+ /* hw_compat_rhel_8_6 bz 2065589 */
+ /*
+ * vhost-vsock device in RHEL 8 kernels doesn't support seqpacket, so
+ * we need do disable it downstream on the latest hw_compat_rhel_8.
+ */
+ { "vhost-vsock-device", "seqpacket", "off" },
+};
+const size_t hw_compat_rhel_8_6_len = G_N_ELEMENTS(hw_compat_rhel_8_6);
+
/*
* Mostly the same as hw_compat_6_0 and hw_compat_6_1
*/
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index fd55fc725c..263d882af6 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -375,6 +375,137 @@ GlobalProperty pc_compat_1_4[] = {
};
const size_t pc_compat_1_4_len = G_N_ELEMENTS(pc_compat_1_4);
+/* This macro is for changes to properties that are RHEL specific,
+ * different to the current upstream and to be applied to the latest
+ * machine type.
+ */
+GlobalProperty pc_rhel_compat[] = {
+ { TYPE_X86_CPU, "host-phys-bits", "on" },
+ { TYPE_X86_CPU, "host-phys-bits-limit", "48" },
+ { TYPE_X86_CPU, "vmx-entry-load-perf-global-ctrl", "off" },
+ { TYPE_X86_CPU, "vmx-exit-load-perf-global-ctrl", "off" },
+ /* bz 1508330 */
+ { "vfio-pci", "x-no-geforce-quirks", "on" },
+ /* bz 1941397 */
+ { TYPE_X86_CPU, "kvm-asyncpf-int", "on" },
+};
+const size_t pc_rhel_compat_len = G_N_ELEMENTS(pc_rhel_compat);
+
+GlobalProperty pc_rhel_8_5_compat[] = {
+ /* pc_rhel_8_5_compat from pc_compat_6_0 */
+ { "qemu64" "-" TYPE_X86_CPU, "family", "6" },
+ /* pc_rhel_8_5_compat from pc_compat_6_0 */
+ { "qemu64" "-" TYPE_X86_CPU, "model", "6" },
+ /* pc_rhel_8_5_compat from pc_compat_6_0 */
+ { "qemu64" "-" TYPE_X86_CPU, "stepping", "3" },
+ /* pc_rhel_8_5_compat from pc_compat_6_0 */
+ { TYPE_X86_CPU, "x-vendor-cpuid-only", "off" },
+ /* pc_rhel_8_5_compat from pc_compat_6_0 */
+ { "ICH9-LPC", ACPI_PM_PROP_ACPI_PCIHP_BRIDGE, "off" },
+
+ /* pc_rhel_8_5_compat from pc_compat_6_1 */
+ { TYPE_X86_CPU, "hv-version-id-build", "0x1bbc" },
+ /* pc_rhel_8_5_compat from pc_compat_6_1 */
+ { TYPE_X86_CPU, "hv-version-id-major", "0x0006" },
+ /* pc_rhel_8_5_compat from pc_compat_6_1 */
+ { TYPE_X86_CPU, "hv-version-id-minor", "0x0001" },
+};
+const size_t pc_rhel_8_5_compat_len = G_N_ELEMENTS(pc_rhel_8_5_compat);
+
+GlobalProperty pc_rhel_8_4_compat[] = {
+ /* pc_rhel_8_4_compat from pc_compat_5_2 */
+ { "ICH9-LPC", "x-smi-cpu-hotunplug", "off" },
+ { TYPE_X86_CPU, "kvm-asyncpf-int", "off" },
+};
+const size_t pc_rhel_8_4_compat_len = G_N_ELEMENTS(pc_rhel_8_4_compat);
+
+GlobalProperty pc_rhel_8_3_compat[] = {
+ /* pc_rhel_8_3_compat from pc_compat_5_1 */
+ { "ICH9-LPC", "x-smi-cpu-hotplug", "off" },
+};
+const size_t pc_rhel_8_3_compat_len = G_N_ELEMENTS(pc_rhel_8_3_compat);
+
+GlobalProperty pc_rhel_8_2_compat[] = {
+ /* pc_rhel_8_2_compat from pc_compat_4_2 */
+ { "mch", "smbase-smram", "off" },
+};
+const size_t pc_rhel_8_2_compat_len = G_N_ELEMENTS(pc_rhel_8_2_compat);
+
+/* pc_rhel_8_1_compat is empty since pc_4_1_compat is */
+GlobalProperty pc_rhel_8_1_compat[] = { };
+const size_t pc_rhel_8_1_compat_len = G_N_ELEMENTS(pc_rhel_8_1_compat);
+
+GlobalProperty pc_rhel_8_0_compat[] = {
+ /* pc_rhel_8_0_compat from pc_compat_3_1 */
+ { "intel-iommu", "dma-drain", "off" },
+ /* pc_rhel_8_0_compat from pc_compat_3_1 */
+ { "Opteron_G3" "-" TYPE_X86_CPU, "rdtscp", "off" },
+ /* pc_rhel_8_0_compat from pc_compat_3_1 */
+ { "Opteron_G4" "-" TYPE_X86_CPU, "rdtscp", "off" },
+ /* pc_rhel_8_0_compat from pc_compat_3_1 */
+ { "Opteron_G4" "-" TYPE_X86_CPU, "npt", "off" },
+ /* pc_rhel_8_0_compat from pc_compat_3_1 */
+ { "Opteron_G4" "-" TYPE_X86_CPU, "nrip-save", "off" },
+ /* pc_rhel_8_0_compat from pc_compat_3_1 */
+ { "Opteron_G5" "-" TYPE_X86_CPU, "rdtscp", "off" },
+ /* pc_rhel_8_0_compat from pc_compat_3_1 */
+ { "Opteron_G5" "-" TYPE_X86_CPU, "npt", "off" },
+ /* pc_rhel_8_0_compat from pc_compat_3_1 */
+ { "Opteron_G5" "-" TYPE_X86_CPU, "nrip-save", "off" },
+ /* pc_rhel_8_0_compat from pc_compat_3_1 */
+ { "EPYC" "-" TYPE_X86_CPU, "npt", "off" },
+ /* pc_rhel_8_0_compat from pc_compat_3_1 */
+ { "EPYC" "-" TYPE_X86_CPU, "nrip-save", "off" },
+ /* pc_rhel_8_0_compat from pc_compat_3_1 */
+ { "EPYC-IBPB" "-" TYPE_X86_CPU, "npt", "off" },
+ /* pc_rhel_8_0_compat from pc_compat_3_1 */
+ { "EPYC-IBPB" "-" TYPE_X86_CPU, "nrip-save", "off" },
+ /** The mpx=on entries from pc_compat_3_1 are in pc_rhel_7_6_compat **/
+ /* pc_rhel_8_0_compat from pc_compat_3_1 */
+ { "Cascadelake-Server" "-" TYPE_X86_CPU, "stepping", "5" },
+ /* pc_rhel_8_0_compat from pc_compat_3_1 */
+ { TYPE_X86_CPU, "x-intel-pt-auto-level", "off" },
+};
+const size_t pc_rhel_8_0_compat_len = G_N_ELEMENTS(pc_rhel_8_0_compat);
+
+/* Similar to PC_COMPAT_3_0 + PC_COMPAT_2_12, but:
+ * all of the 2_12 stuff was already in 7.6 from bz 1481253
+ * x-migrate-smi-count comes from PC_COMPAT_2_11 but
+ * is really tied to kernel version so keep it off on 7.x
+ * machine types irrespective of host.
+ */
+GlobalProperty pc_rhel_7_6_compat[] = {
+ /* pc_rhel_7_6_compat from pc_compat_3_0 */
+ { TYPE_X86_CPU, "x-hv-synic-kvm-only", "on" },
+ /* pc_rhel_7_6_compat from pc_compat_3_0 */
+ { "Skylake-Server" "-" TYPE_X86_CPU, "pku", "off" },
+ /* pc_rhel_7_6_compat from pc_compat_3_0 */
+ { "Skylake-Server-IBRS" "-" TYPE_X86_CPU, "pku", "off" },
+ /* pc_rhel_7_6_compat from pc_compat_2_11 */
+ { TYPE_X86_CPU, "x-migrate-smi-count", "off" },
+ /* pc_rhel_7_6_compat from pc_compat_2_11 */
+ { "Skylake-Client" "-" TYPE_X86_CPU, "mpx", "on" },
+ /* pc_rhel_7_6_compat from pc_compat_2_11 */
+ { "Skylake-Client-IBRS" "-" TYPE_X86_CPU, "mpx", "on" },
+ /* pc_rhel_7_6_compat from pc_compat_2_11 */
+ { "Skylake-Server" "-" TYPE_X86_CPU, "mpx", "on" },
+ /* pc_rhel_7_6_compat from pc_compat_2_11 */
+ { "Skylake-Server-IBRS" "-" TYPE_X86_CPU, "mpx", "on" },
+ /* pc_rhel_7_6_compat from pc_compat_2_11 */
+ { "Cascadelake-Server" "-" TYPE_X86_CPU, "mpx", "on" },
+ /* pc_rhel_7_6_compat from pc_compat_2_11 */
+ { "Icelake-Client" "-" TYPE_X86_CPU, "mpx", "on" },
+ /* pc_rhel_7_6_compat from pc_compat_2_11 */
+ { "Icelake-Server" "-" TYPE_X86_CPU, "mpx", "on" },
+};
+const size_t pc_rhel_7_6_compat_len = G_N_ELEMENTS(pc_rhel_7_6_compat);
+
+/*
+ * The PC_RHEL_*_COMPAT serve the same purpose for RHEL-7 machine
+ * types as the PC_COMPAT_* do for upstream types.
+ * PC_RHEL_7_*_COMPAT apply both to i440fx and q35 types.
+ */
+
GSIState *pc_gsi_create(qemu_irq **irqs, bool pci_enabled)
{
GSIState *s;
@@ -1738,6 +1869,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
pcmc->pvh_enabled = true;
pcmc->kvmclock_create_always = true;
assert(!mc->get_hotplug_handler);
+ mc->async_pf_vmexit_disable = false;
mc->get_hotplug_handler = pc_get_hotplug_handler;
mc->hotplug_allowed = pc_hotplug_allowed;
mc->cpu_index_to_instance_props = x86_cpu_index_to_props;
@@ -1748,7 +1880,8 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
mc->has_hotpluggable_cpus = true;
mc->default_boot_order = "cad";
mc->block_default_type = IF_IDE;
- mc->max_cpus = 255;
+ /* 240: max CPU count for RHEL */
+ mc->max_cpus = 240;
mc->reset = pc_machine_reset;
mc->wakeup = pc_machine_wakeup;
hc->pre_plug = pc_machine_device_pre_plug_cb;
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index c797e98312..0cacc0d623 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -50,6 +50,7 @@
#include "qapi/error.h"
#include "qemu/error-report.h"
#include "sysemu/xen.h"
+#include "migration/migration.h"
#ifdef CONFIG_XEN
#include <xen/hvm/hvm_info_table.h>
#include "hw/xen/xen_pt.h"
@@ -174,8 +175,8 @@ static void pc_init1(MachineState *machine,
if (pcmc->smbios_defaults) {
MachineClass *mc = MACHINE_GET_CLASS(machine);
/* These values are guest ABI, do not change */
- smbios_set_defaults("QEMU", "Standard PC (i440FX + PIIX, 1996)",
- mc->name, pcmc->smbios_legacy_mode,
+ smbios_set_defaults("Red Hat", "KVM",
+ mc->desc, pcmc->smbios_legacy_mode,
pcmc->smbios_uuid_encoded,
pcmc->smbios_stream_product,
pcmc->smbios_stream_version,
@@ -314,6 +315,7 @@ static void pc_init1(MachineState *machine,
* hw_compat_*, pc_compat_*, or * pc_*_machine_options().
*/
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static void pc_compat_2_3_fn(MachineState *machine)
{
X86MachineState *x86ms = X86_MACHINE(machine);
@@ -967,3 +969,76 @@ static void xenfv_3_1_machine_options(MachineClass *m)
DEFINE_PC_MACHINE(xenfv, "xenfv-3.1", pc_xen_hvm_init,
xenfv_3_1_machine_options);
#endif
+#endif /* Disabled for Red Hat Enterprise Linux */
+
+/* Red Hat Enterprise Linux machine types */
+
+/* Options for the latest rhel7 machine type */
+static void pc_machine_rhel7_options(MachineClass *m)
+{
+ PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+ m->family = "pc_piix_Y";
+ m->default_machine_opts = "firmware=bios-256k.bin,hpet=off";
+ pcmc->default_nic_model = "e1000";
+ pcmc->pci_root_uid = 0;
+ m->default_display = "std";
+ m->no_parallel = 1;
+ m->numa_mem_supported = true;
+ m->auto_enable_numa_with_memdev = false;
+ machine_class_allow_dynamic_sysbus_dev(m, TYPE_RAMFB_DEVICE);
+ compat_props_add(m->compat_props, pc_rhel_compat, pc_rhel_compat_len);
+ m->alias = "pc";
+ m->is_default = 1;
+ m->smp_props.prefer_sockets = true;
+}
+
+static void pc_init_rhel760(MachineState *machine)
+{
+ pc_init1(machine, TYPE_I440FX_PCI_HOST_BRIDGE, \
+ TYPE_I440FX_PCI_DEVICE);
+}
+
+static void pc_machine_rhel760_options(MachineClass *m)
+{
+ PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+ pc_machine_rhel7_options(m);
+ m->desc = "RHEL 7.6.0 PC (i440FX + PIIX, 1996)";
+ m->async_pf_vmexit_disable = true;
+ m->smbus_no_migration_support = true;
+
+ /* All RHEL machines for prior major releases are deprecated */
+ m->deprecation_reason = rhel_old_machine_deprecation;
+
+ pcmc->pvh_enabled = false;
+ pcmc->default_cpu_version = CPU_VERSION_LEGACY;
+ pcmc->kvmclock_create_always = false;
+ /* From pc_i440fx_5_1_machine_options() */
+ pcmc->pci_root_uid = 1;
+ compat_props_add(m->compat_props, hw_compat_rhel_8_6,
+ hw_compat_rhel_8_6_len);
+ compat_props_add(m->compat_props, hw_compat_rhel_8_5,
+ hw_compat_rhel_8_5_len);
+ compat_props_add(m->compat_props, pc_rhel_8_5_compat,
+ pc_rhel_8_5_compat_len);
+ compat_props_add(m->compat_props, hw_compat_rhel_8_4,
+ hw_compat_rhel_8_4_len);
+ compat_props_add(m->compat_props, pc_rhel_8_4_compat,
+ pc_rhel_8_4_compat_len);
+ compat_props_add(m->compat_props, hw_compat_rhel_8_3,
+ hw_compat_rhel_8_3_len);
+ compat_props_add(m->compat_props, pc_rhel_8_3_compat,
+ pc_rhel_8_3_compat_len);
+ compat_props_add(m->compat_props, hw_compat_rhel_8_2,
+ hw_compat_rhel_8_2_len);
+ compat_props_add(m->compat_props, pc_rhel_8_2_compat,
+ pc_rhel_8_2_compat_len);
+ compat_props_add(m->compat_props, hw_compat_rhel_8_1, hw_compat_rhel_8_1_len);
+ compat_props_add(m->compat_props, pc_rhel_8_1_compat, pc_rhel_8_1_compat_len);
+ compat_props_add(m->compat_props, hw_compat_rhel_8_0, hw_compat_rhel_8_0_len);
+ compat_props_add(m->compat_props, pc_rhel_8_0_compat, pc_rhel_8_0_compat_len);
+ compat_props_add(m->compat_props, hw_compat_rhel_7_6, hw_compat_rhel_7_6_len);
+ compat_props_add(m->compat_props, pc_rhel_7_6_compat, pc_rhel_7_6_compat_len);
+}
+
+DEFINE_PC_MACHINE(rhel760, "pc-i440fx-rhel7.6.0", pc_init_rhel760,
+ pc_machine_rhel760_options);
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index b695f88c45..157160e069 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -197,8 +197,8 @@ static void pc_q35_init(MachineState *machine)
if (pcmc->smbios_defaults) {
/* These values are guest ABI, do not change */
- smbios_set_defaults("QEMU", "Standard PC (Q35 + ICH9, 2009)",
- mc->name, pcmc->smbios_legacy_mode,
+ smbios_set_defaults("Red Hat", "KVM",
+ mc->desc, pcmc->smbios_legacy_mode,
pcmc->smbios_uuid_encoded,
pcmc->smbios_stream_product,
pcmc->smbios_stream_version,
@@ -342,6 +342,7 @@ static void pc_q35_init(MachineState *machine)
DEFINE_PC_MACHINE(suffix, name, pc_init_##suffix, optionfn)
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static void pc_q35_machine_options(MachineClass *m)
{
PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
@@ -631,3 +632,225 @@ static void pc_q35_2_4_machine_options(MachineClass *m)
DEFINE_Q35_MACHINE(v2_4, "pc-q35-2.4", NULL,
pc_q35_2_4_machine_options);
+#endif /* Disabled for Red Hat Enterprise Linux */
+
+/* Red Hat Enterprise Linux machine types */
+
+/* Options for the latest rhel q35 machine type */
+static void pc_q35_machine_rhel_options(MachineClass *m)
+{
+ PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+ pcmc->default_nic_model = "e1000e";
+ pcmc->pci_root_uid = 0;
+ m->family = "pc_q35_Z";
+ m->units_per_default_bus = 1;
+ m->default_machine_opts = "firmware=bios-256k.bin,hpet=off";
+ m->default_display = "std";
+ m->no_floppy = 1;
+ m->no_parallel = 1;
+ pcmc->default_cpu_version = 1;
+ machine_class_allow_dynamic_sysbus_dev(m, TYPE_AMD_IOMMU_DEVICE);
+ machine_class_allow_dynamic_sysbus_dev(m, TYPE_INTEL_IOMMU_DEVICE);
+ machine_class_allow_dynamic_sysbus_dev(m, TYPE_RAMFB_DEVICE);
+ m->alias = "q35";
+ m->max_cpus = 710;
+ compat_props_add(m->compat_props, pc_rhel_compat, pc_rhel_compat_len);
+}
+
+static void pc_q35_init_rhel900(MachineState *machine)
+{
+ pc_q35_init(machine);
+}
+
+static void pc_q35_machine_rhel900_options(MachineClass *m)
+{
+ PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+ pc_q35_machine_rhel_options(m);
+ m->desc = "RHEL-9.0.0 PC (Q35 + ICH9, 2009)";
+ pcmc->smbios_stream_product = "RHEL";
+ pcmc->smbios_stream_version = "9.0.0";
+}
+
+DEFINE_PC_MACHINE(q35_rhel900, "pc-q35-rhel9.0.0", pc_q35_init_rhel900,
+ pc_q35_machine_rhel900_options);
+
+static void pc_q35_init_rhel860(MachineState *machine)
+{
+ pc_q35_init(machine);
+}
+
+static void pc_q35_machine_rhel860_options(MachineClass *m)
+{
+ PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+ pc_q35_machine_rhel900_options(m);
+ m->desc = "RHEL-8.6.0 PC (Q35 + ICH9, 2009)";
+ m->alias = NULL;
+
+ /* All RHEL machines for prior major releases are deprecated */
+ m->deprecation_reason = rhel_old_machine_deprecation;
+
+ pcmc->smbios_stream_product = "RHEL-AV";
+ pcmc->smbios_stream_version = "8.6.0";
+ compat_props_add(m->compat_props, hw_compat_rhel_8_6,
+ hw_compat_rhel_8_6_len);
+}
+
+DEFINE_PC_MACHINE(q35_rhel860, "pc-q35-rhel8.6.0", pc_q35_init_rhel860,
+ pc_q35_machine_rhel860_options);
+
+
+static void pc_q35_init_rhel850(MachineState *machine)
+{
+ pc_q35_init(machine);
+}
+
+static void pc_q35_machine_rhel850_options(MachineClass *m)
+{
+ PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+ pc_q35_machine_rhel860_options(m);
+ m->desc = "RHEL-8.5.0 PC (Q35 + ICH9, 2009)";
+ m->alias = NULL;
+ pcmc->smbios_stream_product = "RHEL-AV";
+ pcmc->smbios_stream_version = "8.5.0";
+ compat_props_add(m->compat_props, hw_compat_rhel_8_5,
+ hw_compat_rhel_8_5_len);
+ compat_props_add(m->compat_props, pc_rhel_8_5_compat,
+ pc_rhel_8_5_compat_len);
+ m->smp_props.prefer_sockets = true;
+}
+
+DEFINE_PC_MACHINE(q35_rhel850, "pc-q35-rhel8.5.0", pc_q35_init_rhel850,
+ pc_q35_machine_rhel850_options);
+
+
+static void pc_q35_init_rhel840(MachineState *machine)
+{
+ pc_q35_init(machine);
+}
+
+static void pc_q35_machine_rhel840_options(MachineClass *m)
+{
+ PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+ pc_q35_machine_rhel850_options(m);
+ m->desc = "RHEL-8.4.0 PC (Q35 + ICH9, 2009)";
+ m->alias = NULL;
+ pcmc->smbios_stream_product = "RHEL-AV";
+ pcmc->smbios_stream_version = "8.4.0";
+ compat_props_add(m->compat_props, hw_compat_rhel_8_4,
+ hw_compat_rhel_8_4_len);
+ compat_props_add(m->compat_props, pc_rhel_8_4_compat,
+ pc_rhel_8_4_compat_len);
+}
+
+DEFINE_PC_MACHINE(q35_rhel840, "pc-q35-rhel8.4.0", pc_q35_init_rhel840,
+ pc_q35_machine_rhel840_options);
+
+
+static void pc_q35_init_rhel830(MachineState *machine)
+{
+ pc_q35_init(machine);
+}
+
+static void pc_q35_machine_rhel830_options(MachineClass *m)
+{
+ PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+ pc_q35_machine_rhel840_options(m);
+ m->desc = "RHEL-8.3.0 PC (Q35 + ICH9, 2009)";
+ m->alias = NULL;
+ pcmc->smbios_stream_product = "RHEL-AV";
+ pcmc->smbios_stream_version = "8.3.0";
+ compat_props_add(m->compat_props, hw_compat_rhel_8_3,
+ hw_compat_rhel_8_3_len);
+ compat_props_add(m->compat_props, pc_rhel_8_3_compat,
+ pc_rhel_8_3_compat_len);
+ /* From pc_q35_5_1_machine_options() */
+ pcmc->kvmclock_create_always = false;
+ /* From pc_q35_5_1_machine_options() */
+ pcmc->pci_root_uid = 1;
+}
+
+DEFINE_PC_MACHINE(q35_rhel830, "pc-q35-rhel8.3.0", pc_q35_init_rhel830,
+ pc_q35_machine_rhel830_options);
+
+static void pc_q35_init_rhel820(MachineState *machine)
+{
+ pc_q35_init(machine);
+}
+
+static void pc_q35_machine_rhel820_options(MachineClass *m)
+{
+ PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+ pc_q35_machine_rhel830_options(m);
+ m->desc = "RHEL-8.2.0 PC (Q35 + ICH9, 2009)";
+ m->alias = NULL;
+ m->numa_mem_supported = true;
+ m->auto_enable_numa_with_memdev = false;
+ pcmc->smbios_stream_product = "RHEL-AV";
+ pcmc->smbios_stream_version = "8.2.0";
+ compat_props_add(m->compat_props, hw_compat_rhel_8_2,
+ hw_compat_rhel_8_2_len);
+ compat_props_add(m->compat_props, pc_rhel_8_2_compat,
+ pc_rhel_8_2_compat_len);
+}
+
+DEFINE_PC_MACHINE(q35_rhel820, "pc-q35-rhel8.2.0", pc_q35_init_rhel820,
+ pc_q35_machine_rhel820_options);
+
+static void pc_q35_init_rhel810(MachineState *machine)
+{
+ pc_q35_init(machine);
+}
+
+static void pc_q35_machine_rhel810_options(MachineClass *m)
+{
+ PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+ pc_q35_machine_rhel820_options(m);
+ m->desc = "RHEL-8.1.0 PC (Q35 + ICH9, 2009)";
+ m->alias = NULL;
+ pcmc->smbios_stream_product = NULL;
+ pcmc->smbios_stream_version = NULL;
+ compat_props_add(m->compat_props, hw_compat_rhel_8_1, hw_compat_rhel_8_1_len);
+ compat_props_add(m->compat_props, pc_rhel_8_1_compat, pc_rhel_8_1_compat_len);
+}
+
+DEFINE_PC_MACHINE(q35_rhel810, "pc-q35-rhel8.1.0", pc_q35_init_rhel810,
+ pc_q35_machine_rhel810_options);
+
+static void pc_q35_init_rhel800(MachineState *machine)
+{
+ pc_q35_init(machine);
+}
+
+static void pc_q35_machine_rhel800_options(MachineClass *m)
+{
+ PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+ pc_q35_machine_rhel810_options(m);
+ m->desc = "RHEL-8.0.0 PC (Q35 + ICH9, 2009)";
+ m->smbus_no_migration_support = true;
+ m->alias = NULL;
+ pcmc->pvh_enabled = false;
+ pcmc->default_cpu_version = CPU_VERSION_LEGACY;
+ compat_props_add(m->compat_props, hw_compat_rhel_8_0, hw_compat_rhel_8_0_len);
+ compat_props_add(m->compat_props, pc_rhel_8_0_compat, pc_rhel_8_0_compat_len);
+}
+
+DEFINE_PC_MACHINE(q35_rhel800, "pc-q35-rhel8.0.0", pc_q35_init_rhel800,
+ pc_q35_machine_rhel800_options);
+
+static void pc_q35_init_rhel760(MachineState *machine)
+{
+ pc_q35_init(machine);
+}
+
+static void pc_q35_machine_rhel760_options(MachineClass *m)
+{
+ pc_q35_machine_rhel800_options(m);
+ m->alias = NULL;
+ m->desc = "RHEL-7.6.0 PC (Q35 + ICH9, 2009)";
+ m->async_pf_vmexit_disable = true;
+ compat_props_add(m->compat_props, hw_compat_rhel_7_6, hw_compat_rhel_7_6_len);
+ compat_props_add(m->compat_props, pc_rhel_7_6_compat, pc_rhel_7_6_compat_len);
+}
+
+DEFINE_PC_MACHINE(q35_rhel760, "pc-q35-rhel7.6.0", pc_q35_init_rhel760,
+ pc_q35_machine_rhel760_options);
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index ec4176a1e0..465a2a09d2 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -1136,6 +1136,7 @@ static void ccw_machine_rhel860_instance_options(MachineState *machine)
static void ccw_machine_rhel860_class_options(MachineClass *mc)
{
ccw_machine_rhel900_class_options(mc);
+ compat_props_add(mc->compat_props, hw_compat_rhel_8_6, hw_compat_rhel_8_6_len);
/* All RHEL machines for prior major releases are deprecated */
mc->deprecation_reason = rhel_old_machine_deprecation;
diff --git a/include/hw/boards.h b/include/hw/boards.h
index bf59275f18..d1555665df 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -266,6 +266,8 @@ struct MachineClass {
strList *allowed_dynamic_sysbus_devices;
bool auto_enable_numa_with_memhp;
bool auto_enable_numa_with_memdev;
+ /* RHEL only */
+ bool async_pf_vmexit_disable;
bool ignore_boot_device_suffixes;
bool smbus_no_migration_support;
bool nvdimm_supported;
@@ -449,6 +451,9 @@ extern const size_t hw_compat_2_2_len;
extern GlobalProperty hw_compat_2_1[];
extern const size_t hw_compat_2_1_len;
+extern GlobalProperty hw_compat_rhel_8_6[];
+extern const size_t hw_compat_rhel_8_6_len;
+
extern GlobalProperty hw_compat_rhel_8_5[];
extern const size_t hw_compat_rhel_8_5_len;
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 91331059d9..419a6ec24b 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -289,6 +289,30 @@ extern const size_t pc_compat_1_5_len;
extern GlobalProperty pc_compat_1_4[];
extern const size_t pc_compat_1_4_len;
+extern GlobalProperty pc_rhel_compat[];
+extern const size_t pc_rhel_compat_len;
+
+extern GlobalProperty pc_rhel_8_5_compat[];
+extern const size_t pc_rhel_8_5_compat_len;
+
+extern GlobalProperty pc_rhel_8_4_compat[];
+extern const size_t pc_rhel_8_4_compat_len;
+
+extern GlobalProperty pc_rhel_8_3_compat[];
+extern const size_t pc_rhel_8_3_compat_len;
+
+extern GlobalProperty pc_rhel_8_2_compat[];
+extern const size_t pc_rhel_8_2_compat_len;
+
+extern GlobalProperty pc_rhel_8_1_compat[];
+extern const size_t pc_rhel_8_1_compat_len;
+
+extern GlobalProperty pc_rhel_8_0_compat[];
+extern const size_t pc_rhel_8_0_compat_len;
+
+extern GlobalProperty pc_rhel_7_6_compat[];
+extern const size_t pc_rhel_7_6_compat_len;
+
/* Helper for setting model-id for CPU models that changed model-id
* depending on QEMU versions up to QEMU 2.4.
*/
diff --git a/target/i386/kvm/kvm-cpu.c b/target/i386/kvm/kvm-cpu.c
index 5eb955ce9a..74c1396a93 100644
--- a/target/i386/kvm/kvm-cpu.c
+++ b/target/i386/kvm/kvm-cpu.c
@@ -137,6 +137,7 @@ static PropValue kvm_default_props[] = {
{ "acpi", "off" },
{ "monitor", "off" },
{ "svm", "off" },
+ { "kvm-pv-unhalt", "on" },
{ NULL, NULL },
};
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 9cf8e03669..6d1e009443 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -3488,6 +3488,7 @@ static int kvm_get_msrs(X86CPU *cpu)
struct kvm_msr_entry *msrs = cpu->kvm_msr_buf->entries;
int ret, i;
uint64_t mtrr_top_bits;
+ MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
kvm_msr_buf_reset(cpu);
@@ -3822,6 +3823,9 @@ static int kvm_get_msrs(X86CPU *cpu)
break;
case MSR_KVM_ASYNC_PF_EN:
env->async_pf_en_msr = msrs[i].data;
+ if (mc->async_pf_vmexit_disable) {
+ env->async_pf_en_msr &= ~(1ULL << 2);
+ }
break;
case MSR_KVM_ASYNC_PF_INT:
env->async_pf_int_msr = msrs[i].data;
diff --git a/tests/qtest/pvpanic-test.c b/tests/qtest/pvpanic-test.c
index 6dcad2db49..580c2c43d2 100644
--- a/tests/qtest/pvpanic-test.c
+++ b/tests/qtest/pvpanic-test.c
@@ -17,7 +17,7 @@ static void test_panic_nopause(void)
QDict *response, *data;
QTestState *qts;
- qts = qtest_init("-device pvpanic -action panic=none");
+ qts = qtest_init("-M q35 -device pvpanic -action panic=none");
val = qtest_inb(qts, 0x505);
g_assert_cmpuint(val, ==, 3);
@@ -40,7 +40,8 @@ static void test_panic(void)
QDict *response, *data;
QTestState *qts;
- qts = qtest_init("-device pvpanic -action panic=pause");
+ /* RHEL: Use q35 */
+ qts = qtest_init("-M q35 -device pvpanic -action panic=pause");
val = qtest_inb(qts, 0x505);
g_assert_cmpuint(val, ==, 3);
--
2.31.1

View File

@ -1,186 +0,0 @@
From 5e419e5e0a721bdbbfa6d9b82c8be5c5b3d26a01 Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Wed, 2 Sep 2020 09:39:41 +0200
Subject: Enable make check
Fixing tests after device disabling and machine types changes and enabling
make check run during build.
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
---
Rebase changes (6.1.0):
- removed unnecessary test changes
Rebase changes (6.2.0):
- new way of disabling bios-table-test
Rebase changes (7.0.0):
- Disable testing virtio-iommu-pci
- Rename default_bus_bypass_iommu property to default-bus-bypass-iommu
- Disable qtest-bios-table for aarch64
- Removed redhat chunks for boot-serial-test.c, cdrom-test.c and cpu-plug-test.c qtests
- Do not disable boot-order-test, prom-env-test and boot-serial-test qtests
- Use rhel machine type for new intel hda qtest
- Remove unnecessary changes in iotest 051
- Remove changes in bios-tables-test.c and prom-env-test.c qtests
Merged patches (6.1.0):
- 2f129df7d3 redhat: Enable the 'test-block-iothread' test again
---
.distro/qemu-kvm.spec.template | 5 ++---
tests/qtest/fuzz-e1000e-test.c | 2 +-
tests/qtest/fuzz-virtio-scsi-test.c | 2 +-
tests/qtest/intel-hda-test.c | 2 +-
tests/qtest/libqos/meson.build | 2 +-
tests/qtest/lpc-ich9-test.c | 2 +-
tests/qtest/meson.build | 4 ----
tests/qtest/usb-hcd-xhci-test.c | 4 ++++
tests/qtest/virtio-net-failover.c | 1 +
9 files changed, 12 insertions(+), 12 deletions(-)
diff --git a/tests/qtest/fuzz-e1000e-test.c b/tests/qtest/fuzz-e1000e-test.c
index 66229e6096..947fba73b7 100644
--- a/tests/qtest/fuzz-e1000e-test.c
+++ b/tests/qtest/fuzz-e1000e-test.c
@@ -17,7 +17,7 @@ static void test_lp1879531_eth_get_rss_ex_dst_addr(void)
{
QTestState *s;
- s = qtest_init("-nographic -monitor none -serial none -M pc-q35-5.0");
+ s = qtest_init("-nographic -monitor none -serial none -M pc-q35-rhel8.4.0");
qtest_outl(s, 0xcf8, 0x80001010);
qtest_outl(s, 0xcfc, 0xe1020000);
diff --git a/tests/qtest/fuzz-virtio-scsi-test.c b/tests/qtest/fuzz-virtio-scsi-test.c
index aaf6d10e18..43727d62ac 100644
--- a/tests/qtest/fuzz-virtio-scsi-test.c
+++ b/tests/qtest/fuzz-virtio-scsi-test.c
@@ -19,7 +19,7 @@ static void test_mmio_oob_from_memory_region_cache(void)
{
QTestState *s;
- s = qtest_init("-M pc-q35-5.2 -display none -m 512M "
+ s = qtest_init("-M pc-q35-rhel8.4.0 -display none -m 512M "
"-device virtio-scsi,num_queues=8,addr=03.0 ");
qtest_outl(s, 0xcf8, 0x80001811);
diff --git a/tests/qtest/intel-hda-test.c b/tests/qtest/intel-hda-test.c
index a58c98e4d1..c8387e39ce 100644
--- a/tests/qtest/intel-hda-test.c
+++ b/tests/qtest/intel-hda-test.c
@@ -38,7 +38,7 @@ static void test_issue542_ich6(void)
{
QTestState *s;
- s = qtest_init("-nographic -nodefaults -M pc-q35-6.2 "
+ s = qtest_init("-nographic -nodefaults -M pc-q35-rhel9.0.0 "
"-device intel-hda,id=" HDA_ID CODEC_DEVICES);
qtest_outl(s, 0xcf8, 0x80000804);
diff --git a/tests/qtest/libqos/meson.build b/tests/qtest/libqos/meson.build
index e988d15791..46f7dcb81a 100644
--- a/tests/qtest/libqos/meson.build
+++ b/tests/qtest/libqos/meson.build
@@ -41,7 +41,7 @@ libqos_srcs = files('../libqtest.c',
'virtio-rng.c',
'virtio-scsi.c',
'virtio-serial.c',
- 'virtio-iommu.c',
+# 'virtio-iommu.c',
# qgraph machines:
'aarch64-xlnx-zcu102-machine.c',
diff --git a/tests/qtest/lpc-ich9-test.c b/tests/qtest/lpc-ich9-test.c
index fe0bef9980..7a9d51579b 100644
--- a/tests/qtest/lpc-ich9-test.c
+++ b/tests/qtest/lpc-ich9-test.c
@@ -15,7 +15,7 @@ static void test_lp1878642_pci_bus_get_irq_level_assert(void)
{
QTestState *s;
- s = qtest_init("-M pc-q35-5.0 "
+ s = qtest_init("-M pc-q35-rhel8.4.0 "
"-nographic -monitor none -serial none");
qtest_outl(s, 0xcf8, 0x8000f840); /* PMBASE */
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index d25f82bb5a..67cd32def1 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -73,7 +73,6 @@ qtests_i386 = \
config_all_devices.has_key('CONFIG_Q35') and \
config_all_devices.has_key('CONFIG_VIRTIO_PCI') and \
slirp.found() ? ['virtio-net-failover'] : []) + \
- (unpack_edk2_blobs ? ['bios-tables-test'] : []) + \
qtests_pci + \
['fdc-test',
'ide-test',
@@ -86,7 +85,6 @@ qtests_i386 = \
'drive_del-test',
'tco-test',
'cpu-plug-test',
- 'q35-test',
'vmgenid-test',
'migration-test',
'test-x86-cpuid-compat',
@@ -216,7 +214,6 @@ qtests_arm = \
# TODO: once aarch64 TCG is fixed on ARM 32 bit host, make bios-tables-test unconditional
qtests_aarch64 = \
- (cpu != 'arm' and unpack_edk2_blobs ? ['bios-tables-test'] : []) + \
(config_all_devices.has_key('CONFIG_TPM_TIS_SYSBUS') ? ['tpm-tis-device-test'] : []) + \
(config_all_devices.has_key('CONFIG_TPM_TIS_SYSBUS') ? ['tpm-tis-device-swtpm-test'] : []) + \
(config_all_devices.has_key('CONFIG_XLNX_ZYNQMP_ARM') ? ['xlnx-can-test', 'fuzz-xlnx-dp-test'] : []) + \
@@ -231,7 +228,6 @@ qtests_s390x = \
(config_host.has_key('CONFIG_POSIX') ? ['test-filter-redirector'] : []) + \
['boot-serial-test',
'drive_del-test',
- 'device-plug-test',
'virtio-ccw-test',
'cpu-plug-test',
'migration-test']
diff --git a/tests/qtest/usb-hcd-xhci-test.c b/tests/qtest/usb-hcd-xhci-test.c
index 10ef9d2a91..3855873050 100644
--- a/tests/qtest/usb-hcd-xhci-test.c
+++ b/tests/qtest/usb-hcd-xhci-test.c
@@ -21,6 +21,7 @@ static void test_xhci_hotplug(void)
usb_test_hotplug(global_qtest, "xhci", "1", NULL);
}
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static void test_usb_uas_hotplug(void)
{
QTestState *qts = global_qtest;
@@ -36,6 +37,7 @@ static void test_usb_uas_hotplug(void)
qtest_qmp_device_del(qts, "scsihd");
qtest_qmp_device_del(qts, "uas");
}
+#endif
static void test_usb_ccid_hotplug(void)
{
@@ -56,7 +58,9 @@ int main(int argc, char **argv)
qtest_add_func("/xhci/pci/init", test_xhci_init);
qtest_add_func("/xhci/pci/hotplug", test_xhci_hotplug);
+#if 0 /* Disabled for Red Hat Enterprise Linux */
qtest_add_func("/xhci/pci/hotplug/usb-uas", test_usb_uas_hotplug);
+#endif
qtest_add_func("/xhci/pci/hotplug/usb-ccid", test_usb_ccid_hotplug);
qtest_start("-device nec-usb-xhci,id=xhci"
diff --git a/tests/qtest/virtio-net-failover.c b/tests/qtest/virtio-net-failover.c
index 78811f1c92..44de8af00c 100644
--- a/tests/qtest/virtio-net-failover.c
+++ b/tests/qtest/virtio-net-failover.c
@@ -25,6 +25,7 @@
#define PCI_SEL_BASE 0x0010
#define BASE_MACHINE "-M q35 -nodefaults " \
+ "-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=on " \
"-device pcie-root-port,id=root0,addr=0x1,bus=pcie.0,chassis=1 " \
"-device pcie-root-port,id=root1,addr=0x2,bus=pcie.0,chassis=2 "
--
2.31.1

View File

@ -1,104 +0,0 @@
From c358fd4c224a9c3f64b4a8fff34cc6b1dc201fa0 Mon Sep 17 00:00:00 2001
From: Bandan Das <bsd@redhat.com>
Date: Tue, 3 Dec 2013 20:05:13 +0100
Subject: vfio: cap number of devices that can be assigned
RH-Author: Bandan Das <bsd@redhat.com>
Message-id: <1386101113-31560-3-git-send-email-bsd@redhat.com>
Patchwork-id: 55984
O-Subject: [PATCH RHEL7 qemu-kvm v2 2/2] vfio: cap number of devices that can be assigned
Bugzilla: 678368
RH-Acked-by: Alex Williamson <alex.williamson@redhat.com>
RH-Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
RH-Acked-by: Michael S. Tsirkin <mst@redhat.com>
Go through all groups to get count of total number of devices
active to enforce limit
Reasoning from Alex for the limit(32) - Assuming 3 slots per
device, with 125 slots (number of memory slots for RHEL 7),
we can support almost 40 devices and still have few slots left
for other uses. Stepping down a bit, the number 32 arbitrarily
matches the number of slots on a PCI bus and is also a nice power
of two.
Count of slots increased to 509 later so we could increase limit
to 64 as some usecases require more than 32 devices.
Signed-off-by: Bandan Das <bsd@redhat.com>
---
hw/vfio/pci.c | 29 ++++++++++++++++++++++++++++-
hw/vfio/pci.h | 1 +
2 files changed, 29 insertions(+), 1 deletion(-)
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 67a183f17b..1e20f9fd59 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -45,6 +45,9 @@
#define TYPE_VFIO_PCI_NOHOTPLUG "vfio-pci-nohotplug"
+/* RHEL only: Set once for the first assigned dev */
+static uint16_t device_limit;
+
static void vfio_disable_interrupts(VFIOPCIDevice *vdev);
static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled);
@@ -2810,9 +2813,30 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
ssize_t len;
struct stat st;
int groupid;
- int i, ret;
+ int ret, i = 0;
bool is_mdev;
+ if (device_limit && device_limit != vdev->assigned_device_limit) {
+ error_setg(errp, "Assigned device limit has been redefined. "
+ "Old:%d, New:%d",
+ device_limit, vdev->assigned_device_limit);
+ return;
+ } else {
+ device_limit = vdev->assigned_device_limit;
+ }
+
+ QLIST_FOREACH(group, &vfio_group_list, next) {
+ QLIST_FOREACH(vbasedev_iter, &group->device_list, next) {
+ i++;
+ }
+ }
+
+ if (i >= vdev->assigned_device_limit) {
+ error_setg(errp, "Maximum supported vfio devices (%d) "
+ "already attached", vdev->assigned_device_limit);
+ return;
+ }
+
if (!vdev->vbasedev.sysfsdev) {
if (!(~vdev->host.domain || ~vdev->host.bus ||
~vdev->host.slot || ~vdev->host.function)) {
@@ -3249,6 +3273,9 @@ static Property vfio_pci_dev_properties[] = {
DEFINE_PROP_BOOL("x-no-kvm-msix", VFIOPCIDevice, no_kvm_msix, false),
DEFINE_PROP_BOOL("x-no-geforce-quirks", VFIOPCIDevice,
no_geforce_quirks, false),
+ /* RHEL only */
+ DEFINE_PROP_UINT16("x-assigned-device-limit", VFIOPCIDevice,
+ assigned_device_limit, 64),
DEFINE_PROP_BOOL("x-no-kvm-ioeventfd", VFIOPCIDevice, no_kvm_ioeventfd,
false),
DEFINE_PROP_BOOL("x-no-vfio-ioeventfd", VFIOPCIDevice, no_vfio_ioeventfd,
diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index 64777516d1..e0fe6ca97e 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -139,6 +139,7 @@ struct VFIOPCIDevice {
EventNotifier err_notifier;
EventNotifier req_notifier;
int (*resetfn)(struct VFIOPCIDevice *);
+ uint16_t assigned_device_limit;
uint32_t vendor_id;
uint32_t device_id;
uint32_t sub_vendor_id;
--
2.31.1

View File

@ -1,55 +0,0 @@
From ba0c7a5f6b9a1f75666db6b3b795ddf03695dc26 Mon Sep 17 00:00:00 2001
From: Eduardo Habkost <ehabkost@redhat.com>
Date: Wed, 4 Dec 2013 18:53:17 +0100
Subject: Add support statement to -help output
RH-Author: Eduardo Habkost <ehabkost@redhat.com>
Message-id: <1386183197-27761-1-git-send-email-ehabkost@redhat.com>
Patchwork-id: 55994
O-Subject: [qemu-kvm RHEL7 PATCH] Add support statement to -help output
Bugzilla: 972773
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: knoel@redhat.com
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Add support statement to -help output, reporting direct qemu-kvm usage
as unsupported by Red Hat, and advising users to use libvirt instead.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
softmmu/vl.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 6f646531a0..9d5dab43d2 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -831,9 +831,17 @@ static void version(void)
QEMU_COPYRIGHT "\n");
}
+static void print_rh_warning(void)
+{
+ printf("\nWARNING: Direct use of qemu-kvm from the command line is not supported by Red Hat.\n"
+ "WARNING: Use libvirt as the stable management interface.\n"
+ "WARNING: Some command line options listed here may not be available in future releases.\n\n");
+}
+
static void help(int exitcode)
{
version();
+ print_rh_warning();
printf("usage: %s [options] [disk_image]\n\n"
"'disk_image' is a raw hard disk image for IDE hard disk 0\n\n",
g_get_prgname());
@@ -859,6 +867,7 @@ static void help(int exitcode)
"\n"
QEMU_HELP_BOTTOM "\n");
+ print_rh_warning();
exit(exitcode);
}
--
2.31.1

View File

@ -1,45 +0,0 @@
From 9ebfd2f6cfa8e79c92e58fd169f90cc768fb865a Mon Sep 17 00:00:00 2001
From: Andrew Jones <drjones@redhat.com>
Date: Tue, 21 Jan 2014 10:46:52 +0100
Subject: globally limit the maximum number of CPUs
We now globally limit the number of VCPUs.
Especially, there is no way one can specify more than
max_cpus VCPUs for a VM.
This allows us the restore the ppc max_cpus limitation to the upstream
default and minimize the ppc hack in kvm-all.c.
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
Signed-off-by: Danilo Cesar Lemes de Paula <ddepaula@redhat.com>
---
accel/kvm/kvm-all.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 5f1377ca04..fdf0e4d429 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2430,6 +2430,18 @@ static int kvm_init(MachineState *ms)
soft_vcpus_limit = kvm_recommended_vcpus(s);
hard_vcpus_limit = kvm_max_vcpus(s);
+#ifdef HOST_PPC64
+ /*
+ * On POWER, the kernel advertises a soft limit based on the
+ * number of CPU threads on the host. We want to allow exceeding
+ * this for testing purposes, so we don't want to set hard limit
+ * to soft limit as on x86.
+ */
+#else
+ /* RHEL doesn't support nr_vcpus > soft_vcpus_limit */
+ hard_vcpus_limit = soft_vcpus_limit;
+#endif
+
while (nc->name) {
if (nc->num > soft_vcpus_limit) {
warn_report("Number of %s cpus requested (%d) exceeds "
--
2.31.1

View File

@ -1,61 +0,0 @@
From 4b6c8cdc52fdf94d4098d278defb3833dce1d189 Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Wed, 8 Jul 2020 08:35:50 +0200
Subject: Use qemu-kvm in documentation instead of qemu-system-<arch>
Patchwork-id: 62380
O-Subject: [RHEV-7.1 qemu-kvm-rhev PATCHv4] Use qemu-kvm in documentation instead of qemu-system-i386
Bugzilla: 1140620
RH-Acked-by: Laszlo Ersek <lersek@redhat.com>
RH-Acked-by: Markus Armbruster <armbru@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
From: Miroslav Rezanina <mrezanin@redhat.com>
We change the name and location of qemu-kvm binaries. Update documentation
to reflect this change. Only architectures available in RHEL are updated.
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
---
docs/defs.rst.inc | 4 ++--
qemu-options.hx | 10 +++++-----
2 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/docs/defs.rst.inc b/docs/defs.rst.inc
index 52d6454b93..d74dbdeca9 100644
--- a/docs/defs.rst.inc
+++ b/docs/defs.rst.inc
@@ -9,7 +9,7 @@
but the manpages will end up misrendered with following normal text
incorrectly in boldface.
-.. |qemu_system| replace:: qemu-system-x86_64
-.. |qemu_system_x86| replace:: qemu-system-x86_64
+.. |qemu_system| replace:: qemu-kvm
+.. |qemu_system_x86| replace:: qemu-kvm
.. |I2C| replace:: I\ :sup:`2`\ C
.. |I2S| replace:: I\ :sup:`2`\ S
diff --git a/qemu-options.hx b/qemu-options.hx
index 34e9b32a5c..924f61ab6d 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3233,11 +3233,11 @@ SRST
::
- qemu -m 512 -object memory-backend-file,id=mem,size=512M,mem-path=/hugetlbfs,share=on \
- -numa node,memdev=mem \
- -chardev socket,id=chr0,path=/path/to/socket \
- -netdev type=vhost-user,id=net0,chardev=chr0 \
- -device virtio-net-pci,netdev=net0
+ qemu-kvm -m 512 -object memory-backend-file,id=mem,size=512M,mem-path=/hugetlbfs,share=on \
+ -numa node,memdev=mem \
+ -chardev socket,id=chr0,path=/path/to/socket \
+ -netdev type=vhost-user,id=net0,chardev=chr0 \
+ -device virtio-net-pci,netdev=net0
``-netdev vhost-vdpa,vhostdev=/path/to/dev``
Establish a vhost-vdpa netdev.
--
2.31.1

View File

@ -1,66 +0,0 @@
From b72e04cb7e417d9e1c973223747ab3a27abda8b4 Mon Sep 17 00:00:00 2001
From: Fam Zheng <famz@redhat.com>
Date: Wed, 14 Jun 2017 15:37:01 +0200
Subject: virtio-scsi: Reject scsi-cd if data plane enabled [RHEL only]
RH-Author: Fam Zheng <famz@redhat.com>
Message-id: <20170614153701.14757-1-famz@redhat.com>
Patchwork-id: 75613
O-Subject: [RHV-7.4 qemu-kvm-rhev PATCH v3] virtio-scsi: Reject scsi-cd if data plane enabled [RHEL only]
Bugzilla: 1378816
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Acked-by: Max Reitz <mreitz@redhat.com>
We need a fix for RHEL 7.4 and 7.3.z, but unfortunately upstream isn't
ready. If it were, the changes will be too invasive. To have an idea:
https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg05400.html
is an incomplete attempt to fix part of the issue, and the remaining
work unfortunately involve even more complex changes.
As a band-aid, this partially reverts the effect of ef8875b
(virtio-scsi: Remove op blocker for dataplane, since v2.7). We cannot
simply revert that commit as a whole because we already shipped it in
qemu-kvm-rhev 7.3, since when, block jobs has been possible. We should
only block what has been broken. Also, faithfully reverting the above
commit means adding back the removed op blocker, but that is not enough,
because it still crashes when inserting media into an initially empty
scsi-cd.
All in all, scsi-cd on virtio-scsi-dataplane has basically been unusable
unless the scsi-cd never enters an empty state, so, disable it
altogether. Otherwise it would be much more difficult to avoid
crashing.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
---
hw/scsi/virtio-scsi.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 34a968ecfb..7f6da33a8a 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -896,6 +896,15 @@ static void virtio_scsi_hotplug(HotplugHandler *hotplug_dev, DeviceState *dev,
AioContext *old_context;
int ret;
+ /* XXX: Remove this check once block backend is capable of handling
+ * AioContext change upon eject/insert.
+ * s->ctx is NULL if ioeventfd is off, s->ctx is qemu_get_aio_context() if
+ * data plane is not used, both cases are safe for scsi-cd. */
+ if (s->ctx && s->ctx != qemu_get_aio_context() &&
+ object_dynamic_cast(OBJECT(dev), "scsi-cd")) {
+ error_setg(errp, "scsi-cd is not supported by data plane");
+ return;
+ }
if (s->ctx && !s->dataplane_fenced) {
if (blk_op_is_blocked(sd->conf.blk, BLOCK_OP_TYPE_DATAPLANE, errp)) {
return;
--
2.31.1

View File

@ -1,60 +0,0 @@
From 64a06662cdea0ff62efb122be4eab506b2a842d9 Mon Sep 17 00:00:00 2001
From: David Gibson <dgibson@redhat.com>
Date: Wed, 6 Feb 2019 03:58:56 +0000
Subject: BZ1653590: Require at least 64kiB pages for downstream guests & hosts
RH-Author: David Gibson <dgibson@redhat.com>
Message-id: <20190206035856.19058-1-dgibson@redhat.com>
Patchwork-id: 84246
O-Subject: [RHELAV-8.0/rhel qemu-kvm PATCH] BZ1653590: Require at least 64kiB pages for downstream guests & hosts
Bugzilla: 1653590
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
RH-Acked-by: Serhii Popovych <spopovyc@redhat.com>
RH-Acked-by: Thomas Huth <thuth@redhat.com>
Most current POWER guests require 64kiB page support, so that's the default
for the cap-hpt-max-pagesize option in qemu which limits available guest
page sizes. We warn if the value is set smaller than that, but don't
outright fail upstream, because we need to allow for the possibility of
guest (and/or host) kernels configured for 4kiB page sizes.
Downstream, however, we simply don't support 4kiB pagesize configured
kernels in guest or host, so we can have qemu simply error out in this
situation.
Testing: Attempted to start a guest with cap-hpt-max-page-size=4k and verified
it failed immediately with a qemu error
Signed-off-by: David Gibson <dgibson@redhat.com>
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
---
hw/ppc/spapr_caps.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index 655ab856a0..6aa7f93df9 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -329,12 +329,19 @@ bool spapr_check_pagesize(SpaprMachineState *spapr, hwaddr pagesize,
static void cap_hpt_maxpagesize_apply(SpaprMachineState *spapr,
uint8_t val, Error **errp)
{
+#if 0 /* disabled for RHEL */
if (val < 12) {
error_setg(errp, "Require at least 4kiB hpt-max-page-size");
return;
} else if (val < 16) {
warn_report("Many guests require at least 64kiB hpt-max-page-size");
}
+#else /* Only page sizes >=64kiB supported for RHEL */
+ if (val < 16) {
+ error_setg(errp, "Require at least 64kiB hpt-max-page-size");
+ return;
+ }
+#endif
spapr_check_pagesize(spapr, qemu_minrampagesize(), errp);
}
--
2.31.1

View File

@ -1,77 +0,0 @@
From 54f9157a918e1404f2f17ce89a9c8b9088c1bc06 Mon Sep 17 00:00:00 2001
From: Kevin Wolf <kwolf@redhat.com>
Date: Fri, 20 Aug 2021 18:25:12 +0200
Subject: qcow2: Deprecation warning when opening v2 images rw
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Kevin Wolf <kwolf@redhat.com>
RH-MergeRequest: 37: qcow2: Deprecation warning when opening v2 images rw
RH-Commit: [1/1] f450d0ae32d35063b28c72c4f2d2ebb9e6d8db3e (kmwolf/centos-qemu-kvm)
RH-Bugzilla: 1951814
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Acked-by: Hanna Reitz <hreitz@redhat.com>
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Philippe Mathieu-Daudé <philmd@redhat.com>
qcow2 v3 has been around for a long time (since QEMU 1.1/RHEL 7), so
there is no real reason any more to use it. People still using it might
do so unintentionally. Warn about it and suggest upgrading during the
RHEL 9 timeframe so that the code can possibly be disabled in RHEL 10.
The warning is restricted to read-write mode and the system emulator.
The primary motivation for not having it in qemu-img is that 'qemu-img
amend' for upgrades would warn otherwise. It also avoids having to make
too many changes to the test suite.
bdrv_uses_whitelist() is used as a proxy for deciding whether we are
running in a tool or the system emulator. This is not entirely clean,
but it's what is available and the same function qcow2_do_open() already
uses it this way for another warning.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
patch_name: kvm-qcow2-Deprecation-warning-when-opening-v2-images-rw.patch
present_in_specfile: true
location_in_specfile: 116
---
Rebase notes (6.1.0):
- Replace bs->read_only with bdrv_is_read_only
---
block/qcow2.c | 6 ++++++
tests/qemu-iotests/common.filter | 1 +
2 files changed, 7 insertions(+)
diff --git a/block/qcow2.c b/block/qcow2.c
index b5c47931ef..a795e457ac 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1337,6 +1337,12 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options,
ret = -ENOTSUP;
goto fail;
}
+ if (header.version < 3 && !bdrv_is_read_only(bs) && bdrv_uses_whitelist()) {
+ warn_report_once("qcow2 v2 images are deprecated and may not be "
+ "supported in future versions. Please consider "
+ "upgrading the image with 'qemu-img amend "
+ "-o compat=v3'.");
+ }
s->qcow_version = header.version;
diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index cc9f1a5891..6a13757177 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -83,6 +83,7 @@ _filter_qemu()
{
gsed -e "s#\\(^\\|(qemu) \\)$(basename $QEMU_PROG):#\1QEMU_PROG:#" \
-e 's#^QEMU [0-9]\+\.[0-9]\+\.[0-9]\+ monitor#QEMU X.Y.Z monitor#' \
+ -e "/qcow2 v2 images are deprecated/d" \
-e $'s#\r##' # QEMU monitor uses \r\n line endings
}
--
2.31.1

View File

@ -1,135 +0,0 @@
From 1d6439527aa6ccabb58208c94417778ccc19de39 Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Wed, 9 Feb 2022 04:16:25 -0500
Subject: WRB: Introduce RHEL 9.0.0 hw compat structure
General compatibility structure for post RHEL 9.0.0 rebase.
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
---
hw/core/machine.c | 9 +++++++++
hw/i386/pc.c | 6 ++++++
hw/i386/pc_piix.c | 4 ++++
hw/i386/pc_q35.c | 4 ++++
hw/s390x/s390-virtio-ccw.c | 2 ++
include/hw/boards.h | 3 +++
include/hw/i386/pc.h | 3 +++
7 files changed, 31 insertions(+)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 28989b6e7b..dffc3ef4ab 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -53,6 +53,15 @@ GlobalProperty hw_compat_rhel_8_6[] = {
};
const size_t hw_compat_rhel_8_6_len = G_N_ELEMENTS(hw_compat_rhel_8_6);
+/*
+ * Mostly the same as hw_compat_6_2
+ */
+GlobalProperty hw_compat_rhel_9_0[] = {
+ /* hw_compat_rhel_9_0 from hw_compat_6_2 */
+ { "PIIX4_PM", "x-not-migrate-acpi-index", "on"},
+};
+const size_t hw_compat_rhel_9_0_len = G_N_ELEMENTS(hw_compat_rhel_9_0);
+
/*
* Mostly the same as hw_compat_6_0 and hw_compat_6_1
*/
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 263d882af6..0886cfe3fe 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -391,6 +391,12 @@ GlobalProperty pc_rhel_compat[] = {
};
const size_t pc_rhel_compat_len = G_N_ELEMENTS(pc_rhel_compat);
+GlobalProperty pc_rhel_9_0_compat[] = {
+ /* pc_rhel_9_0_compat from pc_compat_6_2 */
+ { "virtio-mem", "unplugged-inaccessible", "off" },
+};
+const size_t pc_rhel_9_0_compat_len = G_N_ELEMENTS(pc_rhel_9_0_compat);
+
GlobalProperty pc_rhel_8_5_compat[] = {
/* pc_rhel_8_5_compat from pc_compat_6_0 */
{ "qemu64" "-" TYPE_X86_CPU, "family", "6" },
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 0cacc0d623..dc987fe93b 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -1014,6 +1014,10 @@ static void pc_machine_rhel760_options(MachineClass *m)
pcmc->kvmclock_create_always = false;
/* From pc_i440fx_5_1_machine_options() */
pcmc->pci_root_uid = 1;
+ compat_props_add(m->compat_props, hw_compat_rhel_9_0,
+ hw_compat_rhel_9_0_len);
+ compat_props_add(m->compat_props, pc_rhel_9_0_compat,
+ pc_rhel_9_0_compat_len);
compat_props_add(m->compat_props, hw_compat_rhel_8_6,
hw_compat_rhel_8_6_len);
compat_props_add(m->compat_props, hw_compat_rhel_8_5,
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 157160e069..52c253c570 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -669,6 +669,10 @@ static void pc_q35_machine_rhel900_options(MachineClass *m)
m->desc = "RHEL-9.0.0 PC (Q35 + ICH9, 2009)";
pcmc->smbios_stream_product = "RHEL";
pcmc->smbios_stream_version = "9.0.0";
+ compat_props_add(m->compat_props, hw_compat_rhel_9_0,
+ hw_compat_rhel_9_0_len);
+ compat_props_add(m->compat_props, pc_rhel_9_0_compat,
+ pc_rhel_9_0_compat_len);
}
DEFINE_PC_MACHINE(q35_rhel900, "pc-q35-rhel9.0.0", pc_q35_init_rhel900,
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 465a2a09d2..08e0f6a79b 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -1118,12 +1118,14 @@ static void ccw_machine_2_4_class_options(MachineClass *mc)
DEFINE_CCW_MACHINE(2_4, "2.4", false);
#endif
+
static void ccw_machine_rhel900_instance_options(MachineState *machine)
{
}
static void ccw_machine_rhel900_class_options(MachineClass *mc)
{
+ compat_props_add(mc->compat_props, hw_compat_rhel_9_0, hw_compat_rhel_9_0_len);
}
DEFINE_CCW_MACHINE(rhel900, "rhel9.0.0", true);
diff --git a/include/hw/boards.h b/include/hw/boards.h
index d1555665df..635e45dd71 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -451,6 +451,9 @@ extern const size_t hw_compat_2_2_len;
extern GlobalProperty hw_compat_2_1[];
extern const size_t hw_compat_2_1_len;
+extern GlobalProperty hw_compat_rhel_9_0[];
+extern const size_t hw_compat_rhel_9_0_len;
+
extern GlobalProperty hw_compat_rhel_8_6[];
extern const size_t hw_compat_rhel_8_6_len;
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 419a6ec24b..a492c420b5 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -292,6 +292,9 @@ extern const size_t pc_compat_1_4_len;
extern GlobalProperty pc_rhel_compat[];
extern const size_t pc_rhel_compat_len;
+extern GlobalProperty pc_rhel_9_0_compat[];
+extern const size_t pc_rhel_9_0_compat_len;
+
extern GlobalProperty pc_rhel_8_5_compat[];
extern const size_t pc_rhel_8_5_compat_len;
--
2.31.1

View File

@ -1,38 +0,0 @@
From c8ad21ca31892f8798cf82508c2b2c61bf3b9895 Mon Sep 17 00:00:00 2001
From: Thomas Huth <thuth@redhat.com>
Date: Mon, 4 Apr 2022 12:15:50 +0200
Subject: redhat: Update s390x machine type compatibility for rebase to QEMU
7.0.0
RH-Author: Thomas Huth <thuth@redhat.com>
RH-MergeRequest: 143: Update machine type compatibility for QEMU 7.0.0 update [s390x]
RH-Commit: [23/23] 0ecf97d7bdddc50565b5779c64744b353f715cbd
RH-Bugzilla: 2064782
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
No s390x-specific machine class property updates required this time,
only an update to the default qemu cpu model.
Signed-off-by: Thomas Huth <thuth@redhat.com>
---
hw/s390x/s390-virtio-ccw.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 08e0f6a79b..4a491d4988 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -1121,6 +1121,9 @@ DEFINE_CCW_MACHINE(2_4, "2.4", false);
static void ccw_machine_rhel900_instance_options(MachineState *machine)
{
+ static const S390FeatInit qemu_cpu_feat = { S390_FEAT_LIST_QEMU_V6_2 };
+
+ s390_set_qemu_cpu_model(0x3906, 14, 2, qemu_cpu_feat);
}
static void ccw_machine_rhel900_class_options(MachineClass *mc)
--
2.31.1

View File

@ -1,70 +0,0 @@
From 38b89dc24551258b630f09d1c654b6c72b265c79 Mon Sep 17 00:00:00 2001
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Date: Thu, 14 Apr 2022 14:58:43 +0100
Subject: pc: Move s3/s4 suspend disabling to compat
RH-Author: Dr. David Alan Gilbert <dgilbert@redhat.com>
RH-MergeRequest: 155: 7.0 machine type fixes (x86)
RH-Commit: [26/26] 7d666032d5f5dab1444ebba085f92f2de4e86699
RH-Bugzilla: 2064771
Our downstream patches currently have tweaks in the C code to disable
s3/s4; Thomas pointed out we can just set the property.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
hw/acpi/ich9.c | 4 ++--
hw/acpi/piix4.c | 4 ++--
hw/i386/pc.c | 6 ++++++
3 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index de1e401cdf..bd9bbade70 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -435,8 +435,8 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm)
static const uint32_t gpe0_len = ICH9_PMIO_GPE0_LEN;
pm->acpi_memory_hotplug.is_enabled = true;
pm->cpu_hotplug_legacy = true;
- pm->disable_s3 = 1;
- pm->disable_s4 = 1;
+ pm->disable_s3 = 0;
+ pm->disable_s4 = 0;
pm->s4_val = 2;
pm->use_acpi_hotplug_bridge = true;
pm->keep_pci_slot_hpc = true;
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 28544e78c3..2fb2b43248 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -653,8 +653,8 @@ static void piix4_send_gpe(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
static Property piix4_pm_properties[] = {
DEFINE_PROP_UINT32("smb_io_base", PIIX4PMState, smb_io_base, 0),
- DEFINE_PROP_UINT8(ACPI_PM_PROP_S3_DISABLED, PIIX4PMState, disable_s3, 1),
- DEFINE_PROP_UINT8(ACPI_PM_PROP_S4_DISABLED, PIIX4PMState, disable_s4, 1),
+ DEFINE_PROP_UINT8(ACPI_PM_PROP_S3_DISABLED, PIIX4PMState, disable_s3, 0),
+ DEFINE_PROP_UINT8(ACPI_PM_PROP_S4_DISABLED, PIIX4PMState, disable_s4, 0),
DEFINE_PROP_UINT8(ACPI_PM_PROP_S4_VAL, PIIX4PMState, s4_val, 2),
DEFINE_PROP_BOOL(ACPI_PM_PROP_ACPI_PCIHP_BRIDGE, PIIX4PMState,
use_acpi_hotplug_bridge, true),
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 0886cfe3fe..f98f842f80 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -380,6 +380,12 @@ const size_t pc_compat_1_4_len = G_N_ELEMENTS(pc_compat_1_4);
* machine type.
*/
GlobalProperty pc_rhel_compat[] = {
+ /* we don't support s3/s4 suspend */
+ { "PIIX4_PM", "disable_s3", "1" },
+ { "PIIX4_PM", "disable_s4", "1" },
+ { "ICH9-LPC", "disable_s3", "1" },
+ { "ICH9-LPC", "disable_s4", "1" },
+
{ TYPE_X86_CPU, "host-phys-bits", "on" },
{ TYPE_X86_CPU, "host-phys-bits-limit", "48" },
{ TYPE_X86_CPU, "vmx-entry-load-perf-global-ctrl", "off" },
--
2.31.1

View File

@ -1,19 +0,0 @@
===================
qemu-kvm development
===================
qemu-kvm is maintained in a `source tree`_ rather than directly in dist-git.
This provides way to develope using regular source code structure and provides
way to generate SRPM and build using koji service. In addition, local build using
CentOS 9 Stream specific configuration.
Developers deliver all changes to source-git using merge request. Only maintainers
will be pushing changes sent to source-git to dist-git.
Each release in dist-git is tagged in the source repository so you can easily
check out the source tree for a build. The tags are in the format
name-version-release, but note release doesn't contain the dist tag since the
source can be built in different build roots (Fedora, CentOS, etc.)
.. _source tree: https://gitlab.com/redhat/centos-stream/src/qemu-kvm

View File

@ -1,9 +0,0 @@
# recipients: kvmqe-ci, yfu
--- !Policy
product_versions:
- rhel-9
decision_context: osci_compose_gate
subject_type: brew-build
rules:
- !PassingTestCaseRule {test_case_name: kvm-ci.qemu-kvm.x86_64-intel.brew-build.gating.tier1.functional}
- !PassingTestCaseRule {test_case_name: kvm-ci.qemu-kvm.x86_64-amd.brew-build.gating.tier1.functional}

View File

@ -1,41 +0,0 @@
From 85781b8745fa1581a66f64011d61a4f0c4e103dc Mon Sep 17 00:00:00 2001
From: Eric Auger <eric.auger@redhat.com>
Date: Fri, 6 May 2022 17:03:11 +0200
Subject: [PATCH 3/5] Enable virtio-iommu-pci on aarch64
RH-Author: Eric Auger <eric.auger@redhat.com>
RH-MergeRequest: 83: Enable virtio-iommu-pci on aarch64
RH-Commit: [1/1] 23e5c0832e52c66adf5fd6daccdc3edddc7ecb8b (eauger1/centos-qemu-kvm)
RH-Bugzilla: 1477099
RH-Acked-by: Gavin Shan <gshan@redhat.com>
RH-Acked-by: Andrew Jones <drjones@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1477099
Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=45128798
Upstream Status: RHEL-only
Tested: With virtio-net-pci and virtio-block-pci
let's enable the virtio-iommu-pci device on aarch64 by
turning CONFIG_VIRTIO_IOMMU on.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
configs/devices/aarch64-softmmu/aarch64-rh-devices.mak | 1 +
1 file changed, 1 insertion(+)
diff --git a/configs/devices/aarch64-softmmu/aarch64-rh-devices.mak b/configs/devices/aarch64-softmmu/aarch64-rh-devices.mak
index 187938573f..1618d31b89 100644
--- a/configs/devices/aarch64-softmmu/aarch64-rh-devices.mak
+++ b/configs/devices/aarch64-softmmu/aarch64-rh-devices.mak
@@ -23,6 +23,7 @@ CONFIG_VFIO_PCI=y
CONFIG_VIRTIO_MMIO=y
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_MEM=y
+CONFIG_VIRTIO_IOMMU=y
CONFIG_XIO3130=y
CONFIG_NVDIMM=y
CONFIG_ACPI_APEI=y
--
2.31.1

View File

@ -1,41 +0,0 @@
From c531a39171201f8a1d063e6af752e5d629c1b4bf Mon Sep 17 00:00:00 2001
From: Eric Auger <eric.auger@redhat.com>
Date: Thu, 9 Jun 2022 11:35:18 +0200
Subject: [PATCH 4/6] Enable virtio-iommu-pci on x86_64
RH-Author: Eric Auger <eric.auger@redhat.com>
RH-MergeRequest: 100: Enable virtio-iommu-pci on x86_64
RH-Commit: [1/1] a164af477efc7cb9d3d76a0e644f198f7c9fb2b5 (eauger1/centos-qemu-kvm)
RH-Bugzilla: 2094252
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: MST <mst@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094252
Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=45871185
Upstream Status: RHEL-only
Tested: With virtio-net-pci and virtio-block-pci
let's enable the virtio-iommu-pci device on x86_64 by
turning CONFIG_VIRTIO_IOMMU on.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
configs/devices/x86_64-softmmu/x86_64-rh-devices.mak | 1 +
1 file changed, 1 insertion(+)
diff --git a/configs/devices/x86_64-softmmu/x86_64-rh-devices.mak b/configs/devices/x86_64-softmmu/x86_64-rh-devices.mak
index d0c9e66641..3850b9de72 100644
--- a/configs/devices/x86_64-softmmu/x86_64-rh-devices.mak
+++ b/configs/devices/x86_64-softmmu/x86_64-rh-devices.mak
@@ -90,6 +90,7 @@ CONFIG_VHOST_USER_BLK=y
CONFIG_VIRTIO_MEM=y
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_VGA=y
+CONFIG_VIRTIO_IOMMU=y
CONFIG_VMMOUSE=y
CONFIG_VMPORT=y
CONFIG_VTD=y
--
2.31.1

View File

@ -1,503 +0,0 @@
From 1163da281c178359dd7e1cf1ced5c98caa600f8e Mon Sep 17 00:00:00 2001
From: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Date: Mon, 25 Apr 2022 09:57:21 +0200
Subject: [PATCH 01/16] Introduce event-loop-base abstract class
RH-Author: Nicolas Saenz Julienne <nsaenzju@redhat.com>
RH-MergeRequest: 93: util/thread-pool: Expose minimum and maximum size
RH-Commit: [1/3] 5817205d8f56cc4aa98bd5963ecac54a59bad990
RH-Bugzilla: 2031024
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Introduce the 'event-loop-base' abstract class, it'll hold the
properties common to all event loops and provide the necessary hooks for
their creation and maintenance. Then have iothread inherit from it.
EventLoopBaseClass is defined as user creatable and provides a hook for
its children to attach themselves to the user creatable class 'complete'
function. It also provides an update_params() callback to propagate
property changes onto its children.
The new 'event-loop-base' class will live in the root directory. It is
built on its own using the 'link_whole' option (there are no direct
function dependencies between the class and its children, it all happens
trough 'constructor' magic). And also imposes new compilation
dependencies:
qom <- event-loop-base <- blockdev (iothread.c)
And in subsequent patches:
qom <- event-loop-base <- qemuutil (util/main-loop.c)
All this forced some amount of reordering in meson.build:
- Moved qom build definition before qemuutil. Doing it the other way
around (i.e. moving qemuutil after qom) isn't possible as a lot of
core libraries that live in between the two depend on it.
- Process the 'hw' subdir earlier, as it introduces files into the
'qom' source set.
No functional changes intended.
Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-id: 20220425075723.20019-2-nsaenzju@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 7d5983e3c8c40b1d0668faba31d79905c4fadd7d)
---
event-loop-base.c | 104 +++++++++++++++++++++++++++++++
include/sysemu/event-loop-base.h | 36 +++++++++++
include/sysemu/iothread.h | 6 +-
iothread.c | 65 ++++++-------------
meson.build | 23 ++++---
qapi/qom.json | 22 +++++--
6 files changed, 192 insertions(+), 64 deletions(-)
create mode 100644 event-loop-base.c
create mode 100644 include/sysemu/event-loop-base.h
diff --git a/event-loop-base.c b/event-loop-base.c
new file mode 100644
index 0000000000..a924c73a7c
--- /dev/null
+++ b/event-loop-base.c
@@ -0,0 +1,104 @@
+/*
+ * QEMU event-loop base
+ *
+ * Copyright (C) 2022 Red Hat Inc
+ *
+ * Authors:
+ * Stefan Hajnoczi <stefanha@redhat.com>
+ * Nicolas Saenz Julienne <nsaenzju@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qom/object_interfaces.h"
+#include "qapi/error.h"
+#include "sysemu/event-loop-base.h"
+
+typedef struct {
+ const char *name;
+ ptrdiff_t offset; /* field's byte offset in EventLoopBase struct */
+} EventLoopBaseParamInfo;
+
+static EventLoopBaseParamInfo aio_max_batch_info = {
+ "aio-max-batch", offsetof(EventLoopBase, aio_max_batch),
+};
+
+static void event_loop_base_get_param(Object *obj, Visitor *v,
+ const char *name, void *opaque, Error **errp)
+{
+ EventLoopBase *event_loop_base = EVENT_LOOP_BASE(obj);
+ EventLoopBaseParamInfo *info = opaque;
+ int64_t *field = (void *)event_loop_base + info->offset;
+
+ visit_type_int64(v, name, field, errp);
+}
+
+static void event_loop_base_set_param(Object *obj, Visitor *v,
+ const char *name, void *opaque, Error **errp)
+{
+ EventLoopBaseClass *bc = EVENT_LOOP_BASE_GET_CLASS(obj);
+ EventLoopBase *base = EVENT_LOOP_BASE(obj);
+ EventLoopBaseParamInfo *info = opaque;
+ int64_t *field = (void *)base + info->offset;
+ int64_t value;
+
+ if (!visit_type_int64(v, name, &value, errp)) {
+ return;
+ }
+
+ if (value < 0) {
+ error_setg(errp, "%s value must be in range [0, %" PRId64 "]",
+ info->name, INT64_MAX);
+ return;
+ }
+
+ *field = value;
+
+ if (bc->update_params) {
+ bc->update_params(base, errp);
+ }
+
+ return;
+}
+
+static void event_loop_base_complete(UserCreatable *uc, Error **errp)
+{
+ EventLoopBaseClass *bc = EVENT_LOOP_BASE_GET_CLASS(uc);
+ EventLoopBase *base = EVENT_LOOP_BASE(uc);
+
+ if (bc->init) {
+ bc->init(base, errp);
+ }
+}
+
+static void event_loop_base_class_init(ObjectClass *klass, void *class_data)
+{
+ UserCreatableClass *ucc = USER_CREATABLE_CLASS(klass);
+ ucc->complete = event_loop_base_complete;
+
+ object_class_property_add(klass, "aio-max-batch", "int",
+ event_loop_base_get_param,
+ event_loop_base_set_param,
+ NULL, &aio_max_batch_info);
+}
+
+static const TypeInfo event_loop_base_info = {
+ .name = TYPE_EVENT_LOOP_BASE,
+ .parent = TYPE_OBJECT,
+ .instance_size = sizeof(EventLoopBase),
+ .class_size = sizeof(EventLoopBaseClass),
+ .class_init = event_loop_base_class_init,
+ .abstract = true,
+ .interfaces = (InterfaceInfo[]) {
+ { TYPE_USER_CREATABLE },
+ { }
+ }
+};
+
+static void register_types(void)
+{
+ type_register_static(&event_loop_base_info);
+}
+type_init(register_types);
diff --git a/include/sysemu/event-loop-base.h b/include/sysemu/event-loop-base.h
new file mode 100644
index 0000000000..8e77d8b69f
--- /dev/null
+++ b/include/sysemu/event-loop-base.h
@@ -0,0 +1,36 @@
+/*
+ * QEMU event-loop backend
+ *
+ * Copyright (C) 2022 Red Hat Inc
+ *
+ * Authors:
+ * Nicolas Saenz Julienne <nsaenzju@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#ifndef QEMU_EVENT_LOOP_BASE_H
+#define QEMU_EVENT_LOOP_BASE_H
+
+#include "qom/object.h"
+#include "block/aio.h"
+#include "qemu/typedefs.h"
+
+#define TYPE_EVENT_LOOP_BASE "event-loop-base"
+OBJECT_DECLARE_TYPE(EventLoopBase, EventLoopBaseClass,
+ EVENT_LOOP_BASE)
+
+struct EventLoopBaseClass {
+ ObjectClass parent_class;
+
+ void (*init)(EventLoopBase *base, Error **errp);
+ void (*update_params)(EventLoopBase *base, Error **errp);
+};
+
+struct EventLoopBase {
+ Object parent;
+
+ /* AioContext AIO engine parameters */
+ int64_t aio_max_batch;
+};
+#endif
diff --git a/include/sysemu/iothread.h b/include/sysemu/iothread.h
index 7f714bd136..8f8601d6ab 100644
--- a/include/sysemu/iothread.h
+++ b/include/sysemu/iothread.h
@@ -17,11 +17,12 @@
#include "block/aio.h"
#include "qemu/thread.h"
#include "qom/object.h"
+#include "sysemu/event-loop-base.h"
#define TYPE_IOTHREAD "iothread"
struct IOThread {
- Object parent_obj;
+ EventLoopBase parent_obj;
QemuThread thread;
AioContext *ctx;
@@ -37,9 +38,6 @@ struct IOThread {
int64_t poll_max_ns;
int64_t poll_grow;
int64_t poll_shrink;
-
- /* AioContext AIO engine parameters */
- int64_t aio_max_batch;
};
typedef struct IOThread IOThread;
diff --git a/iothread.c b/iothread.c
index 0f98af0f2a..8fa2f3bfb8 100644
--- a/iothread.c
+++ b/iothread.c
@@ -17,6 +17,7 @@
#include "qemu/module.h"
#include "block/aio.h"
#include "block/block.h"
+#include "sysemu/event-loop-base.h"
#include "sysemu/iothread.h"
#include "qapi/error.h"
#include "qapi/qapi-commands-misc.h"
@@ -152,10 +153,15 @@ static void iothread_init_gcontext(IOThread *iothread)
iothread->main_loop = g_main_loop_new(iothread->worker_context, TRUE);
}
-static void iothread_set_aio_context_params(IOThread *iothread, Error **errp)
+static void iothread_set_aio_context_params(EventLoopBase *base, Error **errp)
{
+ IOThread *iothread = IOTHREAD(base);
ERRP_GUARD();
+ if (!iothread->ctx) {
+ return;
+ }
+
aio_context_set_poll_params(iothread->ctx,
iothread->poll_max_ns,
iothread->poll_grow,
@@ -166,14 +172,15 @@ static void iothread_set_aio_context_params(IOThread *iothread, Error **errp)
}
aio_context_set_aio_params(iothread->ctx,
- iothread->aio_max_batch,
+ iothread->parent_obj.aio_max_batch,
errp);
}
-static void iothread_complete(UserCreatable *obj, Error **errp)
+
+static void iothread_init(EventLoopBase *base, Error **errp)
{
Error *local_error = NULL;
- IOThread *iothread = IOTHREAD(obj);
+ IOThread *iothread = IOTHREAD(base);
char *thread_name;
iothread->stopping = false;
@@ -189,7 +196,7 @@ static void iothread_complete(UserCreatable *obj, Error **errp)
*/
iothread_init_gcontext(iothread);
- iothread_set_aio_context_params(iothread, &local_error);
+ iothread_set_aio_context_params(base, &local_error);
if (local_error) {
error_propagate(errp, local_error);
aio_context_unref(iothread->ctx);
@@ -201,7 +208,7 @@ static void iothread_complete(UserCreatable *obj, Error **errp)
* to inherit.
*/
thread_name = g_strdup_printf("IO %s",
- object_get_canonical_path_component(OBJECT(obj)));
+ object_get_canonical_path_component(OBJECT(base)));
qemu_thread_create(&iothread->thread, thread_name, iothread_run,
iothread, QEMU_THREAD_JOINABLE);
g_free(thread_name);
@@ -226,9 +233,6 @@ static IOThreadParamInfo poll_grow_info = {
static IOThreadParamInfo poll_shrink_info = {
"poll-shrink", offsetof(IOThread, poll_shrink),
};
-static IOThreadParamInfo aio_max_batch_info = {
- "aio-max-batch", offsetof(IOThread, aio_max_batch),
-};
static void iothread_get_param(Object *obj, Visitor *v,
const char *name, IOThreadParamInfo *info, Error **errp)
@@ -288,35 +292,12 @@ static void iothread_set_poll_param(Object *obj, Visitor *v,
}
}
-static void iothread_get_aio_param(Object *obj, Visitor *v,
- const char *name, void *opaque, Error **errp)
-{
- IOThreadParamInfo *info = opaque;
-
- iothread_get_param(obj, v, name, info, errp);
-}
-
-static void iothread_set_aio_param(Object *obj, Visitor *v,
- const char *name, void *opaque, Error **errp)
-{
- IOThread *iothread = IOTHREAD(obj);
- IOThreadParamInfo *info = opaque;
-
- if (!iothread_set_param(obj, v, name, info, errp)) {
- return;
- }
-
- if (iothread->ctx) {
- aio_context_set_aio_params(iothread->ctx,
- iothread->aio_max_batch,
- errp);
- }
-}
-
static void iothread_class_init(ObjectClass *klass, void *class_data)
{
- UserCreatableClass *ucc = USER_CREATABLE_CLASS(klass);
- ucc->complete = iothread_complete;
+ EventLoopBaseClass *bc = EVENT_LOOP_BASE_CLASS(klass);
+
+ bc->init = iothread_init;
+ bc->update_params = iothread_set_aio_context_params;
object_class_property_add(klass, "poll-max-ns", "int",
iothread_get_poll_param,
@@ -330,23 +311,15 @@ static void iothread_class_init(ObjectClass *klass, void *class_data)
iothread_get_poll_param,
iothread_set_poll_param,
NULL, &poll_shrink_info);
- object_class_property_add(klass, "aio-max-batch", "int",
- iothread_get_aio_param,
- iothread_set_aio_param,
- NULL, &aio_max_batch_info);
}
static const TypeInfo iothread_info = {
.name = TYPE_IOTHREAD,
- .parent = TYPE_OBJECT,
+ .parent = TYPE_EVENT_LOOP_BASE,
.class_init = iothread_class_init,
.instance_size = sizeof(IOThread),
.instance_init = iothread_instance_init,
.instance_finalize = iothread_instance_finalize,
- .interfaces = (InterfaceInfo[]) {
- {TYPE_USER_CREATABLE},
- {}
- },
};
static void iothread_register_types(void)
@@ -383,7 +356,7 @@ static int query_one_iothread(Object *object, void *opaque)
info->poll_max_ns = iothread->poll_max_ns;
info->poll_grow = iothread->poll_grow;
info->poll_shrink = iothread->poll_shrink;
- info->aio_max_batch = iothread->aio_max_batch;
+ info->aio_max_batch = iothread->parent_obj.aio_max_batch;
QAPI_LIST_APPEND(*tail, info);
return 0;
diff --git a/meson.build b/meson.build
index 6f7e430f0f..b9c919a55e 100644
--- a/meson.build
+++ b/meson.build
@@ -2804,6 +2804,7 @@ subdir('qom')
subdir('authz')
subdir('crypto')
subdir('ui')
+subdir('hw')
if enable_modules
@@ -2811,6 +2812,18 @@ if enable_modules
modulecommon = declare_dependency(link_whole: libmodulecommon, compile_args: '-DBUILD_DSO')
endif
+qom_ss = qom_ss.apply(config_host, strict: false)
+libqom = static_library('qom', qom_ss.sources() + genh,
+ dependencies: [qom_ss.dependencies()],
+ name_suffix: 'fa')
+qom = declare_dependency(link_whole: libqom)
+
+event_loop_base = files('event-loop-base.c')
+event_loop_base = static_library('event-loop-base', sources: event_loop_base + genh,
+ build_by_default: true)
+event_loop_base = declare_dependency(link_whole: event_loop_base,
+ dependencies: [qom])
+
stub_ss = stub_ss.apply(config_all, strict: false)
util_ss.add_all(trace_ss)
@@ -2897,7 +2910,6 @@ subdir('monitor')
subdir('net')
subdir('replay')
subdir('semihosting')
-subdir('hw')
subdir('tcg')
subdir('fpu')
subdir('accel')
@@ -3022,13 +3034,6 @@ qemu_syms = custom_target('qemu.syms', output: 'qemu.syms',
capture: true,
command: [undefsym, nm, '@INPUT@'])
-qom_ss = qom_ss.apply(config_host, strict: false)
-libqom = static_library('qom', qom_ss.sources() + genh,
- dependencies: [qom_ss.dependencies()],
- name_suffix: 'fa')
-
-qom = declare_dependency(link_whole: libqom)
-
authz_ss = authz_ss.apply(config_host, strict: false)
libauthz = static_library('authz', authz_ss.sources() + genh,
dependencies: [authz_ss.dependencies()],
@@ -3081,7 +3086,7 @@ libblockdev = static_library('blockdev', blockdev_ss.sources() + genh,
build_by_default: false)
blockdev = declare_dependency(link_whole: [libblockdev],
- dependencies: [block])
+ dependencies: [block, event_loop_base])
qmp_ss = qmp_ss.apply(config_host, strict: false)
libqmp = static_library('qmp', qmp_ss.sources() + genh,
diff --git a/qapi/qom.json b/qapi/qom.json
index eeb5395ff3..a2439533c5 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -499,6 +499,20 @@
'*repeat': 'bool',
'*grab-toggle': 'GrabToggleKeys' } }
+##
+# @EventLoopBaseProperties:
+#
+# Common properties for event loops
+#
+# @aio-max-batch: maximum number of requests in a batch for the AIO engine,
+# 0 means that the engine will use its default.
+# (default: 0)
+#
+# Since: 7.1
+##
+{ 'struct': 'EventLoopBaseProperties',
+ 'data': { '*aio-max-batch': 'int' } }
+
##
# @IothreadProperties:
#
@@ -516,17 +530,15 @@
# algorithm detects it is spending too long polling without
# encountering events. 0 selects a default behaviour (default: 0)
#
-# @aio-max-batch: maximum number of requests in a batch for the AIO engine,
-# 0 means that the engine will use its default
-# (default:0, since 6.1)
+# The @aio-max-batch option is available since 6.1.
#
# Since: 2.0
##
{ 'struct': 'IothreadProperties',
+ 'base': 'EventLoopBaseProperties',
'data': { '*poll-max-ns': 'int',
'*poll-grow': 'int',
- '*poll-shrink': 'int',
- '*aio-max-batch': 'int' } }
+ '*poll-shrink': 'int' } }
##
# @MemoryBackendProperties:
--
2.31.1

View File

@ -0,0 +1,82 @@
From 9bacf8c4104ff3cff2e0e2c2179ec4fda633167f Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Mon, 16 Jan 2023 07:51:08 -0500
Subject: [PATCH 05/11] KVM: keep track of running ioctls
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 247: accel: introduce accelerator blocker API
RH-Bugzilla: 2161188
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [2/3] 357508389e2a0fd996206b406e9e235e50b5f0b6
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2161188
commit a27dd2de68f37ba96fe164a42121daa5f0750afc
Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Fri Nov 11 10:47:57 2022 -0500
KVM: keep track of running ioctls
Using the new accel-blocker API, mark where ioctls are being called
in KVM. Next, we will implement the critical section that will take
care of performing memslots modifications atomically, therefore
preventing any new ioctl from running and allowing the running ones
to finish.
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20221111154758.1372674-3-eesposit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
accel/kvm/kvm-all.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 8f2a53438f..221aadfda7 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2337,6 +2337,7 @@ static int kvm_init(MachineState *ms)
assert(TARGET_PAGE_SIZE <= qemu_real_host_page_size);
s->sigmask_len = 8;
+ accel_blocker_init();
#ifdef KVM_CAP_SET_GUEST_DEBUG
QTAILQ_INIT(&s->kvm_sw_breakpoints);
@@ -3018,7 +3019,9 @@ int kvm_vm_ioctl(KVMState *s, int type, ...)
va_end(ap);
trace_kvm_vm_ioctl(type, arg);
+ accel_ioctl_begin();
ret = ioctl(s->vmfd, type, arg);
+ accel_ioctl_end();
if (ret == -1) {
ret = -errno;
}
@@ -3036,7 +3039,9 @@ int kvm_vcpu_ioctl(CPUState *cpu, int type, ...)
va_end(ap);
trace_kvm_vcpu_ioctl(cpu->cpu_index, type, arg);
+ accel_cpu_ioctl_begin(cpu);
ret = ioctl(cpu->kvm_fd, type, arg);
+ accel_cpu_ioctl_end(cpu);
if (ret == -1) {
ret = -errno;
}
@@ -3054,7 +3059,9 @@ int kvm_device_ioctl(int fd, int type, ...)
va_end(ap);
trace_kvm_device_ioctl(fd, type, arg);
+ accel_ioctl_begin();
ret = ioctl(fd, type, arg);
+ accel_ioctl_end();
if (ret == -1) {
ret = -errno;
}
--
2.37.3

View File

@ -1,237 +0,0 @@
From 055edf068196622a3e1868c9e4c991d410272a6d Mon Sep 17 00:00:00 2001
From: Andrew Jones <drjones@redhat.com>
Date: Wed, 15 Jun 2022 15:28:27 +0200
Subject: [PATCH 03/18] RHEL-only: AArch64: Drop unsupported CPU types
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Daniel P. Berrangé <berrange@redhat.com>
RH-MergeRequest: 94: i386, aarch64, s390x: deprecate many named CPU models
RH-Commit: [3/6] 21f54c86dc87e5e75a64459b5a385686bc09640c (berrange/centos-src-qemu)
RH-Bugzilla: 2060839
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2066824
Upstream Status: RHEL only
We only need to support AArch64 cpu types and we only need three
types:
1) A base type to use with TCG, i.e. a cpu type with only base
features. 'cortex-a57' serves this role and is currently used
by libguestfs.
2) The 'max' type, which is for both KVM and TCG and is good for
tests that just specify 'max' but run under both. 'max' with
TCG also provides the VM with all the CPU features TCG
supports, which is good for VMs that need features not
provided by the basic cortex-a57.
3) The host type which is used with KVM.
Signed-off-by: Andrew Jones <drjones@redhat.com>
---
hw/arm/virt.c | 4 ++++
target/arm/cpu64.c | 6 ++++++
target/arm/cpu_tcg.c | 12 ++----------
tests/qtest/arm-cpu-features.c | 6 ++++++
4 files changed, 18 insertions(+), 10 deletions(-)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 95d012d6eb..74119976d3 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -239,12 +239,16 @@ static const int a15irqmap[] = {
};
static const char *valid_cpus[] = {
+#if 0 /* Disabled for Red Hat Enterprise Linux */
ARM_CPU_TYPE_NAME("cortex-a7"),
ARM_CPU_TYPE_NAME("cortex-a15"),
ARM_CPU_TYPE_NAME("cortex-a53"),
+#endif /* disabled for RHEL */
ARM_CPU_TYPE_NAME("cortex-a57"),
+#if 0 /* Disabled for Red Hat Enterprise Linux */
ARM_CPU_TYPE_NAME("cortex-a72"),
ARM_CPU_TYPE_NAME("a64fx"),
+#endif /* disabled for RHEL */
ARM_CPU_TYPE_NAME("host"),
ARM_CPU_TYPE_NAME("max"),
};
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index eb44c05822..e80b831073 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -146,6 +146,7 @@ static void aarch64_a57_initfn(Object *obj)
define_arm_cp_regs(cpu, cortex_a72_a57_a53_cp_reginfo);
}
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static void aarch64_a53_initfn(Object *obj)
{
ARMCPU *cpu = ARM_CPU(obj);
@@ -249,6 +250,7 @@ static void aarch64_a72_initfn(Object *obj)
cpu->gic_vprebits = 5;
define_arm_cp_regs(cpu, cortex_a72_a57_a53_cp_reginfo);
}
+#endif /* disabled for RHEL */
void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp)
{
@@ -923,6 +925,7 @@ static void aarch64_max_initfn(Object *obj)
qdev_property_add_static(DEVICE(obj), &arm_cpu_lpa2_property);
}
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static void aarch64_a64fx_initfn(Object *obj)
{
ARMCPU *cpu = ARM_CPU(obj);
@@ -969,12 +972,15 @@ static void aarch64_a64fx_initfn(Object *obj)
/* TODO: Add A64FX specific HPC extension registers */
}
+#endif /* disabled for RHEL */
static const ARMCPUInfo aarch64_cpus[] = {
{ .name = "cortex-a57", .initfn = aarch64_a57_initfn },
+#if 0 /* Disabled for Red Hat Enterprise Linux */
{ .name = "cortex-a53", .initfn = aarch64_a53_initfn },
{ .name = "cortex-a72", .initfn = aarch64_a72_initfn },
{ .name = "a64fx", .initfn = aarch64_a64fx_initfn },
+#endif /* disabled for RHEL */
{ .name = "max", .initfn = aarch64_max_initfn },
#if defined(CONFIG_KVM) || defined(CONFIG_HVF)
{ .name = "host", .initfn = aarch64_host_initfn },
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index 3826fa5122..74727fc92c 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -19,10 +19,10 @@
#include "hw/boards.h"
#endif
+#if 0 /* Disabled for Red Hat Enterprise Linux */
/* CPU models. These are not needed for the AArch64 linux-user build. */
#if !defined(CONFIG_USER_ONLY) || !defined(TARGET_AARCH64)
-#if 0 /* Disabled for Red Hat Enterprise Linux */
#if !defined(CONFIG_USER_ONLY) && defined(CONFIG_TCG)
static bool arm_v7m_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
{
@@ -376,7 +376,6 @@ static void cortex_a9_initfn(Object *obj)
cpu->ccsidr[1] = 0x200fe019; /* 16k L1 icache. */
define_arm_cp_regs(cpu, cortexa9_cp_reginfo);
}
-#endif /* disabled for RHEL */
#ifndef CONFIG_USER_ONLY
static uint64_t a15_l2ctlr_read(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -402,7 +401,6 @@ static const ARMCPRegInfo cortexa15_cp_reginfo[] = {
REGINFO_SENTINEL
};
-#if 0 /* Disabled for Red Hat Enterprise Linux */
static void cortex_a7_initfn(Object *obj)
{
ARMCPU *cpu = ARM_CPU(obj);
@@ -448,7 +446,6 @@ static void cortex_a7_initfn(Object *obj)
cpu->ccsidr[2] = 0x711fe07a; /* 4096K L2 unified cache */
define_arm_cp_regs(cpu, cortexa15_cp_reginfo); /* Same as A15 */
}
-#endif /* disabled for RHEL */
static void cortex_a15_initfn(Object *obj)
{
@@ -492,7 +489,6 @@ static void cortex_a15_initfn(Object *obj)
define_arm_cp_regs(cpu, cortexa15_cp_reginfo);
}
-#if 0 /* Disabled for Red Hat Enterprise Linux */
static void cortex_m0_initfn(Object *obj)
{
ARMCPU *cpu = ARM_CPU(obj);
@@ -933,7 +929,6 @@ static void arm_v7m_class_init(ObjectClass *oc, void *data)
cc->gdb_core_xml_file = "arm-m-profile.xml";
}
-#endif /* disabled for RHEL */
#ifndef TARGET_AARCH64
/*
@@ -1013,7 +1008,6 @@ static void arm_max_initfn(Object *obj)
#endif /* !TARGET_AARCH64 */
static const ARMCPUInfo arm_tcg_cpus[] = {
-#if 0 /* Disabled for Red Hat Enterprise Linux */
{ .name = "arm926", .initfn = arm926_initfn },
{ .name = "arm946", .initfn = arm946_initfn },
{ .name = "arm1026", .initfn = arm1026_initfn },
@@ -1029,9 +1023,7 @@ static const ARMCPUInfo arm_tcg_cpus[] = {
{ .name = "cortex-a7", .initfn = cortex_a7_initfn },
{ .name = "cortex-a8", .initfn = cortex_a8_initfn },
{ .name = "cortex-a9", .initfn = cortex_a9_initfn },
-#endif /* disabled for RHEL */
{ .name = "cortex-a15", .initfn = cortex_a15_initfn },
-#if 0 /* Disabled for Red Hat Enterprise Linux */
{ .name = "cortex-m0", .initfn = cortex_m0_initfn,
.class_init = arm_v7m_class_init },
{ .name = "cortex-m3", .initfn = cortex_m3_initfn,
@@ -1062,7 +1054,6 @@ static const ARMCPUInfo arm_tcg_cpus[] = {
{ .name = "pxa270-b1", .initfn = pxa270b1_initfn },
{ .name = "pxa270-c0", .initfn = pxa270c0_initfn },
{ .name = "pxa270-c5", .initfn = pxa270c5_initfn },
-#endif /* disabled for RHEL */
#ifndef TARGET_AARCH64
{ .name = "max", .initfn = arm_max_initfn },
#endif
@@ -1090,3 +1081,4 @@ static void arm_tcg_cpu_register_types(void)
type_init(arm_tcg_cpu_register_types)
#endif /* !CONFIG_USER_ONLY || !TARGET_AARCH64 */
+#endif /* disabled for RHEL */
diff --git a/tests/qtest/arm-cpu-features.c b/tests/qtest/arm-cpu-features.c
index f76652143a..fe2a0a070d 100644
--- a/tests/qtest/arm-cpu-features.c
+++ b/tests/qtest/arm-cpu-features.c
@@ -440,8 +440,10 @@ static void test_query_cpu_model_expansion(const void *data)
assert_error(qts, "host", "The CPU type 'host' requires KVM", NULL);
/* Test expected feature presence/absence for some cpu types */
+#if 0 /* Disabled for Red Hat Enterprise Linux */
assert_has_feature_enabled(qts, "cortex-a15", "pmu");
assert_has_not_feature(qts, "cortex-a15", "aarch64");
+#endif /* disabled for RHEL */
/* Enabling and disabling pmu should always work. */
assert_has_feature_enabled(qts, "max", "pmu");
@@ -458,6 +460,7 @@ static void test_query_cpu_model_expansion(const void *data)
assert_has_feature_enabled(qts, "cortex-a57", "pmu");
assert_has_feature_enabled(qts, "cortex-a57", "aarch64");
+#if 0 /* Disabled for Red Hat Enterprise Linux */
assert_has_feature_enabled(qts, "a64fx", "pmu");
assert_has_feature_enabled(qts, "a64fx", "aarch64");
/*
@@ -470,6 +473,7 @@ static void test_query_cpu_model_expansion(const void *data)
"{ 'sve384': true }");
assert_error(qts, "a64fx", "cannot enable sve640",
"{ 'sve640': true }");
+#endif /* disabled for RHEL */
sve_tests_default(qts, "max");
pauth_tests_default(qts, "max");
@@ -505,9 +509,11 @@ static void test_query_cpu_model_expansion_kvm(const void *data)
QDict *resp;
char *error;
+#if 0 /* Disabled for Red Hat Enterprise Linux */
assert_error(qts, "cortex-a15",
"We cannot guarantee the CPU type 'cortex-a15' works "
"with KVM on this host", NULL);
+#endif /* disabled for RHEL */
assert_has_feature_enabled(qts, "host", "aarch64");
--
2.35.3

View File

@ -1,95 +0,0 @@
From d710394f68eb0b6116dd8ac76f619c192e0d5972 Mon Sep 17 00:00:00 2001
From: Andrew Jones <drjones@redhat.com>
Date: Wed, 15 Jun 2022 15:28:27 +0200
Subject: [PATCH 02/18] RHEL-only: tests/avocado: Switch aarch64 tests from a53
to a57
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Daniel P. Berrangé <berrange@redhat.com>
RH-MergeRequest: 94: i386, aarch64, s390x: deprecate many named CPU models
RH-Commit: [2/6] e85ef69b42c411a6997e4da10ba05176368769b3 (berrange/centos-src-qemu)
RH-Bugzilla: 2060839
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2066824
Upstream Status: RHEL only
We plan to remove the cortex-a53 from the supported cpu types. Switch
all avocado tests that use it to the cortex-a57, which will work the
same and we intend to keep. We don't want to try and upstream this
change since the better upstream change would be to switch from the
a53 to 'max', but the upstream tests also need to use later guest
kernels to use 'max' (see qemu upstream commit 0942820408dc
("hw/arm/virt: Disable LPA2 for -machine virt-6.2")
Signed-off-by: Andrew Jones <drjones@redhat.com>
---
tests/avocado/replay_kernel.py | 2 +-
tests/avocado/reverse_debugging.py | 2 +-
tests/avocado/tcg_plugins.py | 6 +++---
3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/tests/avocado/replay_kernel.py b/tests/avocado/replay_kernel.py
index 0b2b0dc692..3a7b5f0748 100644
--- a/tests/avocado/replay_kernel.py
+++ b/tests/avocado/replay_kernel.py
@@ -147,7 +147,7 @@ def test_aarch64_virt(self):
"""
:avocado: tags=arch:aarch64
:avocado: tags=machine:virt
- :avocado: tags=cpu:cortex-a53
+ :avocado: tags=cpu:cortex-a57
"""
kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
'/linux/releases/29/Everything/aarch64/os/images/pxeboot'
diff --git a/tests/avocado/reverse_debugging.py b/tests/avocado/reverse_debugging.py
index d2921e70c3..66d185ed42 100644
--- a/tests/avocado/reverse_debugging.py
+++ b/tests/avocado/reverse_debugging.py
@@ -198,7 +198,7 @@ def test_aarch64_virt(self):
"""
:avocado: tags=arch:aarch64
:avocado: tags=machine:virt
- :avocado: tags=cpu:cortex-a53
+ :avocado: tags=cpu:cortex-a57
"""
kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
'/linux/releases/29/Everything/aarch64/os/images/pxeboot'
diff --git a/tests/avocado/tcg_plugins.py b/tests/avocado/tcg_plugins.py
index 642d2e49e3..93b3afd823 100644
--- a/tests/avocado/tcg_plugins.py
+++ b/tests/avocado/tcg_plugins.py
@@ -68,7 +68,7 @@ def test_aarch64_virt_insn(self):
:avocado: tags=accel:tcg
:avocado: tags=arch:aarch64
:avocado: tags=machine:virt
- :avocado: tags=cpu:cortex-a53
+ :avocado: tags=cpu:cortex-a57
"""
kernel_path = self._grab_aarch64_kernel()
kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE +
@@ -94,7 +94,7 @@ def test_aarch64_virt_insn_icount(self):
:avocado: tags=accel:tcg
:avocado: tags=arch:aarch64
:avocado: tags=machine:virt
- :avocado: tags=cpu:cortex-a53
+ :avocado: tags=cpu:cortex-a57
"""
kernel_path = self._grab_aarch64_kernel()
kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE +
@@ -120,7 +120,7 @@ def test_aarch64_virt_mem_icount(self):
:avocado: tags=accel:tcg
:avocado: tags=arch:aarch64
:avocado: tags=machine:virt
- :avocado: tags=cpu:cortex-a53
+ :avocado: tags=cpu:cortex-a57
"""
kernel_path = self._grab_aarch64_kernel()
kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE +
--
2.35.3

View File

@ -1,58 +0,0 @@
From 5ab8613582fd56b847fe75750acb5b7255900b35 Mon Sep 17 00:00:00 2001
From: Vitaly Kuznetsov <vkuznets@redhat.com>
Date: Thu, 9 Jun 2022 11:55:15 +0200
Subject: [PATCH 15/16] Revert "globally limit the maximum number of CPUs"
RH-Author: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-MergeRequest: 99: Revert "globally limit the maximum number of CPUs"
RH-Commit: [1/1] 13100d4a2209b2190a3654c1f9cf4ebade1e8d24 (vkuznets/qemu-kvm-c9s)
RH-Bugzilla: 2094270
RH-Acked-by: Andrew Jones <drjones@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Thomas Huth <thuth@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094270
Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=45871149
Upstream Status: RHEL-only
Tested: with upstream kernel
Downstream QEMU carries a patch that sets the hard limit of possible vCPUs
to the value that the KVM code of the kernel recommends as soft limit.
Upstream KVM code has been changed recently to not use an arbitrary soft
limit anymore, but to cap the value on the amount of available physical
CPUs of the host. This defeats the purpose of the downstream change in
QEMU completely. Drop the downstream-only patch to allow CPU overcommit.
This reverts commit 6669f6fa677d43144f39d6ad59725b7ba622f1c2.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
accel/kvm/kvm-all.c | 12 ------------
1 file changed, 12 deletions(-)
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index fdf0e4d429..5f1377ca04 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2430,18 +2430,6 @@ static int kvm_init(MachineState *ms)
soft_vcpus_limit = kvm_recommended_vcpus(s);
hard_vcpus_limit = kvm_max_vcpus(s);
-#ifdef HOST_PPC64
- /*
- * On POWER, the kernel advertises a soft limit based on the
- * number of CPU threads on the host. We want to allow exceeding
- * this for testing purposes, so we don't want to set hard limit
- * to soft limit as on x86.
- */
-#else
- /* RHEL doesn't support nr_vcpus > soft_vcpus_limit */
- hard_vcpus_limit = soft_vcpus_limit;
-#endif
-
while (nc->name) {
if (nc->num > soft_vcpus_limit) {
warn_report("Number of %s cpus requested (%d) exceeds "
--
2.31.1

View File

@ -1,134 +0,0 @@
From 5ea59b17866add54e5ae8c76d3cb472c67e1fa91 Mon Sep 17 00:00:00 2001
From: Thomas Huth <thuth@redhat.com>
Date: Tue, 2 Aug 2022 08:19:49 +0200
Subject: [PATCH 32/32] Revert "migration: Simplify unqueue_page()"
RH-Author: Thomas Huth <thuth@redhat.com>
RH-MergeRequest: 112: Fix postcopy migration on s390x
RH-Commit: [2/2] 3913c9ed3f27f4b66245913da29d0c46db0c6567 (thuth/qemu-kvm-cs9)
RH-Bugzilla: 2099934
RH-Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Peter Xu <peterx@redhat.com>
This reverts commit cfd66f30fb0f735df06ff4220e5000290a43dad3.
The simplification of unqueue_page() introduced a bug that sometimes
breaks migration on s390x hosts.
The problem is not fully understood yet, but since we are already in
the freeze for QEMU 7.1 and we need something working there, let's
revert this patch for the upcoming release. The optimization can be
redone later again in a proper way if necessary.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2099934
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20220802061949.331576-1-thuth@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit 777f53c75983dd10756f5dbfc8af50fe11da81c1)
Conflicts:
migration/trace-events
(trivial contextual conflict)
Signed-off-by: Thomas Huth <thuth@redhat.com>
---
migration/ram.c | 37 ++++++++++++++++++++++++++-----------
migration/trace-events | 3 ++-
2 files changed, 28 insertions(+), 12 deletions(-)
diff --git a/migration/ram.c b/migration/ram.c
index fb6db54642..ee40e4a718 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1548,7 +1548,6 @@ static RAMBlock *unqueue_page(RAMState *rs, ram_addr_t *offset)
{
struct RAMSrcPageRequest *entry;
RAMBlock *block = NULL;
- size_t page_size;
if (!postcopy_has_request(rs)) {
return NULL;
@@ -1565,13 +1564,10 @@ static RAMBlock *unqueue_page(RAMState *rs, ram_addr_t *offset)
entry = QSIMPLEQ_FIRST(&rs->src_page_requests);
block = entry->rb;
*offset = entry->offset;
- page_size = qemu_ram_pagesize(block);
- /* Each page request should only be multiple page size of the ramblock */
- assert((entry->len % page_size) == 0);
- if (entry->len > page_size) {
- entry->len -= page_size;
- entry->offset += page_size;
+ if (entry->len > TARGET_PAGE_SIZE) {
+ entry->len -= TARGET_PAGE_SIZE;
+ entry->offset += TARGET_PAGE_SIZE;
} else {
memory_region_unref(block->mr);
QSIMPLEQ_REMOVE_HEAD(&rs->src_page_requests, next_req);
@@ -1579,9 +1575,6 @@ static RAMBlock *unqueue_page(RAMState *rs, ram_addr_t *offset)
migration_consume_urgent_request();
}
- trace_unqueue_page(block->idstr, *offset,
- test_bit((*offset >> TARGET_PAGE_BITS), block->bmap));
-
return block;
}
@@ -1956,8 +1949,30 @@ static bool get_queued_page(RAMState *rs, PageSearchStatus *pss)
{
RAMBlock *block;
ram_addr_t offset;
+ bool dirty;
+
+ do {
+ block = unqueue_page(rs, &offset);
+ /*
+ * We're sending this page, and since it's postcopy nothing else
+ * will dirty it, and we must make sure it doesn't get sent again
+ * even if this queue request was received after the background
+ * search already sent it.
+ */
+ if (block) {
+ unsigned long page;
+
+ page = offset >> TARGET_PAGE_BITS;
+ dirty = test_bit(page, block->bmap);
+ if (!dirty) {
+ trace_get_queued_page_not_dirty(block->idstr, (uint64_t)offset,
+ page);
+ } else {
+ trace_get_queued_page(block->idstr, (uint64_t)offset, page);
+ }
+ }
- block = unqueue_page(rs, &offset);
+ } while (block && !dirty);
if (!block) {
/*
diff --git a/migration/trace-events b/migration/trace-events
index 1aec580e92..09d61ed1f4 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -85,6 +85,8 @@ put_qlist_end(const char *field_name, const char *vmsd_name) "%s(%s)"
qemu_file_fclose(void) ""
# ram.c
+get_queued_page(const char *block_name, uint64_t tmp_offset, unsigned long page_abs) "%s/0x%" PRIx64 " page_abs=0x%lx"
+get_queued_page_not_dirty(const char *block_name, uint64_t tmp_offset, unsigned long page_abs) "%s/0x%" PRIx64 " page_abs=0x%lx"
migration_bitmap_sync_start(void) ""
migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64
migration_bitmap_clear_dirty(char *str, uint64_t start, uint64_t size, unsigned long page) "rb %s start 0x%"PRIx64" size 0x%"PRIx64" page 0x%lx"
@@ -110,7 +112,6 @@ ram_save_iterate_big_wait(uint64_t milliconds, int iterations) "big wait: %" PRI
ram_load_complete(int ret, uint64_t seq_iter) "exit_code %d seq iteration %" PRIu64
ram_write_tracking_ramblock_start(const char *block_id, size_t page_size, void *addr, size_t length) "%s: page_size: %zu addr: %p length: %zu"
ram_write_tracking_ramblock_stop(const char *block_id, size_t page_size, void *addr, size_t length) "%s: page_size: %zu addr: %p length: %zu"
-unqueue_page(char *block, uint64_t offset, bool dirty) "ramblock '%s' offset 0x%"PRIx64" dirty %d"
# multifd.c
multifd_new_send_channel_async(uint8_t id) "channel %u"
--
2.31.1

View File

@ -1,51 +0,0 @@
From 733acef2caea0758edd74fb634b095ce09bf5914 Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Mon, 9 May 2022 03:46:23 -0400
Subject: [PATCH 15/16] Revert "virtio-scsi: Reject scsi-cd if data plane
enabled [RHEL only]"
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 91: Revert "virtio-scsi: Reject scsi-cd if data plane enabled [RHEL only]"
RH-Commit: [1/1] 1af55d792bc9166e5c86272afe8093c76ab41bb4 (eesposit/qemu-kvm)
RH-Bugzilla: 1995710
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
This reverts commit 4e17b1126e.
Over time AioContext usage and coverage has increased, and now block
backend is capable of handling AioContext change upon eject and insert.
Therefore the above downstream-only commit is not necessary anymore,
and can be safely reverted.
X-downstream-only: true
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
hw/scsi/virtio-scsi.c | 9 ---------
1 file changed, 9 deletions(-)
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 2450c9438c..db54d104be 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -937,15 +937,6 @@ static void virtio_scsi_hotplug(HotplugHandler *hotplug_dev, DeviceState *dev,
AioContext *old_context;
int ret;
- /* XXX: Remove this check once block backend is capable of handling
- * AioContext change upon eject/insert.
- * s->ctx is NULL if ioeventfd is off, s->ctx is qemu_get_aio_context() if
- * data plane is not used, both cases are safe for scsi-cd. */
- if (s->ctx && s->ctx != qemu_get_aio_context() &&
- object_dynamic_cast(OBJECT(dev), "scsi-cd")) {
- error_setg(errp, "scsi-cd is not supported by data plane");
- return;
- }
if (s->ctx && !s->dataplane_fenced) {
if (blk_op_is_blocked(sd->conf.blk, BLOCK_OP_TYPE_DATAPLANE, errp)) {
return;
--
2.31.1

View File

@ -0,0 +1,349 @@
From a5e7bb1f7a88efb5574266a76e80fd7604d19921 Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Mon, 16 Jan 2023 07:49:59 -0500
Subject: [PATCH 04/11] accel: introduce accelerator blocker API
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 247: accel: introduce accelerator blocker API
RH-Bugzilla: 2161188
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [1/3] 9d3d7f9554974a79042c915763288cce07aef135
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2161188
commit bd688fc93120fb3e28aa70e3dfdf567ccc1e0bc1
Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Fri Nov 11 10:47:56 2022 -0500
accel: introduce accelerator blocker API
This API allows the accelerators to prevent vcpus from issuing
new ioctls while execting a critical section marked with the
accel_ioctl_inhibit_begin/end functions.
Note that all functions submitting ioctls must mark where the
ioctl is being called with accel_{cpu_}ioctl_begin/end().
This API requires the caller to always hold the BQL.
API documentation is in sysemu/accel-blocker.h
Internally, it uses a QemuLockCnt together with a per-CPU QemuLockCnt
(to minimize cache line bouncing) to keep avoid that new ioctls
run when the critical section starts, and a QemuEvent to wait
that all running ioctls finish.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221111154758.1372674-2-eesposit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Conflicts:
util/meson.build: files are missing in rhel 8.8.0
namely int128.c, memalign.c and interval-tree.c
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
accel/accel-blocker.c | 154 +++++++++++++++++++++++++++++++++
accel/meson.build | 2 +-
hw/core/cpu-common.c | 2 +
include/hw/core/cpu.h | 3 +
include/sysemu/accel-blocker.h | 56 ++++++++++++
util/meson.build | 2 +-
6 files changed, 217 insertions(+), 2 deletions(-)
create mode 100644 accel/accel-blocker.c
create mode 100644 include/sysemu/accel-blocker.h
diff --git a/accel/accel-blocker.c b/accel/accel-blocker.c
new file mode 100644
index 0000000000..1e7f423462
--- /dev/null
+++ b/accel/accel-blocker.c
@@ -0,0 +1,154 @@
+/*
+ * Lock to inhibit accelerator ioctls
+ *
+ * Copyright (c) 2022 Red Hat Inc.
+ *
+ * Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/thread.h"
+#include "qemu/main-loop.h"
+#include "hw/core/cpu.h"
+#include "sysemu/accel-blocker.h"
+
+static QemuLockCnt accel_in_ioctl_lock;
+static QemuEvent accel_in_ioctl_event;
+
+void accel_blocker_init(void)
+{
+ qemu_lockcnt_init(&accel_in_ioctl_lock);
+ qemu_event_init(&accel_in_ioctl_event, false);
+}
+
+void accel_ioctl_begin(void)
+{
+ if (likely(qemu_mutex_iothread_locked())) {
+ return;
+ }
+
+ /* block if lock is taken in kvm_ioctl_inhibit_begin() */
+ qemu_lockcnt_inc(&accel_in_ioctl_lock);
+}
+
+void accel_ioctl_end(void)
+{
+ if (likely(qemu_mutex_iothread_locked())) {
+ return;
+ }
+
+ qemu_lockcnt_dec(&accel_in_ioctl_lock);
+ /* change event to SET. If event was BUSY, wake up all waiters */
+ qemu_event_set(&accel_in_ioctl_event);
+}
+
+void accel_cpu_ioctl_begin(CPUState *cpu)
+{
+ if (unlikely(qemu_mutex_iothread_locked())) {
+ return;
+ }
+
+ /* block if lock is taken in kvm_ioctl_inhibit_begin() */
+ qemu_lockcnt_inc(&cpu->in_ioctl_lock);
+}
+
+void accel_cpu_ioctl_end(CPUState *cpu)
+{
+ if (unlikely(qemu_mutex_iothread_locked())) {
+ return;
+ }
+
+ qemu_lockcnt_dec(&cpu->in_ioctl_lock);
+ /* change event to SET. If event was BUSY, wake up all waiters */
+ qemu_event_set(&accel_in_ioctl_event);
+}
+
+static bool accel_has_to_wait(void)
+{
+ CPUState *cpu;
+ bool needs_to_wait = false;
+
+ CPU_FOREACH(cpu) {
+ if (qemu_lockcnt_count(&cpu->in_ioctl_lock)) {
+ /* exit the ioctl, if vcpu is running it */
+ qemu_cpu_kick(cpu);
+ needs_to_wait = true;
+ }
+ }
+
+ return needs_to_wait || qemu_lockcnt_count(&accel_in_ioctl_lock);
+}
+
+void accel_ioctl_inhibit_begin(void)
+{
+ CPUState *cpu;
+
+ /*
+ * We allow to inhibit only when holding the BQL, so we can identify
+ * when an inhibitor wants to issue an ioctl easily.
+ */
+ g_assert(qemu_mutex_iothread_locked());
+
+ /* Block further invocations of the ioctls outside the BQL. */
+ CPU_FOREACH(cpu) {
+ qemu_lockcnt_lock(&cpu->in_ioctl_lock);
+ }
+ qemu_lockcnt_lock(&accel_in_ioctl_lock);
+
+ /* Keep waiting until there are running ioctls */
+ while (true) {
+
+ /* Reset event to FREE. */
+ qemu_event_reset(&accel_in_ioctl_event);
+
+ if (accel_has_to_wait()) {
+ /*
+ * If event is still FREE, and there are ioctls still in progress,
+ * wait.
+ *
+ * If an ioctl finishes before qemu_event_wait(), it will change
+ * the event state to SET. This will prevent qemu_event_wait() from
+ * blocking, but it's not a problem because if other ioctls are
+ * still running the loop will iterate once more and reset the event
+ * status to FREE so that it can wait properly.
+ *
+ * If an ioctls finishes while qemu_event_wait() is blocking, then
+ * it will be waken up, but also here the while loop makes sure
+ * to re-enter the wait if there are other running ioctls.
+ */
+ qemu_event_wait(&accel_in_ioctl_event);
+ } else {
+ /* No ioctl is running */
+ return;
+ }
+ }
+}
+
+void accel_ioctl_inhibit_end(void)
+{
+ CPUState *cpu;
+
+ qemu_lockcnt_unlock(&accel_in_ioctl_lock);
+ CPU_FOREACH(cpu) {
+ qemu_lockcnt_unlock(&cpu->in_ioctl_lock);
+ }
+}
+
diff --git a/accel/meson.build b/accel/meson.build
index dfd808d2c8..801b4d44e8 100644
--- a/accel/meson.build
+++ b/accel/meson.build
@@ -1,4 +1,4 @@
-specific_ss.add(files('accel-common.c'))
+specific_ss.add(files('accel-common.c', 'accel-blocker.c'))
softmmu_ss.add(files('accel-softmmu.c'))
user_ss.add(files('accel-user.c'))
diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
index 9e3241b430..b6e83acf0a 100644
--- a/hw/core/cpu-common.c
+++ b/hw/core/cpu-common.c
@@ -238,6 +238,7 @@ static void cpu_common_initfn(Object *obj)
cpu->nr_threads = 1;
qemu_mutex_init(&cpu->work_mutex);
+ qemu_lockcnt_init(&cpu->in_ioctl_lock);
QSIMPLEQ_INIT(&cpu->work_list);
QTAILQ_INIT(&cpu->breakpoints);
QTAILQ_INIT(&cpu->watchpoints);
@@ -249,6 +250,7 @@ static void cpu_common_finalize(Object *obj)
{
CPUState *cpu = CPU(obj);
+ qemu_lockcnt_destroy(&cpu->in_ioctl_lock);
qemu_mutex_destroy(&cpu->work_mutex);
}
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index e948e81f1a..49d9c73f97 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -383,6 +383,9 @@ struct CPUState {
uint32_t kvm_fetch_index;
uint64_t dirty_pages;
+ /* Use by accel-block: CPU is executing an ioctl() */
+ QemuLockCnt in_ioctl_lock;
+
/* Used for events with 'vcpu' and *without* the 'disabled' properties */
DECLARE_BITMAP(trace_dstate_delayed, CPU_TRACE_DSTATE_MAX_EVENTS);
DECLARE_BITMAP(trace_dstate, CPU_TRACE_DSTATE_MAX_EVENTS);
diff --git a/include/sysemu/accel-blocker.h b/include/sysemu/accel-blocker.h
new file mode 100644
index 0000000000..72020529ef
--- /dev/null
+++ b/include/sysemu/accel-blocker.h
@@ -0,0 +1,56 @@
+/*
+ * Accelerator blocking API, to prevent new ioctls from starting and wait the
+ * running ones finish.
+ * This mechanism differs from pause/resume_all_vcpus() in that it does not
+ * release the BQL.
+ *
+ * Copyright (c) 2022 Red Hat Inc.
+ *
+ * Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#ifndef ACCEL_BLOCKER_H
+#define ACCEL_BLOCKER_H
+
+#include "qemu/osdep.h"
+#include "sysemu/cpus.h"
+
+extern void accel_blocker_init(void);
+
+/*
+ * accel_{cpu_}ioctl_begin/end:
+ * Mark when ioctl is about to run or just finished.
+ *
+ * accel_{cpu_}ioctl_begin will block after accel_ioctl_inhibit_begin() is
+ * called, preventing new ioctls to run. They will continue only after
+ * accel_ioctl_inibith_end().
+ */
+extern void accel_ioctl_begin(void);
+extern void accel_ioctl_end(void);
+extern void accel_cpu_ioctl_begin(CPUState *cpu);
+extern void accel_cpu_ioctl_end(CPUState *cpu);
+
+/*
+ * accel_ioctl_inhibit_begin: start critical section
+ *
+ * This function makes sure that:
+ * 1) incoming accel_{cpu_}ioctl_begin() calls block
+ * 2) wait that all ioctls that were already running reach
+ * accel_{cpu_}ioctl_end(), kicking vcpus if necessary.
+ *
+ * This allows the caller to access shared data or perform operations without
+ * worrying of concurrent vcpus accesses.
+ */
+extern void accel_ioctl_inhibit_begin(void);
+
+/*
+ * accel_ioctl_inhibit_end: end critical section started by
+ * accel_ioctl_inhibit_begin()
+ *
+ * This function allows blocked accel_{cpu_}ioctl_begin() to continue.
+ */
+extern void accel_ioctl_inhibit_end(void);
+
+#endif /* ACCEL_BLOCKER_H */
diff --git a/util/meson.build b/util/meson.build
index 05b593055a..b5f153b0e8 100644
--- a/util/meson.build
+++ b/util/meson.build
@@ -48,6 +48,7 @@ util_ss.add(files('transactions.c'))
util_ss.add(when: 'CONFIG_POSIX', if_true: files('drm.c'))
util_ss.add(files('guest-random.c'))
util_ss.add(files('yank.c'))
+util_ss.add(files('lockcnt.c'))
if have_user
util_ss.add(files('selfmap.c'))
@@ -69,7 +70,6 @@ if have_block
util_ss.add(files('hexdump.c'))
util_ss.add(files('iova-tree.c'))
util_ss.add(files('iov.c', 'qemu-sockets.c', 'uri.c'))
- util_ss.add(files('lockcnt.c'))
util_ss.add(files('main-loop.c'))
util_ss.add(files('nvdimm-utils.c'))
util_ss.add(files('qemu-coroutine.c', 'qemu-coroutine-lock.c', 'qemu-coroutine-io.c'))
--
2.37.3

View File

@ -0,0 +1,50 @@
From 953c5c0982b61b0a3f8f03452844b5487eb22fc7 Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Thu, 9 Mar 2023 08:13:17 -0500
Subject: [PATCH 06/13] aio-wait: switch to smp_mb__after_rmw()
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 263: qatomic: add smp_mb__before/after_rmw()
RH-Bugzilla: 2168472
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Commit: [6/10] 9f30f97754139ffd18d36b2350f9ed4e59ac496e
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168472
commit b532526a07ef3b903ead2e055fe6cc87b41057a3
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Fri Mar 3 11:03:52 2023 +0100
aio-wait: switch to smp_mb__after_rmw()
The barrier comes after an atomic increment, so it is enough to use
smp_mb__after_rmw(); this avoids a double barrier on x86 systems.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
include/block/aio-wait.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/block/aio-wait.h b/include/block/aio-wait.h
index 54840f8622..03b6394c78 100644
--- a/include/block/aio-wait.h
+++ b/include/block/aio-wait.h
@@ -82,7 +82,7 @@ extern AioWait global_aio_wait;
/* Increment wait_->num_waiters before evaluating cond. */ \
qatomic_inc(&wait_->num_waiters); \
/* Paired with smp_mb in aio_wait_kick(). */ \
- smp_mb(); \
+ smp_mb__after_rmw(); \
if (ctx_ && in_aio_context_home_thread(ctx_)) { \
while ((cond)) { \
aio_poll(ctx_, true); \
--
2.37.3

View File

@ -0,0 +1,86 @@
From d7eae0ff4c7f7f7bf10f10272adf7c6971c0db9b Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Thu, 9 Mar 2023 09:26:35 -0500
Subject: [PATCH 01/13] aio_wait_kick: add missing memory barrier
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 263: qatomic: add smp_mb__before/after_rmw()
RH-Bugzilla: 2168472
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Commit: [1/10] eb774aee79864052e14e706d931e52e7bd1162c8
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168472
commit 7455ff1aa01564cc175db5b2373e610503ad4411
Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Tue May 24 13:30:54 2022 -0400
aio_wait_kick: add missing memory barrier
It seems that aio_wait_kick always required a memory barrier
or atomic operation in the caller, but nobody actually
took care of doing it.
Let's put the barrier in the function instead, and pair it
with another one in AIO_WAIT_WHILE. Read aio_wait_kick()
comment for further explanation.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220524173054.12651-1-eesposit@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
include/block/aio-wait.h | 2 ++
util/aio-wait.c | 16 +++++++++++++++-
2 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/include/block/aio-wait.h b/include/block/aio-wait.h
index b39eefb38d..54840f8622 100644
--- a/include/block/aio-wait.h
+++ b/include/block/aio-wait.h
@@ -81,6 +81,8 @@ extern AioWait global_aio_wait;
AioContext *ctx_ = (ctx); \
/* Increment wait_->num_waiters before evaluating cond. */ \
qatomic_inc(&wait_->num_waiters); \
+ /* Paired with smp_mb in aio_wait_kick(). */ \
+ smp_mb(); \
if (ctx_ && in_aio_context_home_thread(ctx_)) { \
while ((cond)) { \
aio_poll(ctx_, true); \
diff --git a/util/aio-wait.c b/util/aio-wait.c
index bdb3d3af22..98c5accd29 100644
--- a/util/aio-wait.c
+++ b/util/aio-wait.c
@@ -35,7 +35,21 @@ static void dummy_bh_cb(void *opaque)
void aio_wait_kick(void)
{
- /* The barrier (or an atomic op) is in the caller. */
+ /*
+ * Paired with smp_mb in AIO_WAIT_WHILE. Here we have:
+ * write(condition);
+ * aio_wait_kick() {
+ * smp_mb();
+ * read(num_waiters);
+ * }
+ *
+ * And in AIO_WAIT_WHILE:
+ * write(num_waiters);
+ * smp_mb();
+ * read(condition);
+ */
+ smp_mb();
+
if (qatomic_read(&global_aio_wait.num_waiters)) {
aio_bh_schedule_oneshot(qemu_get_aio_context(), dummy_bh_cb, NULL);
}
--
2.37.3

View File

@ -0,0 +1,66 @@
From 187eb7a418af93375e42298d06e231e2bec3cf00 Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Thu, 9 Mar 2023 08:15:42 -0500
Subject: [PATCH 10/13] async: clarify usage of barriers in the polling case
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 263: qatomic: add smp_mb__before/after_rmw()
RH-Bugzilla: 2168472
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Commit: [10/10] 3be07ccc6137a0336becfe63a818d9cbadb38e9c
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168472
commit 6229438cca037d42f44a96d38feb15cb102a444f
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Mon Mar 6 10:43:52 2023 +0100
async: clarify usage of barriers in the polling case
Explain that aio_context_notifier_poll() relies on
aio_notify_accept() to catch all the memory writes that were
done before ctx->notified was set to true.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
util/async.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/util/async.c b/util/async.c
index 795fe699b6..2a63bf90f2 100644
--- a/util/async.c
+++ b/util/async.c
@@ -463,8 +463,9 @@ void aio_notify_accept(AioContext *ctx)
qatomic_set(&ctx->notified, false);
/*
- * Write ctx->notified before reading e.g. bh->flags. Pairs with smp_wmb
- * in aio_notify.
+ * Order reads of ctx->notified (in aio_context_notifier_poll()) and the
+ * above clearing of ctx->notified before reads of e.g. bh->flags. Pairs
+ * with smp_wmb() in aio_notify.
*/
smp_mb();
}
@@ -487,6 +488,11 @@ static bool aio_context_notifier_poll(void *opaque)
EventNotifier *e = opaque;
AioContext *ctx = container_of(e, AioContext, notifier);
+ /*
+ * No need for load-acquire because we just want to kick the
+ * event loop. aio_notify_accept() takes care of synchronizing
+ * the event loop with the producers.
+ */
return qatomic_read(&ctx->notified);
}
--
2.37.3

View File

@ -0,0 +1,111 @@
From ea3856bb545d19499602830cdc3076d83a981e7a Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Thu, 9 Mar 2023 08:15:36 -0500
Subject: [PATCH 09/13] async: update documentation of the memory barriers
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 263: qatomic: add smp_mb__before/after_rmw()
RH-Bugzilla: 2168472
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Commit: [9/10] d471da2acf7a107cf75f3327c5e8d7456307160e
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168472
commit 8dd48650b43dfde4ebea34191ac267e474bcc29e
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Mon Mar 6 10:15:06 2023 +0100
async: update documentation of the memory barriers
Ever since commit 8c6b0356b539 ("util/async: make bh_aio_poll() O(1)",
2020-02-22), synchronization between qemu_bh_schedule() and aio_bh_poll()
is happening when the bottom half is enqueued in the bh_list; not
when the flags are set. Update the documentation to match.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
util/async.c | 33 +++++++++++++++++++--------------
1 file changed, 19 insertions(+), 14 deletions(-)
diff --git a/util/async.c b/util/async.c
index 6f6717a34b..795fe699b6 100644
--- a/util/async.c
+++ b/util/async.c
@@ -71,14 +71,21 @@ static void aio_bh_enqueue(QEMUBH *bh, unsigned new_flags)
unsigned old_flags;
/*
- * The memory barrier implicit in qatomic_fetch_or makes sure that:
- * 1. idle & any writes needed by the callback are done before the
- * locations are read in the aio_bh_poll.
- * 2. ctx is loaded before the callback has a chance to execute and bh
- * could be freed.
+ * Synchronizes with atomic_fetch_and() in aio_bh_dequeue(), ensuring that
+ * insertion starts after BH_PENDING is set.
*/
old_flags = qatomic_fetch_or(&bh->flags, BH_PENDING | new_flags);
+
if (!(old_flags & BH_PENDING)) {
+ /*
+ * At this point the bottom half becomes visible to aio_bh_poll().
+ * This insertion thus synchronizes with QSLIST_MOVE_ATOMIC in
+ * aio_bh_poll(), ensuring that:
+ * 1. any writes needed by the callback are visible from the callback
+ * after aio_bh_dequeue() returns bh.
+ * 2. ctx is loaded before the callback has a chance to execute and bh
+ * could be freed.
+ */
QSLIST_INSERT_HEAD_ATOMIC(&ctx->bh_list, bh, next);
}
@@ -97,11 +104,8 @@ static QEMUBH *aio_bh_dequeue(BHList *head, unsigned *flags)
QSLIST_REMOVE_HEAD(head, next);
/*
- * The qatomic_and is paired with aio_bh_enqueue(). The implicit memory
- * barrier ensures that the callback sees all writes done by the scheduling
- * thread. It also ensures that the scheduling thread sees the cleared
- * flag before bh->cb has run, and thus will call aio_notify again if
- * necessary.
+ * Synchronizes with qatomic_fetch_or() in aio_bh_enqueue(), ensuring that
+ * the removal finishes before BH_PENDING is reset.
*/
*flags = qatomic_fetch_and(&bh->flags,
~(BH_PENDING | BH_SCHEDULED | BH_IDLE));
@@ -148,6 +152,7 @@ int aio_bh_poll(AioContext *ctx)
BHListSlice *s;
int ret = 0;
+ /* Synchronizes with QSLIST_INSERT_HEAD_ATOMIC in aio_bh_enqueue(). */
QSLIST_MOVE_ATOMIC(&slice.bh_list, &ctx->bh_list);
QSIMPLEQ_INSERT_TAIL(&ctx->bh_slice_list, &slice, next);
@@ -437,15 +442,15 @@ LuringState *aio_get_linux_io_uring(AioContext *ctx)
void aio_notify(AioContext *ctx)
{
/*
- * Write e.g. bh->flags before writing ctx->notified. Pairs with smp_mb in
- * aio_notify_accept.
+ * Write e.g. ctx->bh_list before writing ctx->notified. Pairs with
+ * smp_mb() in aio_notify_accept().
*/
smp_wmb();
qatomic_set(&ctx->notified, true);
/*
- * Write ctx->notified before reading ctx->notify_me. Pairs
- * with smp_mb in aio_ctx_prepare or aio_poll.
+ * Write ctx->notified (and also ctx->bh_list) before reading ctx->notify_me.
+ * Pairs with smp_mb() in aio_ctx_prepare or aio_poll.
*/
smp_mb();
if (qatomic_read(&ctx->notify_me)) {
--
2.37.3

View File

@ -0,0 +1,153 @@
From 192f956f2b0761f270070555f8feb1f0544e5558 Mon Sep 17 00:00:00 2001
From: Hanna Reitz <hreitz@redhat.com>
Date: Wed, 9 Nov 2022 17:54:48 +0100
Subject: [PATCH 01/11] block/mirror: Do not wait for active writes
RH-Author: Hanna Czenczek <hreitz@redhat.com>
RH-MergeRequest: 246: block/mirror: Make active mirror progress even under full load
RH-Bugzilla: 2125119
RH-Acked-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
RH-Commit: [1/3] 652d1e55b954f13eaec2c86f58735d4942837e16
Waiting for all active writes to settle before daring to create a
background copying operation means that we will never do background
operations while the guest does anything (in write-blocking mode), and
therefore cannot converge. Yes, we also will not diverge, but actually
converging would be even nicer.
It is unclear why we did decide to wait for all active writes to settle
before creating a background operation, but it just does not seem
necessary. Active writes will put themselves into the in_flight bitmap
and thus properly block actually conflicting background requests.
It is important for active requests to wait on overlapping background
requests, which we do in active_write_prepare(). However, so far it was
not documented why it is important. Add such documentation now, and
also to the other call of mirror_wait_on_conflicts(), so that it becomes
more clear why and when requests need to actively wait for other
requests to settle.
Another thing to note is that of course we need to ensure that there are
no active requests when the job completes, but that is done by virtue of
the BDS being drained anyway, so there cannot be any active requests at
that point.
With this change, we will need to explicitly keep track of how many
bytes are in flight in active requests so that
job_progress_set_remaining() in mirror_run() can set the correct number
of remaining bytes.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2123297
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20221109165452.67927-2-hreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit d69a879bdf1aed586478eaa161ee064fe1b92f1a)
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
---
block/mirror.c | 37 ++++++++++++++++++++++++++++++-------
1 file changed, 30 insertions(+), 7 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index efec2c7674..282f428cb7 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -81,6 +81,7 @@ typedef struct MirrorBlockJob {
int max_iov;
bool initial_zeroing_ongoing;
int in_active_write_counter;
+ int64_t active_write_bytes_in_flight;
bool prepared;
bool in_drain;
} MirrorBlockJob;
@@ -493,6 +494,13 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
}
bdrv_dirty_bitmap_unlock(s->dirty_bitmap);
+ /*
+ * Wait for concurrent requests to @offset. The next loop will limit the
+ * copied area based on in_flight_bitmap so we only copy an area that does
+ * not overlap with concurrent in-flight requests. Still, we would like to
+ * copy something, so wait until there are at least no more requests to the
+ * very beginning of the area.
+ */
mirror_wait_on_conflicts(NULL, s, offset, 1);
job_pause_point(&s->common.job);
@@ -993,12 +1001,6 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
int64_t cnt, delta;
bool should_complete;
- /* Do not start passive operations while there are active
- * writes in progress */
- while (s->in_active_write_counter) {
- mirror_wait_for_any_operation(s, true);
- }
-
if (s->ret < 0) {
ret = s->ret;
goto immediate_exit;
@@ -1015,7 +1017,9 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
/* cnt is the number of dirty bytes remaining and s->bytes_in_flight is
* the number of bytes currently being processed; together those are
* the current remaining operation length */
- job_progress_set_remaining(&s->common.job, s->bytes_in_flight + cnt);
+ job_progress_set_remaining(&s->common.job,
+ s->bytes_in_flight + cnt +
+ s->active_write_bytes_in_flight);
/* Note that even when no rate limit is applied we need to yield
* periodically with no pending I/O so that bdrv_drain_all() returns.
@@ -1073,6 +1077,10 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
s->in_drain = true;
bdrv_drained_begin(bs);
+
+ /* Must be zero because we are drained */
+ assert(s->in_active_write_counter == 0);
+
cnt = bdrv_get_dirty_count(s->dirty_bitmap);
if (cnt > 0 || mirror_flush(s) < 0) {
bdrv_drained_end(bs);
@@ -1306,6 +1314,7 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod method,
}
job_progress_increase_remaining(&job->common.job, bytes);
+ job->active_write_bytes_in_flight += bytes;
switch (method) {
case MIRROR_METHOD_COPY:
@@ -1327,6 +1336,7 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod method,
abort();
}
+ job->active_write_bytes_in_flight -= bytes;
if (ret >= 0) {
job_progress_update(&job->common.job, bytes);
} else {
@@ -1375,6 +1385,19 @@ static MirrorOp *coroutine_fn active_write_prepare(MirrorBlockJob *s,
s->in_active_write_counter++;
+ /*
+ * Wait for concurrent requests affecting the area. If there are already
+ * running requests that are copying off now-to-be stale data in the area,
+ * we must wait for them to finish before we begin writing fresh data to the
+ * target so that the write operations appear in the correct order.
+ * Note that background requests (see mirror_iteration()) in contrast only
+ * wait for conflicting requests at the start of the dirty area, and then
+ * (based on the in_flight_bitmap) truncate the area to copy so it will not
+ * conflict with any requests beyond that. For active writes, however, we
+ * cannot truncate that area. The request from our parent must be blocked
+ * until the area is copied in full. Therefore, we must wait for the whole
+ * area to become free of concurrent requests.
+ */
mirror_wait_on_conflicts(op, s, offset, bytes);
bitmap_set(s->in_flight_bitmap, start_chunk, end_chunk - start_chunk);
--
2.37.3

View File

@ -0,0 +1,76 @@
From 57c79ed20cb73aa9aa4dd7487379b85ea3f936f6 Mon Sep 17 00:00:00 2001
From: Hanna Reitz <hreitz@redhat.com>
Date: Wed, 9 Nov 2022 17:54:49 +0100
Subject: [PATCH 02/11] block/mirror: Drop mirror_wait_for_any_operation()
RH-Author: Hanna Czenczek <hreitz@redhat.com>
RH-MergeRequest: 246: block/mirror: Make active mirror progress even under full load
RH-Bugzilla: 2125119
RH-Acked-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
RH-Commit: [2/3] dec37883bcc491441ae08d9592d1ec26a47765c0
mirror_wait_for_free_in_flight_slot() is the only remaining user of
mirror_wait_for_any_operation(), so inline the latter into the former.
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20221109165452.67927-3-hreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit eb994912993077f178ccb43b20e422ecf9ae4ac7)
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
---
block/mirror.c | 21 ++++++++-------------
1 file changed, 8 insertions(+), 13 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index 282f428cb7..6b02555ad7 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -304,19 +304,21 @@ static int mirror_cow_align(MirrorBlockJob *s, int64_t *offset,
}
static inline void coroutine_fn
-mirror_wait_for_any_operation(MirrorBlockJob *s, bool active)
+mirror_wait_for_free_in_flight_slot(MirrorBlockJob *s)
{
MirrorOp *op;
QTAILQ_FOREACH(op, &s->ops_in_flight, next) {
- /* Do not wait on pseudo ops, because it may in turn wait on
+ /*
+ * Do not wait on pseudo ops, because it may in turn wait on
* some other operation to start, which may in fact be the
* caller of this function. Since there is only one pseudo op
* at any given time, we will always find some real operation
- * to wait on. */
- if (!op->is_pseudo_op && op->is_in_flight &&
- op->is_active_write == active)
- {
+ * to wait on.
+ * Also, do not wait on active operations, because they do not
+ * use up in-flight slots.
+ */
+ if (!op->is_pseudo_op && op->is_in_flight && !op->is_active_write) {
qemu_co_queue_wait(&op->waiting_requests, NULL);
return;
}
@@ -324,13 +326,6 @@ mirror_wait_for_any_operation(MirrorBlockJob *s, bool active)
abort();
}
-static inline void coroutine_fn
-mirror_wait_for_free_in_flight_slot(MirrorBlockJob *s)
-{
- /* Only non-active operations use up in-flight slots */
- mirror_wait_for_any_operation(s, false);
-}
-
/* Perform a mirror copy operation.
*
* *op->bytes_handled is set to the number of bytes copied after and
--
2.37.3

View File

@ -0,0 +1,75 @@
From b1f5aa5a342a25dc558ee9d435fed0643fe5155f Mon Sep 17 00:00:00 2001
From: Hanna Reitz <hreitz@redhat.com>
Date: Wed, 9 Nov 2022 17:54:50 +0100
Subject: [PATCH 03/11] block/mirror: Fix NULL s->job in active writes
RH-Author: Hanna Czenczek <hreitz@redhat.com>
RH-MergeRequest: 246: block/mirror: Make active mirror progress even under full load
RH-Bugzilla: 2125119
RH-Acked-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
RH-Commit: [3/3] 49d7ebd15667151a6e14228a8260cfdd0aa27a78
There is a small gap in mirror_start_job() before putting the mirror
filter node into the block graph (bdrv_append() call) and the actual job
being created. Before the job is created, MirrorBDSOpaque.job is NULL.
It is possible that requests come in when bdrv_drained_end() is called,
and those requests would see MirrorBDSOpaque.job == NULL. Have our
filter node handle that case gracefully.
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20221109165452.67927-4-hreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit da93d5c84e56e6b4e84aa8e98b6b984c9b6bb528)
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
---
block/mirror.c | 20 ++++++++++++--------
1 file changed, 12 insertions(+), 8 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index 6b02555ad7..50289fca49 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1438,11 +1438,13 @@ static int coroutine_fn bdrv_mirror_top_do_write(BlockDriverState *bs,
MirrorOp *op = NULL;
MirrorBDSOpaque *s = bs->opaque;
int ret = 0;
- bool copy_to_target;
+ bool copy_to_target = false;
- copy_to_target = s->job->ret >= 0 &&
- !job_is_cancelled(&s->job->common.job) &&
- s->job->copy_mode == MIRROR_COPY_MODE_WRITE_BLOCKING;
+ if (s->job) {
+ copy_to_target = s->job->ret >= 0 &&
+ !job_is_cancelled(&s->job->common.job) &&
+ s->job->copy_mode == MIRROR_COPY_MODE_WRITE_BLOCKING;
+ }
if (copy_to_target) {
op = active_write_prepare(s->job, offset, bytes);
@@ -1487,11 +1489,13 @@ static int coroutine_fn bdrv_mirror_top_pwritev(BlockDriverState *bs,
QEMUIOVector bounce_qiov;
void *bounce_buf;
int ret = 0;
- bool copy_to_target;
+ bool copy_to_target = false;
- copy_to_target = s->job->ret >= 0 &&
- !job_is_cancelled(&s->job->common.job) &&
- s->job->copy_mode == MIRROR_COPY_MODE_WRITE_BLOCKING;
+ if (s->job) {
+ copy_to_target = s->job->ret >= 0 &&
+ !job_is_cancelled(&s->job->common.job) &&
+ s->job->copy_mode == MIRROR_COPY_MODE_WRITE_BLOCKING;
+ }
if (copy_to_target) {
/* The guest might concurrently modify the data to write; but
--
2.37.3

View File

@ -1,41 +0,0 @@
From 3a0e9bb88e82cc76ca5efc0595ce94b5dc34749e Mon Sep 17 00:00:00 2001
From: Gavin Shan <gshan@redhat.com>
Date: Mon, 25 Apr 2022 13:42:46 +0800
Subject: [PATCH 1/2] configs/devices/aarch64-softmmu: Enable CONFIG_VIRTIO_MEM
RH-Author: Gavin Shan <gshan@redhat.com>
RH-MergeRequest: 80: Enable virtio-mem for aarch64
RH-Commit: [1/1] 1afbd08da6d7c860da8d617a0a932d3660514878 (gwshan/qemu-rhel-9)
RH-Bugzilla: 2044162
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2044162
This enables virtio-mem device on aarch64 since all needed commits
are ready.
b1b87327a9 hw/arm/virt: Support for virtio-mem-pci
1263615efe virtio-mem: Correct default THP size for ARM64
Signed-off-by: Gavin Shan <gshan@redhat.com>
---
configs/devices/aarch64-softmmu/aarch64-rh-devices.mak | 1 +
1 file changed, 1 insertion(+)
diff --git a/configs/devices/aarch64-softmmu/aarch64-rh-devices.mak b/configs/devices/aarch64-softmmu/aarch64-rh-devices.mak
index 5f6ee1de5b..187938573f 100644
--- a/configs/devices/aarch64-softmmu/aarch64-rh-devices.mak
+++ b/configs/devices/aarch64-softmmu/aarch64-rh-devices.mak
@@ -22,6 +22,7 @@ CONFIG_VFIO=y
CONFIG_VFIO_PCI=y
CONFIG_VIRTIO_MMIO=y
CONFIG_VIRTIO_PCI=y
+CONFIG_VIRTIO_MEM=y
CONFIG_XIO3130=y
CONFIG_NVDIMM=y
CONFIG_ACPI_APEI=y
--
2.35.1

View File

@ -1,101 +0,0 @@
From e3cb8849862a9f0dd20f2913d540336a037d43c7 Mon Sep 17 00:00:00 2001
From: Kevin Wolf <kwolf@redhat.com>
Date: Tue, 10 May 2022 17:10:19 +0200
Subject: [PATCH 07/16] coroutine: Rename qemu_coroutine_inc/dec_pool_size()
RH-Author: Kevin Wolf <kwolf@redhat.com>
RH-MergeRequest: 87: coroutine: Fix crashes due to too large pool batch size
RH-Commit: [1/2] 6389b11f70225f221784c270d9b90c1ea43ca8fb (kmwolf/centos-qemu-kvm)
RH-Bugzilla: 2079938
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
It's true that these functions currently affect the batch size in which
coroutines are reused (i.e. moved from the global release pool to the
allocation pool of a specific thread), but this is a bug and will be
fixed in a separate patch.
In fact, the comment in the header file already just promises that it
influences the pool size, so reflect this in the name of the functions.
As a nice side effect, the shorter function name makes some line
wrapping unnecessary.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20220510151020.105528-2-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 98e3ab35054b946f7c2aba5408822532b0920b53)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
hw/block/virtio-blk.c | 6 ++----
include/qemu/coroutine.h | 6 +++---
util/qemu-coroutine.c | 4 ++--
3 files changed, 7 insertions(+), 9 deletions(-)
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 540c38f829..6a1cc41877 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1215,8 +1215,7 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
for (i = 0; i < conf->num_queues; i++) {
virtio_add_queue(vdev, conf->queue_size, virtio_blk_handle_output);
}
- qemu_coroutine_increase_pool_batch_size(conf->num_queues * conf->queue_size
- / 2);
+ qemu_coroutine_inc_pool_size(conf->num_queues * conf->queue_size / 2);
virtio_blk_data_plane_create(vdev, conf, &s->dataplane, &err);
if (err != NULL) {
error_propagate(errp, err);
@@ -1253,8 +1252,7 @@ static void virtio_blk_device_unrealize(DeviceState *dev)
for (i = 0; i < conf->num_queues; i++) {
virtio_del_queue(vdev, i);
}
- qemu_coroutine_decrease_pool_batch_size(conf->num_queues * conf->queue_size
- / 2);
+ qemu_coroutine_dec_pool_size(conf->num_queues * conf->queue_size / 2);
qemu_del_vm_change_state_handler(s->change);
blockdev_mark_auto_del(s->blk);
virtio_cleanup(vdev);
diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index c828a95ee0..5b621d1295 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -334,12 +334,12 @@ void coroutine_fn yield_until_fd_readable(int fd);
/**
* Increase coroutine pool size
*/
-void qemu_coroutine_increase_pool_batch_size(unsigned int additional_pool_size);
+void qemu_coroutine_inc_pool_size(unsigned int additional_pool_size);
/**
- * Devcrease coroutine pool size
+ * Decrease coroutine pool size
*/
-void qemu_coroutine_decrease_pool_batch_size(unsigned int additional_pool_size);
+void qemu_coroutine_dec_pool_size(unsigned int additional_pool_size);
#include "qemu/lockable.h"
diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
index c03b2422ff..faca0ca97c 100644
--- a/util/qemu-coroutine.c
+++ b/util/qemu-coroutine.c
@@ -205,12 +205,12 @@ AioContext *coroutine_fn qemu_coroutine_get_aio_context(Coroutine *co)
return co->ctx;
}
-void qemu_coroutine_increase_pool_batch_size(unsigned int additional_pool_size)
+void qemu_coroutine_inc_pool_size(unsigned int additional_pool_size)
{
qatomic_add(&pool_batch_size, additional_pool_size);
}
-void qemu_coroutine_decrease_pool_batch_size(unsigned int removing_pool_size)
+void qemu_coroutine_dec_pool_size(unsigned int removing_pool_size)
{
qatomic_sub(&pool_batch_size, removing_pool_size);
}
--
2.31.1

View File

@ -1,138 +0,0 @@
From 345107bfd5537b51f34aaeb97d6161858bb6feee Mon Sep 17 00:00:00 2001
From: Kevin Wolf <kwolf@redhat.com>
Date: Tue, 10 May 2022 17:10:20 +0200
Subject: [PATCH 08/16] coroutine: Revert to constant batch size
RH-Author: Kevin Wolf <kwolf@redhat.com>
RH-MergeRequest: 87: coroutine: Fix crashes due to too large pool batch size
RH-Commit: [2/2] 8a8a39af873854cdc8333d1a70f3479a97c3ec7a (kmwolf/centos-qemu-kvm)
RH-Bugzilla: 2079938
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Commit 4c41c69e changed the way the coroutine pool is sized because for
virtio-blk devices with a large queue size and heavy I/O, it was just
too small and caused coroutines to be deleted and reallocated soon
afterwards. The change made the size dynamic based on the number of
queues and the queue size of virtio-blk devices.
There are two important numbers here: Slightly simplified, when a
coroutine terminates, it is generally stored in the global release pool
up to a certain pool size, and if the pool is full, it is freed.
Conversely, when allocating a new coroutine, the coroutines in the
release pool are reused if the pool already has reached a certain
minimum size (the batch size), otherwise we allocate new coroutines.
The problem after commit 4c41c69e is that it not only increases the
maximum pool size (which is the intended effect), but also the batch
size for reusing coroutines (which is a bug). It means that in cases
with many devices and/or a large queue size (which defaults to the
number of vcpus for virtio-blk-pci), many thousand coroutines could be
sitting in the release pool without being reused.
This is not only a waste of memory and allocations, but it actually
makes the QEMU process likely to hit the vm.max_map_count limit on Linux
because each coroutine requires two mappings (its stack and the guard
page for the stack), causing it to abort() in qemu_alloc_stack() because
when the limit is hit, mprotect() starts to fail with ENOMEM.
In order to fix the problem, change the batch size back to 64 to avoid
uselessly accumulating coroutines in the release pool, but keep the
dynamic maximum pool size so that coroutines aren't freed too early
in heavy I/O scenarios.
Note that this fix doesn't strictly make it impossible to hit the limit,
but this would only happen if most of the coroutines are actually in use
at the same time, not just sitting in a pool. This is the same behaviour
as we already had before commit 4c41c69e. Fully preventing this would
require allowing qemu_coroutine_create() to return an error, but it
doesn't seem to be a scenario that people hit in practice.
Cc: qemu-stable@nongnu.org
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2079938
Fixes: 4c41c69e05fe28c0f95f8abd2ebf407e95a4f04b
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20220510151020.105528-3-kwolf@redhat.com>
Tested-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 9ec7a59b5aad4b736871c378d30f5ef5ec51cb52)
Conflicts:
util/qemu-coroutine.c
Trivial merge conflict because we don't have commit ac387a08 downstream.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
util/qemu-coroutine.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
index faca0ca97c..804f672e0a 100644
--- a/util/qemu-coroutine.c
+++ b/util/qemu-coroutine.c
@@ -20,14 +20,20 @@
#include "qemu/coroutine_int.h"
#include "block/aio.h"
-/** Initial batch size is 64, and is increased on demand */
+/**
+ * The minimal batch size is always 64, coroutines from the release_pool are
+ * reused as soon as there are 64 coroutines in it. The maximum pool size starts
+ * with 64 and is increased on demand so that coroutines are not deleted even if
+ * they are not immediately reused.
+ */
enum {
- POOL_INITIAL_BATCH_SIZE = 64,
+ POOL_MIN_BATCH_SIZE = 64,
+ POOL_INITIAL_MAX_SIZE = 64,
};
/** Free list to speed up creation */
static QSLIST_HEAD(, Coroutine) release_pool = QSLIST_HEAD_INITIALIZER(pool);
-static unsigned int pool_batch_size = POOL_INITIAL_BATCH_SIZE;
+static unsigned int pool_max_size = POOL_INITIAL_MAX_SIZE;
static unsigned int release_pool_size;
static __thread QSLIST_HEAD(, Coroutine) alloc_pool = QSLIST_HEAD_INITIALIZER(pool);
static __thread unsigned int alloc_pool_size;
@@ -51,7 +57,7 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry, void *opaque)
if (CONFIG_COROUTINE_POOL) {
co = QSLIST_FIRST(&alloc_pool);
if (!co) {
- if (release_pool_size > qatomic_read(&pool_batch_size)) {
+ if (release_pool_size > POOL_MIN_BATCH_SIZE) {
/* Slow path; a good place to register the destructor, too. */
if (!coroutine_pool_cleanup_notifier.notify) {
coroutine_pool_cleanup_notifier.notify = coroutine_pool_cleanup;
@@ -88,12 +94,12 @@ static void coroutine_delete(Coroutine *co)
co->caller = NULL;
if (CONFIG_COROUTINE_POOL) {
- if (release_pool_size < qatomic_read(&pool_batch_size) * 2) {
+ if (release_pool_size < qatomic_read(&pool_max_size) * 2) {
QSLIST_INSERT_HEAD_ATOMIC(&release_pool, co, pool_next);
qatomic_inc(&release_pool_size);
return;
}
- if (alloc_pool_size < qatomic_read(&pool_batch_size)) {
+ if (alloc_pool_size < qatomic_read(&pool_max_size)) {
QSLIST_INSERT_HEAD(&alloc_pool, co, pool_next);
alloc_pool_size++;
return;
@@ -207,10 +213,10 @@ AioContext *coroutine_fn qemu_coroutine_get_aio_context(Coroutine *co)
void qemu_coroutine_inc_pool_size(unsigned int additional_pool_size)
{
- qatomic_add(&pool_batch_size, additional_pool_size);
+ qatomic_add(&pool_max_size, additional_pool_size);
}
void qemu_coroutine_dec_pool_size(unsigned int removing_pool_size)
{
- qatomic_sub(&pool_batch_size, removing_pool_size);
+ qatomic_sub(&pool_max_size, removing_pool_size);
}
--
2.31.1

View File

@ -1,132 +0,0 @@
From ffbd90e5f4eba620c7cd631b04f0ed31beb22ffa Mon Sep 17 00:00:00 2001
From: Stefan Hajnoczi <stefanha@redhat.com>
Date: Tue, 17 May 2022 12:07:56 +0100
Subject: [PATCH 1/6] coroutine-ucontext: use QEMU_DEFINE_STATIC_CO_TLS()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Stefan Hajnoczi <stefanha@redhat.com>
RH-MergeRequest: 89: coroutine: use coroutine TLS macros to protect thread-local variables
RH-Commit: [1/3] a9782fe8e919c4bd317b7e8744c7ff57d898add3 (stefanha/centos-stream-qemu-kvm)
RH-Bugzilla: 1952483
RH-Acked-by: Hanna Reitz <hreitz@redhat.com>
RH-Acked-by: Eric Blake <eblake@redhat.com>
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
Thread-Local Storage variables cannot be used directly from coroutine
code because the compiler may optimize TLS variable accesses across
qemu_coroutine_yield() calls. When the coroutine is re-entered from
another thread the TLS variables from the old thread must no longer be
used.
Use QEMU_DEFINE_STATIC_CO_TLS() for the current and leader variables.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220307153853.602859-2-stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 34145a307d849d0b6734d0222a7aa0bb9eef7407)
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
util/coroutine-ucontext.c | 38 ++++++++++++++++++++++++--------------
1 file changed, 24 insertions(+), 14 deletions(-)
diff --git a/util/coroutine-ucontext.c b/util/coroutine-ucontext.c
index 904b375192..127d5a13c8 100644
--- a/util/coroutine-ucontext.c
+++ b/util/coroutine-ucontext.c
@@ -25,6 +25,7 @@
#include "qemu/osdep.h"
#include <ucontext.h>
#include "qemu/coroutine_int.h"
+#include "qemu/coroutine-tls.h"
#ifdef CONFIG_VALGRIND_H
#include <valgrind/valgrind.h>
@@ -66,8 +67,8 @@ typedef struct {
/**
* Per-thread coroutine bookkeeping
*/
-static __thread CoroutineUContext leader;
-static __thread Coroutine *current;
+QEMU_DEFINE_STATIC_CO_TLS(Coroutine *, current);
+QEMU_DEFINE_STATIC_CO_TLS(CoroutineUContext, leader);
/*
* va_args to makecontext() must be type 'int', so passing
@@ -97,14 +98,15 @@ static inline __attribute__((always_inline))
void finish_switch_fiber(void *fake_stack_save)
{
#ifdef CONFIG_ASAN
+ CoroutineUContext *leaderp = get_ptr_leader();
const void *bottom_old;
size_t size_old;
__sanitizer_finish_switch_fiber(fake_stack_save, &bottom_old, &size_old);
- if (!leader.stack) {
- leader.stack = (void *)bottom_old;
- leader.stack_size = size_old;
+ if (!leaderp->stack) {
+ leaderp->stack = (void *)bottom_old;
+ leaderp->stack_size = size_old;
}
#endif
#ifdef CONFIG_TSAN
@@ -161,8 +163,10 @@ static void coroutine_trampoline(int i0, int i1)
/* Initialize longjmp environment and switch back the caller */
if (!sigsetjmp(self->env, 0)) {
- start_switch_fiber_asan(COROUTINE_YIELD, &fake_stack_save, leader.stack,
- leader.stack_size);
+ CoroutineUContext *leaderp = get_ptr_leader();
+
+ start_switch_fiber_asan(COROUTINE_YIELD, &fake_stack_save,
+ leaderp->stack, leaderp->stack_size);
start_switch_fiber_tsan(&fake_stack_save, self, true); /* true=caller */
siglongjmp(*(sigjmp_buf *)co->entry_arg, 1);
}
@@ -297,7 +301,7 @@ qemu_coroutine_switch(Coroutine *from_, Coroutine *to_,
int ret;
void *fake_stack_save = NULL;
- current = to_;
+ set_current(to_);
ret = sigsetjmp(from->env, 0);
if (ret == 0) {
@@ -315,18 +319,24 @@ qemu_coroutine_switch(Coroutine *from_, Coroutine *to_,
Coroutine *qemu_coroutine_self(void)
{
- if (!current) {
- current = &leader.base;
+ Coroutine *self = get_current();
+ CoroutineUContext *leaderp = get_ptr_leader();
+
+ if (!self) {
+ self = &leaderp->base;
+ set_current(self);
}
#ifdef CONFIG_TSAN
- if (!leader.tsan_co_fiber) {
- leader.tsan_co_fiber = __tsan_get_current_fiber();
+ if (!leaderp->tsan_co_fiber) {
+ leaderp->tsan_co_fiber = __tsan_get_current_fiber();
}
#endif
- return current;
+ return self;
}
bool qemu_in_coroutine(void)
{
- return current && current->caller;
+ Coroutine *self = get_current();
+
+ return self && self->caller;
}
--
2.31.1

View File

@ -1,139 +0,0 @@
From 9c2e55d25fec6ffb21e344513b7dbeed7e21f641 Mon Sep 17 00:00:00 2001
From: Stefan Hajnoczi <stefanha@redhat.com>
Date: Tue, 17 May 2022 12:08:04 +0100
Subject: [PATCH 2/6] coroutine: use QEMU_DEFINE_STATIC_CO_TLS()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Stefan Hajnoczi <stefanha@redhat.com>
RH-MergeRequest: 89: coroutine: use coroutine TLS macros to protect thread-local variables
RH-Commit: [2/3] 68a8847e406e2eace6ddc31b0c5676a60600d606 (stefanha/centos-stream-qemu-kvm)
RH-Bugzilla: 1952483
RH-Acked-by: Hanna Reitz <hreitz@redhat.com>
RH-Acked-by: Eric Blake <eblake@redhat.com>
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
Thread-Local Storage variables cannot be used directly from coroutine
code because the compiler may optimize TLS variable accesses across
qemu_coroutine_yield() calls. When the coroutine is re-entered from
another thread the TLS variables from the old thread must no longer be
used.
Use QEMU_DEFINE_STATIC_CO_TLS() for the current and leader variables.
The alloc_pool QSLIST needs a typedef so the return value of
get_ptr_alloc_pool() can be stored in a local variable.
One example of why this code is necessary: a coroutine that yields
before calling qemu_coroutine_create() to create another coroutine is
affected by the TLS issue.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220307153853.602859-3-stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit ac387a08a9c9f6b36757da912f0339c25f421f90)
Conflicts:
- Context conflicts due to commit 5411171c3ef4 ("coroutine: Revert to
constant batch size").
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
util/qemu-coroutine.c | 41 ++++++++++++++++++++++++-----------------
1 file changed, 24 insertions(+), 17 deletions(-)
diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
index 804f672e0a..4a8bd63ef0 100644
--- a/util/qemu-coroutine.c
+++ b/util/qemu-coroutine.c
@@ -18,6 +18,7 @@
#include "qemu/atomic.h"
#include "qemu/coroutine.h"
#include "qemu/coroutine_int.h"
+#include "qemu/coroutine-tls.h"
#include "block/aio.h"
/**
@@ -35,17 +36,20 @@ enum {
static QSLIST_HEAD(, Coroutine) release_pool = QSLIST_HEAD_INITIALIZER(pool);
static unsigned int pool_max_size = POOL_INITIAL_MAX_SIZE;
static unsigned int release_pool_size;
-static __thread QSLIST_HEAD(, Coroutine) alloc_pool = QSLIST_HEAD_INITIALIZER(pool);
-static __thread unsigned int alloc_pool_size;
-static __thread Notifier coroutine_pool_cleanup_notifier;
+
+typedef QSLIST_HEAD(, Coroutine) CoroutineQSList;
+QEMU_DEFINE_STATIC_CO_TLS(CoroutineQSList, alloc_pool);
+QEMU_DEFINE_STATIC_CO_TLS(unsigned int, alloc_pool_size);
+QEMU_DEFINE_STATIC_CO_TLS(Notifier, coroutine_pool_cleanup_notifier);
static void coroutine_pool_cleanup(Notifier *n, void *value)
{
Coroutine *co;
Coroutine *tmp;
+ CoroutineQSList *alloc_pool = get_ptr_alloc_pool();
- QSLIST_FOREACH_SAFE(co, &alloc_pool, pool_next, tmp) {
- QSLIST_REMOVE_HEAD(&alloc_pool, pool_next);
+ QSLIST_FOREACH_SAFE(co, alloc_pool, pool_next, tmp) {
+ QSLIST_REMOVE_HEAD(alloc_pool, pool_next);
qemu_coroutine_delete(co);
}
}
@@ -55,27 +59,30 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry, void *opaque)
Coroutine *co = NULL;
if (CONFIG_COROUTINE_POOL) {
- co = QSLIST_FIRST(&alloc_pool);
+ CoroutineQSList *alloc_pool = get_ptr_alloc_pool();
+
+ co = QSLIST_FIRST(alloc_pool);
if (!co) {
if (release_pool_size > POOL_MIN_BATCH_SIZE) {
/* Slow path; a good place to register the destructor, too. */
- if (!coroutine_pool_cleanup_notifier.notify) {
- coroutine_pool_cleanup_notifier.notify = coroutine_pool_cleanup;
- qemu_thread_atexit_add(&coroutine_pool_cleanup_notifier);
+ Notifier *notifier = get_ptr_coroutine_pool_cleanup_notifier();
+ if (!notifier->notify) {
+ notifier->notify = coroutine_pool_cleanup;
+ qemu_thread_atexit_add(notifier);
}
/* This is not exact; there could be a little skew between
* release_pool_size and the actual size of release_pool. But
* it is just a heuristic, it does not need to be perfect.
*/
- alloc_pool_size = qatomic_xchg(&release_pool_size, 0);
- QSLIST_MOVE_ATOMIC(&alloc_pool, &release_pool);
- co = QSLIST_FIRST(&alloc_pool);
+ set_alloc_pool_size(qatomic_xchg(&release_pool_size, 0));
+ QSLIST_MOVE_ATOMIC(alloc_pool, &release_pool);
+ co = QSLIST_FIRST(alloc_pool);
}
}
if (co) {
- QSLIST_REMOVE_HEAD(&alloc_pool, pool_next);
- alloc_pool_size--;
+ QSLIST_REMOVE_HEAD(alloc_pool, pool_next);
+ set_alloc_pool_size(get_alloc_pool_size() - 1);
}
}
@@ -99,9 +106,9 @@ static void coroutine_delete(Coroutine *co)
qatomic_inc(&release_pool_size);
return;
}
- if (alloc_pool_size < qatomic_read(&pool_max_size)) {
- QSLIST_INSERT_HEAD(&alloc_pool, co, pool_next);
- alloc_pool_size++;
+ if (get_alloc_pool_size() < qatomic_read(&pool_max_size)) {
+ QSLIST_INSERT_HEAD(get_ptr_alloc_pool(), co, pool_next);
+ set_alloc_pool_size(get_alloc_pool_size() + 1);
return;
}
}
--
2.31.1

View File

@ -1,99 +0,0 @@
From 336581e6e9ace3f1ddd24ad0a258db9785f9b0ed Mon Sep 17 00:00:00 2001
From: Stefan Hajnoczi <stefanha@redhat.com>
Date: Tue, 17 May 2022 12:08:12 +0100
Subject: [PATCH 3/6] coroutine-win32: use QEMU_DEFINE_STATIC_CO_TLS()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Stefan Hajnoczi <stefanha@redhat.com>
RH-MergeRequest: 89: coroutine: use coroutine TLS macros to protect thread-local variables
RH-Commit: [3/3] 55b35dfdae1bc7d6f614ac9f81a92f5c6431f713 (stefanha/centos-stream-qemu-kvm)
RH-Bugzilla: 1952483
RH-Acked-by: Hanna Reitz <hreitz@redhat.com>
RH-Acked-by: Eric Blake <eblake@redhat.com>
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
Thread-Local Storage variables cannot be used directly from coroutine
code because the compiler may optimize TLS variable accesses across
qemu_coroutine_yield() calls. When the coroutine is re-entered from
another thread the TLS variables from the old thread must no longer be
used.
Use QEMU_DEFINE_STATIC_CO_TLS() for the current and leader variables.
I think coroutine-win32.c could get away with __thread because the
variables are only used in situations where either the stale value is
correct (current) or outside coroutine context (loading leader when
current is NULL). Due to the difficulty of being sure that this is
really safe in all scenarios it seems worth converting it anyway.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220307153853.602859-4-stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit c1fe694357a328c807ae3cc6961c19e923448fcc)
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
util/coroutine-win32.c | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/util/coroutine-win32.c b/util/coroutine-win32.c
index de6bd4fd3e..c02a62c896 100644
--- a/util/coroutine-win32.c
+++ b/util/coroutine-win32.c
@@ -25,6 +25,7 @@
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "qemu/coroutine_int.h"
+#include "qemu/coroutine-tls.h"
typedef struct
{
@@ -34,8 +35,8 @@ typedef struct
CoroutineAction action;
} CoroutineWin32;
-static __thread CoroutineWin32 leader;
-static __thread Coroutine *current;
+QEMU_DEFINE_STATIC_CO_TLS(CoroutineWin32, leader);
+QEMU_DEFINE_STATIC_CO_TLS(Coroutine *, current);
/* This function is marked noinline to prevent GCC from inlining it
* into coroutine_trampoline(). If we allow it to do that then it
@@ -52,7 +53,7 @@ qemu_coroutine_switch(Coroutine *from_, Coroutine *to_,
CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, from_);
CoroutineWin32 *to = DO_UPCAST(CoroutineWin32, base, to_);
- current = to_;
+ set_current(to_);
to->action = action;
SwitchToFiber(to->fiber);
@@ -89,14 +90,21 @@ void qemu_coroutine_delete(Coroutine *co_)
Coroutine *qemu_coroutine_self(void)
{
+ Coroutine *current = get_current();
+
if (!current) {
- current = &leader.base;
- leader.fiber = ConvertThreadToFiber(NULL);
+ CoroutineWin32 *leader = get_ptr_leader();
+
+ current = &leader->base;
+ set_current(current);
+ leader->fiber = ConvertThreadToFiber(NULL);
}
return current;
}
bool qemu_in_coroutine(void)
{
+ Coroutine *current = get_current();
+
return current && current->caller;
}
--
2.31.1

View File

@ -0,0 +1,127 @@
From 103608465b8bd2edf7f9aaef5c3c93309ccf9ec2 Mon Sep 17 00:00:00 2001
From: Stefan Hajnoczi <stefanha@redhat.com>
Date: Tue, 21 Feb 2023 16:22:17 -0500
Subject: [PATCH 12/13] dma-helpers: prevent dma_blk_cb() vs dma_aio_cancel()
race
RH-Author: Stefan Hajnoczi <stefanha@redhat.com>
RH-MergeRequest: 264: scsi: protect req->aiocb with AioContext lock
RH-Bugzilla: 2090990
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
RH-Commit: [2/3] 14f5835093ba8c5111f3ada2fe87730371aca733
dma_blk_cb() only takes the AioContext lock around ->io_func(). That
means the rest of dma_blk_cb() is not protected. In particular, the
DMAAIOCB field accesses happen outside the lock.
There is a race when the main loop thread holds the AioContext lock and
invokes scsi_device_purge_requests() -> bdrv_aio_cancel() ->
dma_aio_cancel() while an IOThread executes dma_blk_cb(). The dbs->acb
field determines how cancellation proceeds. If dma_aio_cancel() sees
dbs->acb == NULL while dma_blk_cb() is still running, the request can be
completed twice (-ECANCELED and the actual return value).
The following assertion can occur with virtio-scsi when an IOThread is
used:
../hw/scsi/scsi-disk.c:368: scsi_dma_complete: Assertion `r->req.aiocb != NULL' failed.
Fix the race by holding the AioContext across dma_blk_cb(). Now
dma_aio_cancel() under the AioContext lock will not see
inconsistent/intermediate states.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230221212218.1378734-3-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit abfcd2760b3e70727bbc0792221b8b98a733dc32)
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
hw/scsi/scsi-disk.c | 4 +---
softmmu/dma-helpers.c | 12 +++++++-----
2 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 179ce22c4a..c8109a673e 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -351,13 +351,12 @@ done:
scsi_req_unref(&r->req);
}
+/* Called with AioContext lock held */
static void scsi_dma_complete(void *opaque, int ret)
{
SCSIDiskReq *r = (SCSIDiskReq *)opaque;
SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
- aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
-
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
@@ -367,7 +366,6 @@ static void scsi_dma_complete(void *opaque, int ret)
block_acct_done(blk_get_stats(s->qdev.conf.blk), &r->acct);
}
scsi_dma_complete_noio(r, ret);
- aio_context_release(blk_get_aio_context(s->qdev.conf.blk));
}
static void scsi_read_complete_noio(SCSIDiskReq *r, int ret)
diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c
index 7d766a5e89..42af18719a 100644
--- a/softmmu/dma-helpers.c
+++ b/softmmu/dma-helpers.c
@@ -127,17 +127,19 @@ static void dma_complete(DMAAIOCB *dbs, int ret)
static void dma_blk_cb(void *opaque, int ret)
{
DMAAIOCB *dbs = (DMAAIOCB *)opaque;
+ AioContext *ctx = dbs->ctx;
dma_addr_t cur_addr, cur_len;
void *mem;
trace_dma_blk_cb(dbs, ret);
+ aio_context_acquire(ctx);
dbs->acb = NULL;
dbs->offset += dbs->iov.size;
if (dbs->sg_cur_index == dbs->sg->nsg || ret < 0) {
dma_complete(dbs, ret);
- return;
+ goto out;
}
dma_blk_unmap(dbs);
@@ -177,9 +179,9 @@ static void dma_blk_cb(void *opaque, int ret)
if (dbs->iov.size == 0) {
trace_dma_map_wait(dbs);
- dbs->bh = aio_bh_new(dbs->ctx, reschedule_dma, dbs);
+ dbs->bh = aio_bh_new(ctx, reschedule_dma, dbs);
cpu_register_map_client(dbs->bh);
- return;
+ goto out;
}
if (!QEMU_IS_ALIGNED(dbs->iov.size, dbs->align)) {
@@ -187,11 +189,11 @@ static void dma_blk_cb(void *opaque, int ret)
QEMU_ALIGN_DOWN(dbs->iov.size, dbs->align));
}
- aio_context_acquire(dbs->ctx);
dbs->acb = dbs->io_func(dbs->offset, &dbs->iov,
dma_blk_cb, dbs, dbs->io_func_opaque);
- aio_context_release(dbs->ctx);
assert(dbs->acb);
+out:
+ aio_context_release(ctx);
}
static void dma_aio_cancel(BlockAIOCB *acb)
--
2.37.3

View File

@ -0,0 +1,61 @@
From 7693449b235bbab6d32a1b87fa1d0e101c786f3b Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Thu, 9 Mar 2023 08:11:14 -0500
Subject: [PATCH 05/13] edu: add smp_mb__after_rmw()
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 263: qatomic: add smp_mb__before/after_rmw()
RH-Bugzilla: 2168472
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Commit: [5/10] 300901290e08b253b1278eedc39cd07c1e202b96
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168472
commit 2482aeea4195ad84cf3d4e5b15b28ec5b420ed5a
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Thu Mar 2 11:16:13 2023 +0100
edu: add smp_mb__after_rmw()
Ensure ordering between clearing the COMPUTING flag and checking
IRQFACT, and between setting the IRQFACT flag and checking
COMPUTING. This ensures that no wakeups are lost.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
hw/misc/edu.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/hw/misc/edu.c b/hw/misc/edu.c
index e935c418d4..a1f8bc77e7 100644
--- a/hw/misc/edu.c
+++ b/hw/misc/edu.c
@@ -267,6 +267,8 @@ static void edu_mmio_write(void *opaque, hwaddr addr, uint64_t val,
case 0x20:
if (val & EDU_STATUS_IRQFACT) {
qatomic_or(&edu->status, EDU_STATUS_IRQFACT);
+ /* Order check of the COMPUTING flag after setting IRQFACT. */
+ smp_mb__after_rmw();
} else {
qatomic_and(&edu->status, ~EDU_STATUS_IRQFACT);
}
@@ -349,6 +351,9 @@ static void *edu_fact_thread(void *opaque)
qemu_mutex_unlock(&edu->thr_mutex);
qatomic_and(&edu->status, ~EDU_STATUS_COMPUTING);
+ /* Clear COMPUTING flag before checking IRQFACT. */
+ smp_mb__after_rmw();
+
if (qatomic_read(&edu->status) & EDU_STATUS_IRQFACT) {
qemu_mutex_lock_iothread();
edu_raise_irq(edu, FACT_IRQ);
--
2.37.3

View File

@ -1,179 +0,0 @@
From 8a12049e97149056f61f7748d9869606d282d16e Mon Sep 17 00:00:00 2001
From: Gavin Shan <gshan@redhat.com>
Date: Wed, 11 May 2022 18:01:35 +0800
Subject: [PATCH 06/16] hw/acpi/aml-build: Use existing CPU topology to build
PPTT table
RH-Author: Gavin Shan <gshan@redhat.com>
RH-MergeRequest: 86: hw/arm/virt: Fix the default CPU topology
RH-Commit: [6/6] 53fa376531c204cf706cc1a7a0499019756106cb (gwshan/qemu-rhel-9)
RH-Bugzilla: 2041823
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Andrew Jones <drjones@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2041823
When the PPTT table is built, the CPU topology is re-calculated, but
it's unecessary because the CPU topology has been populated in
virt_possible_cpu_arch_ids() on arm/virt machine.
This reworks build_pptt() to avoid by reusing the existing IDs in
ms->possible_cpus. Currently, the only user of build_pptt() is
arm/virt machine.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Tested-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20220503140304.855514-7-gshan@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit ae9141d4a3265553503bf07d3574b40f84615a34)
Signed-off-by: Gavin Shan <gshan@redhat.com>
---
hw/acpi/aml-build.c | 111 +++++++++++++++++++-------------------------
1 file changed, 48 insertions(+), 63 deletions(-)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 4086879ebf..e6bfac95c7 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -2002,86 +2002,71 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
const char *oem_id, const char *oem_table_id)
{
MachineClass *mc = MACHINE_GET_CLASS(ms);
- GQueue *list = g_queue_new();
- guint pptt_start = table_data->len;
- guint parent_offset;
- guint length, i;
- int uid = 0;
- int socket;
+ CPUArchIdList *cpus = ms->possible_cpus;
+ int64_t socket_id = -1, cluster_id = -1, core_id = -1;
+ uint32_t socket_offset = 0, cluster_offset = 0, core_offset = 0;
+ uint32_t pptt_start = table_data->len;
+ int n;
AcpiTable table = { .sig = "PPTT", .rev = 2,
.oem_id = oem_id, .oem_table_id = oem_table_id };
acpi_table_begin(&table, table_data);
- for (socket = 0; socket < ms->smp.sockets; socket++) {
- g_queue_push_tail(list,
- GUINT_TO_POINTER(table_data->len - pptt_start));
- build_processor_hierarchy_node(
- table_data,
- /*
- * Physical package - represents the boundary
- * of a physical package
- */
- (1 << 0),
- 0, socket, NULL, 0);
- }
-
- if (mc->smp_props.clusters_supported) {
- length = g_queue_get_length(list);
- for (i = 0; i < length; i++) {
- int cluster;
-
- parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
- for (cluster = 0; cluster < ms->smp.clusters; cluster++) {
- g_queue_push_tail(list,
- GUINT_TO_POINTER(table_data->len - pptt_start));
- build_processor_hierarchy_node(
- table_data,
- (0 << 0), /* not a physical package */
- parent_offset, cluster, NULL, 0);
- }
+ /*
+ * This works with the assumption that cpus[n].props.*_id has been
+ * sorted from top to down levels in mc->possible_cpu_arch_ids().
+ * Otherwise, the unexpected and duplicated containers will be
+ * created.
+ */
+ for (n = 0; n < cpus->len; n++) {
+ if (cpus->cpus[n].props.socket_id != socket_id) {
+ assert(cpus->cpus[n].props.socket_id > socket_id);
+ socket_id = cpus->cpus[n].props.socket_id;
+ cluster_id = -1;
+ core_id = -1;
+ socket_offset = table_data->len - pptt_start;
+ build_processor_hierarchy_node(table_data,
+ (1 << 0), /* Physical package */
+ 0, socket_id, NULL, 0);
}
- }
- length = g_queue_get_length(list);
- for (i = 0; i < length; i++) {
- int core;
-
- parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
- for (core = 0; core < ms->smp.cores; core++) {
- if (ms->smp.threads > 1) {
- g_queue_push_tail(list,
- GUINT_TO_POINTER(table_data->len - pptt_start));
- build_processor_hierarchy_node(
- table_data,
- (0 << 0), /* not a physical package */
- parent_offset, core, NULL, 0);
- } else {
- build_processor_hierarchy_node(
- table_data,
- (1 << 1) | /* ACPI Processor ID valid */
- (1 << 3), /* Node is a Leaf */
- parent_offset, uid++, NULL, 0);
+ if (mc->smp_props.clusters_supported) {
+ if (cpus->cpus[n].props.cluster_id != cluster_id) {
+ assert(cpus->cpus[n].props.cluster_id > cluster_id);
+ cluster_id = cpus->cpus[n].props.cluster_id;
+ core_id = -1;
+ cluster_offset = table_data->len - pptt_start;
+ build_processor_hierarchy_node(table_data,
+ (0 << 0), /* Not a physical package */
+ socket_offset, cluster_id, NULL, 0);
}
+ } else {
+ cluster_offset = socket_offset;
}
- }
- length = g_queue_get_length(list);
- for (i = 0; i < length; i++) {
- int thread;
+ if (ms->smp.threads == 1) {
+ build_processor_hierarchy_node(table_data,
+ (1 << 1) | /* ACPI Processor ID valid */
+ (1 << 3), /* Node is a Leaf */
+ cluster_offset, n, NULL, 0);
+ } else {
+ if (cpus->cpus[n].props.core_id != core_id) {
+ assert(cpus->cpus[n].props.core_id > core_id);
+ core_id = cpus->cpus[n].props.core_id;
+ core_offset = table_data->len - pptt_start;
+ build_processor_hierarchy_node(table_data,
+ (0 << 0), /* Not a physical package */
+ cluster_offset, core_id, NULL, 0);
+ }
- parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
- for (thread = 0; thread < ms->smp.threads; thread++) {
- build_processor_hierarchy_node(
- table_data,
+ build_processor_hierarchy_node(table_data,
(1 << 1) | /* ACPI Processor ID valid */
(1 << 2) | /* Processor is a Thread */
(1 << 3), /* Node is a Leaf */
- parent_offset, uid++, NULL, 0);
+ core_offset, n, NULL, 0);
}
}
- g_queue_free(list);
acpi_table_end(linker, &table);
}
--
2.31.1

View File

@ -1,74 +0,0 @@
From 3b05d3464945295112b5d02d142422f524a52054 Mon Sep 17 00:00:00 2001
From: Gavin Shan <gshan@redhat.com>
Date: Wed, 11 May 2022 18:01:35 +0800
Subject: [PATCH 03/16] hw/arm/virt: Consider SMP configuration in CPU topology
RH-Author: Gavin Shan <gshan@redhat.com>
RH-MergeRequest: 86: hw/arm/virt: Fix the default CPU topology
RH-Commit: [3/6] 7125b41f038c2b1cb33377d0ef1222f1ea42b648 (gwshan/qemu-rhel-9)
RH-Bugzilla: 2041823
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Andrew Jones <drjones@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2041823
Currently, the SMP configuration isn't considered when the CPU
topology is populated. In this case, it's impossible to provide
the default CPU-to-NUMA mapping or association based on the socket
ID of the given CPU.
This takes account of SMP configuration when the CPU topology
is populated. The die ID for the given CPU isn't assigned since
it's not supported on arm/virt machine. Besides, the used SMP
configuration in qtest/numa-test/aarch64_numa_cpu() is corrcted
to avoid testing failure
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 20220503140304.855514-4-gshan@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit c9ec4cb5e4936f980889e717524e73896b0200ed)
Signed-off-by: Gavin Shan <gshan@redhat.com>
---
hw/arm/virt.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 8be12e121d..a87c8d396a 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2553,6 +2553,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
int n;
unsigned int max_cpus = ms->smp.max_cpus;
VirtMachineState *vms = VIRT_MACHINE(ms);
+ MachineClass *mc = MACHINE_GET_CLASS(vms);
if (ms->possible_cpus) {
assert(ms->possible_cpus->len == max_cpus);
@@ -2566,8 +2567,20 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
ms->possible_cpus->cpus[n].type = ms->cpu_type;
ms->possible_cpus->cpus[n].arch_id =
virt_cpu_mp_affinity(vms, n);
+
+ assert(!mc->smp_props.dies_supported);
+ ms->possible_cpus->cpus[n].props.has_socket_id = true;
+ ms->possible_cpus->cpus[n].props.socket_id =
+ n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
+ ms->possible_cpus->cpus[n].props.has_cluster_id = true;
+ ms->possible_cpus->cpus[n].props.cluster_id =
+ (n / (ms->smp.cores * ms->smp.threads)) % ms->smp.clusters;
+ ms->possible_cpus->cpus[n].props.has_core_id = true;
+ ms->possible_cpus->cpus[n].props.core_id =
+ (n / ms->smp.threads) % ms->smp.cores;
ms->possible_cpus->cpus[n].props.has_thread_id = true;
- ms->possible_cpus->cpus[n].props.thread_id = n;
+ ms->possible_cpus->cpus[n].props.thread_id =
+ n % ms->smp.threads;
}
return ms->possible_cpus;
}
--
2.31.1

View File

@ -1,88 +0,0 @@
From 14e49ad3b98f01c1ad6fe456469d40a96a43dc3c Mon Sep 17 00:00:00 2001
From: Gavin Shan <gshan@redhat.com>
Date: Wed, 11 May 2022 18:01:35 +0800
Subject: [PATCH 05/16] hw/arm/virt: Fix CPU's default NUMA node ID
RH-Author: Gavin Shan <gshan@redhat.com>
RH-MergeRequest: 86: hw/arm/virt: Fix the default CPU topology
RH-Commit: [5/6] 5336f62bc0c53c0417db1d71ef89544907bc28c0 (gwshan/qemu-rhel-9)
RH-Bugzilla: 2041823
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Andrew Jones <drjones@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2041823
When CPU-to-NUMA association isn't explicitly provided by users,
the default one is given by mc->get_default_cpu_node_id(). However,
the CPU topology isn't fully considered in the default association
and this causes CPU topology broken warnings on booting Linux guest.
For example, the following warning messages are observed when the
Linux guest is booted with the following command lines.
/home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
-accel kvm -machine virt,gic-version=host \
-cpu host \
-smp 6,sockets=2,cores=3,threads=1 \
-m 1024M,slots=16,maxmem=64G \
-object memory-backend-ram,id=mem0,size=128M \
-object memory-backend-ram,id=mem1,size=128M \
-object memory-backend-ram,id=mem2,size=128M \
-object memory-backend-ram,id=mem3,size=128M \
-object memory-backend-ram,id=mem4,size=128M \
-object memory-backend-ram,id=mem4,size=384M \
-numa node,nodeid=0,memdev=mem0 \
-numa node,nodeid=1,memdev=mem1 \
-numa node,nodeid=2,memdev=mem2 \
-numa node,nodeid=3,memdev=mem3 \
-numa node,nodeid=4,memdev=mem4 \
-numa node,nodeid=5,memdev=mem5
:
alternatives: patching kernel code
BUG: arch topology borken
the CLS domain not a subset of the MC domain
<the above error log repeats>
BUG: arch topology borken
the DIE domain not a subset of the NODE domain
With current implementation of mc->get_default_cpu_node_id(),
CPU#0 to CPU#5 are associated with NODE#0 to NODE#5 separately.
That's incorrect because CPU#0/1/2 should be associated with same
NUMA node because they're seated in same socket.
This fixes the issue by considering the socket ID when the default
CPU-to-NUMA association is provided in virt_possible_cpu_arch_ids().
With this applied, no more CPU topology broken warnings are seen
from the Linux guest. The 6 CPUs are associated with NODE#0/1, but
there are no CPUs associated with NODE#2/3/4/5.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Message-id: 20220503140304.855514-6-gshan@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 4c18bc192386dfbca530e7f550e0992df657818a)
Signed-off-by: Gavin Shan <gshan@redhat.com>
---
hw/arm/virt.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index a87c8d396a..95d012d6eb 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2545,7 +2545,9 @@ virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
static int64_t virt_get_default_cpu_node_id(const MachineState *ms, int idx)
{
- return idx % ms->numa_state->num_nodes;
+ int64_t socket_id = ms->possible_cpus->cpus[idx].props.socket_id;
+
+ return socket_id % ms->numa_state->num_nodes;
}
static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
--
2.31.1

View File

@ -1,56 +0,0 @@
From e25c40735d2f022c07481b548d20476222006657 Mon Sep 17 00:00:00 2001
From: Eric Auger <eric.auger@redhat.com>
Date: Wed, 4 May 2022 11:11:54 +0200
Subject: [PATCH 2/5] hw/arm/virt: Fix missing initialization in
instance/class_init()
RH-Author: Eric Auger <eric.auger@redhat.com>
RH-MergeRequest: 82: hw/arm/virt: Remove the dtb-kaslr-seed machine option
RH-Commit: [2/2] 22cbbfc30cf57a09b8acfb25d8a4dff2754c630c (eauger1/centos-qemu-kvm)
RH-Bugzilla: 2046029
RH-Acked-by: Gavin Shan <gshan@redhat.com>
RH-Acked-by: Andrew Jones <drjones@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2046029
Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=45133161
Upstream Status: RHEL-only
Tested: Boot RHEL guest and check migration from 8.6 to 9.1
(with custom additions)
During the 7.0 rebase, the initialization of highmem_mmio and
highmem_redists was forgotten in rhel_virt_instance_init().
Fix it to match virt_instance_init() code.
Also mc->smp_props.clusters_supported was missing in
rhel_machine_class_init().
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/arm/virt.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index bde4f77994..8be12e121d 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3286,6 +3286,7 @@ static void rhel_machine_class_init(ObjectClass *oc, void *data)
hc->unplug_request = virt_machine_device_unplug_request_cb;
hc->unplug = virt_machine_device_unplug_cb;
mc->nvdimm_supported = true;
+ mc->smp_props.clusters_supported = true;
mc->auto_enable_numa_with_memhp = true;
mc->auto_enable_numa_with_memdev = true;
mc->default_ram_id = "mach-virt.ram";
@@ -3366,6 +3367,8 @@ static void rhel_virt_instance_init(Object *obj)
vms->gic_version = VIRT_GIC_VERSION_NOSEL;
vms->highmem_ecam = !vmc->no_highmem_ecam;
+ vms->highmem_mmio = true;
+ vms->highmem_redists = true;
if (vmc->no_its) {
vms->its = false;
--
2.31.1

View File

@ -1,76 +0,0 @@
From 69f771c3dc641431f3e98497cbd3832edb69284f Mon Sep 17 00:00:00 2001
From: Eric Auger <eric.auger@redhat.com>
Date: Tue, 3 May 2022 08:56:52 +0200
Subject: [PATCH 1/5] hw/arm/virt: Remove the dtb-kaslr-seed machine option
RH-Author: Eric Auger <eric.auger@redhat.com>
RH-MergeRequest: 82: hw/arm/virt: Remove the dtb-kaslr-seed machine option
RH-Commit: [1/2] a89dcd7f22e04ae39de99795d3f34cdd0b831bc0 (eauger1/centos-qemu-kvm)
RH-Bugzilla: 2046029
RH-Acked-by: Gavin Shan <gshan@redhat.com>
RH-Acked-by: Andrew Jones <drjones@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2046029
Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=45133161
Upstream Status: RHEL-only
Tested: Boot RHEL guest and check the option is not available
In RHEL we do not want to expose the dtb-kaslr-seed virt machine
option. Indeed the default 'on' value matches our need as
random data in the DTB does not cause any boot failure and we
want to support KASLR for the guest.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
---
hw/arm/virt.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index e06862d22a..bde4f77994 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2350,6 +2350,7 @@ static void virt_set_its(Object *obj, bool value, Error **errp)
vms->its = value;
}
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static bool virt_get_dtb_kaslr_seed(Object *obj, Error **errp)
{
VirtMachineState *vms = VIRT_MACHINE(obj);
@@ -2363,6 +2364,7 @@ static void virt_set_dtb_kaslr_seed(Object *obj, bool value, Error **errp)
vms->dtb_kaslr_seed = value;
}
+#endif /* disabled for RHEL */
static char *virt_get_oem_id(Object *obj, Error **errp)
{
@@ -3346,13 +3348,6 @@ static void rhel_machine_class_init(ObjectClass *oc, void *data)
"Override the default value of field OEM Table ID "
"in ACPI table header."
"The string may be up to 8 bytes in size");
-
- object_class_property_add_bool(oc, "dtb-kaslr-seed",
- virt_get_dtb_kaslr_seed,
- virt_set_dtb_kaslr_seed);
- object_class_property_set_description(oc, "dtb-kaslr-seed",
- "Set off to disable passing of kaslr-seed "
- "dtb node to guest");
}
static void rhel_virt_instance_init(Object *obj)
@@ -3397,7 +3392,7 @@ static void rhel_virt_instance_init(Object *obj)
/* MTE is disabled by default and non-configurable for RHEL */
vms->mte = false;
- /* Supply a kaslr-seed by default */
+ /* Supply a kaslr-seed by default and non-configurable for RHEL */
vms->dtb_kaslr_seed = true;
vms->irqmap = a15irqmap;
--
2.31.1

View File

@ -1,95 +0,0 @@
From 4dad0e9abbc843fba4e5fee6e7aa1b0db13f5898 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <eperezma@redhat.com>
Date: Thu, 21 Jul 2022 15:27:35 +0200
Subject: [PATCH 03/32] hw/virtio: Replace g_memdup() by g_memdup2()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Eugenio Pérez <eperezma@redhat.com>
RH-MergeRequest: 108: Net Control Virtqueue shadow Support
RH-Commit: [3/27] ae196903eb1a7aebbf999100e997cf82e5024cb6 (eperezmartin/qemu-kvm)
RH-Bugzilla: 1939363
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Cindy Lu <lulu@redhat.com>
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
Bugzilla: https://bugzilla.redhat.com/1939363
Upstream Status: git://git.qemu.org/qemu.git
commit d792199de55ca5cb5334016884039c740290b5c7
Author: Philippe Mathieu-Daudé <f4bug@amsat.org>
Date: Thu May 12 19:57:46 2022 +0200
hw/virtio: Replace g_memdup() by g_memdup2()
Per https://discourse.gnome.org/t/port-your-module-from-g-memdup-to-g-memdup2-now/5538
The old API took the size of the memory to duplicate as a guint,
whereas most memory functions take memory sizes as a gsize. This
made it easy to accidentally pass a gsize to g_memdup(). For large
values, that would lead to a silent truncation of the size from 64
to 32 bits, and result in a heap area being returned which is
significantly smaller than what the caller expects. This can likely
be exploited in various modules to cause a heap buffer overflow.
Replace g_memdup() by the safer g_memdup2() wrapper.
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20220512175747.142058-6-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
hw/net/virtio-net.c | 3 ++-
hw/virtio/virtio-crypto.c | 6 +++---
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 099e65036d..633de61513 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1458,7 +1458,8 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
}
iov_cnt = elem->out_num;
- iov2 = iov = g_memdup(elem->out_sg, sizeof(struct iovec) * elem->out_num);
+ iov2 = iov = g_memdup2(elem->out_sg,
+ sizeof(struct iovec) * elem->out_num);
s = iov_to_buf(iov, iov_cnt, 0, &ctrl, sizeof(ctrl));
iov_discard_front(&iov, &iov_cnt, sizeof(ctrl));
if (s != sizeof(ctrl)) {
diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c
index dcd80b904d..0e31e3cc04 100644
--- a/hw/virtio/virtio-crypto.c
+++ b/hw/virtio/virtio-crypto.c
@@ -242,7 +242,7 @@ static void virtio_crypto_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
}
out_num = elem->out_num;
- out_iov_copy = g_memdup(elem->out_sg, sizeof(out_iov[0]) * out_num);
+ out_iov_copy = g_memdup2(elem->out_sg, sizeof(out_iov[0]) * out_num);
out_iov = out_iov_copy;
in_num = elem->in_num;
@@ -605,11 +605,11 @@ virtio_crypto_handle_request(VirtIOCryptoReq *request)
}
out_num = elem->out_num;
- out_iov_copy = g_memdup(elem->out_sg, sizeof(out_iov[0]) * out_num);
+ out_iov_copy = g_memdup2(elem->out_sg, sizeof(out_iov[0]) * out_num);
out_iov = out_iov_copy;
in_num = elem->in_num;
- in_iov_copy = g_memdup(elem->in_sg, sizeof(in_iov[0]) * in_num);
+ in_iov_copy = g_memdup2(elem->in_sg, sizeof(in_iov[0]) * in_num);
in_iov = in_iov_copy;
if (unlikely(iov_to_buf(out_iov, out_num, 0, &req, sizeof(req))
--
2.31.1

View File

@ -0,0 +1,367 @@
From 88b5e059462a72ca758d84c0d4d0895a03baac50 Mon Sep 17 00:00:00 2001
From: "manish.mishra" <manish.mishra@nutanix.com>
Date: Tue, 20 Dec 2022 18:44:17 +0000
Subject: [PATCH 1/3] io: Add support for MSG_PEEK for socket channel
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Peter Xu <peterx@redhat.com>
RH-MergeRequest: 258: migration: Fix multifd crash due to channel disorder
RH-Bugzilla: 2137740
RH-Acked-by: quintela1 <quintela@redhat.com>
RH-Acked-by: Leonardo Brás <leobras@redhat.com>
RH-Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
RH-Commit: [1/2] 04fc6fae358599b8509f5355469d2e8720f01903
Conflicts:
io/channel-null.c
migration/channel-block.c
Because these two files do not exist in rhel8.8 tree, dropping the
changes.
MSG_PEEK peeks at the channel, The data is treated as unread and
the next read shall still return this data. This support is
currently added only for socket class. Extra parameter 'flags'
is added to io_readv calls to pass extra read flags like MSG_PEEK.
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: manish.mishra <manish.mishra@nutanix.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 84615a19ddf2bfb38d7b3a0d487d2397ee55e4f3)
Signed-off-by: Peter Xu <peterx@redhat.com>
---
chardev/char-socket.c | 4 ++--
include/io/channel.h | 6 ++++++
io/channel-buffer.c | 1 +
io/channel-command.c | 1 +
io/channel-file.c | 1 +
io/channel-socket.c | 19 ++++++++++++++++++-
io/channel-tls.c | 1 +
io/channel-websock.c | 1 +
io/channel.c | 16 ++++++++++++----
migration/rdma.c | 1 +
scsi/qemu-pr-helper.c | 2 +-
tests/qtest/tpm-emu.c | 2 +-
tests/unit/test-io-channel-socket.c | 1 +
util/vhost-user-server.c | 2 +-
14 files changed, 48 insertions(+), 10 deletions(-)
diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 836cfa0bc2..4cdf79e0c2 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -339,11 +339,11 @@ static ssize_t tcp_chr_recv(Chardev *chr, char *buf, size_t len)
if (qio_channel_has_feature(s->ioc, QIO_CHANNEL_FEATURE_FD_PASS)) {
ret = qio_channel_readv_full(s->ioc, &iov, 1,
&msgfds, &msgfds_num,
- NULL);
+ 0, NULL);
} else {
ret = qio_channel_readv_full(s->ioc, &iov, 1,
NULL, NULL,
- NULL);
+ 0, NULL);
}
if (ret == QIO_CHANNEL_ERR_BLOCK) {
diff --git a/include/io/channel.h b/include/io/channel.h
index c680ee7480..716235d496 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -34,6 +34,8 @@ OBJECT_DECLARE_TYPE(QIOChannel, QIOChannelClass,
#define QIO_CHANNEL_WRITE_FLAG_ZERO_COPY 0x1
+#define QIO_CHANNEL_READ_FLAG_MSG_PEEK 0x1
+
typedef enum QIOChannelFeature QIOChannelFeature;
enum QIOChannelFeature {
@@ -41,6 +43,7 @@ enum QIOChannelFeature {
QIO_CHANNEL_FEATURE_SHUTDOWN,
QIO_CHANNEL_FEATURE_LISTEN,
QIO_CHANNEL_FEATURE_WRITE_ZERO_COPY,
+ QIO_CHANNEL_FEATURE_READ_MSG_PEEK,
};
@@ -114,6 +117,7 @@ struct QIOChannelClass {
size_t niov,
int **fds,
size_t *nfds,
+ int flags,
Error **errp);
int (*io_close)(QIOChannel *ioc,
Error **errp);
@@ -188,6 +192,7 @@ void qio_channel_set_name(QIOChannel *ioc,
* @niov: the length of the @iov array
* @fds: pointer to an array that will received file handles
* @nfds: pointer filled with number of elements in @fds on return
+ * @flags: read flags (QIO_CHANNEL_READ_FLAG_*)
* @errp: pointer to a NULL-initialized error object
*
* Read data from the IO channel, storing it in the
@@ -224,6 +229,7 @@ ssize_t qio_channel_readv_full(QIOChannel *ioc,
size_t niov,
int **fds,
size_t *nfds,
+ int flags,
Error **errp);
diff --git a/io/channel-buffer.c b/io/channel-buffer.c
index bf52011be2..8096180f85 100644
--- a/io/channel-buffer.c
+++ b/io/channel-buffer.c
@@ -54,6 +54,7 @@ static ssize_t qio_channel_buffer_readv(QIOChannel *ioc,
size_t niov,
int **fds,
size_t *nfds,
+ int flags,
Error **errp)
{
QIOChannelBuffer *bioc = QIO_CHANNEL_BUFFER(ioc);
diff --git a/io/channel-command.c b/io/channel-command.c
index 5ff1691bad..2834413b3a 100644
--- a/io/channel-command.c
+++ b/io/channel-command.c
@@ -230,6 +230,7 @@ static ssize_t qio_channel_command_readv(QIOChannel *ioc,
size_t niov,
int **fds,
size_t *nfds,
+ int flags,
Error **errp)
{
QIOChannelCommand *cioc = QIO_CHANNEL_COMMAND(ioc);
diff --git a/io/channel-file.c b/io/channel-file.c
index 348a48545e..490f0e5d84 100644
--- a/io/channel-file.c
+++ b/io/channel-file.c
@@ -86,6 +86,7 @@ static ssize_t qio_channel_file_readv(QIOChannel *ioc,
size_t niov,
int **fds,
size_t *nfds,
+ int flags,
Error **errp)
{
QIOChannelFile *fioc = QIO_CHANNEL_FILE(ioc);
diff --git a/io/channel-socket.c b/io/channel-socket.c
index 6010ad7017..ca8b180b69 100644
--- a/io/channel-socket.c
+++ b/io/channel-socket.c
@@ -174,6 +174,9 @@ int qio_channel_socket_connect_sync(QIOChannelSocket *ioc,
}
#endif
+ qio_channel_set_feature(QIO_CHANNEL(ioc),
+ QIO_CHANNEL_FEATURE_READ_MSG_PEEK);
+
return 0;
}
@@ -407,6 +410,9 @@ qio_channel_socket_accept(QIOChannelSocket *ioc,
}
#endif /* WIN32 */
+ qio_channel_set_feature(QIO_CHANNEL(cioc),
+ QIO_CHANNEL_FEATURE_READ_MSG_PEEK);
+
trace_qio_channel_socket_accept_complete(ioc, cioc, cioc->fd);
return cioc;
@@ -497,6 +503,7 @@ static ssize_t qio_channel_socket_readv(QIOChannel *ioc,
size_t niov,
int **fds,
size_t *nfds,
+ int flags,
Error **errp)
{
QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc);
@@ -518,6 +525,10 @@ static ssize_t qio_channel_socket_readv(QIOChannel *ioc,
}
+ if (flags & QIO_CHANNEL_READ_FLAG_MSG_PEEK) {
+ sflags |= MSG_PEEK;
+ }
+
retry:
ret = recvmsg(sioc->fd, &msg, sflags);
if (ret < 0) {
@@ -625,11 +636,17 @@ static ssize_t qio_channel_socket_readv(QIOChannel *ioc,
size_t niov,
int **fds,
size_t *nfds,
+ int flags,
Error **errp)
{
QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc);
ssize_t done = 0;
ssize_t i;
+ int sflags = 0;
+
+ if (flags & QIO_CHANNEL_READ_FLAG_MSG_PEEK) {
+ sflags |= MSG_PEEK;
+ }
for (i = 0; i < niov; i++) {
ssize_t ret;
@@ -637,7 +654,7 @@ static ssize_t qio_channel_socket_readv(QIOChannel *ioc,
ret = recv(sioc->fd,
iov[i].iov_base,
iov[i].iov_len,
- 0);
+ sflags);
if (ret < 0) {
if (errno == EAGAIN) {
if (done) {
diff --git a/io/channel-tls.c b/io/channel-tls.c
index 4ce890a538..c730cb8ec5 100644
--- a/io/channel-tls.c
+++ b/io/channel-tls.c
@@ -260,6 +260,7 @@ static ssize_t qio_channel_tls_readv(QIOChannel *ioc,
size_t niov,
int **fds,
size_t *nfds,
+ int flags,
Error **errp)
{
QIOChannelTLS *tioc = QIO_CHANNEL_TLS(ioc);
diff --git a/io/channel-websock.c b/io/channel-websock.c
index 035dd6075b..13c94f2afe 100644
--- a/io/channel-websock.c
+++ b/io/channel-websock.c
@@ -1081,6 +1081,7 @@ static ssize_t qio_channel_websock_readv(QIOChannel *ioc,
size_t niov,
int **fds,
size_t *nfds,
+ int flags,
Error **errp)
{
QIOChannelWebsock *wioc = QIO_CHANNEL_WEBSOCK(ioc);
diff --git a/io/channel.c b/io/channel.c
index 0640941ac5..a8c7f11649 100644
--- a/io/channel.c
+++ b/io/channel.c
@@ -52,6 +52,7 @@ ssize_t qio_channel_readv_full(QIOChannel *ioc,
size_t niov,
int **fds,
size_t *nfds,
+ int flags,
Error **errp)
{
QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc);
@@ -63,7 +64,14 @@ ssize_t qio_channel_readv_full(QIOChannel *ioc,
return -1;
}
- return klass->io_readv(ioc, iov, niov, fds, nfds, errp);
+ if ((flags & QIO_CHANNEL_READ_FLAG_MSG_PEEK) &&
+ !qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) {
+ error_setg_errno(errp, EINVAL,
+ "Channel does not support peek read");
+ return -1;
+ }
+
+ return klass->io_readv(ioc, iov, niov, fds, nfds, flags, errp);
}
@@ -146,7 +154,7 @@ int qio_channel_readv_full_all_eof(QIOChannel *ioc,
while ((nlocal_iov > 0) || local_fds) {
ssize_t len;
len = qio_channel_readv_full(ioc, local_iov, nlocal_iov, local_fds,
- local_nfds, errp);
+ local_nfds, 0, errp);
if (len == QIO_CHANNEL_ERR_BLOCK) {
if (qemu_in_coroutine()) {
qio_channel_yield(ioc, G_IO_IN);
@@ -284,7 +292,7 @@ ssize_t qio_channel_readv(QIOChannel *ioc,
size_t niov,
Error **errp)
{
- return qio_channel_readv_full(ioc, iov, niov, NULL, NULL, errp);
+ return qio_channel_readv_full(ioc, iov, niov, NULL, NULL, 0, errp);
}
@@ -303,7 +311,7 @@ ssize_t qio_channel_read(QIOChannel *ioc,
Error **errp)
{
struct iovec iov = { .iov_base = buf, .iov_len = buflen };
- return qio_channel_readv_full(ioc, &iov, 1, NULL, NULL, errp);
+ return qio_channel_readv_full(ioc, &iov, 1, NULL, NULL, 0, errp);
}
diff --git a/migration/rdma.c b/migration/rdma.c
index 54acd2000e..dcf98bd7f8 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -2917,6 +2917,7 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
size_t niov,
int **fds,
size_t *nfds,
+ int flags,
Error **errp)
{
QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
diff --git a/scsi/qemu-pr-helper.c b/scsi/qemu-pr-helper.c
index f281daeced..12ec8e9368 100644
--- a/scsi/qemu-pr-helper.c
+++ b/scsi/qemu-pr-helper.c
@@ -612,7 +612,7 @@ static int coroutine_fn prh_read(PRHelperClient *client, void *buf, int sz,
iov.iov_base = buf;
iov.iov_len = sz;
n_read = qio_channel_readv_full(QIO_CHANNEL(client->ioc), &iov, 1,
- &fds, &nfds, errp);
+ &fds, &nfds, 0, errp);
if (n_read == QIO_CHANNEL_ERR_BLOCK) {
qio_channel_yield(QIO_CHANNEL(client->ioc), G_IO_IN);
diff --git a/tests/qtest/tpm-emu.c b/tests/qtest/tpm-emu.c
index 2994d1cf42..3cf1acaf7d 100644
--- a/tests/qtest/tpm-emu.c
+++ b/tests/qtest/tpm-emu.c
@@ -106,7 +106,7 @@ void *tpm_emu_ctrl_thread(void *data)
int *pfd = NULL;
size_t nfd = 0;
- qio_channel_readv_full(ioc, &iov, 1, &pfd, &nfd, &error_abort);
+ qio_channel_readv_full(ioc, &iov, 1, &pfd, &nfd, 0, &error_abort);
cmd = be32_to_cpu(cmd);
g_assert_cmpint(cmd, ==, CMD_SET_DATAFD);
g_assert_cmpint(nfd, ==, 1);
diff --git a/tests/unit/test-io-channel-socket.c b/tests/unit/test-io-channel-socket.c
index 6713886d02..de2930f203 100644
--- a/tests/unit/test-io-channel-socket.c
+++ b/tests/unit/test-io-channel-socket.c
@@ -452,6 +452,7 @@ static void test_io_channel_unix_fd_pass(void)
G_N_ELEMENTS(iorecv),
&fdrecv,
&nfdrecv,
+ 0,
&error_abort);
g_assert(nfdrecv == G_N_ELEMENTS(fdsend));
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index 783d847a6d..e6a9ef72b7 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -102,7 +102,7 @@ vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
* qio_channel_readv_full may have short reads, keeping calling it
* until getting VHOST_USER_HDR_SIZE or 0 bytes in total
*/
- rc = qio_channel_readv_full(ioc, &iov, 1, &fds, &nfds, &local_err);
+ rc = qio_channel_readv_full(ioc, &iov, 1, &fds, &nfds, 0, &local_err);
if (rc < 0) {
if (rc == QIO_CHANNEL_ERR_BLOCK) {
assert(local_err == NULL);
--
2.37.3

View File

@ -0,0 +1,290 @@
From 93ec857c46911b95ed8e3abc6a9d432ae847c084 Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Mon, 16 Jan 2023 07:51:56 -0500
Subject: [PATCH 06/11] kvm: Atomic memslot updates
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 247: accel: introduce accelerator blocker API
RH-Bugzilla: 2161188
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [3/3] 520e41c0f58066a7381a5f6b32b81bc01cce51c0
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2161188
commit f39b7d2b96e3e73c01bb678cd096f7baf0b9ab39
Author: David Hildenbrand <david@redhat.com>
Date: Fri Nov 11 10:47:58 2022 -0500
kvm: Atomic memslot updates
If we update an existing memslot (e.g., resize, split), we temporarily
remove the memslot to re-add it immediately afterwards. These updates
are not atomic, especially not for KVM VCPU threads, such that we can
get spurious faults.
Let's inhibit most KVM ioctls while performing relevant updates, such
that we can perform the update just as if it would happen atomically
without additional kernel support.
We capture the add/del changes and apply them in the notifier commit
stage instead. There, we can check for overlaps and perform the ioctl
inhibiting only if really required (-> overlap).
To keep things simple we don't perform additional checks that wouldn't
actually result in an overlap -- such as !RAM memory regions in some
cases (see kvm_set_phys_mem()).
To minimize cache-line bouncing, use a separate indicator
(in_ioctl_lock) per CPU. Also, make sure to hold the kvm_slots_lock
while performing both actions (removing+re-adding).
We have to wait until all IOCTLs were exited and block new ones from
getting executed.
This approach cannot result in a deadlock as long as the inhibitor does
not hold any locks that might hinder an IOCTL from getting finished and
exited - something fairly unusual. The inhibitor will always hold the BQL.
AFAIKs, one possible candidate would be userfaultfd. If a page cannot be
placed (e.g., during postcopy), because we're waiting for a lock, or if the
userfaultfd thread cannot process a fault, because it is waiting for a
lock, there could be a deadlock. However, the BQL is not applicable here,
because any other guest memory access while holding the BQL would already
result in a deadlock.
Nothing else in the kernel should block forever and wait for userspace
intervention.
Note: pause_all_vcpus()/resume_all_vcpus() or
start_exclusive()/end_exclusive() cannot be used, as they either drop
the BQL or require to be called without the BQL - something inhibitors
cannot handle. We need a low-level locking mechanism that is
deadlock-free even when not releasing the BQL.
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Tested-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20221111154758.1372674-4-eesposit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Conflicts:
accel/kvm/kvm-all.c: include "sysemu/dirtylimit.h" is missing in
rhel 8.8.0
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
accel/kvm/kvm-all.c | 101 ++++++++++++++++++++++++++++++++++-----
include/sysemu/kvm_int.h | 8 ++++
2 files changed, 98 insertions(+), 11 deletions(-)
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 221aadfda7..3b7bc39823 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -31,6 +31,7 @@
#include "sysemu/kvm_int.h"
#include "sysemu/runstate.h"
#include "sysemu/cpus.h"
+#include "sysemu/accel-blocker.h"
#include "qemu/bswap.h"
#include "exec/memory.h"
#include "exec/ram_addr.h"
@@ -45,6 +46,7 @@
#include "qemu/guest-random.h"
#include "sysemu/hw_accel.h"
#include "kvm-cpus.h"
+#include "qemu/range.h"
#include "hw/boards.h"
@@ -1334,6 +1336,7 @@ void kvm_set_max_memslot_size(hwaddr max_slot_size)
kvm_max_slot_size = max_slot_size;
}
+/* Called with KVMMemoryListener.slots_lock held */
static void kvm_set_phys_mem(KVMMemoryListener *kml,
MemoryRegionSection *section, bool add)
{
@@ -1368,14 +1371,12 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
ram = memory_region_get_ram_ptr(mr) + mr_offset;
ram_start_offset = memory_region_get_ram_addr(mr) + mr_offset;
- kvm_slots_lock();
-
if (!add) {
do {
slot_size = MIN(kvm_max_slot_size, size);
mem = kvm_lookup_matching_slot(kml, start_addr, slot_size);
if (!mem) {
- goto out;
+ return;
}
if (mem->flags & KVM_MEM_LOG_DIRTY_PAGES) {
/*
@@ -1413,7 +1414,7 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
start_addr += slot_size;
size -= slot_size;
} while (size);
- goto out;
+ return;
}
/* register the new slot */
@@ -1438,9 +1439,6 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
ram += slot_size;
size -= slot_size;
} while (size);
-
-out:
- kvm_slots_unlock();
}
static void *kvm_dirty_ring_reaper_thread(void *data)
@@ -1492,18 +1490,95 @@ static void kvm_region_add(MemoryListener *listener,
MemoryRegionSection *section)
{
KVMMemoryListener *kml = container_of(listener, KVMMemoryListener, listener);
+ KVMMemoryUpdate *update;
+
+ update = g_new0(KVMMemoryUpdate, 1);
+ update->section = *section;
- memory_region_ref(section->mr);
- kvm_set_phys_mem(kml, section, true);
+ QSIMPLEQ_INSERT_TAIL(&kml->transaction_add, update, next);
}
static void kvm_region_del(MemoryListener *listener,
MemoryRegionSection *section)
{
KVMMemoryListener *kml = container_of(listener, KVMMemoryListener, listener);
+ KVMMemoryUpdate *update;
+
+ update = g_new0(KVMMemoryUpdate, 1);
+ update->section = *section;
+
+ QSIMPLEQ_INSERT_TAIL(&kml->transaction_del, update, next);
+}
+
+static void kvm_region_commit(MemoryListener *listener)
+{
+ KVMMemoryListener *kml = container_of(listener, KVMMemoryListener,
+ listener);
+ KVMMemoryUpdate *u1, *u2;
+ bool need_inhibit = false;
+
+ if (QSIMPLEQ_EMPTY(&kml->transaction_add) &&
+ QSIMPLEQ_EMPTY(&kml->transaction_del)) {
+ return;
+ }
+
+ /*
+ * We have to be careful when regions to add overlap with ranges to remove.
+ * We have to simulate atomic KVM memslot updates by making sure no ioctl()
+ * is currently active.
+ *
+ * The lists are order by addresses, so it's easy to find overlaps.
+ */
+ u1 = QSIMPLEQ_FIRST(&kml->transaction_del);
+ u2 = QSIMPLEQ_FIRST(&kml->transaction_add);
+ while (u1 && u2) {
+ Range r1, r2;
+
+ range_init_nofail(&r1, u1->section.offset_within_address_space,
+ int128_get64(u1->section.size));
+ range_init_nofail(&r2, u2->section.offset_within_address_space,
+ int128_get64(u2->section.size));
+
+ if (range_overlaps_range(&r1, &r2)) {
+ need_inhibit = true;
+ break;
+ }
+ if (range_lob(&r1) < range_lob(&r2)) {
+ u1 = QSIMPLEQ_NEXT(u1, next);
+ } else {
+ u2 = QSIMPLEQ_NEXT(u2, next);
+ }
+ }
+
+ kvm_slots_lock();
+ if (need_inhibit) {
+ accel_ioctl_inhibit_begin();
+ }
+
+ /* Remove all memslots before adding the new ones. */
+ while (!QSIMPLEQ_EMPTY(&kml->transaction_del)) {
+ u1 = QSIMPLEQ_FIRST(&kml->transaction_del);
+ QSIMPLEQ_REMOVE_HEAD(&kml->transaction_del, next);
- kvm_set_phys_mem(kml, section, false);
- memory_region_unref(section->mr);
+ kvm_set_phys_mem(kml, &u1->section, false);
+ memory_region_unref(u1->section.mr);
+
+ g_free(u1);
+ }
+ while (!QSIMPLEQ_EMPTY(&kml->transaction_add)) {
+ u1 = QSIMPLEQ_FIRST(&kml->transaction_add);
+ QSIMPLEQ_REMOVE_HEAD(&kml->transaction_add, next);
+
+ memory_region_ref(u1->section.mr);
+ kvm_set_phys_mem(kml, &u1->section, true);
+
+ g_free(u1);
+ }
+
+ if (need_inhibit) {
+ accel_ioctl_inhibit_end();
+ }
+ kvm_slots_unlock();
}
static void kvm_log_sync(MemoryListener *listener,
@@ -1647,8 +1722,12 @@ void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
kml->slots[i].slot = i;
}
+ QSIMPLEQ_INIT(&kml->transaction_add);
+ QSIMPLEQ_INIT(&kml->transaction_del);
+
kml->listener.region_add = kvm_region_add;
kml->listener.region_del = kvm_region_del;
+ kml->listener.commit = kvm_region_commit;
kml->listener.log_start = kvm_log_start;
kml->listener.log_stop = kvm_log_stop;
kml->listener.priority = 10;
diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index 1f5487d9b7..7e18c0a3c0 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -11,6 +11,7 @@
#include "exec/memory.h"
#include "qemu/accel.h"
+#include "qemu/queue.h"
#include "sysemu/kvm.h"
typedef struct KVMSlot
@@ -30,10 +31,17 @@ typedef struct KVMSlot
ram_addr_t ram_start_offset;
} KVMSlot;
+typedef struct KVMMemoryUpdate {
+ QSIMPLEQ_ENTRY(KVMMemoryUpdate) next;
+ MemoryRegionSection section;
+} KVMMemoryUpdate;
+
typedef struct KVMMemoryListener {
MemoryListener listener;
KVMSlot *slots;
int as_id;
+ QSIMPLEQ_HEAD(, KVMMemoryUpdate) transaction_add;
+ QSIMPLEQ_HEAD(, KVMMemoryUpdate) transaction_del;
} KVMMemoryListener;
void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
--
2.37.3

View File

@ -1,62 +0,0 @@
From 9ddefaedf423ec03eadaf17496c14e0d7b2381c8 Mon Sep 17 00:00:00 2001
From: Cornelia Huck <cohuck@redhat.com>
Date: Thu, 28 Jul 2022 16:24:46 +0200
Subject: [PATCH 30/32] kvm: don't use perror() without useful errno
RH-Author: Cornelia Huck <cohuck@redhat.com>
RH-MergeRequest: 110: kvm: don't use perror() without useful errno
RH-Commit: [1/1] 20e51aac6767c1f89f74c7d692d1fb7689eff5f0 (cohuck/qemu-kvm-c9s)
RH-Bugzilla: 2095608
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Gavin Shan <gshan@redhat.com>
perror() is designed to append the decoded errno value to a
string. This, however, only makes sense if we called something that
actually sets errno prior to that.
For the callers that check for split irqchip support that is not the
case, and we end up with confusing error messages that end in
"success". Use error_report() instead.
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20220728142446.438177-1-cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=2095608
(cherry picked from commit 47c182fe8b03c0c40059fb95840923e65c9bdb4f)
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
---
accel/kvm/kvm-all.c | 2 +-
target/arm/kvm.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 5f1377ca04..e9c7947640 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2254,7 +2254,7 @@ static void kvm_irqchip_create(KVMState *s)
ret = kvm_arch_irqchip_create(s);
if (ret == 0) {
if (s->kernel_irqchip_split == ON_OFF_AUTO_ON) {
- perror("Split IRQ chip mode not supported.");
+ error_report("Split IRQ chip mode not supported.");
exit(1);
} else {
ret = kvm_vm_ioctl(s, KVM_CREATE_IRQCHIP);
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index bbf1ce7ba3..0a2ba1f8e3 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -960,7 +960,7 @@ void kvm_arch_init_irq_routing(KVMState *s)
int kvm_arch_irqchip_create(KVMState *s)
{
if (kvm_kernel_irqchip_split()) {
- perror("-machine kernel_irqchip=split is not supported on ARM.");
+ error_report("-machine kernel_irqchip=split is not supported on ARM.");
exit(1);
}
--
2.31.1

View File

@ -1,154 +0,0 @@
From 51c310097832724bafac26aed81399da40128400 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <eperezma@redhat.com>
Date: Thu, 21 Jul 2022 15:50:43 +0200
Subject: [PATCH 05/32] meson: create have_vhost_* variables
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Eugenio Pérez <eperezma@redhat.com>
RH-MergeRequest: 108: Net Control Virtqueue shadow Support
RH-Commit: [5/27] 3b30f89e6d639923dc9d9a92a4261bb4509e5c83 (eperezmartin/qemu-kvm)
RH-Bugzilla: 1939363
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Cindy Lu <lulu@redhat.com>
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
Bugzilla: https://bugzilla.redhat.com/1939363
Upstream Status: git://git.qemu.org/qemu.git
commit 2a3129a37652e5e81d12f6e16dd3c447f09831f9
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Wed Apr 20 17:34:05 2022 +0200
meson: create have_vhost_* variables
When using Meson options rather than config-host.h, the "when" clauses
have to be changed to if statements (which is not necessarily great,
though at least it highlights which parts of the build are per-target
and which are not).
Do that before moving vhost logic to meson.build, though for now
the variables are just based on config-host.mak data.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
meson.build | 30 ++++++++++++++++++++----------
tests/meson.build | 2 +-
tools/meson.build | 2 +-
3 files changed, 22 insertions(+), 12 deletions(-)
diff --git a/meson.build b/meson.build
index 13e3323380..735f538497 100644
--- a/meson.build
+++ b/meson.build
@@ -298,6 +298,15 @@ have_tpm = get_option('tpm') \
.require(targetos != 'windows', error_message: 'TPM emulation only available on POSIX systems') \
.allowed()
+# vhost
+have_vhost_user = 'CONFIG_VHOST_USER' in config_host
+have_vhost_vdpa = 'CONFIG_VHOST_VDPA' in config_host
+have_vhost_kernel = 'CONFIG_VHOST_KERNEL' in config_host
+have_vhost_net_user = 'CONFIG_VHOST_NET_USER' in config_host
+have_vhost_net_vdpa = 'CONFIG_VHOST_NET_VDPA' in config_host
+have_vhost_net = 'CONFIG_VHOST_NET' in config_host
+have_vhost_user_crypto = 'CONFIG_VHOST_CRYPTO' in config_host
+
# Target-specific libraries and flags
libm = cc.find_library('m', required: false)
threads = dependency('threads')
@@ -1335,7 +1344,7 @@ has_statx_mnt_id = cc.links(statx_mnt_id_test)
have_vhost_user_blk_server = get_option('vhost_user_blk_server') \
.require(targetos == 'linux',
error_message: 'vhost_user_blk_server requires linux') \
- .require('CONFIG_VHOST_USER' in config_host,
+ .require(have_vhost_user,
error_message: 'vhost_user_blk_server requires vhost-user support') \
.disable_auto_if(not have_system) \
.allowed()
@@ -2116,9 +2125,9 @@ host_kconfig = \
(have_ivshmem ? ['CONFIG_IVSHMEM=y'] : []) + \
('CONFIG_OPENGL' in config_host ? ['CONFIG_OPENGL=y'] : []) + \
(x11.found() ? ['CONFIG_X11=y'] : []) + \
- ('CONFIG_VHOST_USER' in config_host ? ['CONFIG_VHOST_USER=y'] : []) + \
- ('CONFIG_VHOST_VDPA' in config_host ? ['CONFIG_VHOST_VDPA=y'] : []) + \
- ('CONFIG_VHOST_KERNEL' in config_host ? ['CONFIG_VHOST_KERNEL=y'] : []) + \
+ (have_vhost_user ? ['CONFIG_VHOST_USER=y'] : []) + \
+ (have_vhost_vdpa ? ['CONFIG_VHOST_VDPA=y'] : []) + \
+ (have_vhost_kernel ? ['CONFIG_VHOST_KERNEL=y'] : []) + \
(have_virtfs ? ['CONFIG_VIRTFS=y'] : []) + \
('CONFIG_LINUX' in config_host ? ['CONFIG_LINUX=y'] : []) + \
('CONFIG_PVRDMA' in config_host ? ['CONFIG_PVRDMA=y'] : []) + \
@@ -2799,7 +2808,7 @@ if have_system or have_user
endif
vhost_user = not_found
-if targetos == 'linux' and 'CONFIG_VHOST_USER' in config_host
+if targetos == 'linux' and have_vhost_user
libvhost_user = subproject('libvhost-user')
vhost_user = libvhost_user.get_variable('vhost_user_dep')
endif
@@ -3386,7 +3395,7 @@ if have_tools
dependencies: qemuutil,
install: true)
- if 'CONFIG_VHOST_USER' in config_host
+ if have_vhost_user
subdir('contrib/vhost-user-blk')
subdir('contrib/vhost-user-gpu')
subdir('contrib/vhost-user-input')
@@ -3516,15 +3525,16 @@ if 'simple' in get_option('trace_backends')
endif
summary_info += {'D-Bus display': dbus_display}
summary_info += {'QOM debugging': get_option('qom_cast_debug')}
-summary_info += {'vhost-kernel support': config_host.has_key('CONFIG_VHOST_KERNEL')}
-summary_info += {'vhost-net support': config_host.has_key('CONFIG_VHOST_NET')}
-summary_info += {'vhost-crypto support': config_host.has_key('CONFIG_VHOST_CRYPTO')}
+summary_info += {'vhost-kernel support': have_vhost_kernel}
+summary_info += {'vhost-net support': have_vhost_net}
+summary_info += {'vhost-user support': have_vhost_user}
+summary_info += {'vhost-user-crypto support': have_vhost_user_crypto}
summary_info += {'vhost-scsi support': config_host.has_key('CONFIG_VHOST_SCSI')}
summary_info += {'vhost-vsock support': config_host.has_key('CONFIG_VHOST_VSOCK')}
-summary_info += {'vhost-user support': config_host.has_key('CONFIG_VHOST_USER')}
summary_info += {'vhost-user-blk server support': have_vhost_user_blk_server}
summary_info += {'vhost-user-fs support': config_host.has_key('CONFIG_VHOST_USER_FS')}
summary_info += {'vhost-vdpa support': config_host.has_key('CONFIG_VHOST_VDPA')}
+summary_info += {'vhost-vdpa support': have_vhost_vdpa}
summary_info += {'build guest agent': have_ga}
summary(summary_info, bool_yn: true, section: 'Configurable features')
diff --git a/tests/meson.build b/tests/meson.build
index 1d05109eb4..bbe41c8559 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -70,7 +70,7 @@ test_deps = {
'test-qht-par': qht_bench,
}
-if have_tools and 'CONFIG_VHOST_USER' in config_host and 'CONFIG_LINUX' in config_host
+if have_tools and have_vhost_user and 'CONFIG_LINUX' in config_host
executable('vhost-user-bridge',
sources: files('vhost-user-bridge.c'),
dependencies: [qemuutil, vhost_user])
diff --git a/tools/meson.build b/tools/meson.build
index 46977af84f..10eb3a043f 100644
--- a/tools/meson.build
+++ b/tools/meson.build
@@ -3,7 +3,7 @@ have_virtiofsd = get_option('virtiofsd') \
error_message: 'virtiofsd requires Linux') \
.require(seccomp.found() and libcap_ng.found(),
error_message: 'virtiofsd requires libcap-ng-devel and seccomp-devel') \
- .require('CONFIG_VHOST_USER' in config_host,
+ .require(have_vhost_user,
error_message: 'virtiofsd needs vhost-user-support') \
.disable_auto_if(not have_tools and not have_system) \
.allowed()
--
2.31.1

View File

@ -1,213 +0,0 @@
From a7d57a09e33275d5e6649273b5c9da1bc3c92491 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <eperezma@redhat.com>
Date: Thu, 21 Jul 2022 15:51:53 +0200
Subject: [PATCH 06/32] meson: use have_vhost_* variables to pick sources
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Eugenio Pérez <eperezma@redhat.com>
RH-MergeRequest: 108: Net Control Virtqueue shadow Support
RH-Commit: [6/27] bc3db1efb759c0bc97fde2f4fbb3d6dc404c8d3d (eperezmartin/qemu-kvm)
RH-Bugzilla: 1939363
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Cindy Lu <lulu@redhat.com>
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
Bugzilla: https://bugzilla.redhat.com/1939363
Upstream Status: git://git.qemu.org/qemu.git
commit 43b6d7ee1fbc5b5fb7c85d8131fdac1863214ad6
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Wed Apr 20 17:34:06 2022 +0200
meson: use have_vhost_* variables to pick sources
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
Kconfig.host | 3 ---
backends/meson.build | 8 ++++++--
hw/net/meson.build | 8 ++++++--
hw/virtio/Kconfig | 3 ---
hw/virtio/meson.build | 25 ++++++++++++++++---------
meson.build | 1 +
net/meson.build | 12 +++++++-----
tests/qtest/meson.build | 4 +++-
8 files changed, 39 insertions(+), 25 deletions(-)
diff --git a/Kconfig.host b/Kconfig.host
index 60b9c07b5e..1165c4eacd 100644
--- a/Kconfig.host
+++ b/Kconfig.host
@@ -22,15 +22,12 @@ config TPM
config VHOST_USER
bool
- select VHOST
config VHOST_VDPA
bool
- select VHOST
config VHOST_KERNEL
bool
- select VHOST
config VIRTFS
bool
diff --git a/backends/meson.build b/backends/meson.build
index 6e68945528..cb92f639ca 100644
--- a/backends/meson.build
+++ b/backends/meson.build
@@ -12,9 +12,13 @@ softmmu_ss.add([files(
softmmu_ss.add(when: 'CONFIG_POSIX', if_true: files('rng-random.c'))
softmmu_ss.add(when: 'CONFIG_POSIX', if_true: files('hostmem-file.c'))
softmmu_ss.add(when: 'CONFIG_LINUX', if_true: files('hostmem-memfd.c'))
-softmmu_ss.add(when: ['CONFIG_VHOST_USER', 'CONFIG_VIRTIO'], if_true: files('vhost-user.c'))
+if have_vhost_user
+ softmmu_ss.add(when: 'CONFIG_VIRTIO', if_true: files('vhost-user.c'))
+endif
softmmu_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-vhost.c'))
-softmmu_ss.add(when: ['CONFIG_VIRTIO_CRYPTO', 'CONFIG_VHOST_CRYPTO'], if_true: files('cryptodev-vhost-user.c'))
+if have_vhost_user_crypto
+ softmmu_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-vhost-user.c'))
+endif
softmmu_ss.add(when: 'CONFIG_GIO', if_true: [files('dbus-vmstate.c'), gio])
softmmu_ss.add(when: 'CONFIG_SGX', if_true: files('hostmem-epc.c'))
diff --git a/hw/net/meson.build b/hw/net/meson.build
index 685b75badb..ebac261542 100644
--- a/hw/net/meson.build
+++ b/hw/net/meson.build
@@ -46,8 +46,12 @@ specific_ss.add(when: 'CONFIG_XILINX_ETHLITE', if_true: files('xilinx_ethlite.c'
softmmu_ss.add(when: 'CONFIG_VIRTIO_NET', if_true: files('net_rx_pkt.c'))
specific_ss.add(when: 'CONFIG_VIRTIO_NET', if_true: files('virtio-net.c'))
-softmmu_ss.add(when: ['CONFIG_VIRTIO_NET', 'CONFIG_VHOST_NET'], if_true: files('vhost_net.c'), if_false: files('vhost_net-stub.c'))
-softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('vhost_net-stub.c'))
+if have_vhost_net
+ softmmu_ss.add(when: 'CONFIG_VIRTIO_NET', if_true: files('vhost_net.c'), if_false: files('vhost_net-stub.c'))
+ softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('vhost_net-stub.c'))
+else
+ softmmu_ss.add(files('vhost_net-stub.c'))
+endif
softmmu_ss.add(when: 'CONFIG_ETSEC', if_true: files(
'fsl_etsec/etsec.c',
diff --git a/hw/virtio/Kconfig b/hw/virtio/Kconfig
index c144d42f9b..8ca7b3d9d6 100644
--- a/hw/virtio/Kconfig
+++ b/hw/virtio/Kconfig
@@ -1,6 +1,3 @@
-config VHOST
- bool
-
config VIRTIO
bool
diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
index 67dc77e00f..30a832eb4a 100644
--- a/hw/virtio/meson.build
+++ b/hw/virtio/meson.build
@@ -2,18 +2,22 @@ softmmu_virtio_ss = ss.source_set()
softmmu_virtio_ss.add(files('virtio-bus.c'))
softmmu_virtio_ss.add(when: 'CONFIG_VIRTIO_PCI', if_true: files('virtio-pci.c'))
softmmu_virtio_ss.add(when: 'CONFIG_VIRTIO_MMIO', if_true: files('virtio-mmio.c'))
-softmmu_virtio_ss.add(when: 'CONFIG_VHOST', if_false: files('vhost-stub.c'))
-
-softmmu_ss.add_all(when: 'CONFIG_VIRTIO', if_true: softmmu_virtio_ss)
-softmmu_ss.add(when: 'CONFIG_VIRTIO', if_false: files('vhost-stub.c'))
-
-softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('vhost-stub.c'))
virtio_ss = ss.source_set()
virtio_ss.add(files('virtio.c'))
-virtio_ss.add(when: 'CONFIG_VHOST', if_true: files('vhost.c', 'vhost-backend.c', 'vhost-iova-tree.c'))
-virtio_ss.add(when: 'CONFIG_VHOST_USER', if_true: files('vhost-user.c'))
-virtio_ss.add(when: 'CONFIG_VHOST_VDPA', if_true: files('vhost-shadow-virtqueue.c', 'vhost-vdpa.c'))
+
+if have_vhost
+ virtio_ss.add(files('vhost.c', 'vhost-backend.c', 'vhost-iova-tree.c'))
+ if have_vhost_user
+ virtio_ss.add(files('vhost-user.c'))
+ endif
+ if have_vhost_vdpa
+ virtio_ss.add(files('vhost-vdpa.c', 'vhost-shadow-virtqueue.c'))
+ endif
+else
+ softmmu_virtio_ss.add(files('vhost-stub.c'))
+endif
+
virtio_ss.add(when: 'CONFIG_VIRTIO_BALLOON', if_true: files('virtio-balloon.c'))
virtio_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('virtio-crypto.c'))
virtio_ss.add(when: ['CONFIG_VIRTIO_CRYPTO', 'CONFIG_VIRTIO_PCI'], if_true: files('virtio-crypto-pci.c'))
@@ -53,3 +57,6 @@ virtio_pci_ss.add(when: 'CONFIG_VIRTIO_MEM', if_true: files('virtio-mem-pci.c'))
virtio_ss.add_all(when: 'CONFIG_VIRTIO_PCI', if_true: virtio_pci_ss)
specific_ss.add_all(when: 'CONFIG_VIRTIO', if_true: virtio_ss)
+softmmu_ss.add_all(when: 'CONFIG_VIRTIO', if_true: softmmu_virtio_ss)
+softmmu_ss.add(when: 'CONFIG_VIRTIO', if_false: files('vhost-stub.c'))
+softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('vhost-stub.c'))
diff --git a/meson.build b/meson.build
index 735f538497..9ba675f098 100644
--- a/meson.build
+++ b/meson.build
@@ -305,6 +305,7 @@ have_vhost_kernel = 'CONFIG_VHOST_KERNEL' in config_host
have_vhost_net_user = 'CONFIG_VHOST_NET_USER' in config_host
have_vhost_net_vdpa = 'CONFIG_VHOST_NET_VDPA' in config_host
have_vhost_net = 'CONFIG_VHOST_NET' in config_host
+have_vhost = have_vhost_user or have_vhost_vdpa or have_vhost_kernel
have_vhost_user_crypto = 'CONFIG_VHOST_CRYPTO' in config_host
# Target-specific libraries and flags
diff --git a/net/meson.build b/net/meson.build
index 847bc2ac85..c965e83b26 100644
--- a/net/meson.build
+++ b/net/meson.build
@@ -26,10 +26,10 @@ softmmu_ss.add(when: vde, if_true: files('vde.c'))
if have_netmap
softmmu_ss.add(files('netmap.c'))
endif
-vhost_user_ss = ss.source_set()
-vhost_user_ss.add(when: 'CONFIG_VIRTIO_NET', if_true: files('vhost-user.c'), if_false: files('vhost-user-stub.c'))
-softmmu_ss.add_all(when: 'CONFIG_VHOST_NET_USER', if_true: vhost_user_ss)
-softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('vhost-user-stub.c'))
+if have_vhost_net_user
+ softmmu_ss.add(when: 'CONFIG_VIRTIO_NET', if_true: files('vhost-user.c'), if_false: files('vhost-user-stub.c'))
+ softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('vhost-user-stub.c'))
+endif
softmmu_ss.add(when: 'CONFIG_LINUX', if_true: files('tap-linux.c'))
softmmu_ss.add(when: 'CONFIG_BSD', if_true: files('tap-bsd.c'))
@@ -40,6 +40,8 @@ if not config_host.has_key('CONFIG_LINUX') and not config_host.has_key('CONFIG_B
endif
softmmu_ss.add(when: 'CONFIG_POSIX', if_true: files(tap_posix))
softmmu_ss.add(when: 'CONFIG_WIN32', if_true: files('tap-win32.c'))
-softmmu_ss.add(when: 'CONFIG_VHOST_NET_VDPA', if_true: files('vhost-vdpa.c'))
+if have_vhost_net_vdpa
+ softmmu_ss.add(files('vhost-vdpa.c'))
+endif
subdir('can')
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 67cd32def1..9f550df900 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -269,7 +269,9 @@ qos_test_ss.add(
if have_virtfs
qos_test_ss.add(files('virtio-9p-test.c'))
endif
-qos_test_ss.add(when: 'CONFIG_VHOST_USER', if_true: files('vhost-user-test.c'))
+if have_vhost_user
+ qos_test_ss.add(files('vhost-user-test.c'))
+endif
if have_tools and have_vhost_user_blk_server
qos_test_ss.add(files('vhost-user-blk-test.c'))
endif
--
2.31.1

View File

@ -1,87 +0,0 @@
From 7c489b54b0bb33445113fbf16e88feb23be68013 Mon Sep 17 00:00:00 2001
From: Leonardo Bras <leobras@redhat.com>
Date: Fri, 13 May 2022 03:28:30 -0300
Subject: [PATCH 07/18] meson.build: Fix docker-test-build@alpine when
including linux/errqueue.h
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Leonardo Brás <leobras@redhat.com>
RH-MergeRequest: 95: MSG_ZEROCOPY + Multifd
RH-Commit: [1/11] f058eb846fcf611d527a1dd3b0cc399cdc17e3ee (LeoBras/centos-qemu-kvm)
RH-Bugzilla: 1968509
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
RH-Acked-by: Peter Xu <peterx@redhat.com>
A build error happens in alpine CI when linux/errqueue.h is included
in io/channel-socket.c, due to redefining of 'struct __kernel_timespec':
===
ninja: job failed: [...]
In file included from /usr/include/linux/errqueue.h:6,
from ../io/channel-socket.c:29:
/usr/include/linux/time_types.h:7:8: error: redefinition of 'struct __kernel_timespec'
7 | struct __kernel_timespec {
| ^~~~~~~~~~~~~~~~~
In file included from /usr/include/liburing.h:19,
from /builds/user/qemu/include/block/aio.h:18,
from /builds/user/qemu/include/io/channel.h:26,
from /builds/user/qemu/include/io/channel-socket.h:24,
from ../io/channel-socket.c:24:
/usr/include/liburing/compat.h:9:8: note: originally defined here
9 | struct __kernel_timespec {
| ^~~~~~~~~~~~~~~~~
ninja: subcommand failed
===
As above error message suggests, 'struct __kernel_timespec' was already
defined by liburing/compat.h.
Fix alpine CI by adding test to disable liburing in configure step if a
redefinition happens between linux/errqueue.h and liburing/compat.h.
[dgilbert: This has been fixed in Alpine issue 13813 and liburing]
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Message-Id: <20220513062836.965425-2-leobras@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit 354081d43de44ebd3497fe08f7f0121a5517d528)
Signed-off-by: Leonardo Bras <leobras@redhat.com>
---
meson.build | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/meson.build b/meson.build
index 5a7c10e639..13e3323380 100644
--- a/meson.build
+++ b/meson.build
@@ -471,12 +471,23 @@ if not get_option('linux_aio').auto() or have_block
required: get_option('linux_aio'),
kwargs: static_kwargs)
endif
+
+linux_io_uring_test = '''
+ #include <liburing.h>
+ #include <linux/errqueue.h>
+
+ int main(void) { return 0; }'''
+
linux_io_uring = not_found
if not get_option('linux_io_uring').auto() or have_block
linux_io_uring = dependency('liburing', version: '>=0.3',
required: get_option('linux_io_uring'),
method: 'pkg-config', kwargs: static_kwargs)
+ if not cc.links(linux_io_uring_test)
+ linux_io_uring = not_found
+ endif
endif
+
libnfs = not_found
if not get_option('libnfs').auto() or have_block
libnfs = dependency('libnfs', version: '>=1.9.3',
--
2.35.3

View File

@ -1,47 +0,0 @@
From 4bd48e784ae0c38c89f1a944b06c997fd28c4d37 Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Thu, 19 May 2022 04:15:33 -0400
Subject: [PATCH 16/16] migration: Fix operator type
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Miroslav Rezanina <mrezanin@redhat.com>
RH-MergeRequest: 92: Fix build using clang 14
RH-Commit: [1/1] ad9980e64cf2e39085d68f1ff601444bf2afe228 (mrezanin/centos-src-qemu-kvm)
RH-Bugzilla: 2064530
RH-Acked-by: Daniel P. Berrangé <berrange@redhat.com>
RH-Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Clang spotted an & that should have been an &&; fix it.
Reported by: David Binderman / https://gitlab.com/dcb
Fixes: 65dacaa04fa ("migration: introduce save_normal_page()")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/963
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20220406102515.96320-1-dgilbert@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit f912ec5b2d65644116ff496b58d7c9145c19e4c0)
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
---
migration/ram.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/migration/ram.c b/migration/ram.c
index 3532f64ecb..0ef4bd63eb 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1289,7 +1289,7 @@ static int save_normal_page(RAMState *rs, RAMBlock *block, ram_addr_t offset,
offset | RAM_SAVE_FLAG_PAGE));
if (async) {
qemu_put_buffer_async(rs->f, buf, TARGET_PAGE_SIZE,
- migrate_release_ram() &
+ migrate_release_ram() &&
migration_in_postcopy());
} else {
qemu_put_buffer(rs->f, buf, TARGET_PAGE_SIZE);
--
2.31.1

View File

@ -0,0 +1,76 @@
From 34eae2d7ef928a7e0e10cc30fe76839c005998eb Mon Sep 17 00:00:00 2001
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Date: Wed, 13 Apr 2022 12:33:29 +0100
Subject: [PATCH 07/11] migration: Read state once
RH-Author: Dr. David Alan Gilbert <dgilbert@redhat.com>
RH-MergeRequest: 249: migration: Read state once
RH-Bugzilla: 2074205
RH-Acked-by: Peter Xu <peterx@redhat.com>
RH-Acked-by: Laszlo Ersek <lersek@redhat.com>
RH-Acked-by: Jon Maloy <jmaloy@redhat.com>
RH-Acked-by: quintela1 <quintela@redhat.com>
RH-Commit: [1/1] 9aa47b492a646fce4e66ebd9b7d7a85286d16051
The 'status' field for the migration is updated normally using
an atomic operation from the migration thread.
Most readers of it aren't that careful, and in most cases it doesn't
matter.
In query_migrate->fill_source_migration_info the 'state'
is read twice; the first time to decide which state fields to fill in,
and then secondly to copy the state to the status field; that can end up
with a status that's inconsistent; e.g. setting up the fields
for 'setup' and then having an 'active' status. In that case
libvirt gets upset by the lack of ram info.
The symptom is:
libvirt.libvirtError: internal error: migration was active, but no RAM info was set
Read the state exactly once in fill_source_migration_info.
This is a possible fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=2074205
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20220413113329.103696-1-dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit 552de79bfdd5e9e53847eb3c6d6e4cd898a4370e)
---
migration/migration.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 51e6726dac..d8b24a2c91 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1071,6 +1071,7 @@ static void populate_disk_info(MigrationInfo *info)
static void fill_source_migration_info(MigrationInfo *info)
{
MigrationState *s = migrate_get_current();
+ int state = qatomic_read(&s->state);
GSList *cur_blocker = migration_blockers;
info->blocked_reasons = NULL;
@@ -1090,7 +1091,7 @@ static void fill_source_migration_info(MigrationInfo *info)
}
info->has_blocked_reasons = info->blocked_reasons != NULL;
- switch (s->state) {
+ switch (state) {
case MIGRATION_STATUS_NONE:
/* no migration has happened ever */
/* do not overwrite destination migration status */
@@ -1135,7 +1136,7 @@ static void fill_source_migration_info(MigrationInfo *info)
info->has_status = true;
break;
}
- info->status = s->state;
+ info->status = state;
}
typedef enum WriteTrackingSupport {
--
2.37.3

View File

@ -0,0 +1,296 @@
From f21a343af4b4d0c6e5181ae0abd0f6280dc8296c Mon Sep 17 00:00:00 2001
From: "manish.mishra" <manish.mishra@nutanix.com>
Date: Tue, 20 Dec 2022 18:44:18 +0000
Subject: [PATCH 2/3] migration: check magic value for deciding the mapping of
channels
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Peter Xu <peterx@redhat.com>
RH-MergeRequest: 258: migration: Fix multifd crash due to channel disorder
RH-Bugzilla: 2137740
RH-Acked-by: quintela1 <quintela@redhat.com>
RH-Acked-by: Leonardo Brás <leobras@redhat.com>
RH-Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
RH-Commit: [2/2] f97bebef3d3e372cfd660e5ddb6cffba791840d2
Conflicts:
migration/migration.c
migration/multifd.c
migration/postcopy-ram.c
migration/postcopy-ram.h
There're a bunch of conflicts due to missing upstream patches on
e.g. on qemufile reworks, postcopy preempt. We don't plan to have
preempt in rhel8 at all, probably the same as the rest.
Current logic assumes that channel connections on the destination side are
always established in the same order as the source and the first one will
always be the main channel followed by the multifid or post-copy
preemption channel. This may not be always true, as even if a channel has a
connection established on the source side it can be in the pending state on
the destination side and a newer connection can be established first.
Basically causing out of order mapping of channels on the destination side.
Currently, all channels except post-copy preempt send a magic number, this
patch uses that magic number to decide the type of channel. This logic is
applicable only for precopy(multifd) live migration, as mentioned, the
post-copy preempt channel does not send any magic number. Also, tls live
migrations already does tls handshake before creating other channels, so
this issue is not possible with tls, hence this logic is avoided for tls
live migrations. This patch uses read peek to check the magic number of
channels so that current data/control stream management remains
un-effected.
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: manish.mishra <manish.mishra@nutanix.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 6720c2b32725e6ac404f22851a0ecd0a71d0cbe2)
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/channel.c | 45 ++++++++++++++++++++++++++++++++++++++
migration/channel.h | 5 +++++
migration/migration.c | 51 +++++++++++++++++++++++++++++++------------
migration/multifd.c | 19 ++++++++--------
migration/multifd.h | 2 +-
5 files changed, 98 insertions(+), 24 deletions(-)
diff --git a/migration/channel.c b/migration/channel.c
index 086b5c0d8b..ee308fef23 100644
--- a/migration/channel.c
+++ b/migration/channel.c
@@ -98,3 +98,48 @@ void migration_channel_connect(MigrationState *s,
g_free(s->hostname);
error_free(error);
}
+
+
+/**
+ * @migration_channel_read_peek - Peek at migration channel, without
+ * actually removing it from channel buffer.
+ *
+ * @ioc: the channel object
+ * @buf: the memory region to read data into
+ * @buflen: the number of bytes to read in @buf
+ * @errp: pointer to a NULL-initialized error object
+ *
+ * Returns 0 if successful, returns -1 and sets @errp if fails.
+ */
+int migration_channel_read_peek(QIOChannel *ioc,
+ const char *buf,
+ const size_t buflen,
+ Error **errp)
+{
+ ssize_t len = 0;
+ struct iovec iov = { .iov_base = (char *)buf, .iov_len = buflen };
+
+ while (true) {
+ len = qio_channel_readv_full(ioc, &iov, 1, NULL, NULL,
+ QIO_CHANNEL_READ_FLAG_MSG_PEEK, errp);
+
+ if (len <= 0 && len != QIO_CHANNEL_ERR_BLOCK) {
+ error_setg(errp,
+ "Failed to peek at channel");
+ return -1;
+ }
+
+ if (len == buflen) {
+ break;
+ }
+
+ /* 1ms sleep. */
+ if (qemu_in_coroutine()) {
+ qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 1000000);
+ } else {
+ g_usleep(1000);
+ }
+ }
+
+ return 0;
+}
diff --git a/migration/channel.h b/migration/channel.h
index 67a461c28a..5bdb8208a7 100644
--- a/migration/channel.h
+++ b/migration/channel.h
@@ -24,4 +24,9 @@ void migration_channel_connect(MigrationState *s,
QIOChannel *ioc,
const char *hostname,
Error *error_in);
+
+int migration_channel_read_peek(QIOChannel *ioc,
+ const char *buf,
+ const size_t buflen,
+ Error **errp);
#endif
diff --git a/migration/migration.c b/migration/migration.c
index d8b24a2c91..0885549de0 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -32,6 +32,7 @@
#include "savevm.h"
#include "qemu-file-channel.h"
#include "qemu-file.h"
+#include "channel.h"
#include "migration/vmstate.h"
#include "block/block.h"
#include "qapi/error.h"
@@ -637,10 +638,6 @@ static bool migration_incoming_setup(QEMUFile *f, Error **errp)
{
MigrationIncomingState *mis = migration_incoming_get_current();
- if (multifd_load_setup(errp) != 0) {
- return false;
- }
-
if (!mis->from_src_file) {
mis->from_src_file = f;
}
@@ -701,10 +698,42 @@ void migration_fd_process_incoming(QEMUFile *f, Error **errp)
void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
{
MigrationIncomingState *mis = migration_incoming_get_current();
+ bool default_channel = true;
+ uint32_t channel_magic = 0;
Error *local_err = NULL;
- bool start_migration;
+ int ret = 0;
- if (!mis->from_src_file) {
+ if (migrate_use_multifd() && !migrate_postcopy_ram() &&
+ qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) {
+ /*
+ * With multiple channels, it is possible that we receive channels
+ * out of order on destination side, causing incorrect mapping of
+ * source channels on destination side. Check channel MAGIC to
+ * decide type of channel. Please note this is best effort, postcopy
+ * preempt channel does not send any magic number so avoid it for
+ * postcopy live migration. Also tls live migration already does
+ * tls handshake while initializing main channel so with tls this
+ * issue is not possible.
+ */
+ ret = migration_channel_read_peek(ioc, (void *)&channel_magic,
+ sizeof(channel_magic), &local_err);
+
+ if (ret != 0) {
+ error_propagate(errp, local_err);
+ return;
+ }
+
+ default_channel = (channel_magic == cpu_to_be32(QEMU_VM_FILE_MAGIC));
+ } else {
+ default_channel = !mis->from_src_file;
+ }
+
+ if (multifd_load_setup(errp) != 0) {
+ error_setg(errp, "Failed to setup multifd channels");
+ return;
+ }
+
+ if (default_channel) {
/* The first connection (multifd may have multiple) */
QEMUFile *f = qemu_fopen_channel_input(ioc);
@@ -716,23 +745,17 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
if (!migration_incoming_setup(f, errp)) {
return;
}
-
- /*
- * Common migration only needs one channel, so we can start
- * right now. Multifd needs more than one channel, we wait.
- */
- start_migration = !migrate_use_multifd();
} else {
/* Multiple connections */
assert(migrate_use_multifd());
- start_migration = multifd_recv_new_channel(ioc, &local_err);
+ multifd_recv_new_channel(ioc, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
}
- if (start_migration) {
+ if (migration_has_all_channels()) {
migration_incoming_process();
}
}
diff --git a/migration/multifd.c b/migration/multifd.c
index 7c16523e6b..75ac052d2f 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -1183,9 +1183,14 @@ int multifd_load_setup(Error **errp)
uint32_t page_count = MULTIFD_PACKET_SIZE / qemu_target_page_size();
uint8_t i;
- if (!migrate_use_multifd()) {
+ /*
+ * Return successfully if multiFD recv state is already initialised
+ * or multiFD is not enabled.
+ */
+ if (multifd_recv_state || !migrate_use_multifd()) {
return 0;
}
+
if (!migrate_multifd_is_allowed()) {
error_setg(errp, "multifd is not supported by current protocol");
return -1;
@@ -1244,11 +1249,9 @@ bool multifd_recv_all_channels_created(void)
/*
* Try to receive all multifd channels to get ready for the migration.
- * - Return true and do not set @errp when correctly receiving all channels;
- * - Return false and do not set @errp when correctly receiving the current one;
- * - Return false and set @errp when failing to receive the current channel.
+ * Sets @errp when failing to receive the current channel.
*/
-bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp)
+void multifd_recv_new_channel(QIOChannel *ioc, Error **errp)
{
MultiFDRecvParams *p;
Error *local_err = NULL;
@@ -1261,7 +1264,7 @@ bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp)
"failed to receive packet"
" via multifd channel %d: ",
qatomic_read(&multifd_recv_state->count));
- return false;
+ return;
}
trace_multifd_recv_new_channel(id);
@@ -1271,7 +1274,7 @@ bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp)
id);
multifd_recv_terminate_threads(local_err);
error_propagate(errp, local_err);
- return false;
+ return;
}
p->c = ioc;
object_ref(OBJECT(ioc));
@@ -1282,6 +1285,4 @@ bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp)
qemu_thread_create(&p->thread, p->name, multifd_recv_thread, p,
QEMU_THREAD_JOINABLE);
qatomic_inc(&multifd_recv_state->count);
- return qatomic_read(&multifd_recv_state->count) ==
- migrate_multifd_channels();
}
diff --git a/migration/multifd.h b/migration/multifd.h
index 11d5e273e6..9c0a2a0701 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -20,7 +20,7 @@ void multifd_save_cleanup(void);
int multifd_load_setup(Error **errp);
int multifd_load_cleanup(Error **errp);
bool multifd_recv_all_channels_created(void);
-bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp);
+void multifd_recv_new_channel(QIOChannel *ioc, Error **errp);
void multifd_recv_sync_main(void);
int multifd_send_sync_main(QEMUFile *f);
int multifd_queue_page(QEMUFile *f, RAMBlock *block, ram_addr_t offset);
--
2.37.3

View File

@ -1,142 +0,0 @@
From 1d280070748b604c60a7be4d4c3c3a28e3964f37 Mon Sep 17 00:00:00 2001
From: Thomas Huth <thuth@redhat.com>
Date: Tue, 2 Aug 2022 10:11:21 +0200
Subject: [PATCH 31/32] multifd: Copy pages before compressing them with zlib
RH-Author: Thomas Huth <thuth@redhat.com>
RH-MergeRequest: 112: Fix postcopy migration on s390x
RH-Commit: [1/2] fd5a0221e22b4563bd1cb7f8a8b95f0bfe8f5fc9 (thuth/qemu-kvm-cs9)
RH-Bugzilla: 2099934
RH-Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Peter Xu <peterx@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2099934
zlib_send_prepare() compresses pages of a running VM. zlib does not
make any thread-safety guarantees with respect to changing deflate()
input concurrently with deflate() [1].
One can observe problems due to this with the IBM zEnterprise Data
Compression accelerator capable zlib [2]. When the hardware
acceleration is enabled, migration/multifd/tcp/plain/zlib test fails
intermittently [3] due to sliding window corruption. The accelerator's
architecture explicitly discourages concurrent accesses [4]:
Page 26-57, "Other Conditions":
As observed by this CPU, other CPUs, and channel
programs, references to the parameter block, first,
second, and third operands may be multiple-access
references, accesses to these storage locations are
not necessarily block-concurrent, and the sequence
of these accesses or references is undefined.
Mark Adler pointed out that vanilla zlib performs double fetches under
certain circumstances as well [5], therefore we need to copy data
before passing it to deflate().
[1] https://zlib.net/manual.html
[2] https://github.com/madler/zlib/pull/410
[3] https://lists.nongnu.org/archive/html/qemu-devel/2022-03/msg03988.html
[4] http://publibfp.dhe.ibm.com/epubs/pdf/a227832c.pdf
[5] https://lists.gnu.org/archive/html/qemu-devel/2022-07/msg00889.html
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20220705203559.2960949-1-iii@linux.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit 007e179ef0e97eafda4c9ff2a9d665a1947c7c6d)
Signed-off-by: Thomas Huth <thuth@redhat.com>
---
migration/multifd-zlib.c | 38 ++++++++++++++++++++++++++++++--------
1 file changed, 30 insertions(+), 8 deletions(-)
diff --git a/migration/multifd-zlib.c b/migration/multifd-zlib.c
index 3a7ae44485..18213a9513 100644
--- a/migration/multifd-zlib.c
+++ b/migration/multifd-zlib.c
@@ -27,6 +27,8 @@ struct zlib_data {
uint8_t *zbuff;
/* size of compressed buffer */
uint32_t zbuff_len;
+ /* uncompressed buffer of size qemu_target_page_size() */
+ uint8_t *buf;
};
/* Multifd zlib compression */
@@ -45,26 +47,38 @@ static int zlib_send_setup(MultiFDSendParams *p, Error **errp)
{
struct zlib_data *z = g_new0(struct zlib_data, 1);
z_stream *zs = &z->zs;
+ const char *err_msg;
zs->zalloc = Z_NULL;
zs->zfree = Z_NULL;
zs->opaque = Z_NULL;
if (deflateInit(zs, migrate_multifd_zlib_level()) != Z_OK) {
- g_free(z);
- error_setg(errp, "multifd %u: deflate init failed", p->id);
- return -1;
+ err_msg = "deflate init failed";
+ goto err_free_z;
}
/* This is the maxium size of the compressed buffer */
z->zbuff_len = compressBound(MULTIFD_PACKET_SIZE);
z->zbuff = g_try_malloc(z->zbuff_len);
if (!z->zbuff) {
- deflateEnd(&z->zs);
- g_free(z);
- error_setg(errp, "multifd %u: out of memory for zbuff", p->id);
- return -1;
+ err_msg = "out of memory for zbuff";
+ goto err_deflate_end;
+ }
+ z->buf = g_try_malloc(qemu_target_page_size());
+ if (!z->buf) {
+ err_msg = "out of memory for buf";
+ goto err_free_zbuff;
}
p->data = z;
return 0;
+
+err_free_zbuff:
+ g_free(z->zbuff);
+err_deflate_end:
+ deflateEnd(&z->zs);
+err_free_z:
+ g_free(z);
+ error_setg(errp, "multifd %u: %s", p->id, err_msg);
+ return -1;
}
/**
@@ -82,6 +96,8 @@ static void zlib_send_cleanup(MultiFDSendParams *p, Error **errp)
deflateEnd(&z->zs);
g_free(z->zbuff);
z->zbuff = NULL;
+ g_free(z->buf);
+ z->buf = NULL;
g_free(p->data);
p->data = NULL;
}
@@ -114,8 +130,14 @@ static int zlib_send_prepare(MultiFDSendParams *p, Error **errp)
flush = Z_SYNC_FLUSH;
}
+ /*
+ * Since the VM might be running, the page may be changing concurrently
+ * with compression. zlib does not guarantee that this is safe,
+ * therefore copy the page before calling deflate().
+ */
+ memcpy(z->buf, p->pages->block->host + p->normal[i], page_size);
zs->avail_in = page_size;
- zs->next_in = p->pages->block->host + p->normal[i];
+ zs->next_in = z->buf;
zs->avail_out = available;
zs->next_out = z->zbuff + out_size;
--
2.31.1

View File

@ -1,381 +0,0 @@
From 4a9ddf42788d3f924bdad7746f7aca615f03d7c1 Mon Sep 17 00:00:00 2001
From: Eric Blake <eblake@redhat.com>
Date: Wed, 11 May 2022 19:49:24 -0500
Subject: [PATCH 2/2] nbd/server: Allow MULTI_CONN for shared writable exports
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Eric Blake <eblake@redhat.com>
RH-MergeRequest: 90: Advertise MULTI_CONN on writeable NBD servers
RH-Commit: [2/2] 53f0e885a5ed7f6e4bb14e74fe8e7957e6afe90f (ebblake/centos-qemu-kvm)
RH-Bugzilla: 1708300
RH-Acked-by: Nir Soffer <nsoffer@redhat.com>
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
RH-Acked-by: Daniel P. Berrangé <berrange@redhat.com>
According to the NBD spec, a server that advertises
NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will
not see any cache inconsistencies: when properly separated by a single
flush, actions performed by one client will be visible to another
client, regardless of which client did the flush.
We always satisfy these conditions in qemu - even when we support
multiple clients, ALL clients go through a single point of reference
into the block layer, with no local caching. The effect of one client
is instantly visible to the next client. Even if our backend were a
network device, we argue that any multi-path caching effects that
would cause inconsistencies in back-to-back actions not seeing the
effect of previous actions would be a bug in that backend, and not the
fault of caching in qemu. As such, it is safe to unconditionally
advertise CAN_MULTI_CONN for any qemu NBD server situation that
supports parallel clients.
Note, however, that we don't want to advertise CAN_MULTI_CONN when we
know that a second client cannot connect (for historical reasons,
qemu-nbd defaults to a single connection while nbd-server-add and QMP
commands default to unlimited connections; but we already have
existing means to let either style of NBD server creation alter those
defaults). This is visible by no longer advertising MULTI_CONN for
'qemu-nbd -r' without -e, as in the iotest nbd-qemu-allocation.
The harder part of this patch is setting up an iotest to demonstrate
behavior of multiple NBD clients to a single server. It might be
possible with parallel qemu-io processes, but I found it easier to do
in python with the help of libnbd, and help from Nir and Vladimir in
writing the test.
Signed-off-by: Eric Blake <eblake@redhat.com>
Suggested-by: Nir Soffer <nsoffer@redhat.com>
Suggested-by: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru>
Message-Id: <20220512004924.417153-3-eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 58a6fdcc9efb2a7c1ef4893dca4aa5e8020ca3dc)
Conflicts:
nbd/server.c - context, e5fb29d5 not backported
Signed-off-by: Eric Blake <eblake@redhat.com>
---
MAINTAINERS | 1 +
blockdev-nbd.c | 5 +
docs/interop/nbd.txt | 1 +
docs/tools/qemu-nbd.rst | 3 +-
include/block/nbd.h | 3 +-
nbd/server.c | 10 +-
qapi/block-export.json | 8 +-
tests/qemu-iotests/tests/nbd-multiconn | 145 ++++++++++++++++++
tests/qemu-iotests/tests/nbd-multiconn.out | 5 +
.../tests/nbd-qemu-allocation.out | 2 +-
10 files changed, 172 insertions(+), 11 deletions(-)
create mode 100755 tests/qemu-iotests/tests/nbd-multiconn
create mode 100644 tests/qemu-iotests/tests/nbd-multiconn.out
diff --git a/MAINTAINERS b/MAINTAINERS
index 4ad2451e03..2fe20a49ab 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3370,6 +3370,7 @@ F: qemu-nbd.*
F: blockdev-nbd.c
F: docs/interop/nbd.txt
F: docs/tools/qemu-nbd.rst
+F: tests/qemu-iotests/tests/*nbd*
T: git https://repo.or.cz/qemu/ericb.git nbd
T: git https://src.openvz.org/scm/~vsementsov/qemu.git nbd
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index add41a23af..c6d9b0324c 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -44,6 +44,11 @@ bool nbd_server_is_running(void)
return nbd_server || qemu_nbd_connections >= 0;
}
+int nbd_server_max_connections(void)
+{
+ return nbd_server ? nbd_server->max_connections : qemu_nbd_connections;
+}
+
static void nbd_blockdev_client_closed(NBDClient *client, bool ignored)
{
nbd_client_put(client);
diff --git a/docs/interop/nbd.txt b/docs/interop/nbd.txt
index bdb0f2a41a..f5ca25174a 100644
--- a/docs/interop/nbd.txt
+++ b/docs/interop/nbd.txt
@@ -68,3 +68,4 @@ NBD_CMD_BLOCK_STATUS for "qemu:dirty-bitmap:", NBD_CMD_CACHE
* 4.2: NBD_FLAG_CAN_MULTI_CONN for shareable read-only exports,
NBD_CMD_FLAG_FAST_ZERO
* 5.2: NBD_CMD_BLOCK_STATUS for "qemu:allocation-depth"
+* 7.1: NBD_FLAG_CAN_MULTI_CONN for shareable writable exports
diff --git a/docs/tools/qemu-nbd.rst b/docs/tools/qemu-nbd.rst
index 4c950f6199..8e08a29e89 100644
--- a/docs/tools/qemu-nbd.rst
+++ b/docs/tools/qemu-nbd.rst
@@ -139,8 +139,7 @@ driver options if :option:`--image-opts` is specified.
.. option:: -e, --shared=NUM
Allow up to *NUM* clients to share the device (default
- ``1``), 0 for unlimited. Safe for readers, but for now,
- consistency is not guaranteed between multiple writers.
+ ``1``), 0 for unlimited.
.. option:: -t, --persistent
diff --git a/include/block/nbd.h b/include/block/nbd.h
index c5a29ce1c6..c74b7a9d2e 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -1,5 +1,5 @@
/*
- * Copyright (C) 2016-2020 Red Hat, Inc.
+ * Copyright (C) 2016-2022 Red Hat, Inc.
* Copyright (C) 2005 Anthony Liguori <anthony@codemonkey.ws>
*
* Network Block Device
@@ -346,6 +346,7 @@ void nbd_client_put(NBDClient *client);
void nbd_server_is_qemu_nbd(int max_connections);
bool nbd_server_is_running(void);
+int nbd_server_max_connections(void);
void nbd_server_start(SocketAddress *addr, const char *tls_creds,
const char *tls_authz, uint32_t max_connections,
Error **errp);
diff --git a/nbd/server.c b/nbd/server.c
index c5644fd3f6..6e2157acfa 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1,5 +1,5 @@
/*
- * Copyright (C) 2016-2021 Red Hat, Inc.
+ * Copyright (C) 2016-2022 Red Hat, Inc.
* Copyright (C) 2005 Anthony Liguori <anthony@codemonkey.ws>
*
* Network Block Device Server Side
@@ -1642,7 +1642,6 @@ static int nbd_export_create(BlockExport *blk_exp, BlockExportOptions *exp_args,
int64_t size;
uint64_t perm, shared_perm;
bool readonly = !exp_args->writable;
- bool shared = !exp_args->writable;
strList *bitmaps;
size_t i;
int ret;
@@ -1693,11 +1692,12 @@ static int nbd_export_create(BlockExport *blk_exp, BlockExportOptions *exp_args,
exp->description = g_strdup(arg->description);
exp->nbdflags = (NBD_FLAG_HAS_FLAGS | NBD_FLAG_SEND_FLUSH |
NBD_FLAG_SEND_FUA | NBD_FLAG_SEND_CACHE);
+
+ if (nbd_server_max_connections() != 1) {
+ exp->nbdflags |= NBD_FLAG_CAN_MULTI_CONN;
+ }
if (readonly) {
exp->nbdflags |= NBD_FLAG_READ_ONLY;
- if (shared) {
- exp->nbdflags |= NBD_FLAG_CAN_MULTI_CONN;
- }
} else {
exp->nbdflags |= (NBD_FLAG_SEND_TRIM | NBD_FLAG_SEND_WRITE_ZEROES |
NBD_FLAG_SEND_FAST_ZERO);
diff --git a/qapi/block-export.json b/qapi/block-export.json
index 1e34927f85..755ccc89b1 100644
--- a/qapi/block-export.json
+++ b/qapi/block-export.json
@@ -21,7 +21,9 @@
# recreated on the fly while the NBD server is active.
# If missing, it will default to denying access (since 4.0).
# @max-connections: The maximum number of connections to allow at the same
-# time, 0 for unlimited. (since 5.2; default: 0)
+# time, 0 for unlimited. Setting this to 1 also stops
+# the server from advertising multiple client support
+# (since 5.2; default: 0)
#
# Since: 4.2
##
@@ -50,7 +52,9 @@
# recreated on the fly while the NBD server is active.
# If missing, it will default to denying access (since 4.0).
# @max-connections: The maximum number of connections to allow at the same
-# time, 0 for unlimited. (since 5.2; default: 0)
+# time, 0 for unlimited. Setting this to 1 also stops
+# the server from advertising multiple client support
+# (since 5.2; default: 0).
#
# Returns: error if the server is already running.
#
diff --git a/tests/qemu-iotests/tests/nbd-multiconn b/tests/qemu-iotests/tests/nbd-multiconn
new file mode 100755
index 0000000000..b121f2e363
--- /dev/null
+++ b/tests/qemu-iotests/tests/nbd-multiconn
@@ -0,0 +1,145 @@
+#!/usr/bin/env python3
+# group: rw auto quick
+#
+# Test cases for NBD multi-conn advertisement
+#
+# Copyright (C) 2022 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+import os
+from contextlib import contextmanager
+import iotests
+from iotests import qemu_img_create, qemu_io
+
+
+disk = os.path.join(iotests.test_dir, 'disk')
+size = '4M'
+nbd_sock = os.path.join(iotests.sock_dir, 'nbd_sock')
+nbd_uri = 'nbd+unix:///{}?socket=' + nbd_sock
+
+
+@contextmanager
+def open_nbd(export_name):
+ h = nbd.NBD()
+ try:
+ h.connect_uri(nbd_uri.format(export_name))
+ yield h
+ finally:
+ h.shutdown()
+
+class TestNbdMulticonn(iotests.QMPTestCase):
+ def setUp(self):
+ qemu_img_create('-f', iotests.imgfmt, disk, size)
+ qemu_io('-c', 'w -P 1 0 2M', '-c', 'w -P 2 2M 2M', disk)
+
+ self.vm = iotests.VM()
+ self.vm.launch()
+ result = self.vm.qmp('blockdev-add', {
+ 'driver': 'qcow2',
+ 'node-name': 'n',
+ 'file': {'driver': 'file', 'filename': disk}
+ })
+ self.assert_qmp(result, 'return', {})
+
+ def tearDown(self):
+ self.vm.shutdown()
+ os.remove(disk)
+ try:
+ os.remove(nbd_sock)
+ except OSError:
+ pass
+
+ @contextmanager
+ def run_server(self, max_connections=None):
+ args = {
+ 'addr': {
+ 'type': 'unix',
+ 'data': {'path': nbd_sock}
+ }
+ }
+ if max_connections is not None:
+ args['max-connections'] = max_connections
+
+ result = self.vm.qmp('nbd-server-start', args)
+ self.assert_qmp(result, 'return', {})
+ yield
+
+ result = self.vm.qmp('nbd-server-stop')
+ self.assert_qmp(result, 'return', {})
+
+ def add_export(self, name, writable=None):
+ args = {
+ 'type': 'nbd',
+ 'id': name,
+ 'node-name': 'n',
+ 'name': name,
+ }
+ if writable is not None:
+ args['writable'] = writable
+
+ result = self.vm.qmp('block-export-add', args)
+ self.assert_qmp(result, 'return', {})
+
+ def test_default_settings(self):
+ with self.run_server():
+ self.add_export('r')
+ self.add_export('w', writable=True)
+ with open_nbd('r') as h:
+ self.assertTrue(h.can_multi_conn())
+ with open_nbd('w') as h:
+ self.assertTrue(h.can_multi_conn())
+
+ def test_limited_connections(self):
+ with self.run_server(max_connections=1):
+ self.add_export('r')
+ self.add_export('w', writable=True)
+ with open_nbd('r') as h:
+ self.assertFalse(h.can_multi_conn())
+ with open_nbd('w') as h:
+ self.assertFalse(h.can_multi_conn())
+
+ def test_parallel_writes(self):
+ with self.run_server():
+ self.add_export('w', writable=True)
+
+ clients = [nbd.NBD() for _ in range(3)]
+ for c in clients:
+ c.connect_uri(nbd_uri.format('w'))
+ self.assertTrue(c.can_multi_conn())
+
+ initial_data = clients[0].pread(1024 * 1024, 0)
+ self.assertEqual(initial_data, b'\x01' * 1024 * 1024)
+
+ updated_data = b'\x03' * 1024 * 1024
+ clients[1].pwrite(updated_data, 0)
+ clients[2].flush()
+ current_data = clients[0].pread(1024 * 1024, 0)
+
+ self.assertEqual(updated_data, current_data)
+
+ for i in range(3):
+ clients[i].shutdown()
+
+
+if __name__ == '__main__':
+ try:
+ # Easier to use libnbd than to try and set up parallel
+ # 'qemu-nbd --list' or 'qemu-io' processes, but not all systems
+ # have libnbd installed.
+ import nbd # type: ignore
+
+ iotests.main(supported_fmts=['qcow2'])
+ except ImportError:
+ iotests.notrun('libnbd not installed')
diff --git a/tests/qemu-iotests/tests/nbd-multiconn.out b/tests/qemu-iotests/tests/nbd-multiconn.out
new file mode 100644
index 0000000000..8d7e996700
--- /dev/null
+++ b/tests/qemu-iotests/tests/nbd-multiconn.out
@@ -0,0 +1,5 @@
+...
+----------------------------------------------------------------------
+Ran 3 tests
+
+OK
diff --git a/tests/qemu-iotests/tests/nbd-qemu-allocation.out b/tests/qemu-iotests/tests/nbd-qemu-allocation.out
index 0bf1abb063..9d938db24e 100644
--- a/tests/qemu-iotests/tests/nbd-qemu-allocation.out
+++ b/tests/qemu-iotests/tests/nbd-qemu-allocation.out
@@ -17,7 +17,7 @@ wrote 2097152/2097152 bytes at offset 1048576
exports available: 1
export: ''
size: 4194304
- flags: 0x58f ( readonly flush fua df multi cache )
+ flags: 0x48f ( readonly flush fua df cache )
min block: 1
opt block: 4096
max block: 33554432
--
2.31.1

View File

@ -1,78 +0,0 @@
From 56674ee1f25f12978a6a8a1390e11b55b3e0fabe Mon Sep 17 00:00:00 2001
From: Thomas Huth <thuth@redhat.com>
Date: Fri, 8 Jul 2022 20:49:01 +0200
Subject: [PATCH 15/17] pc-bios/s390-ccw/netboot.mak: Ignore Clang's warnings
about GNU extensions
RH-Author: Thomas Huth <thuth@redhat.com>
RH-MergeRequest: 106: pc-bios/s390-ccw: Fix boot from disks with 4k sectors that do not have the typical DASD geometry
RH-Commit: [10/10] 037dab4df23ebb2b42871bca8c842a53a7204b50 (thuth/qemu-kvm-cs9)
RH-Bugzilla: 2098077
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
Bugzilla: http://bugzilla.redhat.com/2098077
commit e2269220acb03e6c6a460c3090d804835e202239
Author: Thomas Huth <thuth@redhat.com>
Date: Mon Jul 4 13:19:03 2022 +0200
pc-bios/s390-ccw/netboot.mak: Ignore Clang's warnings about GNU extensions
When compiling the s390-ccw bios with Clang (v14.0), there is currently
an unuseful warning like this:
CC pc-bios/s390-ccw/ipv6.o
../../roms/SLOF/lib/libnet/ipv6.c:447:18: warning: variable length array
folded to constant array as an extension [-Wgnu-folding-constant]
unsigned short raw[ip6size];
^
SLOF is currently GCC-only and cannot be compiled with Clang yet, so
it is expected that such extensions sneak in there - and as long as
we don't want to compile the code with a compiler that is neither GCC
or Clang, it is also not necessary to avoid such extensions.
Thus these GNU-extension related warnings are completely useless in
the s390-ccw bios, especially in the code that is coming from SLOF,
so we should simply disable the related warnings here now.
Message-Id: <20220704111903.62400-13-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
---
pc-bios/s390-ccw/netboot.mak | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/pc-bios/s390-ccw/netboot.mak b/pc-bios/s390-ccw/netboot.mak
index 68b4d7edcb..ad41898cb6 100644
--- a/pc-bios/s390-ccw/netboot.mak
+++ b/pc-bios/s390-ccw/netboot.mak
@@ -16,9 +16,12 @@ s390-netboot.elf: $(NETOBJS) libnet.a libc.a
s390-netboot.img: s390-netboot.elf
$(call quiet-command,$(STRIP) --strip-unneeded $< -o $@,"STRIP","$(TARGET_DIR)$@")
+# SLOF is GCC-only, so ignore warnings about GNU extensions with Clang here
+NO_GNU_WARN := $(call cc-option,-Werror $(QEMU_CFLAGS),-Wno-gnu)
+
# libc files:
-LIBC_CFLAGS = $(QEMU_CFLAGS) $(CFLAGS) $(LIBC_INC) $(LIBNET_INC) \
+LIBC_CFLAGS = $(QEMU_CFLAGS) $(CFLAGS) $(NO_GNU_WARN) $(LIBC_INC) $(LIBNET_INC) \
-MMD -MP -MT $@ -MF $(@:%.o=%.d)
CTYPE_OBJS = isdigit.o isxdigit.o toupper.o
@@ -52,7 +55,7 @@ libc.a: $(LIBCOBJS)
LIBNETOBJS := args.o dhcp.o dns.o icmpv6.o ipv6.o tcp.o udp.o bootp.o \
dhcpv6.o ethernet.o ipv4.o ndp.o tftp.o pxelinux.o
-LIBNETCFLAGS = $(QEMU_CFLAGS) $(CFLAGS) $(LIBC_INC) $(LIBNET_INC) \
+LIBNETCFLAGS = $(QEMU_CFLAGS) $(CFLAGS) $(NO_GNU_WARN) $(LIBC_INC) $(LIBNET_INC) \
-DDHCPARCH=0x1F -MMD -MP -MT $@ -MF $(@:%.o=%.d)
%.o : $(SLOF_DIR)/lib/libnet/%.c
--
2.31.1

View File

@ -0,0 +1,55 @@
From 01c09f31978154f0d2fd699621ae958a8c3ea2a5 Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Thu, 9 Mar 2023 08:15:24 -0500
Subject: [PATCH 08/13] physmem: add missing memory barrier
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 263: qatomic: add smp_mb__before/after_rmw()
RH-Bugzilla: 2168472
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Commit: [8/10] f6a9659f7cf40b78de6e85e4a7c06842273aa770
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168472
commit 33828ca11da08436e1b32f3e79dabce3061a0427
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Fri Mar 3 14:36:32 2023 +0100
physmem: add missing memory barrier
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
softmmu/physmem.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 4d0ef5f92f..2b96fad302 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -3087,6 +3087,8 @@ void cpu_register_map_client(QEMUBH *bh)
qemu_mutex_lock(&map_client_list_lock);
client->bh = bh;
QLIST_INSERT_HEAD(&map_client_list, client, link);
+ /* Write map_client_list before reading in_use. */
+ smp_mb();
if (!qatomic_read(&bounce.in_use)) {
cpu_notify_map_clients_locked();
}
@@ -3279,6 +3281,7 @@ void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len,
qemu_vfree(bounce.buffer);
bounce.buffer = NULL;
memory_region_unref(bounce.mr);
+ /* Clear in_use before reading map_client_list. */
qatomic_mb_set(&bounce.in_use, false);
cpu_notify_map_clients();
}
--
2.37.3

View File

@ -1,126 +0,0 @@
From e97c563f7146098119839aa146a6f25070eb7148 Mon Sep 17 00:00:00 2001
From: Gavin Shan <gshan@redhat.com>
Date: Wed, 11 May 2022 18:01:02 +0800
Subject: [PATCH 01/16] qapi/machine.json: Add cluster-id
RH-Author: Gavin Shan <gshan@redhat.com>
RH-MergeRequest: 86: hw/arm/virt: Fix the default CPU topology
RH-Commit: [1/6] 44d7d83008c6d28485ae44f7cced792f4987b919 (gwshan/qemu-rhel-9)
RH-Bugzilla: 2041823
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Andrew Jones <drjones@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2041823
This adds cluster-id in CPU instance properties, which will be used
by arm/virt machine. Besides, the cluster-id is also verified or
dumped in various spots:
* hw/core/machine.c::machine_set_cpu_numa_node() to associate
CPU with its NUMA node.
* hw/core/machine.c::machine_numa_finish_cpu_init() to record
CPU slots with no NUMA mapping set.
* hw/core/machine-hmp-cmds.c::hmp_hotpluggable_cpus() to dump
cluster-id.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 20220503140304.855514-2-gshan@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 1dcf7001d4bae651129d46d5628b29e93a411d0b)
Signed-off-by: Gavin Shan <gshan@redhat.com>
---
hw/core/machine-hmp-cmds.c | 4 ++++
hw/core/machine.c | 16 ++++++++++++++++
qapi/machine.json | 6 ++++--
3 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/hw/core/machine-hmp-cmds.c b/hw/core/machine-hmp-cmds.c
index 4e2f319aeb..5cb5eecbfc 100644
--- a/hw/core/machine-hmp-cmds.c
+++ b/hw/core/machine-hmp-cmds.c
@@ -77,6 +77,10 @@ void hmp_hotpluggable_cpus(Monitor *mon, const QDict *qdict)
if (c->has_die_id) {
monitor_printf(mon, " die-id: \"%" PRIu64 "\"\n", c->die_id);
}
+ if (c->has_cluster_id) {
+ monitor_printf(mon, " cluster-id: \"%" PRIu64 "\"\n",
+ c->cluster_id);
+ }
if (c->has_core_id) {
monitor_printf(mon, " core-id: \"%" PRIu64 "\"\n", c->core_id);
}
diff --git a/hw/core/machine.c b/hw/core/machine.c
index dffc3ef4ab..168f4de910 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -890,6 +890,11 @@ void machine_set_cpu_numa_node(MachineState *machine,
return;
}
+ if (props->has_cluster_id && !slot->props.has_cluster_id) {
+ error_setg(errp, "cluster-id is not supported");
+ return;
+ }
+
if (props->has_socket_id && !slot->props.has_socket_id) {
error_setg(errp, "socket-id is not supported");
return;
@@ -909,6 +914,11 @@ void machine_set_cpu_numa_node(MachineState *machine,
continue;
}
+ if (props->has_cluster_id &&
+ props->cluster_id != slot->props.cluster_id) {
+ continue;
+ }
+
if (props->has_die_id && props->die_id != slot->props.die_id) {
continue;
}
@@ -1203,6 +1213,12 @@ static char *cpu_slot_to_string(const CPUArchId *cpu)
}
g_string_append_printf(s, "die-id: %"PRId64, cpu->props.die_id);
}
+ if (cpu->props.has_cluster_id) {
+ if (s->len) {
+ g_string_append_printf(s, ", ");
+ }
+ g_string_append_printf(s, "cluster-id: %"PRId64, cpu->props.cluster_id);
+ }
if (cpu->props.has_core_id) {
if (s->len) {
g_string_append_printf(s, ", ");
diff --git a/qapi/machine.json b/qapi/machine.json
index d25a481ce4..4c417e32a5 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -868,10 +868,11 @@
# @node-id: NUMA node ID the CPU belongs to
# @socket-id: socket number within node/board the CPU belongs to
# @die-id: die number within socket the CPU belongs to (since 4.1)
-# @core-id: core number within die the CPU belongs to
+# @cluster-id: cluster number within die the CPU belongs to (since 7.1)
+# @core-id: core number within cluster the CPU belongs to
# @thread-id: thread number within core the CPU belongs to
#
-# Note: currently there are 5 properties that could be present
+# Note: currently there are 6 properties that could be present
# but management should be prepared to pass through other
# properties with device_add command to allow for future
# interface extension. This also requires the filed names to be kept in
@@ -883,6 +884,7 @@
'data': { '*node-id': 'int',
'*socket-id': 'int',
'*die-id': 'int',
+ '*cluster-id': 'int',
'*core-id': 'int',
'*thread-id': 'int'
}
--
2.31.1

View File

@ -0,0 +1,177 @@
From e7d0e29d1962092af58d0445439671a6e1d91f71 Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Thu, 9 Mar 2023 08:10:33 -0500
Subject: [PATCH 02/13] qatomic: add smp_mb__before/after_rmw()
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 263: qatomic: add smp_mb__before/after_rmw()
RH-Bugzilla: 2168472
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Commit: [2/10] 1f87eb3157abcf23f020881cedce42f76497f348
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168472
commit ff00bed1897c3d27adc5b0cec6f6eeb5a7d13176
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Thu Mar 2 11:10:56 2023 +0100
qatomic: add smp_mb__before/after_rmw()
On ARM, seqcst loads and stores (which QEMU does not use) are compiled
respectively as LDAR and STLR instructions. Even though LDAR is
also used for load-acquire operations, it also waits for all STLRs to
leave the store buffer. Thus, LDAR and STLR alone are load-acquire
and store-release operations, but LDAR also provides store-against-load
ordering as long as the previous store is a STLR.
Compare this to ARMv7, where store-release is DMB+STR and load-acquire
is LDR+DMB, but an additional DMB is needed between store-seqcst and
load-seqcst (e.g. DMB+STR+DMB+LDR+DMB); or with x86, where MOV provides
load-acquire and store-release semantics and the two can be reordered.
Likewise, on ARM sequentially consistent read-modify-write operations only
need to use LDAXR and STLXR respectively for the load and the store, while
on x86 they need to use the stronger LOCK prefix.
In a strange twist of events, however, the _stronger_ semantics
of the ARM instructions can end up causing bugs on ARM, not on x86.
The problems occur when seqcst atomics are mixed with relaxed atomics.
QEMU's atomics try to bridge the Linux API (that most of the developers
are familiar with) and the C11 API, and the two have a substantial
difference:
- in Linux, strongly-ordered atomics such as atomic_add_return() affect
the global ordering of _all_ memory operations, including for example
READ_ONCE()/WRITE_ONCE()
- in C11, sequentially consistent atomics (except for seq-cst fences)
only affect the ordering of sequentially consistent operations.
In particular, since relaxed loads are done with LDR on ARM, they are
not ordered against seqcst stores (which are done with STLR).
QEMU implements high-level synchronization primitives with the idea that
the primitives contain the necessary memory barriers, and the callers can
use relaxed atomics (qatomic_read/qatomic_set) or even regular accesses.
This is very much incompatible with the C11 view that seqcst accesses
are only ordered against other seqcst accesses, and requires using seqcst
fences as in the following example:
qatomic_set(&y, 1); qatomic_set(&x, 1);
smp_mb(); smp_mb();
... qatomic_read(&x) ... ... qatomic_read(&y) ...
When a qatomic_*() read-modify write operation is used instead of one
or both stores, developers that are more familiar with the Linux API may
be tempted to omit the smp_mb(), which will work on x86 but not on ARM.
This nasty difference between Linux and C11 read-modify-write operations
has already caused issues in util/async.c and more are being found.
Provide something similar to Linux smp_mb__before/after_atomic(); this
has the double function of documenting clearly why there is a memory
barrier, and avoiding a double barrier on x86 and s390x systems.
The new macro can already be put to use in qatomic_mb_set().
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
docs/devel/atomics.rst | 26 +++++++++++++++++++++-----
include/qemu/atomic.h | 17 ++++++++++++++++-
2 files changed, 37 insertions(+), 6 deletions(-)
diff --git a/docs/devel/atomics.rst b/docs/devel/atomics.rst
index 52baa0736d..10fbfc58bb 100644
--- a/docs/devel/atomics.rst
+++ b/docs/devel/atomics.rst
@@ -25,7 +25,8 @@ provides macros that fall in three camps:
- weak atomic access and manual memory barriers: ``qatomic_read()``,
``qatomic_set()``, ``smp_rmb()``, ``smp_wmb()``, ``smp_mb()``,
- ``smp_mb_acquire()``, ``smp_mb_release()``, ``smp_read_barrier_depends()``;
+ ``smp_mb_acquire()``, ``smp_mb_release()``, ``smp_read_barrier_depends()``,
+ ``smp_mb__before_rmw()``, ``smp_mb__after_rmw()``;
- sequentially consistent atomic access: everything else.
@@ -470,7 +471,7 @@ and memory barriers, and the equivalents in QEMU:
sequential consistency.
- in QEMU, ``qatomic_read()`` and ``qatomic_set()`` do not participate in
- the total ordering enforced by sequentially-consistent operations.
+ the ordering enforced by read-modify-write operations.
This is because QEMU uses the C11 memory model. The following example
is correct in Linux but not in QEMU:
@@ -486,9 +487,24 @@ and memory barriers, and the equivalents in QEMU:
because the read of ``y`` can be moved (by either the processor or the
compiler) before the write of ``x``.
- Fixing this requires an ``smp_mb()`` memory barrier between the write
- of ``x`` and the read of ``y``. In the common case where only one thread
- writes ``x``, it is also possible to write it like this:
+ Fixing this requires a full memory barrier between the write of ``x`` and
+ the read of ``y``. QEMU provides ``smp_mb__before_rmw()`` and
+ ``smp_mb__after_rmw()``; they act both as an optimization,
+ avoiding the memory barrier on processors where it is unnecessary,
+ and as a clarification of this corner case of the C11 memory model:
+
+ +--------------------------------+
+ | QEMU (correct) |
+ +================================+
+ | :: |
+ | |
+ | a = qatomic_fetch_add(&x, 2);|
+ | smp_mb__after_rmw(); |
+ | b = qatomic_read(&y); |
+ +--------------------------------+
+
+ In the common case where only one thread writes ``x``, it is also possible
+ to write it like this:
+--------------------------------+
| QEMU (correct) |
diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h
index 112a29910b..7855443cab 100644
--- a/include/qemu/atomic.h
+++ b/include/qemu/atomic.h
@@ -243,6 +243,20 @@
#define smp_wmb() smp_mb_release()
#define smp_rmb() smp_mb_acquire()
+/*
+ * SEQ_CST is weaker than the older __sync_* builtins and Linux
+ * kernel read-modify-write atomics. Provide a macro to obtain
+ * the same semantics.
+ */
+#if !defined(QEMU_SANITIZE_THREAD) && \
+ (defined(__i386__) || defined(__x86_64__) || defined(__s390x__))
+# define smp_mb__before_rmw() signal_barrier()
+# define smp_mb__after_rmw() signal_barrier()
+#else
+# define smp_mb__before_rmw() smp_mb()
+# define smp_mb__after_rmw() smp_mb()
+#endif
+
/* qatomic_mb_read/set semantics map Java volatile variables. They are
* less expensive on some platforms (notably POWER) than fully
* sequentially consistent operations.
@@ -257,7 +271,8 @@
#if !defined(__SANITIZE_THREAD__) && \
(defined(__i386__) || defined(__x86_64__) || defined(__s390x__))
/* This is more efficient than a store plus a fence. */
-# define qatomic_mb_set(ptr, i) ((void)qatomic_xchg(ptr, i))
+# define qatomic_mb_set(ptr, i) \
+ ({ (void)qatomic_xchg(ptr, i); smp_mb__after_rmw(); })
#else
# define qatomic_mb_set(ptr, i) \
({ qatomic_store_release(ptr, i); smp_mb(); })
--
2.37.3

View File

@ -0,0 +1,67 @@
From 06c73c4b57dd1f47f819d719a63eb39fbe799304 Mon Sep 17 00:00:00 2001
From: Kevin Wolf <kwolf@redhat.com>
Date: Thu, 12 Jan 2023 20:14:51 +0100
Subject: [PATCH 1/4] qcow2: Fix theoretical corruption in store_bitmap() error
path
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Kevin Wolf <kwolf@redhat.com>
RH-MergeRequest: 251: qemu-img: Fix exit code for errors closing the image
RH-Bugzilla: 2147617
RH-Acked-by: Hanna Czenczek <hreitz@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Commit: [1/4] d0a26bed7b16db41e7baee1f8f2b3ae54e52dd52
In order to write the bitmap table to the image file, it is converted to
big endian. If the write fails, it is passed to clear_bitmap_table() to
free all of the clusters it had allocated before. However, if we don't
convert it back to native endianness first, we'll free things at a wrong
offset.
In practical terms, the offsets will be so high that we won't actually
free any allocated clusters, but just run into an error, but in theory
this can cause image corruption.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230112191454.169353-2-kwolf@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit b03dd9613bcf8fe948581b2b3585510cb525c382)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/qcow2-bitmap.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index 8fb4731551..869069415c 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -115,7 +115,7 @@ static int update_header_sync(BlockDriverState *bs)
return bdrv_flush(bs->file->bs);
}
-static inline void bitmap_table_to_be(uint64_t *bitmap_table, size_t size)
+static inline void bitmap_table_bswap_be(uint64_t *bitmap_table, size_t size)
{
size_t i;
@@ -1401,9 +1401,10 @@ static int store_bitmap(BlockDriverState *bs, Qcow2Bitmap *bm, Error **errp)
goto fail;
}
- bitmap_table_to_be(tb, tb_size);
+ bitmap_table_bswap_be(tb, tb_size);
ret = bdrv_pwrite(bs->file, tb_offset, tb, tb_size * sizeof(tb[0]));
if (ret < 0) {
+ bitmap_table_bswap_be(tb, tb_size);
error_setg_errno(errp, -ret, "Failed to write bitmap '%s' to file",
bm_name);
goto fail;
--
2.37.3

View File

@ -0,0 +1,75 @@
From 2f03293910f3ac559f37d45c95325ae29638003a Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Thu, 9 Mar 2023 08:15:14 -0500
Subject: [PATCH 07/13] qemu-coroutine-lock: add smp_mb__after_rmw()
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 263: qatomic: add smp_mb__before/after_rmw()
RH-Bugzilla: 2168472
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Commit: [7/10] 9cf1b6d3b0dd154489e75ad54a3000ea58983960
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168472
commit e3a3b6ec8169eab2feb241b4982585001512cd55
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Fri Mar 3 10:52:59 2023 +0100
qemu-coroutine-lock: add smp_mb__after_rmw()
mutex->from_push and mutex->handoff in qemu-coroutine-lock implement
the familiar pattern:
write a write b
smp_mb() smp_mb()
read b read a
The memory barrier is required by the C memory model even after a
SEQ_CST read-modify-write operation such as QSLIST_INSERT_HEAD_ATOMIC.
Add it and avoid the unclear qatomic_mb_read() operation.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
util/qemu-coroutine-lock.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
index 2669403839..a03ed0e664 100644
--- a/util/qemu-coroutine-lock.c
+++ b/util/qemu-coroutine-lock.c
@@ -206,10 +206,16 @@ static void coroutine_fn qemu_co_mutex_lock_slowpath(AioContext *ctx,
trace_qemu_co_mutex_lock_entry(mutex, self);
push_waiter(mutex, &w);
+ /*
+ * Add waiter before reading mutex->handoff. Pairs with qatomic_mb_set
+ * in qemu_co_mutex_unlock.
+ */
+ smp_mb__after_rmw();
+
/* This is the "Responsibility Hand-Off" protocol; a lock() picks from
* a concurrent unlock() the responsibility of waking somebody up.
*/
- old_handoff = qatomic_mb_read(&mutex->handoff);
+ old_handoff = qatomic_read(&mutex->handoff);
if (old_handoff &&
has_waiters(mutex) &&
qatomic_cmpxchg(&mutex->handoff, old_handoff, 0) == old_handoff) {
@@ -308,6 +314,7 @@ void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex)
}
our_handoff = mutex->sequence;
+ /* Set handoff before checking for waiters. */
qatomic_mb_set(&mutex->handoff, our_handoff);
if (!has_waiters(mutex)) {
/* The concurrent lock has not added itself yet, so it
--
2.37.3

View File

@ -0,0 +1,70 @@
From 648193b48d8aeaded90fd657e3610d8040f505fc Mon Sep 17 00:00:00 2001
From: Kevin Wolf <kwolf@redhat.com>
Date: Thu, 12 Jan 2023 20:14:53 +0100
Subject: [PATCH 3/4] qemu-img bitmap: Report errors while closing the image
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Kevin Wolf <kwolf@redhat.com>
RH-MergeRequest: 251: qemu-img: Fix exit code for errors closing the image
RH-Bugzilla: 2147617
RH-Acked-by: Hanna Czenczek <hreitz@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Commit: [3/4] 8e13e09564718a0badd03af84f036246a46a0eba
blk_unref() can't report any errors that happen while closing the image.
For example, if qcow2 hits an -ENOSPC error while writing out dirty
bitmaps when it's closed, it prints error messages to stderr, but
'qemu-img bitmap' won't see any error return value and will therefore
look successful with exit code 0.
In order to fix this, manually inactivate the image first before calling
blk_unref(). This already performs the operations that would be most
likely to fail while closing the image, but it can still return errors.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1330
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230112191454.169353-4-kwolf@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit c5e477110dcb8ef4642dce399777c3dee68fa96c)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
qemu-img.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/qemu-img.c b/qemu-img.c
index 18833f7d69..7d035c0c7f 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -4622,6 +4622,7 @@ static int img_bitmap(int argc, char **argv)
QSIMPLEQ_HEAD(, ImgBitmapAction) actions;
ImgBitmapAction *act, *act_next;
const char *op;
+ int inactivate_ret;
QSIMPLEQ_INIT(&actions);
@@ -4806,6 +4807,16 @@ static int img_bitmap(int argc, char **argv)
ret = 0;
out:
+ /*
+ * Manually inactivate the images first because this way we can know whether
+ * an error occurred. blk_unref() doesn't tell us about failures.
+ */
+ inactivate_ret = bdrv_inactivate_all();
+ if (inactivate_ret < 0) {
+ error_report("Error while closing the image: %s", strerror(-inactivate_ret));
+ ret = 1;
+ }
+
blk_unref(src);
blk_unref(blk);
qemu_opts_del(opts);
--
2.37.3

View File

@ -0,0 +1,67 @@
From 2396df7fe527567e8e78761ef24ea1057ef6fa48 Mon Sep 17 00:00:00 2001
From: Kevin Wolf <kwolf@redhat.com>
Date: Thu, 12 Jan 2023 20:14:52 +0100
Subject: [PATCH 2/4] qemu-img commit: Report errors while closing the image
RH-Author: Kevin Wolf <kwolf@redhat.com>
RH-MergeRequest: 251: qemu-img: Fix exit code for errors closing the image
RH-Bugzilla: 2147617
RH-Acked-by: Hanna Czenczek <hreitz@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Commit: [2/4] 28f95bf76d1d63e2b0bed0c2ba5206bd3e5ea4f8
blk_unref() can't report any errors that happen while closing the image.
For example, if qcow2 hits an -ENOSPC error while writing out dirty
bitmaps when it's closed, it prints error messages to stderr, but
'qemu-img commit' won't see any error return value and will therefore
look successful with exit code 0.
In order to fix this, manually inactivate the image first before calling
blk_unref(). This already performs the operations that would be most
likely to fail while closing the image, but it can still return errors.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230112191454.169353-3-kwolf@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 44efba2d713aca076c411594d0c1a2b99155eeb3)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
qemu-img.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/qemu-img.c b/qemu-img.c
index f036a1d428..18833f7d69 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -443,6 +443,11 @@ static BlockBackend *img_open(bool image_opts,
blk = img_open_file(filename, NULL, fmt, flags, writethrough, quiet,
force_share);
}
+
+ if (blk) {
+ blk_set_force_allow_inactivate(blk);
+ }
+
return blk;
}
@@ -1110,6 +1115,14 @@ unref_backing:
done:
qemu_progress_end();
+ /*
+ * Manually inactivate the image first because this way we can know whether
+ * an error occurred. blk_unref() doesn't tell us about failures.
+ */
+ ret = bdrv_inactivate_all();
+ if (ret < 0 && !local_err) {
+ error_setg_errno(&local_err, -ret, "Error while closing the image");
+ }
blk_unref(blk);
if (local_err) {
--
2.37.3

View File

@ -0,0 +1,166 @@
From 7c6faae20638f58681df223e0ca44e0a6cb60d2d Mon Sep 17 00:00:00 2001
From: Kevin Wolf <kwolf@redhat.com>
Date: Thu, 12 Jan 2023 20:14:54 +0100
Subject: [PATCH 4/4] qemu-iotests: Test qemu-img bitmap/commit exit code on
error
RH-Author: Kevin Wolf <kwolf@redhat.com>
RH-MergeRequest: 251: qemu-img: Fix exit code for errors closing the image
RH-Bugzilla: 2147617
RH-Acked-by: Hanna Czenczek <hreitz@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Commit: [4/4] fb2f9de98ddd2ee1d745119e4f15272ef44e0aae
This tests that when an error happens while writing back bitmaps to the
image file in qcow2_inactivate(), 'qemu-img bitmap/commit' actually
return an error value in their exit code instead of making the operation
look successful to scripts.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230112191454.169353-5-kwolf@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 07a4e1f8e5418f36424cd57d5d061b090a238c65)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
.../qemu-iotests/tests/qemu-img-close-errors | 96 +++++++++++++++++++
.../tests/qemu-img-close-errors.out | 23 +++++
2 files changed, 119 insertions(+)
create mode 100755 tests/qemu-iotests/tests/qemu-img-close-errors
create mode 100644 tests/qemu-iotests/tests/qemu-img-close-errors.out
diff --git a/tests/qemu-iotests/tests/qemu-img-close-errors b/tests/qemu-iotests/tests/qemu-img-close-errors
new file mode 100755
index 0000000000..50bfb6cfa2
--- /dev/null
+++ b/tests/qemu-iotests/tests/qemu-img-close-errors
@@ -0,0 +1,96 @@
+#!/usr/bin/env bash
+# group: rw auto quick
+#
+# Check that errors while closing the image, in particular writing back dirty
+# bitmaps, is correctly reported with a failing qemu-img exit code.
+#
+# Copyright (C) 2023 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+#
+
+# creator
+owner=kwolf@redhat.com
+
+seq="$(basename $0)"
+echo "QA output created by $seq"
+
+status=1 # failure is the default!
+
+_cleanup()
+{
+ _cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+cd ..
+. ./common.rc
+. ./common.filter
+
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+
+size=1G
+
+# The error we are going to use is ENOSPC. Depending on how many bitmaps we
+# create in the backing file (and therefore increase the used up space), we get
+# failures in different places. With a low number, only merging the bitmap
+# fails, whereas with a higher number, already 'qemu-img commit' fails.
+for max_bitmap in 6 7; do
+ echo
+ echo "=== Test with $max_bitmap bitmaps ==="
+
+ TEST_IMG="$TEST_IMG.base" _make_test_img -q $size
+ for i in $(seq 1 $max_bitmap); do
+ $QEMU_IMG bitmap --add "$TEST_IMG.base" "stale-bitmap-$i"
+ done
+
+ # Simulate a block device of 128 MB by resizing the image file accordingly
+ # and then enforcing the size with the raw driver
+ $QEMU_IO -f raw -c "truncate 128M" "$TEST_IMG.base"
+ BASE_JSON='json:{
+ "driver": "qcow2",
+ "file": {
+ "driver": "raw",
+ "size": 134217728,
+ "file": {
+ "driver": "file",
+ "filename":"'"$TEST_IMG.base"'"
+ }
+ }
+ }'
+
+ _make_test_img -q -b "$BASE_JSON" -F $IMGFMT
+ $QEMU_IMG bitmap --add "$TEST_IMG" "good-bitmap"
+
+ $QEMU_IO -c 'write 0 126m' "$TEST_IMG" | _filter_qemu_io
+
+ $QEMU_IMG commit -d "$TEST_IMG" 2>&1 | _filter_generated_node_ids
+ echo "qemu-img commit exit code: ${PIPESTATUS[0]}"
+
+ $QEMU_IMG bitmap --add "$BASE_JSON" "good-bitmap"
+ echo "qemu-img bitmap --add exit code: $?"
+
+ $QEMU_IMG bitmap --merge "good-bitmap" -b "$TEST_IMG" "$BASE_JSON" \
+ "good-bitmap" 2>&1 | _filter_generated_node_ids
+ echo "qemu-img bitmap --merge exit code: ${PIPESTATUS[0]}"
+done
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
+
diff --git a/tests/qemu-iotests/tests/qemu-img-close-errors.out b/tests/qemu-iotests/tests/qemu-img-close-errors.out
new file mode 100644
index 0000000000..1bfe88f176
--- /dev/null
+++ b/tests/qemu-iotests/tests/qemu-img-close-errors.out
@@ -0,0 +1,23 @@
+QA output created by qemu-img-close-errors
+
+=== Test with 6 bitmaps ===
+wrote 132120576/132120576 bytes at offset 0
+126 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Image committed.
+qemu-img commit exit code: 0
+qemu-img bitmap --add exit code: 0
+qemu-img: Lost persistent bitmaps during inactivation of node 'NODE_NAME': Failed to write bitmap 'good-bitmap' to file: No space left on device
+qemu-img: Error while closing the image: Invalid argument
+qemu-img: Lost persistent bitmaps during inactivation of node 'NODE_NAME': Failed to write bitmap 'good-bitmap' to file: No space left on device
+qemu-img bitmap --merge exit code: 1
+
+=== Test with 7 bitmaps ===
+wrote 132120576/132120576 bytes at offset 0
+126 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+qemu-img: Lost persistent bitmaps during inactivation of node 'NODE_NAME': Failed to write bitmap 'stale-bitmap-7' to file: No space left on device
+qemu-img: Lost persistent bitmaps during inactivation of node 'NODE_NAME': Failed to write bitmap 'stale-bitmap-7' to file: No space left on device
+qemu-img: Error while closing the image: Invalid argument
+qemu-img commit exit code: 1
+qemu-img bitmap --add exit code: 0
+qemu-img bitmap --merge exit code: 0
+*** done
--
2.37.3

View File

@ -1,92 +0,0 @@
From e6aae1d0368a152924c38775e517f4e83c1d898b Mon Sep 17 00:00:00 2001
From: Eric Blake <eblake@redhat.com>
Date: Wed, 11 May 2022 19:49:23 -0500
Subject: [PATCH 1/2] qemu-nbd: Pass max connections to blockdev layer
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Eric Blake <eblake@redhat.com>
RH-MergeRequest: 90: Advertise MULTI_CONN on writeable NBD servers
RH-Commit: [1/2] b0e33fd125bf3523b8b9a4dead3c8bb2342bfd4e (ebblake/centos-qemu-kvm)
RH-Bugzilla: 1708300
RH-Acked-by: Nir Soffer <None>
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
RH-Acked-by: Daniel P. Berrangé <berrange@redhat.com>
The next patch wants to adjust whether the NBD server code advertises
MULTI_CONN based on whether it is known if the server limits to
exactly one client. For a server started by QMP, this information is
obtained through nbd_server_start (which can support more than one
export); but for qemu-nbd (which supports exactly one export), it is
controlled only by the command-line option -e/--shared. Since we
already have a hook function used by qemu-nbd, it's easiest to just
alter its signature to fit our needs.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20220512004924.417153-2-eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit a5fced40212ed73c715ca298a2929dd4d99c9999)
Signed-off-by: Eric Blake <eblake@redhat.com>
---
blockdev-nbd.c | 8 ++++----
include/block/nbd.h | 2 +-
qemu-nbd.c | 2 +-
3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 9840d25a82..add41a23af 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -30,18 +30,18 @@ typedef struct NBDServerData {
} NBDServerData;
static NBDServerData *nbd_server;
-static bool is_qemu_nbd;
+static int qemu_nbd_connections = -1; /* Non-negative if this is qemu-nbd */
static void nbd_update_server_watch(NBDServerData *s);
-void nbd_server_is_qemu_nbd(bool value)
+void nbd_server_is_qemu_nbd(int max_connections)
{
- is_qemu_nbd = value;
+ qemu_nbd_connections = max_connections;
}
bool nbd_server_is_running(void)
{
- return nbd_server || is_qemu_nbd;
+ return nbd_server || qemu_nbd_connections >= 0;
}
static void nbd_blockdev_client_closed(NBDClient *client, bool ignored)
diff --git a/include/block/nbd.h b/include/block/nbd.h
index a98eb665da..c5a29ce1c6 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -344,7 +344,7 @@ void nbd_client_new(QIOChannelSocket *sioc,
void nbd_client_get(NBDClient *client);
void nbd_client_put(NBDClient *client);
-void nbd_server_is_qemu_nbd(bool value);
+void nbd_server_is_qemu_nbd(int max_connections);
bool nbd_server_is_running(void);
void nbd_server_start(SocketAddress *addr, const char *tls_creds,
const char *tls_authz, uint32_t max_connections,
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 713e7557a9..8c25ae93df 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -1087,7 +1087,7 @@ int main(int argc, char **argv)
bs->detect_zeroes = detect_zeroes;
- nbd_server_is_qemu_nbd(true);
+ nbd_server_is_qemu_nbd(shared);
export_opts = g_new(BlockExportOptions, 1);
*export_opts = (BlockExportOptions) {
--
2.31.1

View File

@ -0,0 +1,146 @@
From d46ca52c3f42add549bd3790a41d06594821334e Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Thu, 9 Mar 2023 08:10:57 -0500
Subject: [PATCH 03/13] qemu-thread-posix: cleanup, fix, document QemuEvent
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 263: qatomic: add smp_mb__before/after_rmw()
RH-Bugzilla: 2168472
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Commit: [3/10] 746070c4d78c7f0a9ac4456d9aee69475acb8964
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168472
commit 9586a1329f5dce6c1d7f4de53cf0536644d7e593
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Thu Mar 2 11:19:52 2023 +0100
qemu-thread-posix: cleanup, fix, document QemuEvent
QemuEvent is currently broken on ARM due to missing memory barriers
after qatomic_*(). Apart from adding the memory barrier, a closer look
reveals some unpaired memory barriers too. Document more clearly what
is going on.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
util/qemu-thread-posix.c | 69 ++++++++++++++++++++++++++++------------
1 file changed, 49 insertions(+), 20 deletions(-)
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index e1225b63bd..dd3b6d4670 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -430,13 +430,21 @@ void qemu_event_destroy(QemuEvent *ev)
void qemu_event_set(QemuEvent *ev)
{
- /* qemu_event_set has release semantics, but because it *loads*
+ assert(ev->initialized);
+
+ /*
+ * Pairs with both qemu_event_reset() and qemu_event_wait().
+ *
+ * qemu_event_set has release semantics, but because it *loads*
* ev->value we need a full memory barrier here.
*/
- assert(ev->initialized);
smp_mb();
if (qatomic_read(&ev->value) != EV_SET) {
- if (qatomic_xchg(&ev->value, EV_SET) == EV_BUSY) {
+ int old = qatomic_xchg(&ev->value, EV_SET);
+
+ /* Pairs with memory barrier in kernel futex_wait system call. */
+ smp_mb__after_rmw();
+ if (old == EV_BUSY) {
/* There were waiters, wake them up. */
qemu_futex_wake(ev, INT_MAX);
}
@@ -445,18 +453,19 @@ void qemu_event_set(QemuEvent *ev)
void qemu_event_reset(QemuEvent *ev)
{
- unsigned value;
-
assert(ev->initialized);
- value = qatomic_read(&ev->value);
- smp_mb_acquire();
- if (value == EV_SET) {
- /*
- * If there was a concurrent reset (or even reset+wait),
- * do nothing. Otherwise change EV_SET->EV_FREE.
- */
- qatomic_or(&ev->value, EV_FREE);
- }
+
+ /*
+ * If there was a concurrent reset (or even reset+wait),
+ * do nothing. Otherwise change EV_SET->EV_FREE.
+ */
+ qatomic_or(&ev->value, EV_FREE);
+
+ /*
+ * Order reset before checking the condition in the caller.
+ * Pairs with the first memory barrier in qemu_event_set().
+ */
+ smp_mb__after_rmw();
}
void qemu_event_wait(QemuEvent *ev)
@@ -464,20 +473,40 @@ void qemu_event_wait(QemuEvent *ev)
unsigned value;
assert(ev->initialized);
- value = qatomic_read(&ev->value);
- smp_mb_acquire();
+
+ /*
+ * qemu_event_wait must synchronize with qemu_event_set even if it does
+ * not go down the slow path, so this load-acquire is needed that
+ * synchronizes with the first memory barrier in qemu_event_set().
+ *
+ * If we do go down the slow path, there is no requirement at all: we
+ * might miss a qemu_event_set() here but ultimately the memory barrier in
+ * qemu_futex_wait() will ensure the check is done correctly.
+ */
+ value = qatomic_load_acquire(&ev->value);
if (value != EV_SET) {
if (value == EV_FREE) {
/*
- * Leave the event reset and tell qemu_event_set that there
- * are waiters. No need to retry, because there cannot be
- * a concurrent busy->free transition. After the CAS, the
- * event will be either set or busy.
+ * Leave the event reset and tell qemu_event_set that there are
+ * waiters. No need to retry, because there cannot be a concurrent
+ * busy->free transition. After the CAS, the event will be either
+ * set or busy.
+ *
+ * This cmpxchg doesn't have particular ordering requirements if it
+ * succeeds (moving the store earlier can only cause qemu_event_set()
+ * to issue _more_ wakeups), the failing case needs acquire semantics
+ * like the load above.
*/
if (qatomic_cmpxchg(&ev->value, EV_FREE, EV_BUSY) == EV_SET) {
return;
}
}
+
+ /*
+ * This is the final check for a concurrent set, so it does need
+ * a smp_mb() pairing with the second barrier of qemu_event_set().
+ * The barrier is inside the FUTEX_WAIT system call.
+ */
qemu_futex_wait(ev, EV_BUSY);
}
}
--
2.37.3

View File

@ -0,0 +1,162 @@
From fa730378c42567e77eaf3e70983108f31f9001b9 Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Thu, 9 Mar 2023 08:11:05 -0500
Subject: [PATCH 04/13] qemu-thread-win32: cleanup, fix, document QemuEvent
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 263: qatomic: add smp_mb__before/after_rmw()
RH-Bugzilla: 2168472
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Commit: [4/10] 43d5bd903b460d4c3c5793a456820e8c5c8521d9
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168472
commit 6c5df4b48f0c52a61342ecb307a43f4c2a3565c4
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Thu Mar 2 11:22:50 2023 +0100
qemu-thread-win32: cleanup, fix, document QemuEvent
QemuEvent is currently broken on ARM due to missing memory barriers
after qatomic_*(). Apart from adding the memory barrier, a closer look
reveals some unpaired memory barriers that are not really needed and
complicated the functions unnecessarily. Also, it is relying on
a memory barrier in ResetEvent(); the barrier _ought_ to be there
but there is really no documentation about it, so make it explicit.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
util/qemu-thread-win32.c | 82 +++++++++++++++++++++++++++-------------
1 file changed, 56 insertions(+), 26 deletions(-)
diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
index 52eb19f351..c10249bc2e 100644
--- a/util/qemu-thread-win32.c
+++ b/util/qemu-thread-win32.c
@@ -246,12 +246,20 @@ void qemu_event_destroy(QemuEvent *ev)
void qemu_event_set(QemuEvent *ev)
{
assert(ev->initialized);
- /* qemu_event_set has release semantics, but because it *loads*
+
+ /*
+ * Pairs with both qemu_event_reset() and qemu_event_wait().
+ *
+ * qemu_event_set has release semantics, but because it *loads*
* ev->value we need a full memory barrier here.
*/
smp_mb();
if (qatomic_read(&ev->value) != EV_SET) {
- if (qatomic_xchg(&ev->value, EV_SET) == EV_BUSY) {
+ int old = qatomic_xchg(&ev->value, EV_SET);
+
+ /* Pairs with memory barrier after ResetEvent. */
+ smp_mb__after_rmw();
+ if (old == EV_BUSY) {
/* There were waiters, wake them up. */
SetEvent(ev->event);
}
@@ -260,17 +268,19 @@ void qemu_event_set(QemuEvent *ev)
void qemu_event_reset(QemuEvent *ev)
{
- unsigned value;
-
assert(ev->initialized);
- value = qatomic_read(&ev->value);
- smp_mb_acquire();
- if (value == EV_SET) {
- /* If there was a concurrent reset (or even reset+wait),
- * do nothing. Otherwise change EV_SET->EV_FREE.
- */
- qatomic_or(&ev->value, EV_FREE);
- }
+
+ /*
+ * If there was a concurrent reset (or even reset+wait),
+ * do nothing. Otherwise change EV_SET->EV_FREE.
+ */
+ qatomic_or(&ev->value, EV_FREE);
+
+ /*
+ * Order reset before checking the condition in the caller.
+ * Pairs with the first memory barrier in qemu_event_set().
+ */
+ smp_mb__after_rmw();
}
void qemu_event_wait(QemuEvent *ev)
@@ -278,29 +288,49 @@ void qemu_event_wait(QemuEvent *ev)
unsigned value;
assert(ev->initialized);
- value = qatomic_read(&ev->value);
- smp_mb_acquire();
+
+ /*
+ * qemu_event_wait must synchronize with qemu_event_set even if it does
+ * not go down the slow path, so this load-acquire is needed that
+ * synchronizes with the first memory barrier in qemu_event_set().
+ *
+ * If we do go down the slow path, there is no requirement at all: we
+ * might miss a qemu_event_set() here but ultimately the memory barrier in
+ * qemu_futex_wait() will ensure the check is done correctly.
+ */
+ value = qatomic_load_acquire(&ev->value);
if (value != EV_SET) {
if (value == EV_FREE) {
- /* qemu_event_set is not yet going to call SetEvent, but we are
- * going to do another check for EV_SET below when setting EV_BUSY.
- * At that point it is safe to call WaitForSingleObject.
+ /*
+ * Here the underlying kernel event is reset, but qemu_event_set is
+ * not yet going to call SetEvent. However, there will be another
+ * check for EV_SET below when setting EV_BUSY. At that point it
+ * is safe to call WaitForSingleObject.
*/
ResetEvent(ev->event);
- /* Tell qemu_event_set that there are waiters. No need to retry
- * because there cannot be a concurrent busy->free transition.
- * After the CAS, the event will be either set or busy.
+ /*
+ * It is not clear whether ResetEvent provides this barrier; kernel
+ * APIs (KeResetEvent/KeClearEvent) do not. Better safe than sorry!
+ */
+ smp_mb();
+
+ /*
+ * Leave the event reset and tell qemu_event_set that there are
+ * waiters. No need to retry, because there cannot be a concurrent
+ * busy->free transition. After the CAS, the event will be either
+ * set or busy.
*/
if (qatomic_cmpxchg(&ev->value, EV_FREE, EV_BUSY) == EV_SET) {
- value = EV_SET;
- } else {
- value = EV_BUSY;
+ return;
}
}
- if (value == EV_BUSY) {
- WaitForSingleObject(ev->event, INFINITE);
- }
+
+ /*
+ * ev->value is now EV_BUSY. Since we didn't observe EV_SET,
+ * qemu_event_set() must observe EV_BUSY and call SetEvent().
+ */
+ WaitForSingleObject(ev->event, INFINITE);
}
}
--
2.37.3

View File

@ -1,100 +0,0 @@
From a039ed652e6d2f5edcef9d5d1d3baec17ce7f929 Mon Sep 17 00:00:00 2001
From: Gavin Shan <gshan@redhat.com>
Date: Wed, 11 May 2022 18:01:35 +0800
Subject: [PATCH 04/16] qtest/numa-test: Correct CPU and NUMA association in
aarch64_numa_cpu()
RH-Author: Gavin Shan <gshan@redhat.com>
RH-MergeRequest: 86: hw/arm/virt: Fix the default CPU topology
RH-Commit: [4/6] 64e9908a179eb4fb586d662f70f275a81808e50c (gwshan/qemu-rhel-9)
RH-Bugzilla: 2041823
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Andrew Jones <drjones@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2041823
In aarch64_numa_cpu(), the CPU and NUMA association is something
like below. Two threads in the same core/cluster/socket are
associated with two individual NUMA nodes, which is unreal as
Igor Mammedov mentioned. We don't expect the association to break
NUMA-to-socket boundary, which matches with the real world.
NUMA-node socket cluster core thread
------------------------------------------
0 0 0 0 0
1 0 0 0 1
This corrects the topology for CPUs and their association with
NUMA nodes. After this patch is applied, the CPU and NUMA
association becomes something like below, which looks real.
Besides, socket/cluster/core/thread IDs are all checked when
the NUMA node IDs are verified. It helps to check if the CPU
topology is properly populated or not.
NUMA-node socket cluster core thread
------------------------------------------
0 1 0 0 0
1 0 0 0 0
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 20220503140304.855514-5-gshan@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit e280ecb39bc1629f74ea5479d464fd1608dc8f76)
Signed-off-by: Gavin Shan <gshan@redhat.com>
---
tests/qtest/numa-test.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/tests/qtest/numa-test.c b/tests/qtest/numa-test.c
index aeda8c774c..32e35daaae 100644
--- a/tests/qtest/numa-test.c
+++ b/tests/qtest/numa-test.c
@@ -224,17 +224,17 @@ static void aarch64_numa_cpu(const void *data)
g_autofree char *cli = NULL;
cli = make_cli(data, "-machine "
- "smp.cpus=2,smp.sockets=1,smp.clusters=1,smp.cores=1,smp.threads=2 "
+ "smp.cpus=2,smp.sockets=2,smp.clusters=1,smp.cores=1,smp.threads=1 "
"-numa node,nodeid=0,memdev=ram -numa node,nodeid=1 "
- "-numa cpu,node-id=1,thread-id=0 "
- "-numa cpu,node-id=0,thread-id=1");
+ "-numa cpu,node-id=0,socket-id=1,cluster-id=0,core-id=0,thread-id=0 "
+ "-numa cpu,node-id=1,socket-id=0,cluster-id=0,core-id=0,thread-id=0");
qts = qtest_init(cli);
cpus = get_cpus(qts, &resp);
g_assert(cpus);
while ((e = qlist_pop(cpus))) {
QDict *cpu, *props;
- int64_t thread, node;
+ int64_t socket, cluster, core, thread, node;
cpu = qobject_to(QDict, e);
g_assert(qdict_haskey(cpu, "props"));
@@ -242,12 +242,18 @@ static void aarch64_numa_cpu(const void *data)
g_assert(qdict_haskey(props, "node-id"));
node = qdict_get_int(props, "node-id");
+ g_assert(qdict_haskey(props, "socket-id"));
+ socket = qdict_get_int(props, "socket-id");
+ g_assert(qdict_haskey(props, "cluster-id"));
+ cluster = qdict_get_int(props, "cluster-id");
+ g_assert(qdict_haskey(props, "core-id"));
+ core = qdict_get_int(props, "core-id");
g_assert(qdict_haskey(props, "thread-id"));
thread = qdict_get_int(props, "thread-id");
- if (thread == 0) {
+ if (socket == 0 && cluster == 0 && core == 0 && thread == 0) {
g_assert_cmpint(node, ==, 1);
- } else if (thread == 1) {
+ } else if (socket == 1 && cluster == 0 && core == 0 && thread == 0) {
g_assert_cmpint(node, ==, 0);
} else {
g_assert(false);
--
2.31.1

View File

@ -1,68 +0,0 @@
From 66f3928b40991d8467a3da086688f73d061886c8 Mon Sep 17 00:00:00 2001
From: Gavin Shan <gshan@redhat.com>
Date: Wed, 11 May 2022 18:01:35 +0800
Subject: [PATCH 02/16] qtest/numa-test: Specify CPU topology in
aarch64_numa_cpu()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Gavin Shan <gshan@redhat.com>
RH-MergeRequest: 86: hw/arm/virt: Fix the default CPU topology
RH-Commit: [2/6] b851e7ad59e057825392ddf75e9040cc102a0385 (gwshan/qemu-rhel-9)
RH-Bugzilla: 2041823
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Andrew Jones <drjones@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2041823
The CPU topology isn't enabled on arm/virt machine yet, but we're
going to do it in next patch. After the CPU topology is enabled by
next patch, "thread-id=1" becomes invalid because the CPU core is
preferred on arm/virt machine. It means these two CPUs have 0/1
as their core IDs, but their thread IDs are all 0. It will trigger
test failure as the following message indicates:
[14/21 qemu:qtest+qtest-aarch64 / qtest-aarch64/numa-test ERROR
1.48s killed by signal 6 SIGABRT
>>> G_TEST_DBUS_DAEMON=/home/gavin/sandbox/qemu.main/tests/dbus-vmstate-daemon.sh \
QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon \
QTEST_QEMU_BINARY=./qemu-system-aarch64 \
QTEST_QEMU_IMG=./qemu-img MALLOC_PERTURB_=83 \
/home/gavin/sandbox/qemu.main/build/tests/qtest/numa-test --tap -k
――――――――――――――――――――――――――――――――――――――――――――――
stderr:
qemu-system-aarch64: -numa cpu,node-id=0,thread-id=1: no match found
This fixes the issue by providing comprehensive SMP configurations
in aarch64_numa_cpu(). The SMP configurations aren't used before
the CPU topology is enabled in next patch.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Message-id: 20220503140304.855514-3-gshan@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit ac7199a2523ce2ccf8e685087a5d177eeca89b09)
Signed-off-by: Gavin Shan <gshan@redhat.com>
---
tests/qtest/numa-test.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tests/qtest/numa-test.c b/tests/qtest/numa-test.c
index 90bf68a5b3..aeda8c774c 100644
--- a/tests/qtest/numa-test.c
+++ b/tests/qtest/numa-test.c
@@ -223,7 +223,8 @@ static void aarch64_numa_cpu(const void *data)
QTestState *qts;
g_autofree char *cli = NULL;
- cli = make_cli(data, "-machine smp.cpus=2 "
+ cli = make_cli(data, "-machine "
+ "smp.cpus=2,smp.sockets=1,smp.clusters=1,smp.cores=1,smp.threads=2 "
"-numa node,nodeid=0,memdev=ram -numa node,nodeid=1 "
"-numa cpu,node-id=1,thread-id=0 "
"-numa cpu,node-id=0,thread-id=1");
--
2.31.1

View File

@ -0,0 +1,114 @@
From 2f0febd6813c4ad7f52e43afb3ecce7aef3557e6 Mon Sep 17 00:00:00 2001
From: Matthew Rosato <mjrosato@linux.ibm.com>
Date: Fri, 28 Oct 2022 15:47:56 -0400
Subject: [PATCH 08/11] s390x/pci: RPCIT second pass when mappings exhausted
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 250: s390x/pci: reset ISM passthrough devices on shutdown and system reset
RH-Bugzilla: 2163713
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [1/4] 0b4500b9247725b1ef0b290bb85392300a618cac
If we encounter a new mapping while the number of available DMA entries
in vfio is 0, we are currently skipping that mapping which is a problem
if we manage to free up DMA space after that within the same RPCIT --
we will return to the guest with CC0 and have not mapped everything
within the specified range. This issue was uncovered while testing
changes to the s390 linux kernel iommu/dma code, where a different
usage pattern was employed (new mappings start at the end of the
aperture and work back towards the front, making us far more likely
to encounter new mappings before invalidated mappings during a
global refresh).
Fix this by tracking whether any mappings were skipped due to vfio
DMA limit hitting 0; when this occurs, we still continue the range
and unmap/map anything we can - then we must re-run the range again
to pickup anything that was missed. This must occur in a loop until
all requests are satisfied (success) or we detect that we are still
unable to complete all mappings (return ZPCI_RPCIT_ST_INSUFF_RES).
Link: https://lore.kernel.org/linux-s390/20221019144435.369902-1-schnelle@linux.ibm.com/
Fixes: 37fa32de70 ("s390x/pci: Honor DMA limits set by vfio")
Reported-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Message-Id: <20221028194758.204007-2-mjrosato@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 4a8d21ba50fc8625c3bd51dab903872952f95718)
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
hw/s390x/s390-pci-inst.c | 29 ++++++++++++++++++++++-------
1 file changed, 22 insertions(+), 7 deletions(-)
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 20a9bcc7af..7cc4bcf850 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -677,8 +677,9 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
S390PCIBusDevice *pbdev;
S390PCIIOMMU *iommu;
S390IOTLBEntry entry;
- hwaddr start, end;
+ hwaddr start, end, sstart;
uint32_t dma_avail;
+ bool again;
if (env->psw.mask & PSW_MASK_PSTATE) {
s390_program_interrupt(env, PGM_PRIVILEGED, ra);
@@ -691,7 +692,7 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
}
fh = env->regs[r1] >> 32;
- start = env->regs[r2];
+ sstart = start = env->regs[r2];
end = start + env->regs[r2 + 1];
pbdev = s390_pci_find_dev_by_fh(s390_get_phb(), fh);
@@ -732,6 +733,9 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
goto err;
}
+ retry:
+ start = sstart;
+ again = false;
while (start < end) {
error = s390_guest_io_table_walk(iommu->g_iota, start, &entry);
if (error) {
@@ -739,13 +743,24 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
}
start += entry.len;
- while (entry.iova < start && entry.iova < end &&
- (dma_avail > 0 || entry.perm == IOMMU_NONE)) {
- dma_avail = s390_pci_update_iotlb(iommu, &entry);
- entry.iova += TARGET_PAGE_SIZE;
- entry.translated_addr += TARGET_PAGE_SIZE;
+ while (entry.iova < start && entry.iova < end) {
+ if (dma_avail > 0 || entry.perm == IOMMU_NONE) {
+ dma_avail = s390_pci_update_iotlb(iommu, &entry);
+ entry.iova += TARGET_PAGE_SIZE;
+ entry.translated_addr += TARGET_PAGE_SIZE;
+ } else {
+ /*
+ * We are unable to make a new mapping at this time, continue
+ * on and hopefully free up more space. Then attempt another
+ * pass.
+ */
+ again = true;
+ break;
+ }
}
}
+ if (again && dma_avail > 0)
+ goto retry;
err:
if (error) {
pbdev->state = ZPCI_FS_ERROR;
--
2.37.3

View File

@ -0,0 +1,125 @@
From b972c5a2763a91024725c147cf1691ed8e180c7c Mon Sep 17 00:00:00 2001
From: Matthew Rosato <mjrosato@linux.ibm.com>
Date: Fri, 28 Oct 2022 15:47:57 -0400
Subject: [PATCH 09/11] s390x/pci: coalesce unmap operations
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 250: s390x/pci: reset ISM passthrough devices on shutdown and system reset
RH-Bugzilla: 2163713
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [2/4] 7b5ee38eca565f5a7cbede4b9883ba3a508fb46c
Currently, each unmapped page is handled as an individual iommu
region notification. Attempt to group contiguous unmap operations
into fewer notifications to reduce overhead.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Message-Id: <20221028194758.204007-3-mjrosato@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit ef536007c3301bbd6a787e4c2210ea289adaa6f0)
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
hw/s390x/s390-pci-inst.c | 51 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 51 insertions(+)
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 7cc4bcf850..66e764f901 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -640,6 +640,8 @@ static uint32_t s390_pci_update_iotlb(S390PCIIOMMU *iommu,
}
g_hash_table_remove(iommu->iotlb, &entry->iova);
inc_dma_avail(iommu);
+ /* Don't notify the iommu yet, maybe we can bundle contiguous unmaps */
+ goto out;
} else {
if (cache) {
if (cache->perm == entry->perm &&
@@ -663,15 +665,44 @@ static uint32_t s390_pci_update_iotlb(S390PCIIOMMU *iommu,
dec_dma_avail(iommu);
}
+ /*
+ * All associated iotlb entries have already been cleared, trigger the
+ * unmaps.
+ */
memory_region_notify_iommu(&iommu->iommu_mr, 0, event);
out:
return iommu->dma_limit ? iommu->dma_limit->avail : 1;
}
+static void s390_pci_batch_unmap(S390PCIIOMMU *iommu, uint64_t iova,
+ uint64_t len)
+{
+ uint64_t remain = len, start = iova, end = start + len - 1, mask, size;
+ IOMMUTLBEvent event = {
+ .type = IOMMU_NOTIFIER_UNMAP,
+ .entry = {
+ .target_as = &address_space_memory,
+ .translated_addr = 0,
+ .perm = IOMMU_NONE,
+ },
+ };
+
+ while (remain >= TARGET_PAGE_SIZE) {
+ mask = dma_aligned_pow2_mask(start, end, 64);
+ size = mask + 1;
+ event.entry.iova = start;
+ event.entry.addr_mask = mask;
+ memory_region_notify_iommu(&iommu->iommu_mr, 0, event);
+ start += size;
+ remain -= size;
+ }
+}
+
int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
{
CPUS390XState *env = &cpu->env;
+ uint64_t iova, coalesce = 0;
uint32_t fh;
uint16_t error = 0;
S390PCIBusDevice *pbdev;
@@ -742,6 +773,21 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
break;
}
+ /*
+ * If this is an unmap of a PTE, let's try to coalesce multiple unmaps
+ * into as few notifier events as possible.
+ */
+ if (entry.perm == IOMMU_NONE && entry.len == TARGET_PAGE_SIZE) {
+ if (coalesce == 0) {
+ iova = entry.iova;
+ }
+ coalesce += entry.len;
+ } else if (coalesce > 0) {
+ /* Unleash the coalesced unmap before processing a new map */
+ s390_pci_batch_unmap(iommu, iova, coalesce);
+ coalesce = 0;
+ }
+
start += entry.len;
while (entry.iova < start && entry.iova < end) {
if (dma_avail > 0 || entry.perm == IOMMU_NONE) {
@@ -759,6 +805,11 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
}
}
}
+ if (coalesce) {
+ /* Unleash the coalesced unmap before finishing rpcit */
+ s390_pci_batch_unmap(iommu, iova, coalesce);
+ coalesce = 0;
+ }
if (again && dma_avail > 0)
goto retry;
err:
--
2.37.3

View File

@ -0,0 +1,147 @@
From 9ec96a236be84e34b16681e658d3910fc3877a44 Mon Sep 17 00:00:00 2001
From: Matthew Rosato <mjrosato@linux.ibm.com>
Date: Fri, 9 Dec 2022 14:57:00 -0500
Subject: [PATCH 11/11] s390x/pci: reset ISM passthrough devices on shutdown
and system reset
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 250: s390x/pci: reset ISM passthrough devices on shutdown and system reset
RH-Bugzilla: 2163713
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [4/4] c857d022c7c2f43cdeb66c4f6acfd9272c925b35
ISM device firmware stores unique state information that can
can cause a wholesale unmap of the associated IOMMU (e.g. when
we get a termination signal for QEMU) to trigger firmware errors
because firmware believes we are attempting to invalidate entries
that are still in-use by the guest OS (when in fact that guest is
in the process of being terminated or rebooted).
To alleviate this, register both a shutdown notifier (for unexpected
termination cases e.g. virsh destroy) as well as a reset callback
(for cases like guest OS reboot). For each of these scenarios, trigger
PCI device reset; this is enough to indicate to firmware that the IOMMU
is no longer in-use by the guest OS, making it safe to invalidate any
associated IOMMU entries.
Fixes: 15d0e7942d3b ("s390x/pci: don't fence interpreted devices without MSI-X")
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Message-Id: <20221209195700.263824-1-mjrosato@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
[thuth: Adjusted the hunk in s390-pci-vfio.c due to different context]
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 03451953c79e6b31f7860ee0c35b28e181d573c1)
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
hw/s390x/s390-pci-bus.c | 28 ++++++++++++++++++++++++++++
hw/s390x/s390-pci-vfio.c | 2 ++
include/hw/s390x/s390-pci-bus.h | 5 +++++
3 files changed, 35 insertions(+)
diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index d8b1e44a02..2d92848b0f 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -24,6 +24,8 @@
#include "hw/pci/msi.h"
#include "qemu/error-report.h"
#include "qemu/module.h"
+#include "sysemu/reset.h"
+#include "sysemu/runstate.h"
#ifndef DEBUG_S390PCI_BUS
#define DEBUG_S390PCI_BUS 0
@@ -150,10 +152,30 @@ out:
psccb->header.response_code = cpu_to_be16(rc);
}
+static void s390_pci_shutdown_notifier(Notifier *n, void *opaque)
+{
+ S390PCIBusDevice *pbdev = container_of(n, S390PCIBusDevice,
+ shutdown_notifier);
+
+ pci_device_reset(pbdev->pdev);
+}
+
+static void s390_pci_reset_cb(void *opaque)
+{
+ S390PCIBusDevice *pbdev = opaque;
+
+ pci_device_reset(pbdev->pdev);
+}
+
static void s390_pci_perform_unplug(S390PCIBusDevice *pbdev)
{
HotplugHandler *hotplug_ctrl;
+ if (pbdev->pft == ZPCI_PFT_ISM) {
+ notifier_remove(&pbdev->shutdown_notifier);
+ qemu_unregister_reset(s390_pci_reset_cb, pbdev);
+ }
+
/* Unplug the PCI device */
if (pbdev->pdev) {
DeviceState *pdev = DEVICE(pbdev->pdev);
@@ -1111,6 +1133,12 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
pbdev->fh |= FH_SHM_VFIO;
pbdev->forwarding_assist = false;
}
+ /* Register shutdown notifier and reset callback for ISM devices */
+ if (pbdev->pft == ZPCI_PFT_ISM) {
+ pbdev->shutdown_notifier.notify = s390_pci_shutdown_notifier;
+ qemu_register_shutdown_notifier(&pbdev->shutdown_notifier);
+ qemu_register_reset(s390_pci_reset_cb, pbdev);
+ }
} else {
pbdev->fh |= FH_SHM_EMUL;
/* Always intercept emulated devices */
diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 99806e2a84..69af35f4fe 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -124,6 +124,8 @@ static void s390_pci_read_base(S390PCIBusDevice *pbdev,
/* The following values remain 0 until we support other FMB formats */
pbdev->zpci_fn.fmbl = 0;
pbdev->zpci_fn.pft = 0;
+ /* Store function type separately for type-specific behavior */
+ pbdev->pft = cap->pft;
/*
* If appropriate, reduce the size of the supported DMA aperture reported
diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index 1c46e3a269..e0a9f9385b 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -39,6 +39,9 @@
#define UID_CHECKING_ENABLED 0x01
#define ZPCI_DTSM 0x40
+/* zPCI Function Types */
+#define ZPCI_PFT_ISM 5
+
OBJECT_DECLARE_SIMPLE_TYPE(S390pciState, S390_PCI_HOST_BRIDGE)
OBJECT_DECLARE_SIMPLE_TYPE(S390PCIBus, S390_PCI_BUS)
OBJECT_DECLARE_SIMPLE_TYPE(S390PCIBusDevice, S390_PCI_DEVICE)
@@ -344,6 +347,7 @@ struct S390PCIBusDevice {
uint16_t noi;
uint16_t maxstbl;
uint8_t sum;
+ uint8_t pft;
S390PCIGroup *pci_group;
ClpRspQueryPci zpci_fn;
S390MsixInfo msix;
@@ -352,6 +356,7 @@ struct S390PCIBusDevice {
MemoryRegion msix_notify_mr;
IndAddr *summary_ind;
IndAddr *indicator;
+ Notifier shutdown_notifier;
bool pci_unplug_request_processed;
bool unplug_requested;
bool interp;
--
2.37.3

View File

@ -0,0 +1,91 @@
From a0b6c21b555566eb6bc38643269d14c82dfd0226 Mon Sep 17 00:00:00 2001
From: Matthew Rosato <mjrosato@linux.ibm.com>
Date: Fri, 28 Oct 2022 15:47:58 -0400
Subject: [PATCH 10/11] s390x/pci: shrink DMA aperture to be bound by vfio DMA
limit
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 250: s390x/pci: reset ISM passthrough devices on shutdown and system reset
RH-Bugzilla: 2163713
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [3/4] aa241dd250ad5e696b67c87dddc31ee5aaee9c0e
Currently, s390x-pci performs accounting against the vfio DMA
limit and triggers the guest to clean up mappings when the limit
is reached. Let's go a step further and also limit the size of
the supported DMA aperture reported to the guest based upon the
initial vfio DMA limit reported for the container (if less than
than the size reported by the firmware/host zPCI layer). This
avoids processing sections of the guest DMA table during global
refresh that, for common use cases, will never be used anway, and
makes exhausting the vfio DMA limit due to mismatch between guest
aperture size and host limit far less likely and more indicitive
of an error.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Message-Id: <20221028194758.204007-4-mjrosato@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit df202e3ff3fccb49868e08f20d0bda86cb953fbe)
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
hw/s390x/s390-pci-vfio.c | 11 +++++++++++
include/hw/s390x/s390-pci-bus.h | 1 +
2 files changed, 12 insertions(+)
diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 2aefa508a0..99806e2a84 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -84,6 +84,7 @@ S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
cnt->users = 1;
cnt->avail = avail;
QTAILQ_INSERT_TAIL(&s->zpci_dma_limit, cnt, link);
+ pbdev->iommu->max_dma_limit = avail;
return cnt;
}
@@ -103,6 +104,7 @@ static void s390_pci_read_base(S390PCIBusDevice *pbdev,
struct vfio_info_cap_header *hdr;
struct vfio_device_info_cap_zpci_base *cap;
VFIOPCIDevice *vpci = container_of(pbdev->pdev, VFIOPCIDevice, pdev);
+ uint64_t vfio_size;
hdr = vfio_get_device_info_cap(info, VFIO_DEVICE_INFO_CAP_ZPCI_BASE);
@@ -122,6 +124,15 @@ static void s390_pci_read_base(S390PCIBusDevice *pbdev,
/* The following values remain 0 until we support other FMB formats */
pbdev->zpci_fn.fmbl = 0;
pbdev->zpci_fn.pft = 0;
+
+ /*
+ * If appropriate, reduce the size of the supported DMA aperture reported
+ * to the guest based upon the vfio DMA limit.
+ */
+ vfio_size = pbdev->iommu->max_dma_limit << TARGET_PAGE_BITS;
+ if (vfio_size < (cap->end_dma - cap->start_dma + 1)) {
+ pbdev->zpci_fn.edma = cap->start_dma + vfio_size - 1;
+ }
}
static bool get_host_fh(S390PCIBusDevice *pbdev, struct vfio_device_info *info,
diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index 0605fcea24..1c46e3a269 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -278,6 +278,7 @@ struct S390PCIIOMMU {
uint64_t g_iota;
uint64_t pba;
uint64_t pal;
+ uint64_t max_dma_limit;
GHashTable *iotlb;
S390PCIDMACount *dma_limit;
};
--
2.37.3

View File

@ -0,0 +1,176 @@
From df836ee4b4e2a69cca5042a3a9daf2c41dc2aa58 Mon Sep 17 00:00:00 2001
From: Stefan Hajnoczi <stefanha@redhat.com>
Date: Tue, 21 Feb 2023 16:22:16 -0500
Subject: [PATCH 11/13] scsi: protect req->aiocb with AioContext lock
RH-Author: Stefan Hajnoczi <stefanha@redhat.com>
RH-MergeRequest: 264: scsi: protect req->aiocb with AioContext lock
RH-Bugzilla: 2090990
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
RH-Commit: [1/3] e6a6d4109713e0fd6d6c515535c66196fea98688
If requests are being processed in the IOThread when a SCSIDevice is
unplugged, scsi_device_purge_requests() -> scsi_req_cancel_async() races
with I/O completion callbacks. Both threads load and store req->aiocb.
This can lead to assert(r->req.aiocb == NULL) failures and undefined
behavior.
Protect r->req.aiocb with the AioContext lock to prevent the race.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230221212218.1378734-2-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 7b7fc3d0102dafe8eb44802493036a526e921a71)
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
hw/scsi/scsi-disk.c | 23 ++++++++++++++++-------
hw/scsi/scsi-generic.c | 11 ++++++-----
2 files changed, 22 insertions(+), 12 deletions(-)
diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index d4914178ea..179ce22c4a 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -270,9 +270,11 @@ static void scsi_aio_complete(void *opaque, int ret)
SCSIDiskReq *r = (SCSIDiskReq *)opaque;
SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
+ aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
+
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
- aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
+
if (scsi_disk_req_check_error(r, ret, true)) {
goto done;
}
@@ -354,10 +356,11 @@ static void scsi_dma_complete(void *opaque, int ret)
SCSIDiskReq *r = (SCSIDiskReq *)opaque;
SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
+ aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
+
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
- aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
if (ret < 0) {
block_acct_failed(blk_get_stats(s->qdev.conf.blk), &r->acct);
} else {
@@ -390,10 +393,11 @@ static void scsi_read_complete(void *opaque, int ret)
SCSIDiskReq *r = (SCSIDiskReq *)opaque;
SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
+ aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
+
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
- aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
if (ret < 0) {
block_acct_failed(blk_get_stats(s->qdev.conf.blk), &r->acct);
} else {
@@ -443,10 +447,11 @@ static void scsi_do_read_cb(void *opaque, int ret)
SCSIDiskReq *r = (SCSIDiskReq *)opaque;
SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
+ aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
+
assert (r->req.aiocb != NULL);
r->req.aiocb = NULL;
- aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
if (ret < 0) {
block_acct_failed(blk_get_stats(s->qdev.conf.blk), &r->acct);
} else {
@@ -527,10 +532,11 @@ static void scsi_write_complete(void * opaque, int ret)
SCSIDiskReq *r = (SCSIDiskReq *)opaque;
SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
+ aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
+
assert (r->req.aiocb != NULL);
r->req.aiocb = NULL;
- aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
if (ret < 0) {
block_acct_failed(blk_get_stats(s->qdev.conf.blk), &r->acct);
} else {
@@ -1659,10 +1665,11 @@ static void scsi_unmap_complete(void *opaque, int ret)
SCSIDiskReq *r = data->r;
SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
+ aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
+
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
- aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
if (scsi_disk_req_check_error(r, ret, true)) {
scsi_req_unref(&r->req);
g_free(data);
@@ -1738,9 +1745,11 @@ static void scsi_write_same_complete(void *opaque, int ret)
SCSIDiskReq *r = data->r;
SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
+ aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
+
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
- aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
+
if (scsi_disk_req_check_error(r, ret, true)) {
goto done;
}
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 3742899839..a1a40df64b 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -111,10 +111,11 @@ static void scsi_command_complete(void *opaque, int ret)
SCSIGenericReq *r = (SCSIGenericReq *)opaque;
SCSIDevice *s = r->req.dev;
+ aio_context_acquire(blk_get_aio_context(s->conf.blk));
+
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
- aio_context_acquire(blk_get_aio_context(s->conf.blk));
scsi_command_complete_noio(r, ret);
aio_context_release(blk_get_aio_context(s->conf.blk));
}
@@ -269,11 +270,11 @@ static void scsi_read_complete(void * opaque, int ret)
SCSIDevice *s = r->req.dev;
int len;
+ aio_context_acquire(blk_get_aio_context(s->conf.blk));
+
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
- aio_context_acquire(blk_get_aio_context(s->conf.blk));
-
if (ret || r->req.io_canceled) {
scsi_command_complete_noio(r, ret);
goto done;
@@ -387,11 +388,11 @@ static void scsi_write_complete(void * opaque, int ret)
trace_scsi_generic_write_complete(ret);
+ aio_context_acquire(blk_get_aio_context(s->conf.blk));
+
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
- aio_context_acquire(blk_get_aio_context(s->conf.blk));
-
if (ret || r->req.io_canceled) {
scsi_command_complete_noio(r, ret);
goto done;
--
2.37.3

View File

@ -1,54 +0,0 @@
From 74b3e92dcb9e343e135a681259514b4fd28086ea Mon Sep 17 00:00:00 2001
From: Eric Auger <eric.auger@redhat.com>
Date: Fri, 6 May 2022 15:25:09 +0200
Subject: [PATCH 4/5] sysemu: tpm: Add a stub function for TPM_IS_CRB
RH-Author: Eric Auger <eric.auger@redhat.com>
RH-MergeRequest: 84: vfio/common: Remove spurious tpm-crb-cmd misalignment warning
RH-Commit: [1/2] 0ab55ca1aa12a3a7cbdef5a378928f75e030e536 (eauger1/centos-qemu-kvm)
RH-Bugzilla: 2037612
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Andrew Jones <drjones@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037612
Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=45166961
Upstream Status: YES
Tested: With TPM-CRB and VFIO
In a subsequent patch, VFIO will need to recognize if
a memory region owner is a TPM CRB device. Hence VFIO
needs to use TPM_IS_CRB() even if CONFIG_TPM is unset. So
let's add a stub function.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linnux.ibm.com>
Link: https://lore.kernel.org/r/20220506132510.1847942-2-eric.auger@redhat.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
(cherry picked from commit 4168cdad398843ed53d650a27651868b4d3e21c9)
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
include/sysemu/tpm.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/include/sysemu/tpm.h b/include/sysemu/tpm.h
index 68b2206463..fb40e30ff6 100644
--- a/include/sysemu/tpm.h
+++ b/include/sysemu/tpm.h
@@ -80,6 +80,12 @@ static inline TPMVersion tpm_get_version(TPMIf *ti)
#define tpm_init() (0)
#define tpm_cleanup()
+/* needed for an alignment check in non-tpm code */
+static inline Object *TPM_IS_CRB(Object *obj)
+{
+ return NULL;
+}
+
#endif /* CONFIG_TPM */
#endif /* QEMU_TPM_H */
--
2.31.1

View File

@ -1,129 +0,0 @@
From 1f8528b71d96c01dd6106f11681f4a4e2776ef5f Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Daniel=20P=2E=20Berrang=C3=A9?= <berrange@redhat.com>
Date: Mon, 21 Mar 2022 12:05:42 +0000
Subject: [PATCH 06/18] target/arm: deprecate named CPU models
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Daniel P. Berrangé <berrange@redhat.com>
RH-MergeRequest: 94: i386, aarch64, s390x: deprecate many named CPU models
RH-Commit: [6/6] afddeb9e898206fd04499f01c48caf7dc1a8b8ef (berrange/centos-src-qemu)
RH-Bugzilla: 2060839
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
KVM requires use of the 'host' CPU model, so named CPU models are only
needed for TCG. Since we don't consider TCG to be supported we can
deprecate all the named CPU models. TCG users can rely on 'max' model.
Note: this has the effect of deprecating the default built-in CPU
model 'cortex-a57'. Applications using QEMU are expected to make an
explicit choice about which CPU model they want, since no builtin
default can suit all purposes.
https://bugzilla.redhat.com/show_bug.cgi?id=2060839
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
target/arm/cpu-qom.h | 1 +
target/arm/cpu.c | 5 +++++
target/arm/cpu.h | 2 ++
target/arm/cpu64.c | 8 +++++++-
target/arm/helper.c | 2 ++
5 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/target/arm/cpu-qom.h b/target/arm/cpu-qom.h
index 64c44cef2d..82e97249bc 100644
--- a/target/arm/cpu-qom.h
+++ b/target/arm/cpu-qom.h
@@ -35,6 +35,7 @@ typedef struct ARMCPUInfo {
const char *name;
void (*initfn)(Object *obj);
void (*class_init)(ObjectClass *oc, void *data);
+ const char *deprecation_note;
} ARMCPUInfo;
void arm_cpu_register(const ARMCPUInfo *info);
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 5d4ca7a227..c74b0fb462 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -2105,8 +2105,13 @@ static void arm_cpu_instance_init(Object *obj)
static void cpu_register_class_init(ObjectClass *oc, void *data)
{
ARMCPUClass *acc = ARM_CPU_CLASS(oc);
+ CPUClass *cc = CPU_CLASS(oc);
acc->info = data;
+
+ if (acc->info->deprecation_note) {
+ cc->deprecation_note = acc->info->deprecation_note;
+ }
}
void arm_cpu_register(const ARMCPUInfo *info)
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 23879de5fa..c0c9f680e5 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -33,6 +33,8 @@
#define KVM_HAVE_MCE_INJECTION 1
#endif
+#define RHEL_CPU_DEPRECATION "use 'host' / 'max'"
+
#define EXCP_UDEF 1 /* undefined instruction */
#define EXCP_SWI 2 /* software interrupt */
#define EXCP_PREFETCH_ABORT 3
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index e80b831073..c8f152891c 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -975,7 +975,8 @@ static void aarch64_a64fx_initfn(Object *obj)
#endif /* disabled for RHEL */
static const ARMCPUInfo aarch64_cpus[] = {
- { .name = "cortex-a57", .initfn = aarch64_a57_initfn },
+ { .name = "cortex-a57", .initfn = aarch64_a57_initfn,
+ .deprecation_note = RHEL_CPU_DEPRECATION },
#if 0 /* Disabled for Red Hat Enterprise Linux */
{ .name = "cortex-a53", .initfn = aarch64_a53_initfn },
{ .name = "cortex-a72", .initfn = aarch64_a72_initfn },
@@ -1052,8 +1053,13 @@ static void aarch64_cpu_instance_init(Object *obj)
static void cpu_register_class_init(ObjectClass *oc, void *data)
{
ARMCPUClass *acc = ARM_CPU_CLASS(oc);
+ CPUClass *cc = CPU_CLASS(oc);
acc->info = data;
+
+ if (acc->info->deprecation_note) {
+ cc->deprecation_note = acc->info->deprecation_note;
+ }
}
void aarch64_cpu_register(const ARMCPUInfo *info)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 7d14650615..3d34f63e49 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -8560,6 +8560,7 @@ void arm_cpu_list(void)
static void arm_cpu_add_definition(gpointer data, gpointer user_data)
{
ObjectClass *oc = data;
+ CPUClass *cc = CPU_CLASS(oc);
CpuDefinitionInfoList **cpu_list = user_data;
CpuDefinitionInfo *info;
const char *typename;
@@ -8569,6 +8570,7 @@ static void arm_cpu_add_definition(gpointer data, gpointer user_data)
info->name = g_strndup(typename,
strlen(typename) - strlen("-" TYPE_ARM_CPU));
info->q_typename = g_strdup(typename);
+ info->deprecated = !!cc->deprecation_note;
QAPI_LIST_PREPEND(*cpu_list, info);
}
--
2.35.3

View File

@ -1,273 +0,0 @@
From 577b04770e47aed0f88acb4a415ed04ddbe087f1 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Daniel=20P=2E=20Berrang=C3=A9?= <berrange@redhat.com>
Date: Thu, 17 Mar 2022 17:59:22 +0000
Subject: [PATCH 04/18] target/i386: deprecate CPUs older than x86_64-v2 ABI
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Daniel P. Berrangé <berrange@redhat.com>
RH-MergeRequest: 94: i386, aarch64, s390x: deprecate many named CPU models
RH-Commit: [4/6] 71f6043f11b31ffa841a2e14d24972e571c18a9e (berrange/centos-src-qemu)
RH-Bugzilla: 2060839
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RHEL-9 is compiled with the x86_64-v2 ABI. We use this as a baseline to
select which CPUs we want to support, such that there is at least one
supported guest CPU that can be launched for every physical machine
capable of running RHEL-9 KVM.
Supported CPUs:
* QEMU models
base (QEMU internal)
host (host passthrough)
max (host passthrough for KVM,
all emulated features for TCG)
* Intel models
Icelake-Server
Icelake-Server-noTSX
Cascadelake-Server (2019)
Cascadelake-Server-noTSX (2019)
Skylake-Server (2016)
Skylake-Server-IBRS (2016)
Skylake-Server-noTSX-IBRS (2016)
Skylake-Client (2015)
Skylake-Client-IBRS (2015)
Skylake-Client-noTSX-IBRS (2015)
Broadwell (2014)
Broadwell-IBRS (2014)
Broadwell-noTSX (2014)
Broadwell-noTSX-IBRS (2014)
Haswell (2013)
Haswell-IBRS (2013)
Haswell-noTSX (2013)
Haswell-noTSX-IBRS (2013)
IvyBridge (2012)
IvyBridge-IBRS (2012)
SandyBridge (2011)
SandyBridge-IBRS (2011)
Westmere (2010)
Westmere-IBRS (2010)
Nehalem (2008)
Nehalem-IBRS (2008)
Cooperlake (2020)
Snowridge (2019)
KnightsMill (2017)
Denverton (2016)
* AMD models
EPYC-Milan (2021)
EPYC-Rome (2019)
EPYC (2017)
EPYC-IBPB (2017)
Opteron_G5 (2012)
Opteron_G4 (2011)
* Other
Dhyana (2018)
(I've omitted the many -vNNN versions for brevity)
Deprecated CPUs:
486
athlon
Conroe
core2duo
coreduo
Icelake-Client (already deprecated upstream)
Icelake-Client-noTSX (already deprecated upstream)
kvm32
kvm64
n270
Opteron_G1
Opteron_G2
Opteron_G3
Penryn
pentium2
pentium3
pentium
phenom
qemu32
qemu64
The deprecated CPU models are subject to removal in a future
major version of RHEL.
Note: this has the effect of deprecating the default built-in CPU
model 'qemu64'. Applications using QEMU are expected to make an
explicit choice about which CPU model they want, since no builtin
default can suit all purposes.
https://bugzilla.redhat.com/show_bug.cgi?id=2060839
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
target/i386/cpu.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index cb6b5467d0..87cb641b5f 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1780,9 +1780,13 @@ static const CPUCaches epyc_milan_cache_info = {
* PT in VMX operation
*/
+#define RHEL_CPU_DEPRECATION \
+ "use at least 'Nehalem' / 'Opteron_G4', or 'host' / 'max'"
+
static const X86CPUDefinition builtin_x86_defs[] = {
{
.name = "qemu64",
+ .deprecation_note = RHEL_CPU_DEPRECATION,
.level = 0xd,
.vendor = CPUID_VENDOR_AMD,
.family = 15,
@@ -1803,6 +1807,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "phenom",
+ .deprecation_note = RHEL_CPU_DEPRECATION,
.level = 5,
.vendor = CPUID_VENDOR_AMD,
.family = 16,
@@ -1835,6 +1840,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "core2duo",
+ .deprecation_note = RHEL_CPU_DEPRECATION,
.level = 10,
.vendor = CPUID_VENDOR_INTEL,
.family = 6,
@@ -1877,6 +1883,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "kvm64",
+ .deprecation_note = RHEL_CPU_DEPRECATION,
.level = 0xd,
.vendor = CPUID_VENDOR_INTEL,
.family = 15,
@@ -1918,6 +1925,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "qemu32",
+ .deprecation_note = RHEL_CPU_DEPRECATION,
.level = 4,
.vendor = CPUID_VENDOR_INTEL,
.family = 6,
@@ -1932,6 +1940,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "kvm32",
+ .deprecation_note = RHEL_CPU_DEPRECATION,
.level = 5,
.vendor = CPUID_VENDOR_INTEL,
.family = 15,
@@ -1962,6 +1971,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "coreduo",
+ .deprecation_note = RHEL_CPU_DEPRECATION,
.level = 10,
.vendor = CPUID_VENDOR_INTEL,
.family = 6,
@@ -1995,6 +2005,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "486",
+ .deprecation_note = RHEL_CPU_DEPRECATION,
.level = 1,
.vendor = CPUID_VENDOR_INTEL,
.family = 4,
@@ -2007,6 +2018,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "pentium",
+ .deprecation_note = RHEL_CPU_DEPRECATION,
.level = 1,
.vendor = CPUID_VENDOR_INTEL,
.family = 5,
@@ -2019,6 +2031,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "pentium2",
+ .deprecation_note = RHEL_CPU_DEPRECATION,
.level = 2,
.vendor = CPUID_VENDOR_INTEL,
.family = 6,
@@ -2031,6 +2044,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "pentium3",
+ .deprecation_note = RHEL_CPU_DEPRECATION,
.level = 3,
.vendor = CPUID_VENDOR_INTEL,
.family = 6,
@@ -2043,6 +2057,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "athlon",
+ .deprecation_note = RHEL_CPU_DEPRECATION,
.level = 2,
.vendor = CPUID_VENDOR_AMD,
.family = 6,
@@ -2058,6 +2073,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "n270",
+ .deprecation_note = RHEL_CPU_DEPRECATION,
.level = 10,
.vendor = CPUID_VENDOR_INTEL,
.family = 6,
@@ -2083,6 +2099,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "Conroe",
+ .deprecation_note = RHEL_CPU_DEPRECATION,
.level = 10,
.vendor = CPUID_VENDOR_INTEL,
.family = 6,
@@ -2123,6 +2140,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "Penryn",
+ .deprecation_note = RHEL_CPU_DEPRECATION,
.level = 10,
.vendor = CPUID_VENDOR_INTEL,
.family = 6,
@@ -3832,6 +3850,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "Opteron_G1",
+ .deprecation_note = RHEL_CPU_DEPRECATION,
.level = 5,
.vendor = CPUID_VENDOR_AMD,
.family = 15,
@@ -3852,6 +3871,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "Opteron_G2",
+ .deprecation_note = RHEL_CPU_DEPRECATION,
.level = 5,
.vendor = CPUID_VENDOR_AMD,
.family = 15,
@@ -3874,6 +3894,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "Opteron_G3",
+ .deprecation_note = RHEL_CPU_DEPRECATION,
.level = 5,
.vendor = CPUID_VENDOR_AMD,
.family = 16,
--
2.35.3

View File

@ -1,48 +0,0 @@
From 39642d0d37e2ef61ce7fde0bc284d37a365e4482 Mon Sep 17 00:00:00 2001
From: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Date: Mon, 2 May 2022 17:59:11 -0300
Subject: [PATCH 2/2] target/ppc/cpu-models: Fix ppc_cpu_aliases list for RHEL
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Murilo Opsfelder Araújo <muriloo@linux.ibm.com>
RH-MergeRequest: 81: target/ppc/cpu-models: remove extraneous "#endif"
RH-Commit: [1/1] 5fff003ad3deb84c6a8e69ab90552a31edb3b058 (mopsfelder/centos-stream-src-qemu-kvm)
RH-Bugzilla: 2081022
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
The commit b9d28ecdedaf ("Enable/disable devices for RHEL") removed the
"#if 0" from the beginning of the ppc_cpu_aliases list, which broke the
build on ppc64le:
../target/ppc/cpu-models.c:904:2: error: #endif without #if
#endif
^
1 error generated.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2081022
Fixes: b9d28ecdedaf (Enable/disable devices for RHEL)
Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
---
target/ppc/cpu-models.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/target/ppc/cpu-models.c b/target/ppc/cpu-models.c
index dd78883410..528467eac1 100644
--- a/target/ppc/cpu-models.c
+++ b/target/ppc/cpu-models.c
@@ -746,6 +746,7 @@
/* PowerPC CPU aliases */
PowerPCCPUAlias ppc_cpu_aliases[] = {
+#if 0 /* Disabled for Red Hat Enterprise Linux */
{ "405", "405d4" },
{ "405cr", "405crc" },
{ "405gp", "405gpd" },
--
2.35.1

View File

@ -0,0 +1,50 @@
From e1870dec813fa6f8482f4f27b7a9bef8c1584b6b Mon Sep 17 00:00:00 2001
From: Thomas Huth <thuth@redhat.com>
Date: Tue, 14 Feb 2023 14:48:37 +0100
Subject: [PATCH 3/3] target/s390x/arch_dump: Fix memory corruption in
s390x_write_elf64_notes()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Thomas Huth <thuth@redhat.com>
RH-MergeRequest: 260: target/s390x/arch_dump: Fix memory corruption in s390x_write_elf64_notes()
RH-Bugzilla: 2168187
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cédric Le Goater <clg@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [1/1] 67b71ed720a1f03d5bda9119969ea95fc4a6106d
Bugzilla: https://bugzilla.redhat.com/2168187
Upstream-Status: Posted (and reviewed, but not merged yet)
"note_size" can be smaller than sizeof(note), so unconditionally calling
memset(notep, 0, sizeof(note)) could cause a memory corruption here in
case notep has been allocated dynamically, thus let's use note_size as
length argument for memset() instead.
Fixes: 113d8f4e95 ("s390x: pv: Add dump support")
Message-Id: <20230214141056.680969-1-thuth@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
---
target/s390x/arch_dump.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/target/s390x/arch_dump.c b/target/s390x/arch_dump.c
index a2329141e8..a7c44ba49d 100644
--- a/target/s390x/arch_dump.c
+++ b/target/s390x/arch_dump.c
@@ -248,7 +248,7 @@ static int s390x_write_elf64_notes(const char *note_name,
notep = g_malloc(note_size);
}
- memset(notep, 0, sizeof(note));
+ memset(notep, 0, note_size);
/* Setup note header data */
notep->hdr.n_descsz = cpu_to_be32(content_size);
--
2.37.3

View File

@ -1,194 +0,0 @@
From 8459c305914e2a7a19dcd1662d54a89def7acfa6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Daniel=20P=2E=20Berrang=C3=A9?= <berrange@redhat.com>
Date: Thu, 17 Mar 2022 17:59:22 +0000
Subject: [PATCH 05/18] target/s390x: deprecate CPUs older than z14
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Daniel P. Berrangé <berrange@redhat.com>
RH-MergeRequest: 94: i386, aarch64, s390x: deprecate many named CPU models
RH-Commit: [5/6] 2da9e06cf452287673f94f880a7eb8b2b37b7278 (berrange/centos-src-qemu)
RH-Bugzilla: 2060839
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RHEL-9 is compiled with the z14 ABI. We use this as a baseline to
select which CPUs we want to support, such that there is at least one
supported guest CPU that can be launched for every physical
machine capable of running RHEL-9 KVM.
Supported CPUs:
gen15a-base
gen15a
gen15b-base
gen15b
gen16a-base
gen16a
gen16b-base
gen16b
max
qemu
z14.2-base
z14.2
z14-base
z14
z14ZR1-base
z14ZR1
Deprecated CPUs:
z10BC.2-base
z10BC.2
z10BC-base
z10BC
z10EC.2-base
z10EC.2
z10EC.3-base
z10EC.3
z10EC-base
z10EC
z114-base
z114
z13.2-base
z13.2
z13-base
z13s-base
z13s
z13
z196.2-base
z196.2
z196-base
z196
z800-base
z800
z890.2-base
z890.2
z890.3-base
z890.3
z890-base
z890
z900.2-base
z900.2
z900.3-base
z900.3
z900-base
z900
z990.2-base
z990.2
z990.3-base
z990.3
z990.4-base
z990.4
z990.5-base
z990.5
z990-base
z990
z9BC.2-base
z9BC.2
z9BC-base
z9BC
z9EC.2-base
z9EC.2
z9EC.3-base
z9EC.3
z9EC-base
z9EC
zBC12-base
zBC12
zEC12.2-base
zEC12.2
zEC12-base
zEC12
https://bugzilla.redhat.com/show_bug.cgi?id=2060839
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
target/s390x/cpu_models.c | 11 +++++++++++
target/s390x/cpu_models.h | 2 ++
target/s390x/cpu_models_sysemu.c | 2 ++
3 files changed, 15 insertions(+)
diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c
index 6d71428056..9b9fc41676 100644
--- a/target/s390x/cpu_models.c
+++ b/target/s390x/cpu_models.c
@@ -45,6 +45,9 @@
* of a following release have been a superset of the previous release. With
* generation 15 one base feature and one optional feature have been deprecated.
*/
+
+#define RHEL_CPU_DEPRECATION "use at least 'z14', or 'host' / 'qemu' / 'max'"
+
static S390CPUDef s390_cpu_defs[] = {
CPUDEF_INIT(0x2064, 7, 1, 38, 0x00000000U, "z900", "IBM zSeries 900 GA1"),
CPUDEF_INIT(0x2064, 7, 2, 38, 0x00000000U, "z900.2", "IBM zSeries 900 GA2"),
@@ -852,22 +855,30 @@ static void s390_host_cpu_model_class_init(ObjectClass *oc, void *data)
static void s390_base_cpu_model_class_init(ObjectClass *oc, void *data)
{
S390CPUClass *xcc = S390_CPU_CLASS(oc);
+ CPUClass *cc = CPU_CLASS(oc);
/* all base models are migration safe */
xcc->cpu_def = (const S390CPUDef *) data;
xcc->is_migration_safe = true;
xcc->is_static = true;
xcc->desc = xcc->cpu_def->desc;
+ if (xcc->cpu_def->gen < 14) {
+ cc->deprecation_note = RHEL_CPU_DEPRECATION;
+ }
}
static void s390_cpu_model_class_init(ObjectClass *oc, void *data)
{
S390CPUClass *xcc = S390_CPU_CLASS(oc);
+ CPUClass *cc = CPU_CLASS(oc);
/* model that can change between QEMU versions */
xcc->cpu_def = (const S390CPUDef *) data;
xcc->is_migration_safe = true;
xcc->desc = xcc->cpu_def->desc;
+ if (xcc->cpu_def->gen < 14) {
+ cc->deprecation_note = RHEL_CPU_DEPRECATION;
+ }
}
static void s390_qemu_cpu_model_class_init(ObjectClass *oc, void *data)
diff --git a/target/s390x/cpu_models.h b/target/s390x/cpu_models.h
index 74d1f87e4f..372160bcd7 100644
--- a/target/s390x/cpu_models.h
+++ b/target/s390x/cpu_models.h
@@ -38,6 +38,8 @@ struct S390CPUDef {
S390FeatBitmap full_feat;
/* used to init full_feat from generated data */
S390FeatInit full_init;
+ /* if deprecated, provides a suggestion */
+ const char *deprecation_note;
};
/* CPU model based on a CPU definition */
diff --git a/target/s390x/cpu_models_sysemu.c b/target/s390x/cpu_models_sysemu.c
index 6a04ccab1b..f3b7c304ec 100644
--- a/target/s390x/cpu_models_sysemu.c
+++ b/target/s390x/cpu_models_sysemu.c
@@ -61,6 +61,7 @@ static void create_cpu_model_list(ObjectClass *klass, void *opaque)
CpuDefinitionInfo *info;
char *name = g_strdup(object_class_get_name(klass));
S390CPUClass *scc = S390_CPU_CLASS(klass);
+ CPUClass *cc = CPU_CLASS(klass);
/* strip off the -s390x-cpu */
g_strrstr(name, "-" TYPE_S390_CPU)[0] = 0;
@@ -70,6 +71,7 @@ static void create_cpu_model_list(ObjectClass *klass, void *opaque)
info->migration_safe = scc->is_migration_safe;
info->q_static = scc->is_static;
info->q_typename = g_strdup(object_class_get_name(klass));
+ info->deprecated = !!cc->deprecation_note;
/* check for unavailable features */
if (cpu_list_data->model) {
Object *obj;
--
2.35.3

View File

@ -1,157 +0,0 @@
From f52aa60217634c96fef59ce76b803a94610bf5c8 Mon Sep 17 00:00:00 2001
From: Andrew Jones <drjones@redhat.com>
Date: Wed, 15 Jun 2022 15:28:27 +0200
Subject: [PATCH 01/18] tests/avocado: update aarch64_virt test to exercise
-cpu max
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Daniel P. Berrangé <berrange@redhat.com>
RH-MergeRequest: 94: i386, aarch64, s390x: deprecate many named CPU models
RH-Commit: [1/6] df6839e567180a4c32afd98852f68b2279e00f7c (berrange/centos-src-qemu)
RH-Bugzilla: 2060839
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2066824
commit 11593544df6f8febb3ce87015c22b429bf43c4c7
Author: Alex Bennée <alex.bennee@linaro.org>
Date: Tue Apr 19 10:09:56 2022 +0100
tests/avocado: update aarch64_virt test to exercise -cpu max
The Fedora 29 kernel is quite old and importantly fails when running
in LPA2 scenarios. As it's not really exercising much of the CPU space
replace it with a custom 5.16.12 kernel with all the architecture
options turned on. There is a minimal buildroot initramfs included in
the kernel which has a few tools for stress testing the memory
subsystem. The userspace also targets the Neoverse N1 processor so
would fail with a v8.0 cpu like cortex-a53.
While we are at it move the test into its own file so it can have an
assigned maintainer.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220419091020.3008144-2-alex.bennee@linaro.org>
Signed-off-by: Andrew Jones <drjones@redhat.com>
---
MAINTAINERS | 1 +
tests/avocado/boot_linux_console.py | 25 -------------
tests/avocado/machine_aarch64_virt.py | 51 +++++++++++++++++++++++++++
3 files changed, 52 insertions(+), 25 deletions(-)
create mode 100644 tests/avocado/machine_aarch64_virt.py
diff --git a/MAINTAINERS b/MAINTAINERS
index 2fe20a49ab..bfe8806f60 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -942,6 +942,7 @@ S: Maintained
F: hw/arm/virt*
F: include/hw/arm/virt.h
F: docs/system/arm/virt.rst
+F: tests/avocado/machine_aarch64_virt.py
Xilinx Zynq
M: Edgar E. Iglesias <edgar.iglesias@gmail.com>
diff --git a/tests/avocado/boot_linux_console.py b/tests/avocado/boot_linux_console.py
index b40a3abc81..45a2ceda22 100644
--- a/tests/avocado/boot_linux_console.py
+++ b/tests/avocado/boot_linux_console.py
@@ -325,31 +325,6 @@ def test_mips_malta32el_nanomips_64k_dbg(self):
kernel_hash = '18d1c68f2e23429e266ca39ba5349ccd0aeb7180'
self.do_test_mips_malta32el_nanomips(kernel_url, kernel_hash)
- def test_aarch64_virt(self):
- """
- :avocado: tags=arch:aarch64
- :avocado: tags=machine:virt
- :avocado: tags=accel:tcg
- :avocado: tags=cpu:cortex-a53
- """
- kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
- '/linux/releases/29/Everything/aarch64/os/images/pxeboot'
- '/vmlinuz')
- kernel_hash = '8c73e469fc6ea06a58dc83a628fc695b693b8493'
- kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
-
- self.vm.set_console()
- kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE +
- 'console=ttyAMA0')
- self.require_accelerator("tcg")
- self.vm.add_args('-cpu', 'cortex-a53',
- '-accel', 'tcg',
- '-kernel', kernel_path,
- '-append', kernel_command_line)
- self.vm.launch()
- console_pattern = 'Kernel command line: %s' % kernel_command_line
- self.wait_for_console_pattern(console_pattern)
-
def test_aarch64_xlnx_versal_virt(self):
"""
:avocado: tags=arch:aarch64
diff --git a/tests/avocado/machine_aarch64_virt.py b/tests/avocado/machine_aarch64_virt.py
new file mode 100644
index 0000000000..21848cba70
--- /dev/null
+++ b/tests/avocado/machine_aarch64_virt.py
@@ -0,0 +1,51 @@
+# Functional test that boots a Linux kernel and checks the console
+#
+# Copyright (c) 2022 Linaro Ltd.
+#
+# Author:
+# Alex Bennée <alex.bennee@linaro.org>
+#
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+import time
+
+from avocado_qemu import QemuSystemTest
+from avocado_qemu import wait_for_console_pattern
+from avocado_qemu import exec_command
+
+class Aarch64VirtMachine(QemuSystemTest):
+ KERNEL_COMMON_COMMAND_LINE = 'printk.time=0 '
+
+ def wait_for_console_pattern(self, success_message, vm=None):
+ wait_for_console_pattern(self, success_message,
+ failure_message='Kernel panic - not syncing',
+ vm=vm)
+
+ def test_aarch64_virt(self):
+ """
+ :avocado: tags=arch:aarch64
+ :avocado: tags=machine:virt
+ :avocado: tags=accel:tcg
+ :avocado: tags=cpu:max
+ """
+ kernel_url = ('https://fileserver.linaro.org/s/'
+ 'z6B2ARM7DQT3HWN/download')
+
+ kernel_hash = 'ed11daab50c151dde0e1e9c9cb8b2d9bd3215347'
+ kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
+
+ self.vm.set_console()
+ kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE +
+ 'console=ttyAMA0')
+ self.require_accelerator("tcg")
+ self.vm.add_args('-cpu', 'max,pauth-impdef=on',
+ '-accel', 'tcg',
+ '-kernel', kernel_path,
+ '-append', kernel_command_line)
+ self.vm.launch()
+ self.wait_for_console_pattern('Welcome to Buildroot')
+ time.sleep(0.1)
+ exec_command(self, 'root')
+ time.sleep(0.1)
+ exec_command(self, 'cat /proc/self/maps')
+ time.sleep(0.1)
--
2.35.3

View File

@ -1,385 +0,0 @@
From 7a6fa42d4a4263c94b9bf18290f9e7680ea9e7f4 Mon Sep 17 00:00:00 2001
From: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Date: Mon, 25 Apr 2022 09:57:23 +0200
Subject: [PATCH 03/16] util/event-loop-base: Introduce options to set the
thread pool size
RH-Author: Nicolas Saenz Julienne <nsaenzju@redhat.com>
RH-MergeRequest: 93: util/thread-pool: Expose minimum and maximum size
RH-Commit: [3/3] af78a88ff3c69701cbb5f9e980c3d6ebbd13ff98
RH-Bugzilla: 2031024
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
The thread pool regulates itself: when idle, it kills threads until
empty, when in demand, it creates new threads until full. This behaviour
doesn't play well with latency sensitive workloads where the price of
creating a new thread is too high. For example, when paired with qemu's
'-mlock', or using safety features like SafeStack, creating a new thread
has been measured take multiple milliseconds.
In order to mitigate this let's introduce a new 'EventLoopBase'
property to set the thread pool size. The threads will be created during
the pool's initialization or upon updating the property's value, remain
available during its lifetime regardless of demand, and destroyed upon
freeing it. A properly characterized workload will then be able to
configure the pool to avoid any latency spikes.
Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-id: 20220425075723.20019-4-nsaenzju@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 71ad4713cc1d7fca24388b828ef31ae6cb38a31c)
---
event-loop-base.c | 23 +++++++++++++
include/block/aio.h | 10 ++++++
include/block/thread-pool.h | 3 ++
include/sysemu/event-loop-base.h | 4 +++
iothread.c | 3 ++
qapi/qom.json | 10 +++++-
util/aio-posix.c | 1 +
util/async.c | 20 ++++++++++++
util/main-loop.c | 9 ++++++
util/thread-pool.c | 55 +++++++++++++++++++++++++++++---
10 files changed, 133 insertions(+), 5 deletions(-)
diff --git a/event-loop-base.c b/event-loop-base.c
index e7f99a6ec8..d5be4dc6fc 100644
--- a/event-loop-base.c
+++ b/event-loop-base.c
@@ -14,6 +14,7 @@
#include "qemu/osdep.h"
#include "qom/object_interfaces.h"
#include "qapi/error.h"
+#include "block/thread-pool.h"
#include "sysemu/event-loop-base.h"
typedef struct {
@@ -21,9 +22,22 @@ typedef struct {
ptrdiff_t offset; /* field's byte offset in EventLoopBase struct */
} EventLoopBaseParamInfo;
+static void event_loop_base_instance_init(Object *obj)
+{
+ EventLoopBase *base = EVENT_LOOP_BASE(obj);
+
+ base->thread_pool_max = THREAD_POOL_MAX_THREADS_DEFAULT;
+}
+
static EventLoopBaseParamInfo aio_max_batch_info = {
"aio-max-batch", offsetof(EventLoopBase, aio_max_batch),
};
+static EventLoopBaseParamInfo thread_pool_min_info = {
+ "thread-pool-min", offsetof(EventLoopBase, thread_pool_min),
+};
+static EventLoopBaseParamInfo thread_pool_max_info = {
+ "thread-pool-max", offsetof(EventLoopBase, thread_pool_max),
+};
static void event_loop_base_get_param(Object *obj, Visitor *v,
const char *name, void *opaque, Error **errp)
@@ -95,12 +109,21 @@ static void event_loop_base_class_init(ObjectClass *klass, void *class_data)
event_loop_base_get_param,
event_loop_base_set_param,
NULL, &aio_max_batch_info);
+ object_class_property_add(klass, "thread-pool-min", "int",
+ event_loop_base_get_param,
+ event_loop_base_set_param,
+ NULL, &thread_pool_min_info);
+ object_class_property_add(klass, "thread-pool-max", "int",
+ event_loop_base_get_param,
+ event_loop_base_set_param,
+ NULL, &thread_pool_max_info);
}
static const TypeInfo event_loop_base_info = {
.name = TYPE_EVENT_LOOP_BASE,
.parent = TYPE_OBJECT,
.instance_size = sizeof(EventLoopBase),
+ .instance_init = event_loop_base_instance_init,
.class_size = sizeof(EventLoopBaseClass),
.class_init = event_loop_base_class_init,
.abstract = true,
diff --git a/include/block/aio.h b/include/block/aio.h
index 5634173b12..d128558f1d 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -192,6 +192,8 @@ struct AioContext {
QSLIST_HEAD(, Coroutine) scheduled_coroutines;
QEMUBH *co_schedule_bh;
+ int thread_pool_min;
+ int thread_pool_max;
/* Thread pool for performing work and receiving completion callbacks.
* Has its own locking.
*/
@@ -769,4 +771,12 @@ void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns,
void aio_context_set_aio_params(AioContext *ctx, int64_t max_batch,
Error **errp);
+/**
+ * aio_context_set_thread_pool_params:
+ * @ctx: the aio context
+ * @min: min number of threads to have readily available in the thread pool
+ * @min: max number of threads the thread pool can contain
+ */
+void aio_context_set_thread_pool_params(AioContext *ctx, int64_t min,
+ int64_t max, Error **errp);
#endif
diff --git a/include/block/thread-pool.h b/include/block/thread-pool.h
index 7dd7d730a0..2020bcc92d 100644
--- a/include/block/thread-pool.h
+++ b/include/block/thread-pool.h
@@ -20,6 +20,8 @@
#include "block/block.h"
+#define THREAD_POOL_MAX_THREADS_DEFAULT 64
+
typedef int ThreadPoolFunc(void *opaque);
typedef struct ThreadPool ThreadPool;
@@ -33,5 +35,6 @@ BlockAIOCB *thread_pool_submit_aio(ThreadPool *pool,
int coroutine_fn thread_pool_submit_co(ThreadPool *pool,
ThreadPoolFunc *func, void *arg);
void thread_pool_submit(ThreadPool *pool, ThreadPoolFunc *func, void *arg);
+void thread_pool_update_params(ThreadPool *pool, struct AioContext *ctx);
#endif
diff --git a/include/sysemu/event-loop-base.h b/include/sysemu/event-loop-base.h
index fced4c9fea..2748bf6ae1 100644
--- a/include/sysemu/event-loop-base.h
+++ b/include/sysemu/event-loop-base.h
@@ -33,5 +33,9 @@ struct EventLoopBase {
/* AioContext AIO engine parameters */
int64_t aio_max_batch;
+
+ /* AioContext thread pool parameters */
+ int64_t thread_pool_min;
+ int64_t thread_pool_max;
};
#endif
diff --git a/iothread.c b/iothread.c
index 8fa2f3bfb8..529194a566 100644
--- a/iothread.c
+++ b/iothread.c
@@ -174,6 +174,9 @@ static void iothread_set_aio_context_params(EventLoopBase *base, Error **errp)
aio_context_set_aio_params(iothread->ctx,
iothread->parent_obj.aio_max_batch,
errp);
+
+ aio_context_set_thread_pool_params(iothread->ctx, base->thread_pool_min,
+ base->thread_pool_max, errp);
}
diff --git a/qapi/qom.json b/qapi/qom.json
index 7d4a2ac1b9..6a653c6636 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -508,10 +508,18 @@
# 0 means that the engine will use its default.
# (default: 0)
#
+# @thread-pool-min: minimum number of threads reserved in the thread pool
+# (default:0)
+#
+# @thread-pool-max: maximum number of threads the thread pool can contain
+# (default:64)
+#
# Since: 7.1
##
{ 'struct': 'EventLoopBaseProperties',
- 'data': { '*aio-max-batch': 'int' } }
+ 'data': { '*aio-max-batch': 'int',
+ '*thread-pool-min': 'int',
+ '*thread-pool-max': 'int' } }
##
# @IothreadProperties:
diff --git a/util/aio-posix.c b/util/aio-posix.c
index be0182a3c6..731f3826c0 100644
--- a/util/aio-posix.c
+++ b/util/aio-posix.c
@@ -15,6 +15,7 @@
#include "qemu/osdep.h"
#include "block/block.h"
+#include "block/thread-pool.h"
#include "qemu/main-loop.h"
#include "qemu/rcu.h"
#include "qemu/rcu_queue.h"
diff --git a/util/async.c b/util/async.c
index 2ea1172f3e..554ba70cca 100644
--- a/util/async.c
+++ b/util/async.c
@@ -563,6 +563,9 @@ AioContext *aio_context_new(Error **errp)
ctx->aio_max_batch = 0;
+ ctx->thread_pool_min = 0;
+ ctx->thread_pool_max = THREAD_POOL_MAX_THREADS_DEFAULT;
+
return ctx;
fail:
g_source_destroy(&ctx->source);
@@ -696,3 +699,20 @@ void qemu_set_current_aio_context(AioContext *ctx)
assert(!get_my_aiocontext());
set_my_aiocontext(ctx);
}
+
+void aio_context_set_thread_pool_params(AioContext *ctx, int64_t min,
+ int64_t max, Error **errp)
+{
+
+ if (min > max || !max || min > INT_MAX || max > INT_MAX) {
+ error_setg(errp, "bad thread-pool-min/thread-pool-max values");
+ return;
+ }
+
+ ctx->thread_pool_min = min;
+ ctx->thread_pool_max = max;
+
+ if (ctx->thread_pool) {
+ thread_pool_update_params(ctx->thread_pool, ctx);
+ }
+}
diff --git a/util/main-loop.c b/util/main-loop.c
index 5b13f456fa..a0f48186ab 100644
--- a/util/main-loop.c
+++ b/util/main-loop.c
@@ -30,6 +30,7 @@
#include "sysemu/replay.h"
#include "qemu/main-loop.h"
#include "block/aio.h"
+#include "block/thread-pool.h"
#include "qemu/error-report.h"
#include "qemu/queue.h"
#include "qemu/compiler.h"
@@ -187,12 +188,20 @@ int qemu_init_main_loop(Error **errp)
static void main_loop_update_params(EventLoopBase *base, Error **errp)
{
+ ERRP_GUARD();
+
if (!qemu_aio_context) {
error_setg(errp, "qemu aio context not ready");
return;
}
aio_context_set_aio_params(qemu_aio_context, base->aio_max_batch, errp);
+ if (*errp) {
+ return;
+ }
+
+ aio_context_set_thread_pool_params(qemu_aio_context, base->thread_pool_min,
+ base->thread_pool_max, errp);
}
MainLoop *mloop;
diff --git a/util/thread-pool.c b/util/thread-pool.c
index d763cea505..196835b4d3 100644
--- a/util/thread-pool.c
+++ b/util/thread-pool.c
@@ -58,7 +58,6 @@ struct ThreadPool {
QemuMutex lock;
QemuCond worker_stopped;
QemuSemaphore sem;
- int max_threads;
QEMUBH *new_thread_bh;
/* The following variables are only accessed from one AioContext. */
@@ -71,8 +70,27 @@ struct ThreadPool {
int new_threads; /* backlog of threads we need to create */
int pending_threads; /* threads created but not running yet */
bool stopping;
+ int min_threads;
+ int max_threads;
};
+static inline bool back_to_sleep(ThreadPool *pool, int ret)
+{
+ /*
+ * The semaphore timed out, we should exit the loop except when:
+ * - There is work to do, we raced with the signal.
+ * - The max threads threshold just changed, we raced with the signal.
+ * - The thread pool forces a minimum number of readily available threads.
+ */
+ if (ret == -1 && (!QTAILQ_EMPTY(&pool->request_list) ||
+ pool->cur_threads > pool->max_threads ||
+ pool->cur_threads <= pool->min_threads)) {
+ return true;
+ }
+
+ return false;
+}
+
static void *worker_thread(void *opaque)
{
ThreadPool *pool = opaque;
@@ -91,8 +109,9 @@ static void *worker_thread(void *opaque)
ret = qemu_sem_timedwait(&pool->sem, 10000);
qemu_mutex_lock(&pool->lock);
pool->idle_threads--;
- } while (ret == -1 && !QTAILQ_EMPTY(&pool->request_list));
- if (ret == -1 || pool->stopping) {
+ } while (back_to_sleep(pool, ret));
+ if (ret == -1 || pool->stopping ||
+ pool->cur_threads > pool->max_threads) {
break;
}
@@ -294,6 +313,33 @@ void thread_pool_submit(ThreadPool *pool, ThreadPoolFunc *func, void *arg)
thread_pool_submit_aio(pool, func, arg, NULL, NULL);
}
+void thread_pool_update_params(ThreadPool *pool, AioContext *ctx)
+{
+ qemu_mutex_lock(&pool->lock);
+
+ pool->min_threads = ctx->thread_pool_min;
+ pool->max_threads = ctx->thread_pool_max;
+
+ /*
+ * We either have to:
+ * - Increase the number available of threads until over the min_threads
+ * threshold.
+ * - Decrease the number of available threads until under the max_threads
+ * threshold.
+ * - Do nothing. The current number of threads fall in between the min and
+ * max thresholds. We'll let the pool manage itself.
+ */
+ for (int i = pool->cur_threads; i < pool->min_threads; i++) {
+ spawn_thread(pool);
+ }
+
+ for (int i = pool->cur_threads; i > pool->max_threads; i--) {
+ qemu_sem_post(&pool->sem);
+ }
+
+ qemu_mutex_unlock(&pool->lock);
+}
+
static void thread_pool_init_one(ThreadPool *pool, AioContext *ctx)
{
if (!ctx) {
@@ -306,11 +352,12 @@ static void thread_pool_init_one(ThreadPool *pool, AioContext *ctx)
qemu_mutex_init(&pool->lock);
qemu_cond_init(&pool->worker_stopped);
qemu_sem_init(&pool->sem, 0);
- pool->max_threads = 64;
pool->new_thread_bh = aio_bh_new(ctx, spawn_thread_bh_fn, pool);
QLIST_INIT(&pool->head);
QTAILQ_INIT(&pool->request_list);
+
+ thread_pool_update_params(pool, ctx);
}
ThreadPool *thread_pool_new(AioContext *ctx)
--
2.31.1

View File

@ -1,233 +0,0 @@
From b4969662de01848f887a3918e97e516efc213f71 Mon Sep 17 00:00:00 2001
From: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Date: Mon, 25 Apr 2022 09:57:22 +0200
Subject: [PATCH 02/16] util/main-loop: Introduce the main loop into QOM
RH-Author: Nicolas Saenz Julienne <nsaenzju@redhat.com>
RH-MergeRequest: 93: util/thread-pool: Expose minimum and maximum size
RH-Commit: [2/3] a481b77e25ad50d13dcbe26b36c551b18c89bddd
RH-Bugzilla: 2031024
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
'event-loop-base' provides basic property handling for all 'AioContext'
based event loops. So let's define a new 'MainLoopClass' that inherits
from it. This will permit tweaking the main loop's properties through
qapi as well as through the command line using the '-object' keyword[1].
Only one instance of 'MainLoopClass' might be created at any time.
'EventLoopBaseClass' learns a new callback, 'can_be_deleted()' so as to
mark 'MainLoop' as non-deletable.
[1] For example:
-object main-loop,id=main-loop,aio-max-batch=<value>
Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-id: 20220425075723.20019-3-nsaenzju@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 70ac26b9e5ca8374bb3ef3f30b871726673c9f27)
---
event-loop-base.c | 13 ++++++++
include/qemu/main-loop.h | 10 ++++++
include/sysemu/event-loop-base.h | 1 +
meson.build | 3 +-
qapi/qom.json | 13 ++++++++
util/main-loop.c | 56 ++++++++++++++++++++++++++++++++
6 files changed, 95 insertions(+), 1 deletion(-)
diff --git a/event-loop-base.c b/event-loop-base.c
index a924c73a7c..e7f99a6ec8 100644
--- a/event-loop-base.c
+++ b/event-loop-base.c
@@ -73,10 +73,23 @@ static void event_loop_base_complete(UserCreatable *uc, Error **errp)
}
}
+static bool event_loop_base_can_be_deleted(UserCreatable *uc)
+{
+ EventLoopBaseClass *bc = EVENT_LOOP_BASE_GET_CLASS(uc);
+ EventLoopBase *backend = EVENT_LOOP_BASE(uc);
+
+ if (bc->can_be_deleted) {
+ return bc->can_be_deleted(backend);
+ }
+
+ return true;
+}
+
static void event_loop_base_class_init(ObjectClass *klass, void *class_data)
{
UserCreatableClass *ucc = USER_CREATABLE_CLASS(klass);
ucc->complete = event_loop_base_complete;
+ ucc->can_be_deleted = event_loop_base_can_be_deleted;
object_class_property_add(klass, "aio-max-batch", "int",
event_loop_base_get_param,
diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h
index d3750c8e76..20c9387654 100644
--- a/include/qemu/main-loop.h
+++ b/include/qemu/main-loop.h
@@ -26,9 +26,19 @@
#define QEMU_MAIN_LOOP_H
#include "block/aio.h"
+#include "qom/object.h"
+#include "sysemu/event-loop-base.h"
#define SIG_IPI SIGUSR1
+#define TYPE_MAIN_LOOP "main-loop"
+OBJECT_DECLARE_TYPE(MainLoop, MainLoopClass, MAIN_LOOP)
+
+struct MainLoop {
+ EventLoopBase parent_obj;
+};
+typedef struct MainLoop MainLoop;
+
/**
* qemu_init_main_loop: Set up the process so that it can run the main loop.
*
diff --git a/include/sysemu/event-loop-base.h b/include/sysemu/event-loop-base.h
index 8e77d8b69f..fced4c9fea 100644
--- a/include/sysemu/event-loop-base.h
+++ b/include/sysemu/event-loop-base.h
@@ -25,6 +25,7 @@ struct EventLoopBaseClass {
void (*init)(EventLoopBase *base, Error **errp);
void (*update_params)(EventLoopBase *base, Error **errp);
+ bool (*can_be_deleted)(EventLoopBase *base);
};
struct EventLoopBase {
diff --git a/meson.build b/meson.build
index b9c919a55e..5a7c10e639 100644
--- a/meson.build
+++ b/meson.build
@@ -2832,7 +2832,8 @@ libqemuutil = static_library('qemuutil',
sources: util_ss.sources() + stub_ss.sources() + genh,
dependencies: [util_ss.dependencies(), libm, threads, glib, socket, malloc, pixman])
qemuutil = declare_dependency(link_with: libqemuutil,
- sources: genh + version_res)
+ sources: genh + version_res,
+ dependencies: [event_loop_base])
if have_system or have_user
decodetree = generator(find_program('scripts/decodetree.py'),
diff --git a/qapi/qom.json b/qapi/qom.json
index a2439533c5..7d4a2ac1b9 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -540,6 +540,17 @@
'*poll-grow': 'int',
'*poll-shrink': 'int' } }
+##
+# @MainLoopProperties:
+#
+# Properties for the main-loop object.
+#
+# Since: 7.1
+##
+{ 'struct': 'MainLoopProperties',
+ 'base': 'EventLoopBaseProperties',
+ 'data': {} }
+
##
# @MemoryBackendProperties:
#
@@ -830,6 +841,7 @@
{ 'name': 'input-linux',
'if': 'CONFIG_LINUX' },
'iothread',
+ 'main-loop',
{ 'name': 'memory-backend-epc',
'if': 'CONFIG_LINUX' },
'memory-backend-file',
@@ -895,6 +907,7 @@
'input-linux': { 'type': 'InputLinuxProperties',
'if': 'CONFIG_LINUX' },
'iothread': 'IothreadProperties',
+ 'main-loop': 'MainLoopProperties',
'memory-backend-epc': { 'type': 'MemoryBackendEpcProperties',
'if': 'CONFIG_LINUX' },
'memory-backend-file': 'MemoryBackendFileProperties',
diff --git a/util/main-loop.c b/util/main-loop.c
index b7b0ce4ca0..5b13f456fa 100644
--- a/util/main-loop.c
+++ b/util/main-loop.c
@@ -33,6 +33,7 @@
#include "qemu/error-report.h"
#include "qemu/queue.h"
#include "qemu/compiler.h"
+#include "qom/object.h"
#ifndef _WIN32
#include <sys/wait.h>
@@ -184,6 +185,61 @@ int qemu_init_main_loop(Error **errp)
return 0;
}
+static void main_loop_update_params(EventLoopBase *base, Error **errp)
+{
+ if (!qemu_aio_context) {
+ error_setg(errp, "qemu aio context not ready");
+ return;
+ }
+
+ aio_context_set_aio_params(qemu_aio_context, base->aio_max_batch, errp);
+}
+
+MainLoop *mloop;
+
+static void main_loop_init(EventLoopBase *base, Error **errp)
+{
+ MainLoop *m = MAIN_LOOP(base);
+
+ if (mloop) {
+ error_setg(errp, "only one main-loop instance allowed");
+ return;
+ }
+
+ main_loop_update_params(base, errp);
+
+ mloop = m;
+ return;
+}
+
+static bool main_loop_can_be_deleted(EventLoopBase *base)
+{
+ return false;
+}
+
+static void main_loop_class_init(ObjectClass *oc, void *class_data)
+{
+ EventLoopBaseClass *bc = EVENT_LOOP_BASE_CLASS(oc);
+
+ bc->init = main_loop_init;
+ bc->update_params = main_loop_update_params;
+ bc->can_be_deleted = main_loop_can_be_deleted;
+}
+
+static const TypeInfo main_loop_info = {
+ .name = TYPE_MAIN_LOOP,
+ .parent = TYPE_EVENT_LOOP_BASE,
+ .class_init = main_loop_class_init,
+ .instance_size = sizeof(MainLoop),
+};
+
+static void main_loop_register_types(void)
+{
+ type_register_static(&main_loop_info);
+}
+
+type_init(main_loop_register_types)
+
static int max_priority;
#ifndef _WIN32
--
2.31.1

View File

@ -1,106 +0,0 @@
From 8e0fdce814af4cfc84dce5e5920da989b1f1a86d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <eperezma@redhat.com>
Date: Thu, 21 Jul 2022 16:06:05 +0200
Subject: [PATCH 26/32] vdpa: Add device migration blocker
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Eugenio Pérez <eperezma@redhat.com>
RH-MergeRequest: 108: Net Control Virtqueue shadow Support
RH-Commit: [26/27] 53d94d45b5e5e88f12b95f9b0f243696cfcbd7ce (eperezmartin/qemu-kvm)
RH-Bugzilla: 1939363
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Cindy Lu <lulu@redhat.com>
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
Bugzilla: https://bugzilla.redhat.com/1939363
Upstream Status: git://git.qemu.org/qemu.git
commit c156d5bf2b142dcc06808ccee06882144f230aec
Author: Eugenio Pérez <eperezma@redhat.com>
Date: Wed Jul 20 08:59:45 2022 +0200
vdpa: Add device migration blocker
Since the vhost-vdpa device is exposing _F_LOG, adding a migration blocker if
it uses CVQ.
However, qemu is able to migrate simple devices with no CVQ as long as
they use SVQ. To allow it, add a placeholder error to vhost_vdpa, and
only add to vhost_dev when used. vhost_dev machinery place the migration
blocker if needed.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
hw/virtio/vhost-vdpa.c | 15 +++++++++++++++
include/hw/virtio/vhost-vdpa.h | 1 +
2 files changed, 16 insertions(+)
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 49effe5462..e3e5bce4bb 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -20,6 +20,7 @@
#include "hw/virtio/vhost-shadow-virtqueue.h"
#include "hw/virtio/vhost-vdpa.h"
#include "exec/address-spaces.h"
+#include "migration/blocker.h"
#include "qemu/main-loop.h"
#include "cpu.h"
#include "trace.h"
@@ -1020,6 +1021,13 @@ static bool vhost_vdpa_svqs_start(struct vhost_dev *dev)
return true;
}
+ if (v->migration_blocker) {
+ int r = migrate_add_blocker(v->migration_blocker, &err);
+ if (unlikely(r < 0)) {
+ return false;
+ }
+ }
+
for (i = 0; i < v->shadow_vqs->len; ++i) {
VirtQueue *vq = virtio_get_queue(dev->vdev, dev->vq_index + i);
VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, i);
@@ -1062,6 +1070,10 @@ err:
vhost_svq_stop(svq);
}
+ if (v->migration_blocker) {
+ migrate_del_blocker(v->migration_blocker);
+ }
+
return false;
}
@@ -1081,6 +1093,9 @@ static bool vhost_vdpa_svqs_stop(struct vhost_dev *dev)
}
}
+ if (v->migration_blocker) {
+ migrate_del_blocker(v->migration_blocker);
+ }
return true;
}
diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index 1111d85643..d10a89303e 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -35,6 +35,7 @@ typedef struct vhost_vdpa {
bool shadow_vqs_enabled;
/* IOVA mapping used by the Shadow Virtqueue */
VhostIOVATree *iova_tree;
+ Error *migration_blocker;
GPtrArray *shadow_vqs;
const VhostShadowVirtqueueOps *shadow_vq_ops;
void *shadow_vq_ops_opaque;
--
2.31.1

View File

@ -1,223 +0,0 @@
From 0b27781f9984c67625c49a516c3e38fbf5fa1b1b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <eperezma@redhat.com>
Date: Thu, 21 Jul 2022 16:06:16 +0200
Subject: [PATCH 27/32] vdpa: Add x-svq to NetdevVhostVDPAOptions
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Eugenio Pérez <eperezma@redhat.com>
RH-MergeRequest: 108: Net Control Virtqueue shadow Support
RH-Commit: [27/27] bd85496c2a8c1ebf34f908fca2be2ab9852fd0e9 (eperezmartin/qemu-kvm)
RH-Bugzilla: 1939363
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Cindy Lu <lulu@redhat.com>
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
Bugzilla: https://bugzilla.redhat.com/1939363
Upstream Status: git://git.qemu.org/qemu.git
commit 1576dbb5bbc49344c606e969ec749be70c0fd94e
Author: Eugenio Pérez <eperezma@redhat.com>
Date: Wed Jul 20 08:59:46 2022 +0200
vdpa: Add x-svq to NetdevVhostVDPAOptions
Finally offering the possibility to enable SVQ from the command line.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
net/vhost-vdpa.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++--
qapi/net.json | 9 +++++-
2 files changed, 77 insertions(+), 4 deletions(-)
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 8b76dac966..50672bcd66 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -75,6 +75,28 @@ const int vdpa_feature_bits[] = {
VHOST_INVALID_FEATURE_BIT
};
+/** Supported device specific feature bits with SVQ */
+static const uint64_t vdpa_svq_device_features =
+ BIT_ULL(VIRTIO_NET_F_CSUM) |
+ BIT_ULL(VIRTIO_NET_F_GUEST_CSUM) |
+ BIT_ULL(VIRTIO_NET_F_MTU) |
+ BIT_ULL(VIRTIO_NET_F_MAC) |
+ BIT_ULL(VIRTIO_NET_F_GUEST_TSO4) |
+ BIT_ULL(VIRTIO_NET_F_GUEST_TSO6) |
+ BIT_ULL(VIRTIO_NET_F_GUEST_ECN) |
+ BIT_ULL(VIRTIO_NET_F_GUEST_UFO) |
+ BIT_ULL(VIRTIO_NET_F_HOST_TSO4) |
+ BIT_ULL(VIRTIO_NET_F_HOST_TSO6) |
+ BIT_ULL(VIRTIO_NET_F_HOST_ECN) |
+ BIT_ULL(VIRTIO_NET_F_HOST_UFO) |
+ BIT_ULL(VIRTIO_NET_F_MRG_RXBUF) |
+ BIT_ULL(VIRTIO_NET_F_STATUS) |
+ BIT_ULL(VIRTIO_NET_F_CTRL_VQ) |
+ BIT_ULL(VIRTIO_F_ANY_LAYOUT) |
+ BIT_ULL(VIRTIO_NET_F_CTRL_MAC_ADDR) |
+ BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
+ BIT_ULL(VIRTIO_NET_F_STANDBY);
+
VHostNetState *vhost_vdpa_get_vhost_net(NetClientState *nc)
{
VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
@@ -133,9 +155,13 @@ err_init:
static void vhost_vdpa_cleanup(NetClientState *nc)
{
VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
+ struct vhost_dev *dev = &s->vhost_net->dev;
qemu_vfree(s->cvq_cmd_out_buffer);
qemu_vfree(s->cvq_cmd_in_buffer);
+ if (dev->vq_index + dev->nvqs == dev->vq_index_end) {
+ g_clear_pointer(&s->vhost_vdpa.iova_tree, vhost_iova_tree_delete);
+ }
if (s->vhost_net) {
vhost_net_cleanup(s->vhost_net);
g_free(s->vhost_net);
@@ -437,7 +463,9 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
int vdpa_device_fd,
int queue_pair_index,
int nvqs,
- bool is_datapath)
+ bool is_datapath,
+ bool svq,
+ VhostIOVATree *iova_tree)
{
NetClientState *nc = NULL;
VhostVDPAState *s;
@@ -455,6 +483,8 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
s->vhost_vdpa.device_fd = vdpa_device_fd;
s->vhost_vdpa.index = queue_pair_index;
+ s->vhost_vdpa.shadow_vqs_enabled = svq;
+ s->vhost_vdpa.iova_tree = iova_tree;
if (!is_datapath) {
s->cvq_cmd_out_buffer = qemu_memalign(qemu_real_host_page_size,
vhost_vdpa_net_cvq_cmd_page_len());
@@ -465,6 +495,8 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
s->vhost_vdpa.shadow_vq_ops = &vhost_vdpa_net_svq_ops;
s->vhost_vdpa.shadow_vq_ops_opaque = s;
+ error_setg(&s->vhost_vdpa.migration_blocker,
+ "Migration disabled: vhost-vdpa uses CVQ.");
}
ret = vhost_vdpa_add(nc, (void *)&s->vhost_vdpa, queue_pair_index, nvqs);
if (ret) {
@@ -474,6 +506,14 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
return nc;
}
+static int vhost_vdpa_get_iova_range(int fd,
+ struct vhost_vdpa_iova_range *iova_range)
+{
+ int ret = ioctl(fd, VHOST_VDPA_GET_IOVA_RANGE, iova_range);
+
+ return ret < 0 ? -errno : 0;
+}
+
static int vhost_vdpa_get_features(int fd, uint64_t *features, Error **errp)
{
int ret = ioctl(fd, VHOST_GET_FEATURES, features);
@@ -524,6 +564,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
uint64_t features;
int vdpa_device_fd;
g_autofree NetClientState **ncs = NULL;
+ g_autoptr(VhostIOVATree) iova_tree = NULL;
NetClientState *nc;
int queue_pairs, r, i, has_cvq = 0;
@@ -551,22 +592,45 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
return queue_pairs;
}
+ if (opts->x_svq) {
+ struct vhost_vdpa_iova_range iova_range;
+
+ uint64_t invalid_dev_features =
+ features & ~vdpa_svq_device_features &
+ /* Transport are all accepted at this point */
+ ~MAKE_64BIT_MASK(VIRTIO_TRANSPORT_F_START,
+ VIRTIO_TRANSPORT_F_END - VIRTIO_TRANSPORT_F_START);
+
+ if (invalid_dev_features) {
+ error_setg(errp, "vdpa svq does not work with features 0x%" PRIx64,
+ invalid_dev_features);
+ goto err_svq;
+ }
+
+ vhost_vdpa_get_iova_range(vdpa_device_fd, &iova_range);
+ iova_tree = vhost_iova_tree_new(iova_range.first, iova_range.last);
+ }
+
ncs = g_malloc0(sizeof(*ncs) * queue_pairs);
for (i = 0; i < queue_pairs; i++) {
ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
- vdpa_device_fd, i, 2, true);
+ vdpa_device_fd, i, 2, true, opts->x_svq,
+ iova_tree);
if (!ncs[i])
goto err;
}
if (has_cvq) {
nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
- vdpa_device_fd, i, 1, false);
+ vdpa_device_fd, i, 1, false,
+ opts->x_svq, iova_tree);
if (!nc)
goto err;
}
+ /* iova_tree ownership belongs to last NetClientState */
+ g_steal_pointer(&iova_tree);
return 0;
err:
@@ -575,6 +639,8 @@ err:
qemu_del_net_client(ncs[i]);
}
}
+
+err_svq:
qemu_close(vdpa_device_fd);
return -1;
diff --git a/qapi/net.json b/qapi/net.json
index b92f3f5fb4..92848e4362 100644
--- a/qapi/net.json
+++ b/qapi/net.json
@@ -445,12 +445,19 @@
# @queues: number of queues to be created for multiqueue vhost-vdpa
# (default: 1)
#
+# @x-svq: Start device with (experimental) shadow virtqueue. (Since 7.1)
+# (default: false)
+#
+# Features:
+# @unstable: Member @x-svq is experimental.
+#
# Since: 5.1
##
{ 'struct': 'NetdevVhostVDPAOptions',
'data': {
'*vhostdev': 'str',
- '*queues': 'int' } }
+ '*queues': 'int',
+ '*x-svq': {'type': 'bool', 'features' : [ 'unstable'] } } }
##
# @NetClientDriver:
--
2.31.1

View File

@ -1,65 +0,0 @@
From df06ce560ddfefde98bef822ec2020382059921f Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <eperezma@redhat.com>
Date: Thu, 21 Jul 2022 15:38:55 +0200
Subject: [PATCH 10/32] vdpa: Avoid compiler to squash reads to used idx
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Eugenio Pérez <eperezma@redhat.com>
RH-MergeRequest: 108: Net Control Virtqueue shadow Support
RH-Commit: [10/27] b28789302d4f64749da26f413763f918161d9b70 (eperezmartin/qemu-kvm)
RH-Bugzilla: 1939363
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Cindy Lu <lulu@redhat.com>
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
Bugzilla: https://bugzilla.redhat.com/1939363
Upstream Status: git://git.qemu.org/qemu.git
commit c381abc37f0aba42ed2e3b41cdace8f8438829e4
Author: Eugenio Pérez <eperezma@redhat.com>
Date: Wed Jul 20 08:59:29 2022 +0200
vdpa: Avoid compiler to squash reads to used idx
In the next patch we will allow busypolling of this value. The compiler
have a running path where shadow_used_idx, last_used_idx, and vring used
idx are not modified within the same thread busypolling.
This was not an issue before since we always cleared device event
notifier before checking it, and that could act as memory barrier.
However, the busypoll needs something similar to kernel READ_ONCE.
Let's add it here, sepparated from the polling.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
hw/virtio/vhost-shadow-virtqueue.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 3fbda1e3d4..9c46c3a8fa 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -327,11 +327,12 @@ static void vhost_handle_guest_kick_notifier(EventNotifier *n)
static bool vhost_svq_more_used(VhostShadowVirtqueue *svq)
{
+ uint16_t *used_idx = &svq->vring.used->idx;
if (svq->last_used_idx != svq->shadow_used_idx) {
return true;
}
- svq->shadow_used_idx = cpu_to_le16(svq->vring.used->idx);
+ svq->shadow_used_idx = cpu_to_le16(*(volatile uint16_t *)used_idx);
return svq->last_used_idx != svq->shadow_used_idx;
}
--
2.31.1

View File

@ -1,323 +0,0 @@
From 881945094c0e4d33614d40959bfc20e395f5a478 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <eperezma@redhat.com>
Date: Thu, 21 Jul 2022 16:05:40 +0200
Subject: [PATCH 24/32] vdpa: Buffer CVQ support on shadow virtqueue
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Eugenio Pérez <eperezma@redhat.com>
RH-MergeRequest: 108: Net Control Virtqueue shadow Support
RH-Commit: [24/27] 5486f80141a3ad968a32e782bdcdead32f417352 (eperezmartin/qemu-kvm)
RH-Bugzilla: 1939363
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Cindy Lu <lulu@redhat.com>
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
Bugzilla: https://bugzilla.redhat.com/1939363
Upstream Status: git://git.qemu.org/qemu.git
commit 2df4dd31e194c94da7d28c02e92449f4a989fca9
Author: Eugenio Pérez <eperezma@redhat.com>
Date: Wed Jul 20 08:59:43 2022 +0200
vdpa: Buffer CVQ support on shadow virtqueue
Introduce the control virtqueue support for vDPA shadow virtqueue. This
is needed for advanced networking features like rx filtering.
Virtio-net control VQ copies the descriptors to qemu's VA, so we avoid
TOCTOU with the guest's or device's memory every time there is a device
model change. Otherwise, the guest could change the memory content in
the time between qemu and the device read it.
To demonstrate command handling, VIRTIO_NET_F_CTRL_MACADDR is
implemented. If the virtio-net driver changes MAC the virtio-net device
model will be updated with the new one, and a rx filtering change event
will be raised.
More cvq commands could be added here straightforwardly but they have
not been tested.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
net/vhost-vdpa.c | 213 +++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 205 insertions(+), 8 deletions(-)
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 2e3b6b10d8..df42822463 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -33,6 +33,9 @@ typedef struct VhostVDPAState {
NetClientState nc;
struct vhost_vdpa vhost_vdpa;
VHostNetState *vhost_net;
+
+ /* Control commands shadow buffers */
+ void *cvq_cmd_out_buffer, *cvq_cmd_in_buffer;
bool started;
} VhostVDPAState;
@@ -131,6 +134,8 @@ static void vhost_vdpa_cleanup(NetClientState *nc)
{
VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
+ qemu_vfree(s->cvq_cmd_out_buffer);
+ qemu_vfree(s->cvq_cmd_in_buffer);
if (s->vhost_net) {
vhost_net_cleanup(s->vhost_net);
g_free(s->vhost_net);
@@ -190,24 +195,191 @@ static NetClientInfo net_vhost_vdpa_info = {
.check_peer_type = vhost_vdpa_check_peer_type,
};
+static void vhost_vdpa_cvq_unmap_buf(struct vhost_vdpa *v, void *addr)
+{
+ VhostIOVATree *tree = v->iova_tree;
+ DMAMap needle = {
+ /*
+ * No need to specify size or to look for more translations since
+ * this contiguous chunk was allocated by us.
+ */
+ .translated_addr = (hwaddr)(uintptr_t)addr,
+ };
+ const DMAMap *map = vhost_iova_tree_find_iova(tree, &needle);
+ int r;
+
+ if (unlikely(!map)) {
+ error_report("Cannot locate expected map");
+ return;
+ }
+
+ r = vhost_vdpa_dma_unmap(v, map->iova, map->size + 1);
+ if (unlikely(r != 0)) {
+ error_report("Device cannot unmap: %s(%d)", g_strerror(r), r);
+ }
+
+ vhost_iova_tree_remove(tree, map);
+}
+
+static size_t vhost_vdpa_net_cvq_cmd_len(void)
+{
+ /*
+ * MAC_TABLE_SET is the ctrl command that produces the longer out buffer.
+ * In buffer is always 1 byte, so it should fit here
+ */
+ return sizeof(struct virtio_net_ctrl_hdr) +
+ 2 * sizeof(struct virtio_net_ctrl_mac) +
+ MAC_TABLE_ENTRIES * ETH_ALEN;
+}
+
+static size_t vhost_vdpa_net_cvq_cmd_page_len(void)
+{
+ return ROUND_UP(vhost_vdpa_net_cvq_cmd_len(), qemu_real_host_page_size);
+}
+
+/** Copy and map a guest buffer. */
+static bool vhost_vdpa_cvq_map_buf(struct vhost_vdpa *v,
+ const struct iovec *out_data,
+ size_t out_num, size_t data_len, void *buf,
+ size_t *written, bool write)
+{
+ DMAMap map = {};
+ int r;
+
+ if (unlikely(!data_len)) {
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: invalid legnth of %s buffer\n",
+ __func__, write ? "in" : "out");
+ return false;
+ }
+
+ *written = iov_to_buf(out_data, out_num, 0, buf, data_len);
+ map.translated_addr = (hwaddr)(uintptr_t)buf;
+ map.size = vhost_vdpa_net_cvq_cmd_page_len() - 1;
+ map.perm = write ? IOMMU_RW : IOMMU_RO,
+ r = vhost_iova_tree_map_alloc(v->iova_tree, &map);
+ if (unlikely(r != IOVA_OK)) {
+ error_report("Cannot map injected element");
+ return false;
+ }
+
+ r = vhost_vdpa_dma_map(v, map.iova, vhost_vdpa_net_cvq_cmd_page_len(), buf,
+ !write);
+ if (unlikely(r < 0)) {
+ goto dma_map_err;
+ }
+
+ return true;
+
+dma_map_err:
+ vhost_iova_tree_remove(v->iova_tree, &map);
+ return false;
+}
+
/**
- * Forward buffer for the moment.
+ * Copy the guest element into a dedicated buffer suitable to be sent to NIC
+ *
+ * @iov: [0] is the out buffer, [1] is the in one
+ */
+static bool vhost_vdpa_net_cvq_map_elem(VhostVDPAState *s,
+ VirtQueueElement *elem,
+ struct iovec *iov)
+{
+ size_t in_copied;
+ bool ok;
+
+ iov[0].iov_base = s->cvq_cmd_out_buffer;
+ ok = vhost_vdpa_cvq_map_buf(&s->vhost_vdpa, elem->out_sg, elem->out_num,
+ vhost_vdpa_net_cvq_cmd_len(), iov[0].iov_base,
+ &iov[0].iov_len, false);
+ if (unlikely(!ok)) {
+ return false;
+ }
+
+ iov[1].iov_base = s->cvq_cmd_in_buffer;
+ ok = vhost_vdpa_cvq_map_buf(&s->vhost_vdpa, NULL, 0,
+ sizeof(virtio_net_ctrl_ack), iov[1].iov_base,
+ &in_copied, true);
+ if (unlikely(!ok)) {
+ vhost_vdpa_cvq_unmap_buf(&s->vhost_vdpa, s->cvq_cmd_out_buffer);
+ return false;
+ }
+
+ iov[1].iov_len = sizeof(virtio_net_ctrl_ack);
+ return true;
+}
+
+/**
+ * Do not forward commands not supported by SVQ. Otherwise, the device could
+ * accept it and qemu would not know how to update the device model.
+ */
+static bool vhost_vdpa_net_cvq_validate_cmd(const struct iovec *out,
+ size_t out_num)
+{
+ struct virtio_net_ctrl_hdr ctrl;
+ size_t n;
+
+ n = iov_to_buf(out, out_num, 0, &ctrl, sizeof(ctrl));
+ if (unlikely(n < sizeof(ctrl))) {
+ qemu_log_mask(LOG_GUEST_ERROR,
+ "%s: invalid legnth of out buffer %zu\n", __func__, n);
+ return false;
+ }
+
+ switch (ctrl.class) {
+ case VIRTIO_NET_CTRL_MAC:
+ switch (ctrl.cmd) {
+ case VIRTIO_NET_CTRL_MAC_ADDR_SET:
+ return true;
+ default:
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: invalid mac cmd %u\n",
+ __func__, ctrl.cmd);
+ };
+ break;
+ default:
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: invalid control class %u\n",
+ __func__, ctrl.class);
+ };
+
+ return false;
+}
+
+/**
+ * Validate and copy control virtqueue commands.
+ *
+ * Following QEMU guidelines, we offer a copy of the buffers to the device to
+ * prevent TOCTOU bugs.
*/
static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
VirtQueueElement *elem,
void *opaque)
{
- unsigned int n = elem->out_num + elem->in_num;
- g_autofree struct iovec *dev_buffers = g_new(struct iovec, n);
+ VhostVDPAState *s = opaque;
size_t in_len, dev_written;
virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
- int r;
+ /* out and in buffers sent to the device */
+ struct iovec dev_buffers[2] = {
+ { .iov_base = s->cvq_cmd_out_buffer },
+ { .iov_base = s->cvq_cmd_in_buffer },
+ };
+ /* in buffer used for device model */
+ const struct iovec in = {
+ .iov_base = &status,
+ .iov_len = sizeof(status),
+ };
+ int r = -EINVAL;
+ bool ok;
+
+ ok = vhost_vdpa_net_cvq_map_elem(s, elem, dev_buffers);
+ if (unlikely(!ok)) {
+ goto out;
+ }
- memcpy(dev_buffers, elem->out_sg, elem->out_num);
- memcpy(dev_buffers + elem->out_num, elem->in_sg, elem->in_num);
+ ok = vhost_vdpa_net_cvq_validate_cmd(&dev_buffers[0], 1);
+ if (unlikely(!ok)) {
+ goto out;
+ }
- r = vhost_svq_add(svq, &dev_buffers[0], elem->out_num, &dev_buffers[1],
- elem->in_num, elem);
+ r = vhost_svq_add(svq, &dev_buffers[0], 1, &dev_buffers[1], 1, elem);
if (unlikely(r != 0)) {
if (unlikely(r == -ENOSPC)) {
qemu_log_mask(LOG_GUEST_ERROR, "%s: No space on device queue\n",
@@ -224,6 +396,18 @@ static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
dev_written = vhost_svq_poll(svq);
if (unlikely(dev_written < sizeof(status))) {
error_report("Insufficient written data (%zu)", dev_written);
+ goto out;
+ }
+
+ memcpy(&status, dev_buffers[1].iov_base, sizeof(status));
+ if (status != VIRTIO_NET_OK) {
+ goto out;
+ }
+
+ status = VIRTIO_NET_ERR;
+ virtio_net_handle_ctrl_iov(svq->vdev, &in, 1, dev_buffers, 1);
+ if (status != VIRTIO_NET_OK) {
+ error_report("Bad CVQ processing in model");
}
out:
@@ -234,6 +418,12 @@ out:
}
vhost_svq_push_elem(svq, elem, MIN(in_len, sizeof(status)));
g_free(elem);
+ if (dev_buffers[0].iov_base) {
+ vhost_vdpa_cvq_unmap_buf(&s->vhost_vdpa, dev_buffers[0].iov_base);
+ }
+ if (dev_buffers[1].iov_base) {
+ vhost_vdpa_cvq_unmap_buf(&s->vhost_vdpa, dev_buffers[1].iov_base);
+ }
return r;
}
@@ -266,6 +456,13 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
s->vhost_vdpa.device_fd = vdpa_device_fd;
s->vhost_vdpa.index = queue_pair_index;
if (!is_datapath) {
+ s->cvq_cmd_out_buffer = qemu_memalign(qemu_real_host_page_size,
+ vhost_vdpa_net_cvq_cmd_page_len());
+ memset(s->cvq_cmd_out_buffer, 0, vhost_vdpa_net_cvq_cmd_page_len());
+ s->cvq_cmd_in_buffer = qemu_memalign(qemu_real_host_page_size,
+ vhost_vdpa_net_cvq_cmd_page_len());
+ memset(s->cvq_cmd_in_buffer, 0, vhost_vdpa_net_cvq_cmd_page_len());
+
s->vhost_vdpa.shadow_vq_ops = &vhost_vdpa_net_svq_ops;
s->vhost_vdpa.shadow_vq_ops_opaque = s;
}
--
2.31.1

View File

@ -1,84 +0,0 @@
From 3a5d325fcb2958318262efac31d5fd25fb062523 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <eperezma@redhat.com>
Date: Thu, 21 Jul 2022 15:38:55 +0200
Subject: [PATCH 21/32] vdpa: Export vhost_vdpa_dma_map and unmap calls
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Eugenio Pérez <eperezma@redhat.com>
RH-MergeRequest: 108: Net Control Virtqueue shadow Support
RH-Commit: [21/27] 97e7a583bbd3c12a0786d53132812ec41702c190 (eperezmartin/qemu-kvm)
RH-Bugzilla: 1939363
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Cindy Lu <lulu@redhat.com>
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
Bugzilla: https://bugzilla.redhat.com/1939363
Upstream Status: git://git.qemu.org/qemu.git
commit 463ba1e3b8cf080812895c5f26d95d8d7db2e692
Author: Eugenio Pérez <eperezma@redhat.com>
Date: Wed Jul 20 08:59:40 2022 +0200
vdpa: Export vhost_vdpa_dma_map and unmap calls
Shadow CVQ will copy buffers on qemu VA, so we avoid TOCTOU attacks from
the guest that could set a different state in qemu device model and vdpa
device.
To do so, it needs to be able to map these new buffers to the device.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
hw/virtio/vhost-vdpa.c | 7 +++----
include/hw/virtio/vhost-vdpa.h | 4 ++++
2 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 28df57b12e..14b02fe079 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -71,8 +71,8 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
return false;
}
-static int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
- void *vaddr, bool readonly)
+int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
+ void *vaddr, bool readonly)
{
struct vhost_msg_v2 msg = {};
int fd = v->device_fd;
@@ -97,8 +97,7 @@ static int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
return ret;
}
-static int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova,
- hwaddr size)
+int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size)
{
struct vhost_msg_v2 msg = {};
int fd = v->device_fd;
diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index a29dbb3f53..7214eb47dc 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -39,4 +39,8 @@ typedef struct vhost_vdpa {
VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
} VhostVDPA;
+int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
+ void *vaddr, bool readonly);
+int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
+
#endif
--
2.31.1

View File

@ -1,108 +0,0 @@
From 9a290bd74f983f3a65aa9ec5df2da9aa94bfdecd Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <eperezma@redhat.com>
Date: Thu, 21 Jul 2022 16:05:42 +0200
Subject: [PATCH 25/32] vdpa: Extract get features part from
vhost_vdpa_get_max_queue_pairs
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Eugenio Pérez <eperezma@redhat.com>
RH-MergeRequest: 108: Net Control Virtqueue shadow Support
RH-Commit: [25/27] 654ad68e10a4df84cced923c64e72d500721ad67 (eperezmartin/qemu-kvm)
RH-Bugzilla: 1939363
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Cindy Lu <lulu@redhat.com>
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
Bugzilla: https://bugzilla.redhat.com/1939363
Upstream Status: git://git.qemu.org/qemu.git
commit 8170ab3f43989680491d00f1017f60b25d346114
Author: Eugenio Pérez <eperezma@redhat.com>
Date: Wed Jul 20 08:59:44 2022 +0200
vdpa: Extract get features part from vhost_vdpa_get_max_queue_pairs
To know the device features is needed for CVQ SVQ, so SVQ knows if it
can handle all commands or not. Extract from
vhost_vdpa_get_max_queue_pairs so we can reuse it.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
net/vhost-vdpa.c | 30 ++++++++++++++++++++----------
1 file changed, 20 insertions(+), 10 deletions(-)
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index df42822463..8b76dac966 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -474,20 +474,24 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
return nc;
}
-static int vhost_vdpa_get_max_queue_pairs(int fd, int *has_cvq, Error **errp)
+static int vhost_vdpa_get_features(int fd, uint64_t *features, Error **errp)
+{
+ int ret = ioctl(fd, VHOST_GET_FEATURES, features);
+ if (unlikely(ret < 0)) {
+ error_setg_errno(errp, errno,
+ "Fail to query features from vhost-vDPA device");
+ }
+ return ret;
+}
+
+static int vhost_vdpa_get_max_queue_pairs(int fd, uint64_t features,
+ int *has_cvq, Error **errp)
{
unsigned long config_size = offsetof(struct vhost_vdpa_config, buf);
g_autofree struct vhost_vdpa_config *config = NULL;
__virtio16 *max_queue_pairs;
- uint64_t features;
int ret;
- ret = ioctl(fd, VHOST_GET_FEATURES, &features);
- if (ret) {
- error_setg(errp, "Fail to query features from vhost-vDPA device");
- return ret;
- }
-
if (features & (1 << VIRTIO_NET_F_CTRL_VQ)) {
*has_cvq = 1;
} else {
@@ -517,10 +521,11 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
NetClientState *peer, Error **errp)
{
const NetdevVhostVDPAOptions *opts;
+ uint64_t features;
int vdpa_device_fd;
g_autofree NetClientState **ncs = NULL;
NetClientState *nc;
- int queue_pairs, i, has_cvq = 0;
+ int queue_pairs, r, i, has_cvq = 0;
assert(netdev->type == NET_CLIENT_DRIVER_VHOST_VDPA);
opts = &netdev->u.vhost_vdpa;
@@ -534,7 +539,12 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
return -errno;
}
- queue_pairs = vhost_vdpa_get_max_queue_pairs(vdpa_device_fd,
+ r = vhost_vdpa_get_features(vdpa_device_fd, &features, errp);
+ if (unlikely(r < 0)) {
+ return r;
+ }
+
+ queue_pairs = vhost_vdpa_get_max_queue_pairs(vdpa_device_fd, features,
&has_cvq, errp);
if (queue_pairs < 0) {
qemu_close(vdpa_device_fd);
--
2.31.1

View File

@ -1,166 +0,0 @@
From c33bc0b7f2b5cfa330a6d89d60ee94de129c65c1 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <eperezma@redhat.com>
Date: Thu, 21 Jul 2022 16:05:38 +0200
Subject: [PATCH 23/32] vdpa: manual forward CVQ buffers
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Eugenio Pérez <eperezma@redhat.com>
RH-MergeRequest: 108: Net Control Virtqueue shadow Support
RH-Commit: [23/27] ce128d5152be7eebf87e186eb8b58c2ed95aff6d (eperezmartin/qemu-kvm)
RH-Bugzilla: 1939363
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Cindy Lu <lulu@redhat.com>
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
Bugzilla: https://bugzilla.redhat.com/1939363
Upstream Status: git://git.qemu.org/qemu.git
commit bd907ae4b00ebedad5e586af05ea3d6490318d45
Author: Eugenio Pérez <eperezma@redhat.com>
Date: Wed Jul 20 08:59:42 2022 +0200
vdpa: manual forward CVQ buffers
Do a simple forwarding of CVQ buffers, the same work SVQ could do but
through callbacks. No functional change intended.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
hw/virtio/vhost-vdpa.c | 3 +-
include/hw/virtio/vhost-vdpa.h | 3 ++
net/vhost-vdpa.c | 58 ++++++++++++++++++++++++++++++++++
3 files changed, 63 insertions(+), 1 deletion(-)
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 14b02fe079..49effe5462 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -417,7 +417,8 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
for (unsigned n = 0; n < hdev->nvqs; ++n) {
g_autoptr(VhostShadowVirtqueue) svq;
- svq = vhost_svq_new(v->iova_tree, NULL, NULL);
+ svq = vhost_svq_new(v->iova_tree, v->shadow_vq_ops,
+ v->shadow_vq_ops_opaque);
if (unlikely(!svq)) {
error_setg(errp, "Cannot create svq %u", n);
return -1;
diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index 7214eb47dc..1111d85643 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -15,6 +15,7 @@
#include <gmodule.h>
#include "hw/virtio/vhost-iova-tree.h"
+#include "hw/virtio/vhost-shadow-virtqueue.h"
#include "hw/virtio/virtio.h"
#include "standard-headers/linux/vhost_types.h"
@@ -35,6 +36,8 @@ typedef struct vhost_vdpa {
/* IOVA mapping used by the Shadow Virtqueue */
VhostIOVATree *iova_tree;
GPtrArray *shadow_vqs;
+ const VhostShadowVirtqueueOps *shadow_vq_ops;
+ void *shadow_vq_ops_opaque;
struct vhost_dev *dev;
VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
} VhostVDPA;
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index df1e69ee72..2e3b6b10d8 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -11,11 +11,14 @@
#include "qemu/osdep.h"
#include "clients.h"
+#include "hw/virtio/virtio-net.h"
#include "net/vhost_net.h"
#include "net/vhost-vdpa.h"
#include "hw/virtio/vhost-vdpa.h"
#include "qemu/config-file.h"
#include "qemu/error-report.h"
+#include "qemu/log.h"
+#include "qemu/memalign.h"
#include "qemu/option.h"
#include "qapi/error.h"
#include <linux/vhost.h>
@@ -187,6 +190,57 @@ static NetClientInfo net_vhost_vdpa_info = {
.check_peer_type = vhost_vdpa_check_peer_type,
};
+/**
+ * Forward buffer for the moment.
+ */
+static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
+ VirtQueueElement *elem,
+ void *opaque)
+{
+ unsigned int n = elem->out_num + elem->in_num;
+ g_autofree struct iovec *dev_buffers = g_new(struct iovec, n);
+ size_t in_len, dev_written;
+ virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
+ int r;
+
+ memcpy(dev_buffers, elem->out_sg, elem->out_num);
+ memcpy(dev_buffers + elem->out_num, elem->in_sg, elem->in_num);
+
+ r = vhost_svq_add(svq, &dev_buffers[0], elem->out_num, &dev_buffers[1],
+ elem->in_num, elem);
+ if (unlikely(r != 0)) {
+ if (unlikely(r == -ENOSPC)) {
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: No space on device queue\n",
+ __func__);
+ }
+ goto out;
+ }
+
+ /*
+ * We can poll here since we've had BQL from the time we sent the
+ * descriptor. Also, we need to take the answer before SVQ pulls by itself,
+ * when BQL is released
+ */
+ dev_written = vhost_svq_poll(svq);
+ if (unlikely(dev_written < sizeof(status))) {
+ error_report("Insufficient written data (%zu)", dev_written);
+ }
+
+out:
+ in_len = iov_from_buf(elem->in_sg, elem->in_num, 0, &status,
+ sizeof(status));
+ if (unlikely(in_len < sizeof(status))) {
+ error_report("Bad device CVQ written length");
+ }
+ vhost_svq_push_elem(svq, elem, MIN(in_len, sizeof(status)));
+ g_free(elem);
+ return r;
+}
+
+static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
+ .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
+};
+
static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
const char *device,
const char *name,
@@ -211,6 +265,10 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
s->vhost_vdpa.device_fd = vdpa_device_fd;
s->vhost_vdpa.index = queue_pair_index;
+ if (!is_datapath) {
+ s->vhost_vdpa.shadow_vq_ops = &vhost_vdpa_net_svq_ops;
+ s->vhost_vdpa.shadow_vq_ops_opaque = s;
+ }
ret = vhost_vdpa_add(nc, (void *)&s->vhost_vdpa, queue_pair_index, nvqs);
if (ret) {
qemu_del_net_client(nc);
--
2.31.1

View File

@ -1,114 +0,0 @@
From b90a5878355bd549200ed1eff52ea084325bfc8a Mon Sep 17 00:00:00 2001
From: Eric Auger <eric.auger@redhat.com>
Date: Fri, 6 May 2022 15:25:10 +0200
Subject: [PATCH 5/5] vfio/common: remove spurious tpm-crb-cmd misalignment
warning
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Eric Auger <eric.auger@redhat.com>
RH-MergeRequest: 84: vfio/common: Remove spurious tpm-crb-cmd misalignment warning
RH-Commit: [2/2] 9b73a9aec59cb50d5e3468cc553464bf4a73d0a1 (eauger1/centos-qemu-kvm)
RH-Bugzilla: 2037612
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Andrew Jones <drjones@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037612
Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=45166961
Upstream Status: YES
Tested: With TPM-CRB and VFIO
The CRB command buffer currently is a RAM MemoryRegion and given
its base address alignment, it causes an error report on
vfio_listener_region_add(). This region could have been a RAM device
region, easing the detection of such safe situation but this option
was not well received. So let's add a helper function that uses the
memory region owner type to detect the situation is safe wrt
the assignment. Other device types can be checked here if such kind
of problem occurs again.
Conflicts in hw/vfio/common.c
We don't have 8e3b0cbb721 ("Replace qemu_real_host_page variables with inlined functions")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Link: https://lore.kernel.org/r/20220506132510.1847942-3-eric.auger@redhat.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
(cherry picked from commit 851d6d1a0ff29a87ec588205842edf6b86d99b5c)
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/vfio/common.c | 27 ++++++++++++++++++++++++++-
hw/vfio/trace-events | 1 +
2 files changed, 27 insertions(+), 1 deletion(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 080046e3f5..0fbe0d47af 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -40,6 +40,7 @@
#include "trace.h"
#include "qapi/error.h"
#include "migration/migration.h"
+#include "sysemu/tpm.h"
VFIOGroupList vfio_group_list =
QLIST_HEAD_INITIALIZER(vfio_group_list);
@@ -861,6 +862,22 @@ static void vfio_unregister_ram_discard_listener(VFIOContainer *container,
g_free(vrdl);
}
+static bool vfio_known_safe_misalignment(MemoryRegionSection *section)
+{
+ MemoryRegion *mr = section->mr;
+
+ if (!TPM_IS_CRB(mr->owner)) {
+ return false;
+ }
+
+ /* this is a known safe misaligned region, just trace for debug purpose */
+ trace_vfio_known_safe_misalignment(memory_region_name(mr),
+ section->offset_within_address_space,
+ section->offset_within_region,
+ qemu_real_host_page_size);
+ return true;
+}
+
static void vfio_listener_region_add(MemoryListener *listener,
MemoryRegionSection *section)
{
@@ -884,7 +901,15 @@ static void vfio_listener_region_add(MemoryListener *listener,
if (unlikely((section->offset_within_address_space &
~qemu_real_host_page_mask) !=
(section->offset_within_region & ~qemu_real_host_page_mask))) {
- error_report("%s received unaligned region", __func__);
+ if (!vfio_known_safe_misalignment(section)) {
+ error_report("%s received unaligned region %s iova=0x%"PRIx64
+ " offset_within_region=0x%"PRIx64
+ " qemu_real_host_page_size=0x%"PRIxPTR,
+ __func__, memory_region_name(section->mr),
+ section->offset_within_address_space,
+ section->offset_within_region,
+ qemu_real_host_page_size);
+ }
return;
}
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index 0ef1b5f4a6..582882db91 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -100,6 +100,7 @@ vfio_listener_region_add_skip(uint64_t start, uint64_t end) "SKIPPING region_add
vfio_spapr_group_attach(int groupfd, int tablefd) "Attached groupfd %d to liobn fd %d"
vfio_listener_region_add_iommu(uint64_t start, uint64_t end) "region_add [iommu] 0x%"PRIx64" - 0x%"PRIx64
vfio_listener_region_add_ram(uint64_t iova_start, uint64_t iova_end, void *vaddr) "region_add [ram] 0x%"PRIx64" - 0x%"PRIx64" [%p]"
+vfio_known_safe_misalignment(const char *name, uint64_t iova, uint64_t offset_within_region, uintptr_t page_size) "Region \"%s\" iova=0x%"PRIx64" offset_within_region=0x%"PRIx64" qemu_real_host_page_size=0x%"PRIxPTR ": cannot be mapped for DMA"
vfio_listener_region_add_no_dma_map(const char *name, uint64_t iova, uint64_t size, uint64_t page_size) "Region \"%s\" 0x%"PRIx64" size=0x%"PRIx64" is not aligned to 0x%"PRIx64" and cannot be mapped for DMA"
vfio_listener_region_del_skip(uint64_t start, uint64_t end) "SKIPPING region_del 0x%"PRIx64" - 0x%"PRIx64
vfio_listener_region_del(uint64_t start, uint64_t end) "region_del 0x%"PRIx64" - 0x%"PRIx64
--
2.31.1

View File

@ -1,78 +0,0 @@
From 3de8fb9f3dba18d04efa10b70bcec641035effc5 Mon Sep 17 00:00:00 2001
From: Eric Auger <eric.auger@redhat.com>
Date: Tue, 24 May 2022 05:14:05 -0400
Subject: [PATCH 16/16] vfio/common: remove spurious warning on
vfio_listener_region_del
RH-Author: Eric Auger <eric.auger@redhat.com>
RH-MergeRequest: 101: vfio/common: remove spurious warning on vfio_listener_region_del
RH-Commit: [1/1] dac688b8a981ebb964fea79ea198c329b9cdb551 (eauger1/centos-qemu-kvm)
RH-Bugzilla: 2086262
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Alex Williamson <None>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2086262
Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=45876133
Upstream Status: YES
Tested: With TPM-CRB and VFIO
851d6d1a0f ("vfio/common: remove spurious tpm-crb-cmd misalignment
warning") removed the warning on vfio_listener_region_add() path.
However the same warning also hits on region_del path. Let's remove
it and reword the dynamic trace as this can be called on both
map and unmap path.
Contextual Conflict in hw/vfio/common.c
We don't have 8e3b0cbb721 ("Replace qemu_real_host_page variables with inlined functions")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Link: https://lore.kernel.org/r/20220524091405.416256-1-eric.auger@redhat.com
Fixes: 851d6d1a0ff2 ("vfio/common: remove spurious tpm-crb-cmd misalignment warning")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
(cherry picked from commit ec6600be0dc16982181c7ad80d94c143c0807dd2)
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/vfio/common.c | 10 +++++++++-
hw/vfio/trace-events | 2 +-
2 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 0fbe0d47af..637981f9a1 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1145,7 +1145,15 @@ static void vfio_listener_region_del(MemoryListener *listener,
if (unlikely((section->offset_within_address_space &
~qemu_real_host_page_mask) !=
(section->offset_within_region & ~qemu_real_host_page_mask))) {
- error_report("%s received unaligned region", __func__);
+ if (!vfio_known_safe_misalignment(section)) {
+ error_report("%s received unaligned region %s iova=0x%"PRIx64
+ " offset_within_region=0x%"PRIx64
+ " qemu_real_host_page_size=0x%"PRIxPTR,
+ __func__, memory_region_name(section->mr),
+ section->offset_within_address_space,
+ section->offset_within_region,
+ qemu_real_host_page_size);
+ }
return;
}
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index 582882db91..73dffe9e00 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -100,7 +100,7 @@ vfio_listener_region_add_skip(uint64_t start, uint64_t end) "SKIPPING region_add
vfio_spapr_group_attach(int groupfd, int tablefd) "Attached groupfd %d to liobn fd %d"
vfio_listener_region_add_iommu(uint64_t start, uint64_t end) "region_add [iommu] 0x%"PRIx64" - 0x%"PRIx64
vfio_listener_region_add_ram(uint64_t iova_start, uint64_t iova_end, void *vaddr) "region_add [ram] 0x%"PRIx64" - 0x%"PRIx64" [%p]"
-vfio_known_safe_misalignment(const char *name, uint64_t iova, uint64_t offset_within_region, uintptr_t page_size) "Region \"%s\" iova=0x%"PRIx64" offset_within_region=0x%"PRIx64" qemu_real_host_page_size=0x%"PRIxPTR ": cannot be mapped for DMA"
+vfio_known_safe_misalignment(const char *name, uint64_t iova, uint64_t offset_within_region, uintptr_t page_size) "Region \"%s\" iova=0x%"PRIx64" offset_within_region=0x%"PRIx64" qemu_real_host_page_size=0x%"PRIxPTR
vfio_listener_region_add_no_dma_map(const char *name, uint64_t iova, uint64_t size, uint64_t page_size) "Region \"%s\" 0x%"PRIx64" size=0x%"PRIx64" is not aligned to 0x%"PRIx64" and cannot be mapped for DMA"
vfio_listener_region_del_skip(uint64_t start, uint64_t end) "SKIPPING region_del 0x%"PRIx64" - 0x%"PRIx64
vfio_listener_region_del(uint64_t start, uint64_t end) "region_del 0x%"PRIx64" - 0x%"PRIx64
--
2.31.1

Some files were not shown because too many files have changed in this diff Show More