import CS qemu-kvm-6.2.0-39.module_el8+669+76cc32af

This commit is contained in:
eabdullin 2023-10-16 09:56:10 +00:00
parent f4f2c7fce8
commit 5cc2edd609
74 changed files with 7691 additions and 1 deletions

4
.gitignore vendored
View File

@ -1 +1,5 @@
SOURCES/qemu-6.2.0.tar.xz
SOURCES/tests_data_acpi_pc_SSDT.dimmpxm
SOURCES/tests_data_acpi_q35_FACP.slic
SOURCES/tests_data_acpi_q35_SSDT.dimmpxm
SOURCES/tests_data_acpi_virt_SSDT.memhp

View File

@ -1 +1,5 @@
68cd61a466170115b88817e2d52db2cd7a92f43a SOURCES/qemu-6.2.0.tar.xz
c4b34092bc5af1ba7febfca1477320fb024e8acd SOURCES/tests_data_acpi_pc_SSDT.dimmpxm
19349e3517143bd1af56a5444e927ba37a111f72 SOURCES/tests_data_acpi_q35_FACP.slic
4632d10ae8cedad4d5d760ed211f83f0dc81005d SOURCES/tests_data_acpi_q35_SSDT.dimmpxm
ef12eed43cc357fb134db6fa3c7ffc83e222a97d SOURCES/tests_data_acpi_virt_SSDT.memhp

View File

@ -0,0 +1,50 @@
From 953c5c0982b61b0a3f8f03452844b5487eb22fc7 Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Thu, 9 Mar 2023 08:13:17 -0500
Subject: [PATCH 06/13] aio-wait: switch to smp_mb__after_rmw()
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 263: qatomic: add smp_mb__before/after_rmw()
RH-Bugzilla: 2168472
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Commit: [6/10] 9f30f97754139ffd18d36b2350f9ed4e59ac496e
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168472
commit b532526a07ef3b903ead2e055fe6cc87b41057a3
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Fri Mar 3 11:03:52 2023 +0100
aio-wait: switch to smp_mb__after_rmw()
The barrier comes after an atomic increment, so it is enough to use
smp_mb__after_rmw(); this avoids a double barrier on x86 systems.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
include/block/aio-wait.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/block/aio-wait.h b/include/block/aio-wait.h
index 54840f8622..03b6394c78 100644
--- a/include/block/aio-wait.h
+++ b/include/block/aio-wait.h
@@ -82,7 +82,7 @@ extern AioWait global_aio_wait;
/* Increment wait_->num_waiters before evaluating cond. */ \
qatomic_inc(&wait_->num_waiters); \
/* Paired with smp_mb in aio_wait_kick(). */ \
- smp_mb(); \
+ smp_mb__after_rmw(); \
if (ctx_ && in_aio_context_home_thread(ctx_)) { \
while ((cond)) { \
aio_poll(ctx_, true); \
--
2.37.3

View File

@ -0,0 +1,86 @@
From d7eae0ff4c7f7f7bf10f10272adf7c6971c0db9b Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Thu, 9 Mar 2023 09:26:35 -0500
Subject: [PATCH 01/13] aio_wait_kick: add missing memory barrier
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 263: qatomic: add smp_mb__before/after_rmw()
RH-Bugzilla: 2168472
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Commit: [1/10] eb774aee79864052e14e706d931e52e7bd1162c8
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168472
commit 7455ff1aa01564cc175db5b2373e610503ad4411
Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Tue May 24 13:30:54 2022 -0400
aio_wait_kick: add missing memory barrier
It seems that aio_wait_kick always required a memory barrier
or atomic operation in the caller, but nobody actually
took care of doing it.
Let's put the barrier in the function instead, and pair it
with another one in AIO_WAIT_WHILE. Read aio_wait_kick()
comment for further explanation.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220524173054.12651-1-eesposit@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
include/block/aio-wait.h | 2 ++
util/aio-wait.c | 16 +++++++++++++++-
2 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/include/block/aio-wait.h b/include/block/aio-wait.h
index b39eefb38d..54840f8622 100644
--- a/include/block/aio-wait.h
+++ b/include/block/aio-wait.h
@@ -81,6 +81,8 @@ extern AioWait global_aio_wait;
AioContext *ctx_ = (ctx); \
/* Increment wait_->num_waiters before evaluating cond. */ \
qatomic_inc(&wait_->num_waiters); \
+ /* Paired with smp_mb in aio_wait_kick(). */ \
+ smp_mb(); \
if (ctx_ && in_aio_context_home_thread(ctx_)) { \
while ((cond)) { \
aio_poll(ctx_, true); \
diff --git a/util/aio-wait.c b/util/aio-wait.c
index bdb3d3af22..98c5accd29 100644
--- a/util/aio-wait.c
+++ b/util/aio-wait.c
@@ -35,7 +35,21 @@ static void dummy_bh_cb(void *opaque)
void aio_wait_kick(void)
{
- /* The barrier (or an atomic op) is in the caller. */
+ /*
+ * Paired with smp_mb in AIO_WAIT_WHILE. Here we have:
+ * write(condition);
+ * aio_wait_kick() {
+ * smp_mb();
+ * read(num_waiters);
+ * }
+ *
+ * And in AIO_WAIT_WHILE:
+ * write(num_waiters);
+ * smp_mb();
+ * read(condition);
+ */
+ smp_mb();
+
if (qatomic_read(&global_aio_wait.num_waiters)) {
aio_bh_schedule_oneshot(qemu_get_aio_context(), dummy_bh_cb, NULL);
}
--
2.37.3

View File

@ -0,0 +1,56 @@
From 47d027147694fde94dd73305ee53b6a136cbeced Mon Sep 17 00:00:00 2001
From: Jon Maloy <jmaloy@redhat.com>
Date: Tue, 9 May 2023 10:29:03 -0400
Subject: [PATCH 08/15] apic: disable reentrancy detection for apic-msi
RH-Author: Jon Maloy <jmaloy@redhat.com>
RH-MergeRequest: 277: memory: prevent dma-reentracy issues
RH-Bugzilla: 1999236
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [8/12] 25c3cf99b00cd9adc10d6e7afa9c3e3b7da08de2 (redhat/rhel/src/qemu-kvm/jons-qemu-kvm-2)
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1999236
Upstream: Merged
CVE: CVE-2021-3750
commit 50795ee051a342c681a9b45671c552fbd6274db8
Author: Alexander Bulekov <alxndr@bu.edu>
Date: Thu Apr 27 17:10:13 2023 -0400
apic: disable reentrancy detection for apic-msi
As the code is designed for re-entrant calls to apic-msi, mark apic-msi
as reentrancy-safe.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20230427211013.2994127-9-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
---
hw/intc/apic.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index 3df11c34d6..a7c2b301a8 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -883,6 +883,13 @@ static void apic_realize(DeviceState *dev, Error **errp)
memory_region_init_io(&s->io_memory, OBJECT(s), &apic_io_ops, s, "apic-msi",
APIC_SPACE_SIZE);
+ /*
+ * apic-msi's apic_mem_write can call into ioapic_eoi_broadcast, which can
+ * write back to apic-msi. As such mark the apic-msi region re-entrancy
+ * safe.
+ */
+ s->io_memory.disable_reentrancy_guard = true;
+
s->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, apic_timer, s);
local_apics[s->id] = s;
--
2.37.3

View File

@ -0,0 +1,235 @@
From 8996ac4369de7e0cb6f911db6f47c3e4ae88c8aa Mon Sep 17 00:00:00 2001
From: Jon Maloy <jmaloy@redhat.com>
Date: Tue, 9 May 2023 10:29:03 -0400
Subject: [PATCH 02/15] async: Add an optional reentrancy guard to the BH API
RH-Author: Jon Maloy <jmaloy@redhat.com>
RH-MergeRequest: 277: memory: prevent dma-reentracy issues
RH-Bugzilla: 1999236
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [2/12] b03f247e242a6cdb3eebec36477234ac77dcd20c (redhat/rhel/src/qemu-kvm/jons-qemu-kvm-2)
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1999236
Upstream: Merged
CVE: CVE-2021-3750
Conflict: The file block/graph-lock.h, inluded from include/block/aio.h,
doesn't exist in this code version. The code compiles without
issues if this include is just omitted, so we do that.
commit 9c86c97f12c060bf7484dd931f38634e166a81f0
Author: Alexander Bulekov <alxndr@bu.edu>
Date: Thu Apr 27 17:10:07 2023 -0400
async: Add an optional reentrancy guard to the BH API
Devices can pass their MemoryReentrancyGuard (from their DeviceState),
when creating new BHes. Then, the async API will toggle the guard
before/after calling the BH call-back. This prevents bh->mmio reentrancy
issues.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20230427211013.2994127-3-alxndr@bu.edu>
[thuth: Fix "line over 90 characters" checkpatch.pl error]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
---
docs/devel/multiple-iothreads.txt | 7 +++++++
include/block/aio.h | 18 ++++++++++++++++--
include/qemu/main-loop.h | 7 +++++--
tests/unit/ptimer-test-stubs.c | 3 ++-
util/async.c | 18 +++++++++++++++++-
util/main-loop.c | 6 ++++--
util/trace-events | 1 +
7 files changed, 52 insertions(+), 8 deletions(-)
diff --git a/docs/devel/multiple-iothreads.txt b/docs/devel/multiple-iothreads.txt
index aeb997bed5..a11576bc74 100644
--- a/docs/devel/multiple-iothreads.txt
+++ b/docs/devel/multiple-iothreads.txt
@@ -61,6 +61,7 @@ There are several old APIs that use the main loop AioContext:
* LEGACY qemu_aio_set_event_notifier() - monitor an event notifier
* LEGACY timer_new_ms() - create a timer
* LEGACY qemu_bh_new() - create a BH
+ * LEGACY qemu_bh_new_guarded() - create a BH with a device re-entrancy guard
* LEGACY qemu_aio_wait() - run an event loop iteration
Since they implicitly work on the main loop they cannot be used in code that
@@ -72,8 +73,14 @@ Instead, use the AioContext functions directly (see include/block/aio.h):
* aio_set_event_notifier() - monitor an event notifier
* aio_timer_new() - create a timer
* aio_bh_new() - create a BH
+ * aio_bh_new_guarded() - create a BH with a device re-entrancy guard
* aio_poll() - run an event loop iteration
+The qemu_bh_new_guarded/aio_bh_new_guarded APIs accept a "MemReentrancyGuard"
+argument, which is used to check for and prevent re-entrancy problems. For
+BHs associated with devices, the reentrancy-guard is contained in the
+corresponding DeviceState and named "mem_reentrancy_guard".
+
The AioContext can be obtained from the IOThread using
iothread_get_aio_context() or for the main loop using qemu_get_aio_context().
Code that takes an AioContext argument works both in IOThreads or the main
diff --git a/include/block/aio.h b/include/block/aio.h
index 47fbe9d81f..c7da152985 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -22,6 +22,8 @@
#include "qemu/event_notifier.h"
#include "qemu/thread.h"
#include "qemu/timer.h"
+#include "hw/qdev-core.h"
+
typedef struct BlockAIOCB BlockAIOCB;
typedef void BlockCompletionFunc(void *opaque, int ret);
@@ -321,9 +323,11 @@ void aio_bh_schedule_oneshot_full(AioContext *ctx, QEMUBHFunc *cb, void *opaque,
* is opaque and must be allocated prior to its use.
*
* @name: A human-readable identifier for debugging purposes.
+ * @reentrancy_guard: A guard set when entering a cb to prevent
+ * device-reentrancy issues
*/
QEMUBH *aio_bh_new_full(AioContext *ctx, QEMUBHFunc *cb, void *opaque,
- const char *name);
+ const char *name, MemReentrancyGuard *reentrancy_guard);
/**
* aio_bh_new: Allocate a new bottom half structure
@@ -332,7 +336,17 @@ QEMUBH *aio_bh_new_full(AioContext *ctx, QEMUBHFunc *cb, void *opaque,
* string.
*/
#define aio_bh_new(ctx, cb, opaque) \
- aio_bh_new_full((ctx), (cb), (opaque), (stringify(cb)))
+ aio_bh_new_full((ctx), (cb), (opaque), (stringify(cb)), NULL)
+
+/**
+ * aio_bh_new_guarded: Allocate a new bottom half structure with a
+ * reentrancy_guard
+ *
+ * A convenience wrapper for aio_bh_new_full() that uses the cb as the name
+ * string.
+ */
+#define aio_bh_new_guarded(ctx, cb, opaque, guard) \
+ aio_bh_new_full((ctx), (cb), (opaque), (stringify(cb)), guard)
/**
* aio_notify: Force processing of pending events.
diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h
index 8dbc6fcb89..85dd5ada9e 100644
--- a/include/qemu/main-loop.h
+++ b/include/qemu/main-loop.h
@@ -294,9 +294,12 @@ void qemu_cond_timedwait_iothread(QemuCond *cond, int ms);
void qemu_fd_register(int fd);
+#define qemu_bh_new_guarded(cb, opaque, guard) \
+ qemu_bh_new_full((cb), (opaque), (stringify(cb)), guard)
#define qemu_bh_new(cb, opaque) \
- qemu_bh_new_full((cb), (opaque), (stringify(cb)))
-QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name);
+ qemu_bh_new_full((cb), (opaque), (stringify(cb)), NULL)
+QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name,
+ MemReentrancyGuard *reentrancy_guard);
void qemu_bh_schedule_idle(QEMUBH *bh);
enum {
diff --git a/tests/unit/ptimer-test-stubs.c b/tests/unit/ptimer-test-stubs.c
index 2a3ef58799..a7a2d08e7e 100644
--- a/tests/unit/ptimer-test-stubs.c
+++ b/tests/unit/ptimer-test-stubs.c
@@ -108,7 +108,8 @@ int64_t qemu_clock_deadline_ns_all(QEMUClockType type, int attr_mask)
return deadline;
}
-QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name)
+QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name,
+ MemReentrancyGuard *reentrancy_guard)
{
QEMUBH *bh = g_new(QEMUBH, 1);
diff --git a/util/async.c b/util/async.c
index 2a63bf90f2..1fff02e7fc 100644
--- a/util/async.c
+++ b/util/async.c
@@ -62,6 +62,7 @@ struct QEMUBH {
void *opaque;
QSLIST_ENTRY(QEMUBH) next;
unsigned flags;
+ MemReentrancyGuard *reentrancy_guard;
};
/* Called concurrently from any thread */
@@ -127,7 +128,7 @@ void aio_bh_schedule_oneshot_full(AioContext *ctx, QEMUBHFunc *cb,
}
QEMUBH *aio_bh_new_full(AioContext *ctx, QEMUBHFunc *cb, void *opaque,
- const char *name)
+ const char *name, MemReentrancyGuard *reentrancy_guard)
{
QEMUBH *bh;
bh = g_new(QEMUBH, 1);
@@ -136,13 +137,28 @@ QEMUBH *aio_bh_new_full(AioContext *ctx, QEMUBHFunc *cb, void *opaque,
.cb = cb,
.opaque = opaque,
.name = name,
+ .reentrancy_guard = reentrancy_guard,
};
return bh;
}
void aio_bh_call(QEMUBH *bh)
{
+ bool last_engaged_in_io = false;
+
+ if (bh->reentrancy_guard) {
+ last_engaged_in_io = bh->reentrancy_guard->engaged_in_io;
+ if (bh->reentrancy_guard->engaged_in_io) {
+ trace_reentrant_aio(bh->ctx, bh->name);
+ }
+ bh->reentrancy_guard->engaged_in_io = true;
+ }
+
bh->cb(bh->opaque);
+
+ if (bh->reentrancy_guard) {
+ bh->reentrancy_guard->engaged_in_io = last_engaged_in_io;
+ }
}
/* Multiple occurrences of aio_bh_poll cannot be called concurrently. */
diff --git a/util/main-loop.c b/util/main-loop.c
index 06b18b195c..1eacf04691 100644
--- a/util/main-loop.c
+++ b/util/main-loop.c
@@ -544,9 +544,11 @@ void main_loop_wait(int nonblocking)
/* Functions to operate on the main QEMU AioContext. */
-QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name)
+QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name,
+ MemReentrancyGuard *reentrancy_guard)
{
- return aio_bh_new_full(qemu_aio_context, cb, opaque, name);
+ return aio_bh_new_full(qemu_aio_context, cb, opaque, name,
+ reentrancy_guard);
}
/*
diff --git a/util/trace-events b/util/trace-events
index c8f53d7d9f..dc3b1eb3bf 100644
--- a/util/trace-events
+++ b/util/trace-events
@@ -11,6 +11,7 @@ poll_remove(void *ctx, void *node, int fd) "ctx %p node %p fd %d"
# async.c
aio_co_schedule(void *ctx, void *co) "ctx %p co %p"
aio_co_schedule_bh_cb(void *ctx, void *co) "ctx %p co %p"
+reentrant_aio(void *ctx, const char *name) "ctx %p name %s"
# thread-pool.c
thread_pool_submit(void *pool, void *req, void *opaque) "pool %p req %p opaque %p"
--
2.37.3

View File

@ -0,0 +1,71 @@
From d754050d260e2ad890cecd975df6e163c531b40e Mon Sep 17 00:00:00 2001
From: Jon Maloy <jmaloy@redhat.com>
Date: Tue, 9 May 2023 10:29:03 -0400
Subject: [PATCH 09/15] async: avoid use-after-free on re-entrancy guard
RH-Author: Jon Maloy <jmaloy@redhat.com>
RH-MergeRequest: 277: memory: prevent dma-reentracy issues
RH-Bugzilla: 1999236
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [9/12] d357650e581c3921bbfe3e2fde5e3f55853b5fab (redhat/rhel/src/qemu-kvm/jons-qemu-kvm-2)
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1999236
Upstream: Merged
CVE: CVE-2021-3750
commit 7915bd06f25e1803778081161bf6fa10c42dc7cd
Author: Alexander Bulekov <alxndr@bu.edu>
Date: Mon May 1 10:19:56 2023 -0400
async: avoid use-after-free on re-entrancy guard
A BH callback can free the BH, causing a use-after-free in aio_bh_call.
Fix that by keeping a local copy of the re-entrancy guard pointer.
Buglink: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=58513
Fixes: 9c86c97f12 ("async: Add an optional reentrancy guard to the BH API")
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20230501141956.3444868-1-alxndr@bu.edu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
---
util/async.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/util/async.c b/util/async.c
index 1fff02e7fc..ffe0541c3b 100644
--- a/util/async.c
+++ b/util/async.c
@@ -146,18 +146,20 @@ void aio_bh_call(QEMUBH *bh)
{
bool last_engaged_in_io = false;
- if (bh->reentrancy_guard) {
- last_engaged_in_io = bh->reentrancy_guard->engaged_in_io;
- if (bh->reentrancy_guard->engaged_in_io) {
+ /* Make a copy of the guard-pointer as cb may free the bh */
+ MemReentrancyGuard *reentrancy_guard = bh->reentrancy_guard;
+ if (reentrancy_guard) {
+ last_engaged_in_io = reentrancy_guard->engaged_in_io;
+ if (reentrancy_guard->engaged_in_io) {
trace_reentrant_aio(bh->ctx, bh->name);
}
- bh->reentrancy_guard->engaged_in_io = true;
+ reentrancy_guard->engaged_in_io = true;
}
bh->cb(bh->opaque);
- if (bh->reentrancy_guard) {
- bh->reentrancy_guard->engaged_in_io = last_engaged_in_io;
+ if (reentrancy_guard) {
+ reentrancy_guard->engaged_in_io = last_engaged_in_io;
}
}
--
2.37.3

View File

@ -0,0 +1,66 @@
From 187eb7a418af93375e42298d06e231e2bec3cf00 Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Thu, 9 Mar 2023 08:15:42 -0500
Subject: [PATCH 10/13] async: clarify usage of barriers in the polling case
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 263: qatomic: add smp_mb__before/after_rmw()
RH-Bugzilla: 2168472
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Commit: [10/10] 3be07ccc6137a0336becfe63a818d9cbadb38e9c
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168472
commit 6229438cca037d42f44a96d38feb15cb102a444f
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Mon Mar 6 10:43:52 2023 +0100
async: clarify usage of barriers in the polling case
Explain that aio_context_notifier_poll() relies on
aio_notify_accept() to catch all the memory writes that were
done before ctx->notified was set to true.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
util/async.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/util/async.c b/util/async.c
index 795fe699b6..2a63bf90f2 100644
--- a/util/async.c
+++ b/util/async.c
@@ -463,8 +463,9 @@ void aio_notify_accept(AioContext *ctx)
qatomic_set(&ctx->notified, false);
/*
- * Write ctx->notified before reading e.g. bh->flags. Pairs with smp_wmb
- * in aio_notify.
+ * Order reads of ctx->notified (in aio_context_notifier_poll()) and the
+ * above clearing of ctx->notified before reads of e.g. bh->flags. Pairs
+ * with smp_wmb() in aio_notify.
*/
smp_mb();
}
@@ -487,6 +488,11 @@ static bool aio_context_notifier_poll(void *opaque)
EventNotifier *e = opaque;
AioContext *ctx = container_of(e, AioContext, notifier);
+ /*
+ * No need for load-acquire because we just want to kick the
+ * event loop. aio_notify_accept() takes care of synchronizing
+ * the event loop with the producers.
+ */
return qatomic_read(&ctx->notified);
}
--
2.37.3

View File

@ -0,0 +1,111 @@
From ea3856bb545d19499602830cdc3076d83a981e7a Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Thu, 9 Mar 2023 08:15:36 -0500
Subject: [PATCH 09/13] async: update documentation of the memory barriers
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 263: qatomic: add smp_mb__before/after_rmw()
RH-Bugzilla: 2168472
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Commit: [9/10] d471da2acf7a107cf75f3327c5e8d7456307160e
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168472
commit 8dd48650b43dfde4ebea34191ac267e474bcc29e
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Mon Mar 6 10:15:06 2023 +0100
async: update documentation of the memory barriers
Ever since commit 8c6b0356b539 ("util/async: make bh_aio_poll() O(1)",
2020-02-22), synchronization between qemu_bh_schedule() and aio_bh_poll()
is happening when the bottom half is enqueued in the bh_list; not
when the flags are set. Update the documentation to match.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
util/async.c | 33 +++++++++++++++++++--------------
1 file changed, 19 insertions(+), 14 deletions(-)
diff --git a/util/async.c b/util/async.c
index 6f6717a34b..795fe699b6 100644
--- a/util/async.c
+++ b/util/async.c
@@ -71,14 +71,21 @@ static void aio_bh_enqueue(QEMUBH *bh, unsigned new_flags)
unsigned old_flags;
/*
- * The memory barrier implicit in qatomic_fetch_or makes sure that:
- * 1. idle & any writes needed by the callback are done before the
- * locations are read in the aio_bh_poll.
- * 2. ctx is loaded before the callback has a chance to execute and bh
- * could be freed.
+ * Synchronizes with atomic_fetch_and() in aio_bh_dequeue(), ensuring that
+ * insertion starts after BH_PENDING is set.
*/
old_flags = qatomic_fetch_or(&bh->flags, BH_PENDING | new_flags);
+
if (!(old_flags & BH_PENDING)) {
+ /*
+ * At this point the bottom half becomes visible to aio_bh_poll().
+ * This insertion thus synchronizes with QSLIST_MOVE_ATOMIC in
+ * aio_bh_poll(), ensuring that:
+ * 1. any writes needed by the callback are visible from the callback
+ * after aio_bh_dequeue() returns bh.
+ * 2. ctx is loaded before the callback has a chance to execute and bh
+ * could be freed.
+ */
QSLIST_INSERT_HEAD_ATOMIC(&ctx->bh_list, bh, next);
}
@@ -97,11 +104,8 @@ static QEMUBH *aio_bh_dequeue(BHList *head, unsigned *flags)
QSLIST_REMOVE_HEAD(head, next);
/*
- * The qatomic_and is paired with aio_bh_enqueue(). The implicit memory
- * barrier ensures that the callback sees all writes done by the scheduling
- * thread. It also ensures that the scheduling thread sees the cleared
- * flag before bh->cb has run, and thus will call aio_notify again if
- * necessary.
+ * Synchronizes with qatomic_fetch_or() in aio_bh_enqueue(), ensuring that
+ * the removal finishes before BH_PENDING is reset.
*/
*flags = qatomic_fetch_and(&bh->flags,
~(BH_PENDING | BH_SCHEDULED | BH_IDLE));
@@ -148,6 +152,7 @@ int aio_bh_poll(AioContext *ctx)
BHListSlice *s;
int ret = 0;
+ /* Synchronizes with QSLIST_INSERT_HEAD_ATOMIC in aio_bh_enqueue(). */
QSLIST_MOVE_ATOMIC(&slice.bh_list, &ctx->bh_list);
QSIMPLEQ_INSERT_TAIL(&ctx->bh_slice_list, &slice, next);
@@ -437,15 +442,15 @@ LuringState *aio_get_linux_io_uring(AioContext *ctx)
void aio_notify(AioContext *ctx)
{
/*
- * Write e.g. bh->flags before writing ctx->notified. Pairs with smp_mb in
- * aio_notify_accept.
+ * Write e.g. ctx->bh_list before writing ctx->notified. Pairs with
+ * smp_mb() in aio_notify_accept().
*/
smp_wmb();
qatomic_set(&ctx->notified, true);
/*
- * Write ctx->notified before reading ctx->notify_me. Pairs
- * with smp_mb in aio_ctx_prepare or aio_poll.
+ * Write ctx->notified (and also ctx->bh_list) before reading ctx->notify_me.
+ * Pairs with smp_mb() in aio_ctx_prepare or aio_poll.
*/
smp_mb();
if (qatomic_read(&ctx->notify_me)) {
--
2.37.3

View File

@ -0,0 +1,58 @@
From 7715635d018351e0a5c4c25aec2c71a2fe3b9e69 Mon Sep 17 00:00:00 2001
From: Jon Maloy <jmaloy@redhat.com>
Date: Tue, 9 May 2023 10:29:03 -0400
Subject: [PATCH 06/15] bcm2835_property: disable reentrancy detection for
iomem
RH-Author: Jon Maloy <jmaloy@redhat.com>
RH-MergeRequest: 277: memory: prevent dma-reentracy issues
RH-Bugzilla: 1999236
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [6/12] 4d6187430ca1c4309a36824c0c6815d2a763db1a (redhat/rhel/src/qemu-kvm/jons-qemu-kvm-2)
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1999236
Upstream: Merged
CVE: CVE-2021-3750
commit 985c4a4e547afb9573b6bd6843d20eb2c3d1d1cd
Author: Alexander Bulekov <alxndr@bu.edu>
Date: Thu Apr 27 17:10:11 2023 -0400
bcm2835_property: disable reentrancy detection for iomem
As the code is designed for re-entrant calls from bcm2835_property to
bcm2835_mbox and back into bcm2835_property, mark iomem as
reentrancy-safe.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230427211013.2994127-7-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
---
hw/misc/bcm2835_property.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/hw/misc/bcm2835_property.c b/hw/misc/bcm2835_property.c
index 73941bdae9..022b5a849c 100644
--- a/hw/misc/bcm2835_property.c
+++ b/hw/misc/bcm2835_property.c
@@ -377,6 +377,13 @@ static void bcm2835_property_init(Object *obj)
memory_region_init_io(&s->iomem, OBJECT(s), &bcm2835_property_ops, s,
TYPE_BCM2835_PROPERTY, 0x10);
+
+ /*
+ * bcm2835_property_ops call into bcm2835_mbox, which in-turn reads from
+ * iomem. As such, mark iomem as re-entracy safe.
+ */
+ s->iomem.disable_reentrancy_guard = true;
+
sysbus_init_mmio(SYS_BUS_DEVICE(s), &s->iomem);
sysbus_init_irq(SYS_BUS_DEVICE(s), &s->mbox_irq);
}
--
2.37.3

View File

@ -0,0 +1,359 @@
From 1f7520baa6f0bf02ccba2ebfe7d1d5bf6520f95a Mon Sep 17 00:00:00 2001
From: Hanna Czenczek <hreitz@redhat.com>
Date: Tue, 11 Apr 2023 19:34:16 +0200
Subject: [PATCH 2/5] block: Collapse padded I/O vecs exceeding IOV_MAX
RH-Author: Hanna Czenczek <hreitz@redhat.com>
RH-MergeRequest: 291: block: Split padded I/O vectors exceeding IOV_MAX
RH-Bugzilla: 2141964
RH-Acked-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Commit: [2/5] 1d86ce8398e4ab66e308a686f9855c963e52b0a9
When processing vectored guest requests that are not aligned to the
storage request alignment, we pad them by adding head and/or tail
buffers for a read-modify-write cycle.
The guest can submit I/O vectors up to IOV_MAX (1024) in length, but
with this padding, the vector can exceed that limit. As of
4c002cef0e9abe7135d7916c51abce47f7fc1ee2 ("util/iov: make
qemu_iovec_init_extended() honest"), we refuse to pad vectors beyond the
limit, instead returning an error to the guest.
To the guest, this appears as a random I/O error. We should not return
an I/O error to the guest when it issued a perfectly valid request.
Before 4c002cef0e9abe7135d7916c51abce47f7fc1ee2, we just made the vector
longer than IOV_MAX, which generally seems to work (because the guest
assumes a smaller alignment than we really have, file-posix's
raw_co_prw() will generally see bdrv_qiov_is_aligned() return false, and
so emulate the request, so that the IOV_MAX does not matter). However,
that does not seem exactly great.
I see two ways to fix this problem:
1. We split such long requests into two requests.
2. We join some elements of the vector into new buffers to make it
shorter.
I am wary of (1), because it seems like it may have unintended side
effects.
(2) on the other hand seems relatively simple to implement, with
hopefully few side effects, so this patch does that.
To do this, the use of qemu_iovec_init_extended() in bdrv_pad_request()
is effectively replaced by the new function bdrv_create_padded_qiov(),
which not only wraps the request IOV with padding head/tail, but also
ensures that the resulting vector will not have more than IOV_MAX
elements. Putting that functionality into qemu_iovec_init_extended() is
infeasible because it requires allocating a bounce buffer; doing so
would require many more parameters (buffer alignment, how to initialize
the buffer, and out parameters like the buffer, its length, and the
original elements), which is not reasonable.
Conversely, it is not difficult to move qemu_iovec_init_extended()'s
functionality into bdrv_create_padded_qiov() by using public
qemu_iovec_* functions, so that is what this patch does.
Because bdrv_pad_request() was the only "serious" user of
qemu_iovec_init_extended(), the next patch will remove the latter
function, so the functionality is not implemented twice.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2141964
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230411173418.19549-3-hreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
(cherry picked from commit 18743311b829cafc1737a5f20bc3248d5f91ee2a)
Conflicts:
block/io.c: Downstream bdrv_pad_request() has no @flags
parameter.
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
---
block/io.c | 166 ++++++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 151 insertions(+), 15 deletions(-)
diff --git a/block/io.c b/block/io.c
index c3e7301613..0fe8f0dd40 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1624,6 +1624,14 @@ out:
* @merge_reads is true for small requests,
* if @buf_len == @head + bytes + @tail. In this case it is possible that both
* head and tail exist but @buf_len == align and @tail_buf == @buf.
+ *
+ * @write is true for write requests, false for read requests.
+ *
+ * If padding makes the vector too long (exceeding IOV_MAX), then we need to
+ * merge existing vector elements into a single one. @collapse_bounce_buf acts
+ * as the bounce buffer in such cases. @pre_collapse_qiov has the pre-collapse
+ * I/O vector elements so for read requests, the data can be copied back after
+ * the read is done.
*/
typedef struct BdrvRequestPadding {
uint8_t *buf;
@@ -1632,11 +1640,17 @@ typedef struct BdrvRequestPadding {
size_t head;
size_t tail;
bool merge_reads;
+ bool write;
QEMUIOVector local_qiov;
+
+ uint8_t *collapse_bounce_buf;
+ size_t collapse_len;
+ QEMUIOVector pre_collapse_qiov;
} BdrvRequestPadding;
static bool bdrv_init_padding(BlockDriverState *bs,
int64_t offset, int64_t bytes,
+ bool write,
BdrvRequestPadding *pad)
{
int64_t align = bs->bl.request_alignment;
@@ -1668,6 +1682,8 @@ static bool bdrv_init_padding(BlockDriverState *bs,
pad->tail_buf = pad->buf + pad->buf_len - align;
}
+ pad->write = write;
+
return true;
}
@@ -1733,8 +1749,23 @@ zero_mem:
return 0;
}
-static void bdrv_padding_destroy(BdrvRequestPadding *pad)
+/**
+ * Free *pad's associated buffers, and perform any necessary finalization steps.
+ */
+static void bdrv_padding_finalize(BdrvRequestPadding *pad)
{
+ if (pad->collapse_bounce_buf) {
+ if (!pad->write) {
+ /*
+ * If padding required elements in the vector to be collapsed into a
+ * bounce buffer, copy the bounce buffer content back
+ */
+ qemu_iovec_from_buf(&pad->pre_collapse_qiov, 0,
+ pad->collapse_bounce_buf, pad->collapse_len);
+ }
+ qemu_vfree(pad->collapse_bounce_buf);
+ qemu_iovec_destroy(&pad->pre_collapse_qiov);
+ }
if (pad->buf) {
qemu_vfree(pad->buf);
qemu_iovec_destroy(&pad->local_qiov);
@@ -1742,6 +1773,101 @@ static void bdrv_padding_destroy(BdrvRequestPadding *pad)
memset(pad, 0, sizeof(*pad));
}
+/*
+ * Create pad->local_qiov by wrapping @iov in the padding head and tail, while
+ * ensuring that the resulting vector will not exceed IOV_MAX elements.
+ *
+ * To ensure this, when necessary, the first two or three elements of @iov are
+ * merged into pad->collapse_bounce_buf and replaced by a reference to that
+ * bounce buffer in pad->local_qiov.
+ *
+ * After performing a read request, the data from the bounce buffer must be
+ * copied back into pad->pre_collapse_qiov (e.g. by bdrv_padding_finalize()).
+ */
+static int bdrv_create_padded_qiov(BlockDriverState *bs,
+ BdrvRequestPadding *pad,
+ struct iovec *iov, int niov,
+ size_t iov_offset, size_t bytes)
+{
+ int padded_niov, surplus_count, collapse_count;
+
+ /* Assert this invariant */
+ assert(niov <= IOV_MAX);
+
+ /*
+ * Cannot pad if resulting length would exceed SIZE_MAX. Returning an error
+ * to the guest is not ideal, but there is little else we can do. At least
+ * this will practically never happen on 64-bit systems.
+ */
+ if (SIZE_MAX - pad->head < bytes ||
+ SIZE_MAX - pad->head - bytes < pad->tail)
+ {
+ return -EINVAL;
+ }
+
+ /* Length of the resulting IOV if we just concatenated everything */
+ padded_niov = !!pad->head + niov + !!pad->tail;
+
+ qemu_iovec_init(&pad->local_qiov, MIN(padded_niov, IOV_MAX));
+
+ if (pad->head) {
+ qemu_iovec_add(&pad->local_qiov, pad->buf, pad->head);
+ }
+
+ /*
+ * If padded_niov > IOV_MAX, we cannot just concatenate everything.
+ * Instead, merge the first two or three elements of @iov to reduce the
+ * number of vector elements as necessary.
+ */
+ if (padded_niov > IOV_MAX) {
+ /*
+ * Only head and tail can have lead to the number of entries exceeding
+ * IOV_MAX, so we can exceed it by the head and tail at most. We need
+ * to reduce the number of elements by `surplus_count`, so we merge that
+ * many elements plus one into one element.
+ */
+ surplus_count = padded_niov - IOV_MAX;
+ assert(surplus_count <= !!pad->head + !!pad->tail);
+ collapse_count = surplus_count + 1;
+
+ /*
+ * Move the elements to collapse into `pad->pre_collapse_qiov`, then
+ * advance `iov` (and associated variables) by those elements.
+ */
+ qemu_iovec_init(&pad->pre_collapse_qiov, collapse_count);
+ qemu_iovec_concat_iov(&pad->pre_collapse_qiov, iov,
+ collapse_count, iov_offset, SIZE_MAX);
+ iov += collapse_count;
+ iov_offset = 0;
+ niov -= collapse_count;
+ bytes -= pad->pre_collapse_qiov.size;
+
+ /*
+ * Construct the bounce buffer to match the length of the to-collapse
+ * vector elements, and for write requests, initialize it with the data
+ * from those elements. Then add it to `pad->local_qiov`.
+ */
+ pad->collapse_len = pad->pre_collapse_qiov.size;
+ pad->collapse_bounce_buf = qemu_blockalign(bs, pad->collapse_len);
+ if (pad->write) {
+ qemu_iovec_to_buf(&pad->pre_collapse_qiov, 0,
+ pad->collapse_bounce_buf, pad->collapse_len);
+ }
+ qemu_iovec_add(&pad->local_qiov,
+ pad->collapse_bounce_buf, pad->collapse_len);
+ }
+
+ qemu_iovec_concat_iov(&pad->local_qiov, iov, niov, iov_offset, bytes);
+
+ if (pad->tail) {
+ qemu_iovec_add(&pad->local_qiov,
+ pad->buf + pad->buf_len - pad->tail, pad->tail);
+ }
+
+ assert(pad->local_qiov.niov == MIN(padded_niov, IOV_MAX));
+ return 0;
+}
+
/*
* bdrv_pad_request
*
@@ -1749,6 +1875,8 @@ static void bdrv_padding_destroy(BdrvRequestPadding *pad)
* read of padding, bdrv_padding_rmw_read() should be called separately if
* needed.
*
+ * @write is true for write requests, false for read requests.
+ *
* Request parameters (@qiov, &qiov_offset, &offset, &bytes) are in-out:
* - on function start they represent original request
* - on failure or when padding is not needed they are unchanged
@@ -1757,25 +1885,33 @@ static void bdrv_padding_destroy(BdrvRequestPadding *pad)
static int bdrv_pad_request(BlockDriverState *bs,
QEMUIOVector **qiov, size_t *qiov_offset,
int64_t *offset, int64_t *bytes,
+ bool write,
BdrvRequestPadding *pad, bool *padded)
{
int ret;
+ struct iovec *sliced_iov;
+ int sliced_niov;
+ size_t sliced_head, sliced_tail;
bdrv_check_qiov_request(*offset, *bytes, *qiov, *qiov_offset, &error_abort);
- if (!bdrv_init_padding(bs, *offset, *bytes, pad)) {
+ if (!bdrv_init_padding(bs, *offset, *bytes, write, pad)) {
if (padded) {
*padded = false;
}
return 0;
}
- ret = qemu_iovec_init_extended(&pad->local_qiov, pad->buf, pad->head,
- *qiov, *qiov_offset, *bytes,
- pad->buf + pad->buf_len - pad->tail,
- pad->tail);
+ sliced_iov = qemu_iovec_slice(*qiov, *qiov_offset, *bytes,
+ &sliced_head, &sliced_tail,
+ &sliced_niov);
+
+ /* Guaranteed by bdrv_check_qiov_request() */
+ assert(*bytes <= SIZE_MAX);
+ ret = bdrv_create_padded_qiov(bs, pad, sliced_iov, sliced_niov,
+ sliced_head, *bytes);
if (ret < 0) {
- bdrv_padding_destroy(pad);
+ bdrv_padding_finalize(pad);
return ret;
}
*bytes += pad->head + pad->tail;
@@ -1836,8 +1972,8 @@ int coroutine_fn bdrv_co_preadv_part(BdrvChild *child,
flags |= BDRV_REQ_COPY_ON_READ;
}
- ret = bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes, &pad,
- NULL);
+ ret = bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes, false,
+ &pad, NULL);
if (ret < 0) {
goto fail;
}
@@ -1847,7 +1983,7 @@ int coroutine_fn bdrv_co_preadv_part(BdrvChild *child,
bs->bl.request_alignment,
qiov, qiov_offset, flags);
tracked_request_end(&req);
- bdrv_padding_destroy(&pad);
+ bdrv_padding_finalize(&pad);
fail:
bdrv_dec_in_flight(bs);
@@ -2167,7 +2303,7 @@ static int coroutine_fn bdrv_co_do_zero_pwritev(BdrvChild *child,
bool padding;
BdrvRequestPadding pad;
- padding = bdrv_init_padding(bs, offset, bytes, &pad);
+ padding = bdrv_init_padding(bs, offset, bytes, true, &pad);
if (padding) {
bdrv_make_request_serialising(req, align);
@@ -2214,7 +2350,7 @@ static int coroutine_fn bdrv_co_do_zero_pwritev(BdrvChild *child,
}
out:
- bdrv_padding_destroy(&pad);
+ bdrv_padding_finalize(&pad);
return ret;
}
@@ -2280,8 +2416,8 @@ int coroutine_fn bdrv_co_pwritev_part(BdrvChild *child,
* bdrv_co_do_zero_pwritev() does aligning by itself, so, we do
* alignment only if there is no ZERO flag.
*/
- ret = bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes, &pad,
- &padded);
+ ret = bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes, true,
+ &pad, &padded);
if (ret < 0) {
return ret;
}
@@ -2310,7 +2446,7 @@ int coroutine_fn bdrv_co_pwritev_part(BdrvChild *child,
ret = bdrv_aligned_pwritev(child, &req, offset, bytes, align,
qiov, qiov_offset, flags);
- bdrv_padding_destroy(&pad);
+ bdrv_padding_finalize(&pad);
out:
tracked_request_end(&req);
--
2.39.3

View File

@ -0,0 +1,75 @@
From b9866279996ee065cb524bf30bc70e22efbab303 Mon Sep 17 00:00:00 2001
From: Hanna Czenczek <hreitz@redhat.com>
Date: Fri, 14 Jul 2023 10:59:38 +0200
Subject: [PATCH 5/5] block: Fix pad_request's request restriction
RH-Author: Hanna Czenczek <hreitz@redhat.com>
RH-MergeRequest: 291: block: Split padded I/O vectors exceeding IOV_MAX
RH-Bugzilla: 2141964
RH-Acked-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Commit: [5/5] f9188bd089d6c67185ea1accde20d491a2ed3193
bdrv_pad_request() relies on requests' lengths not to exceed SIZE_MAX,
which bdrv_check_qiov_request() does not guarantee.
bdrv_check_request32() however will guarantee this, and both of
bdrv_pad_request()'s callers (bdrv_co_preadv_part() and
bdrv_co_pwritev_part()) already run it before calling
bdrv_pad_request(). Therefore, bdrv_pad_request() can safely call
bdrv_check_request32() without expecting error, too.
In effect, this patch will not change guest-visible behavior. It is a
clean-up to tighten a condition to match what is guaranteed by our
callers, and which exists purely to show clearly why the subsequent
assertion (`assert(*bytes <= SIZE_MAX)`) is always true.
Note there is a difference between the interfaces of
bdrv_check_qiov_request() and bdrv_check_request32(): The former takes
an errp, the latter does not, so we can no longer just pass
&error_abort. Instead, we need to check the returned value. While we
do expect success (because the callers have already run this function),
an assert(ret == 0) is not much simpler than just to return an error if
it occurs, so let us handle errors by returning them up the stack now.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-id: 20230714085938.202730-1-hreitz@redhat.com
Fixes: 18743311b829cafc1737a5f20bc3248d5f91ee2a
("block: Collapse padded I/O vecs exceeding IOV_MAX")
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
---
block/io.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/block/io.c b/block/io.c
index 0fe8f0dd40..8ae57728a6 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1893,7 +1893,11 @@ static int bdrv_pad_request(BlockDriverState *bs,
int sliced_niov;
size_t sliced_head, sliced_tail;
- bdrv_check_qiov_request(*offset, *bytes, *qiov, *qiov_offset, &error_abort);
+ /* Should have been checked by the caller already */
+ ret = bdrv_check_request32(*offset, *bytes, *qiov, *qiov_offset);
+ if (ret < 0) {
+ return ret;
+ }
if (!bdrv_init_padding(bs, *offset, *bytes, write, pad)) {
if (padded) {
@@ -1906,7 +1910,7 @@ static int bdrv_pad_request(BlockDriverState *bs,
&sliced_head, &sliced_tail,
&sliced_niov);
- /* Guaranteed by bdrv_check_qiov_request() */
+ /* Guaranteed by bdrv_check_request32() */
assert(*bytes <= SIZE_MAX);
ret = bdrv_create_padded_qiov(bs, pad, sliced_iov, sliced_niov,
sliced_head, *bytes);
--
2.39.3

View File

@ -0,0 +1,56 @@
From 866a3b56f6a2d43f3cf7b3313fb41808bc5e6e1f Mon Sep 17 00:00:00 2001
From: Jon Maloy <jmaloy@redhat.com>
Date: Tue, 9 May 2023 10:29:03 -0400
Subject: [PATCH 03/15] checkpatch: add qemu_bh_new/aio_bh_new checks
RH-Author: Jon Maloy <jmaloy@redhat.com>
RH-MergeRequest: 277: memory: prevent dma-reentracy issues
RH-Bugzilla: 1999236
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [3/12] 620b480b0878c18223f3cc103450bc16aa6d7e21 (redhat/rhel/src/qemu-kvm/jons-qemu-kvm-2)
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1999236
Upstream: Merged
CVE: CVE-2021-3750
commit ef56ffbdd6b0605dc1e305611287b948c970e236
Author: Alexander Bulekov <alxndr@bu.edu>
Date: Thu Apr 27 17:10:08 2023 -0400
checkpatch: add qemu_bh_new/aio_bh_new checks
Advise authors to use the _guarded versions of the APIs, instead.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20230427211013.2994127-4-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
---
scripts/checkpatch.pl | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index cb8eff233e..b2428e80cc 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2858,6 +2858,14 @@ sub process {
if ($line =~ /\bsignal\s*\(/ && !($line =~ /SIG_(?:IGN|DFL)/)) {
ERROR("use sigaction to establish signal handlers; signal is not portable\n" . $herecurr);
}
+# recommend qemu_bh_new_guarded instead of qemu_bh_new
+ if ($realfile =~ /.*\/hw\/.*/ && $line =~ /\bqemu_bh_new\s*\(/) {
+ ERROR("use qemu_bh_new_guarded() instead of qemu_bh_new() to avoid reentrancy problems\n" . $herecurr);
+ }
+# recommend aio_bh_new_guarded instead of aio_bh_new
+ if ($realfile =~ /.*\/hw\/.*/ && $line =~ /\baio_bh_new\s*\(/) {
+ ERROR("use aio_bh_new_guarded() instead of aio_bh_new() to avoid reentrancy problems\n" . $herecurr);
+ }
# check for module_init(), use category-specific init macros explicitly please
if ($line =~ /^module_init\s*\(/) {
ERROR("please use block_init(), type_init() etc. instead of module_init()\n" . $herecurr);
--
2.37.3

View File

@ -0,0 +1,127 @@
From 103608465b8bd2edf7f9aaef5c3c93309ccf9ec2 Mon Sep 17 00:00:00 2001
From: Stefan Hajnoczi <stefanha@redhat.com>
Date: Tue, 21 Feb 2023 16:22:17 -0500
Subject: [PATCH 12/13] dma-helpers: prevent dma_blk_cb() vs dma_aio_cancel()
race
RH-Author: Stefan Hajnoczi <stefanha@redhat.com>
RH-MergeRequest: 264: scsi: protect req->aiocb with AioContext lock
RH-Bugzilla: 2090990
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
RH-Commit: [2/3] 14f5835093ba8c5111f3ada2fe87730371aca733
dma_blk_cb() only takes the AioContext lock around ->io_func(). That
means the rest of dma_blk_cb() is not protected. In particular, the
DMAAIOCB field accesses happen outside the lock.
There is a race when the main loop thread holds the AioContext lock and
invokes scsi_device_purge_requests() -> bdrv_aio_cancel() ->
dma_aio_cancel() while an IOThread executes dma_blk_cb(). The dbs->acb
field determines how cancellation proceeds. If dma_aio_cancel() sees
dbs->acb == NULL while dma_blk_cb() is still running, the request can be
completed twice (-ECANCELED and the actual return value).
The following assertion can occur with virtio-scsi when an IOThread is
used:
../hw/scsi/scsi-disk.c:368: scsi_dma_complete: Assertion `r->req.aiocb != NULL' failed.
Fix the race by holding the AioContext across dma_blk_cb(). Now
dma_aio_cancel() under the AioContext lock will not see
inconsistent/intermediate states.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230221212218.1378734-3-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit abfcd2760b3e70727bbc0792221b8b98a733dc32)
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
hw/scsi/scsi-disk.c | 4 +---
softmmu/dma-helpers.c | 12 +++++++-----
2 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 179ce22c4a..c8109a673e 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -351,13 +351,12 @@ done:
scsi_req_unref(&r->req);
}
+/* Called with AioContext lock held */
static void scsi_dma_complete(void *opaque, int ret)
{
SCSIDiskReq *r = (SCSIDiskReq *)opaque;
SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
- aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
-
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
@@ -367,7 +366,6 @@ static void scsi_dma_complete(void *opaque, int ret)
block_acct_done(blk_get_stats(s->qdev.conf.blk), &r->acct);
}
scsi_dma_complete_noio(r, ret);
- aio_context_release(blk_get_aio_context(s->qdev.conf.blk));
}
static void scsi_read_complete_noio(SCSIDiskReq *r, int ret)
diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c
index 7d766a5e89..42af18719a 100644
--- a/softmmu/dma-helpers.c
+++ b/softmmu/dma-helpers.c
@@ -127,17 +127,19 @@ static void dma_complete(DMAAIOCB *dbs, int ret)
static void dma_blk_cb(void *opaque, int ret)
{
DMAAIOCB *dbs = (DMAAIOCB *)opaque;
+ AioContext *ctx = dbs->ctx;
dma_addr_t cur_addr, cur_len;
void *mem;
trace_dma_blk_cb(dbs, ret);
+ aio_context_acquire(ctx);
dbs->acb = NULL;
dbs->offset += dbs->iov.size;
if (dbs->sg_cur_index == dbs->sg->nsg || ret < 0) {
dma_complete(dbs, ret);
- return;
+ goto out;
}
dma_blk_unmap(dbs);
@@ -177,9 +179,9 @@ static void dma_blk_cb(void *opaque, int ret)
if (dbs->iov.size == 0) {
trace_dma_map_wait(dbs);
- dbs->bh = aio_bh_new(dbs->ctx, reschedule_dma, dbs);
+ dbs->bh = aio_bh_new(ctx, reschedule_dma, dbs);
cpu_register_map_client(dbs->bh);
- return;
+ goto out;
}
if (!QEMU_IS_ALIGNED(dbs->iov.size, dbs->align)) {
@@ -187,11 +189,11 @@ static void dma_blk_cb(void *opaque, int ret)
QEMU_ALIGN_DOWN(dbs->iov.size, dbs->align));
}
- aio_context_acquire(dbs->ctx);
dbs->acb = dbs->io_func(dbs->offset, &dbs->iov,
dma_blk_cb, dbs, dbs->io_func_opaque);
- aio_context_release(dbs->ctx);
assert(dbs->acb);
+out:
+ aio_context_release(ctx);
}
static void dma_aio_cancel(BlockAIOCB *acb)
--
2.37.3

View File

@ -0,0 +1,61 @@
From 7693449b235bbab6d32a1b87fa1d0e101c786f3b Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Thu, 9 Mar 2023 08:11:14 -0500
Subject: [PATCH 05/13] edu: add smp_mb__after_rmw()
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 263: qatomic: add smp_mb__before/after_rmw()
RH-Bugzilla: 2168472
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Commit: [5/10] 300901290e08b253b1278eedc39cd07c1e202b96
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168472
commit 2482aeea4195ad84cf3d4e5b15b28ec5b420ed5a
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Thu Mar 2 11:16:13 2023 +0100
edu: add smp_mb__after_rmw()
Ensure ordering between clearing the COMPUTING flag and checking
IRQFACT, and between setting the IRQFACT flag and checking
COMPUTING. This ensures that no wakeups are lost.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
hw/misc/edu.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/hw/misc/edu.c b/hw/misc/edu.c
index e935c418d4..a1f8bc77e7 100644
--- a/hw/misc/edu.c
+++ b/hw/misc/edu.c
@@ -267,6 +267,8 @@ static void edu_mmio_write(void *opaque, hwaddr addr, uint64_t val,
case 0x20:
if (val & EDU_STATUS_IRQFACT) {
qatomic_or(&edu->status, EDU_STATUS_IRQFACT);
+ /* Order check of the COMPUTING flag after setting IRQFACT. */
+ smp_mb__after_rmw();
} else {
qatomic_and(&edu->status, ~EDU_STATUS_IRQFACT);
}
@@ -349,6 +351,9 @@ static void *edu_fact_thread(void *opaque)
qemu_mutex_unlock(&edu->thr_mutex);
qatomic_and(&edu->status, ~EDU_STATUS_COMPUTING);
+ /* Clear COMPUTING flag before checking IRQFACT. */
+ smp_mb__after_rmw();
+
if (qatomic_read(&edu->status) & EDU_STATUS_IRQFACT) {
qemu_mutex_lock_iothread();
edu_raise_irq(edu, FACT_IRQ);
--
2.37.3

View File

@ -0,0 +1,449 @@
From 146cfb23b76b898f08690ffc14aab16d22a41404 Mon Sep 17 00:00:00 2001
From: Jon Maloy <jmaloy@redhat.com>
Date: Tue, 9 May 2023 10:29:03 -0400
Subject: [PATCH 04/15] hw: replace most qemu_bh_new calls with
qemu_bh_new_guarded
RH-Author: Jon Maloy <jmaloy@redhat.com>
RH-MergeRequest: 277: memory: prevent dma-reentracy issues
RH-Bugzilla: 1999236
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [4/12] 00c51d30246b3aa529f6043e35ee471660aa1fce (redhat/rhel/src/qemu-kvm/jons-qemu-kvm-2)
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1999236
Upstream: Merged
CVE: CVE-2021-3750
Conflicts: In hw/nvme/ctrl.c there are no calls to qemu_bh_new() at the two locations
the replacement is done in the upstream commit. Instead, timer_new_ns() is
used. We leave these functions unaltered.
commit f63192b0544af5d3e4d5edfd85ab520fcf671377
Author: Alexander Bulekov <alxndr@bu.edu>
Date: Thu Apr 27 17:10:09 2023 -0400
hw: replace most qemu_bh_new calls with qemu_bh_new_guarded
This protects devices from bh->mmio reentrancy issues.
Thanks: Thomas Huth <thuth@redhat.com> for diagnosing OS X test failure.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230427211013.2994127-5-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
---
hw/9pfs/xen-9p-backend.c | 5 ++++-
hw/block/dataplane/virtio-blk.c | 3 ++-
hw/block/dataplane/xen-block.c | 5 +++--
hw/char/virtio-serial-bus.c | 3 ++-
hw/display/qxl.c | 9 ++++++---
hw/display/virtio-gpu.c | 6 ++++--
hw/ide/ahci.c | 3 ++-
hw/ide/ahci_internal.h | 1 +
hw/ide/core.c | 4 +++-
hw/misc/imx_rngc.c | 6 ++++--
hw/misc/macio/mac_dbdma.c | 2 +-
hw/net/virtio-net.c | 3 ++-
hw/scsi/mptsas.c | 3 ++-
hw/scsi/scsi-bus.c | 3 ++-
hw/scsi/vmw_pvscsi.c | 3 ++-
hw/usb/dev-uas.c | 3 ++-
hw/usb/hcd-dwc2.c | 3 ++-
hw/usb/hcd-ehci.c | 3 ++-
hw/usb/hcd-uhci.c | 2 +-
hw/usb/host-libusb.c | 6 ++++--
hw/usb/redirect.c | 6 ++++--
hw/usb/xen-usb.c | 3 ++-
hw/virtio/virtio-balloon.c | 5 +++--
hw/virtio/virtio-crypto.c | 3 ++-
24 files changed, 62 insertions(+), 31 deletions(-)
diff --git a/hw/9pfs/xen-9p-backend.c b/hw/9pfs/xen-9p-backend.c
index 65c4979c3c..09f7c13588 100644
--- a/hw/9pfs/xen-9p-backend.c
+++ b/hw/9pfs/xen-9p-backend.c
@@ -60,6 +60,7 @@ typedef struct Xen9pfsDev {
int num_rings;
Xen9pfsRing *rings;
+ MemReentrancyGuard mem_reentrancy_guard;
} Xen9pfsDev;
static void xen_9pfs_disconnect(struct XenLegacyDevice *xendev);
@@ -441,7 +442,9 @@ static int xen_9pfs_connect(struct XenLegacyDevice *xendev)
xen_9pdev->rings[i].ring.out = xen_9pdev->rings[i].data +
XEN_FLEX_RING_SIZE(ring_order);
- xen_9pdev->rings[i].bh = qemu_bh_new(xen_9pfs_bh, &xen_9pdev->rings[i]);
+ xen_9pdev->rings[i].bh = qemu_bh_new_guarded(xen_9pfs_bh,
+ &xen_9pdev->rings[i],
+ &xen_9pdev->mem_reentrancy_guard);
xen_9pdev->rings[i].out_cons = 0;
xen_9pdev->rings[i].out_size = 0;
xen_9pdev->rings[i].inprogress = false;
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index ee5a5352dc..5f0de7da1e 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -127,7 +127,8 @@ bool virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *conf,
} else {
s->ctx = qemu_get_aio_context();
}
- s->bh = aio_bh_new(s->ctx, notify_guest_bh, s);
+ s->bh = aio_bh_new_guarded(s->ctx, notify_guest_bh, s,
+ &DEVICE(vdev)->mem_reentrancy_guard);
s->batch_notify_vqs = bitmap_new(conf->num_queues);
*dataplane = s;
diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index 860787580a..07855feea6 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -631,8 +631,9 @@ XenBlockDataPlane *xen_block_dataplane_create(XenDevice *xendev,
} else {
dataplane->ctx = qemu_get_aio_context();
}
- dataplane->bh = aio_bh_new(dataplane->ctx, xen_block_dataplane_bh,
- dataplane);
+ dataplane->bh = aio_bh_new_guarded(dataplane->ctx, xen_block_dataplane_bh,
+ dataplane,
+ &DEVICE(xendev)->mem_reentrancy_guard);
return dataplane;
}
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index f01ec2137c..f18124b155 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -985,7 +985,8 @@ static void virtser_port_device_realize(DeviceState *dev, Error **errp)
return;
}
- port->bh = qemu_bh_new(flush_queued_data_bh, port);
+ port->bh = qemu_bh_new_guarded(flush_queued_data_bh, port,
+ &dev->mem_reentrancy_guard);
port->elem = NULL;
}
diff --git a/hw/display/qxl.c b/hw/display/qxl.c
index bcd9e8716a..0f663b9912 100644
--- a/hw/display/qxl.c
+++ b/hw/display/qxl.c
@@ -2206,11 +2206,14 @@ static void qxl_realize_common(PCIQXLDevice *qxl, Error **errp)
qemu_add_vm_change_state_handler(qxl_vm_change_state_handler, qxl);
- qxl->update_irq = qemu_bh_new(qxl_update_irq_bh, qxl);
+ qxl->update_irq = qemu_bh_new_guarded(qxl_update_irq_bh, qxl,
+ &DEVICE(qxl)->mem_reentrancy_guard);
qxl_reset_state(qxl);
- qxl->update_area_bh = qemu_bh_new(qxl_render_update_area_bh, qxl);
- qxl->ssd.cursor_bh = qemu_bh_new(qemu_spice_cursor_refresh_bh, &qxl->ssd);
+ qxl->update_area_bh = qemu_bh_new_guarded(qxl_render_update_area_bh, qxl,
+ &DEVICE(qxl)->mem_reentrancy_guard);
+ qxl->ssd.cursor_bh = qemu_bh_new_guarded(qemu_spice_cursor_refresh_bh, &qxl->ssd,
+ &DEVICE(qxl)->mem_reentrancy_guard);
}
static void qxl_realize_primary(PCIDevice *dev, Error **errp)
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index d78b9700c7..ecf9079145 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -1332,8 +1332,10 @@ void virtio_gpu_device_realize(DeviceState *qdev, Error **errp)
g->ctrl_vq = virtio_get_queue(vdev, 0);
g->cursor_vq = virtio_get_queue(vdev, 1);
- g->ctrl_bh = qemu_bh_new(virtio_gpu_ctrl_bh, g);
- g->cursor_bh = qemu_bh_new(virtio_gpu_cursor_bh, g);
+ g->ctrl_bh = qemu_bh_new_guarded(virtio_gpu_ctrl_bh, g,
+ &qdev->mem_reentrancy_guard);
+ g->cursor_bh = qemu_bh_new_guarded(virtio_gpu_cursor_bh, g,
+ &qdev->mem_reentrancy_guard);
QTAILQ_INIT(&g->reslist);
QTAILQ_INIT(&g->cmdq);
QTAILQ_INIT(&g->fenceq);
diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index a94c6e26fb..7488b28065 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -1504,7 +1504,8 @@ static void ahci_cmd_done(const IDEDMA *dma)
ahci_write_fis_d2h(ad);
if (ad->port_regs.cmd_issue && !ad->check_bh) {
- ad->check_bh = qemu_bh_new(ahci_check_cmd_bh, ad);
+ ad->check_bh = qemu_bh_new_guarded(ahci_check_cmd_bh, ad,
+ &ad->mem_reentrancy_guard);
qemu_bh_schedule(ad->check_bh);
}
}
diff --git a/hw/ide/ahci_internal.h b/hw/ide/ahci_internal.h
index 109de9e2d1..a7768dd69e 100644
--- a/hw/ide/ahci_internal.h
+++ b/hw/ide/ahci_internal.h
@@ -321,6 +321,7 @@ struct AHCIDevice {
bool init_d2h_sent;
AHCICmdHdr *cur_cmd;
NCQTransferState ncq_tfs[AHCI_MAX_CMDS];
+ MemReentrancyGuard mem_reentrancy_guard;
};
struct AHCIPCIState {
diff --git a/hw/ide/core.c b/hw/ide/core.c
index 15138225be..05a32d0a99 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -510,6 +510,7 @@ BlockAIOCB *ide_issue_trim(
BlockCompletionFunc *cb, void *cb_opaque, void *opaque)
{
IDEState *s = opaque;
+ IDEDevice *dev = s->unit ? s->bus->slave : s->bus->master;
TrimAIOCB *iocb;
/* Paired with a decrement in ide_trim_bh_cb() */
@@ -517,7 +518,8 @@ BlockAIOCB *ide_issue_trim(
iocb = blk_aio_get(&trim_aiocb_info, s->blk, cb, cb_opaque);
iocb->s = s;
- iocb->bh = qemu_bh_new(ide_trim_bh_cb, iocb);
+ iocb->bh = qemu_bh_new_guarded(ide_trim_bh_cb, iocb,
+ &DEVICE(dev)->mem_reentrancy_guard);
iocb->ret = 0;
iocb->qiov = qiov;
iocb->i = -1;
diff --git a/hw/misc/imx_rngc.c b/hw/misc/imx_rngc.c
index 632c03779c..082c6980ad 100644
--- a/hw/misc/imx_rngc.c
+++ b/hw/misc/imx_rngc.c
@@ -228,8 +228,10 @@ static void imx_rngc_realize(DeviceState *dev, Error **errp)
sysbus_init_mmio(sbd, &s->iomem);
sysbus_init_irq(sbd, &s->irq);
- s->self_test_bh = qemu_bh_new(imx_rngc_self_test, s);
- s->seed_bh = qemu_bh_new(imx_rngc_seed, s);
+ s->self_test_bh = qemu_bh_new_guarded(imx_rngc_self_test, s,
+ &dev->mem_reentrancy_guard);
+ s->seed_bh = qemu_bh_new_guarded(imx_rngc_seed, s,
+ &dev->mem_reentrancy_guard);
}
static void imx_rngc_reset(DeviceState *dev)
diff --git a/hw/misc/macio/mac_dbdma.c b/hw/misc/macio/mac_dbdma.c
index e220f1a927..f6a9e76fe7 100644
--- a/hw/misc/macio/mac_dbdma.c
+++ b/hw/misc/macio/mac_dbdma.c
@@ -912,7 +912,7 @@ static void mac_dbdma_realize(DeviceState *dev, Error **errp)
{
DBDMAState *s = MAC_DBDMA(dev);
- s->bh = qemu_bh_new(DBDMA_run_bh, s);
+ s->bh = qemu_bh_new_guarded(DBDMA_run_bh, s, &dev->mem_reentrancy_guard);
}
static void mac_dbdma_class_init(ObjectClass *oc, void *data)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 7e172ef829..ddaa8fa122 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -2753,7 +2753,8 @@ static void virtio_net_add_queue(VirtIONet *n, int index)
n->vqs[index].tx_vq =
virtio_add_queue(vdev, n->net_conf.tx_queue_size,
virtio_net_handle_tx_bh);
- n->vqs[index].tx_bh = qemu_bh_new(virtio_net_tx_bh, &n->vqs[index]);
+ n->vqs[index].tx_bh = qemu_bh_new_guarded(virtio_net_tx_bh, &n->vqs[index],
+ &DEVICE(vdev)->mem_reentrancy_guard);
}
n->vqs[index].tx_waiting = 0;
diff --git a/hw/scsi/mptsas.c b/hw/scsi/mptsas.c
index f6c7765544..ab8aaca85d 100644
--- a/hw/scsi/mptsas.c
+++ b/hw/scsi/mptsas.c
@@ -1313,7 +1313,8 @@ static void mptsas_scsi_realize(PCIDevice *dev, Error **errp)
}
s->max_devices = MPTSAS_NUM_PORTS;
- s->request_bh = qemu_bh_new(mptsas_fetch_requests, s);
+ s->request_bh = qemu_bh_new_guarded(mptsas_fetch_requests, s,
+ &DEVICE(dev)->mem_reentrancy_guard);
scsi_bus_init(&s->bus, sizeof(s->bus), &dev->qdev, &mptsas_scsi_info);
}
diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index 77325d8cc7..b506ab7d04 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -192,7 +192,8 @@ static void scsi_dma_restart_cb(void *opaque, bool running, RunState state)
AioContext *ctx = blk_get_aio_context(s->conf.blk);
/* The reference is dropped in scsi_dma_restart_bh.*/
object_ref(OBJECT(s));
- s->bh = aio_bh_new(ctx, scsi_dma_restart_bh, s);
+ s->bh = aio_bh_new_guarded(ctx, scsi_dma_restart_bh, s,
+ &DEVICE(s)->mem_reentrancy_guard);
qemu_bh_schedule(s->bh);
}
}
diff --git a/hw/scsi/vmw_pvscsi.c b/hw/scsi/vmw_pvscsi.c
index cd76bd67ab..4c36febbc0 100644
--- a/hw/scsi/vmw_pvscsi.c
+++ b/hw/scsi/vmw_pvscsi.c
@@ -1178,7 +1178,8 @@ pvscsi_realizefn(PCIDevice *pci_dev, Error **errp)
pcie_endpoint_cap_init(pci_dev, PVSCSI_EXP_EP_OFFSET);
}
- s->completion_worker = qemu_bh_new(pvscsi_process_completion_queue, s);
+ s->completion_worker = qemu_bh_new_guarded(pvscsi_process_completion_queue, s,
+ &DEVICE(pci_dev)->mem_reentrancy_guard);
scsi_bus_init(&s->bus, sizeof(s->bus), DEVICE(pci_dev), &pvscsi_scsi_info);
/* override default SCSI bus hotplug-handler, with pvscsi's one */
diff --git a/hw/usb/dev-uas.c b/hw/usb/dev-uas.c
index 599d6b52a0..a36a7c3013 100644
--- a/hw/usb/dev-uas.c
+++ b/hw/usb/dev-uas.c
@@ -935,7 +935,8 @@ static void usb_uas_realize(USBDevice *dev, Error **errp)
QTAILQ_INIT(&uas->results);
QTAILQ_INIT(&uas->requests);
- uas->status_bh = qemu_bh_new(usb_uas_send_status_bh, uas);
+ uas->status_bh = qemu_bh_new_guarded(usb_uas_send_status_bh, uas,
+ &d->mem_reentrancy_guard);
dev->flags |= (1 << USB_DEV_FLAG_IS_SCSI_STORAGE);
scsi_bus_init(&uas->bus, sizeof(uas->bus), DEVICE(dev), &usb_uas_scsi_info);
diff --git a/hw/usb/hcd-dwc2.c b/hw/usb/hcd-dwc2.c
index e1d96acf7e..0e238f8422 100644
--- a/hw/usb/hcd-dwc2.c
+++ b/hw/usb/hcd-dwc2.c
@@ -1364,7 +1364,8 @@ static void dwc2_realize(DeviceState *dev, Error **errp)
s->fi = USB_FRMINTVL - 1;
s->eof_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, dwc2_frame_boundary, s);
s->frame_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, dwc2_work_timer, s);
- s->async_bh = qemu_bh_new(dwc2_work_bh, s);
+ s->async_bh = qemu_bh_new_guarded(dwc2_work_bh, s,
+ &dev->mem_reentrancy_guard);
sysbus_init_irq(sbd, &s->irq);
}
diff --git a/hw/usb/hcd-ehci.c b/hw/usb/hcd-ehci.c
index 6caa7ac6c2..df4ff6f2c1 100644
--- a/hw/usb/hcd-ehci.c
+++ b/hw/usb/hcd-ehci.c
@@ -2528,7 +2528,8 @@ void usb_ehci_realize(EHCIState *s, DeviceState *dev, Error **errp)
}
s->frame_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, ehci_work_timer, s);
- s->async_bh = qemu_bh_new(ehci_work_bh, s);
+ s->async_bh = qemu_bh_new_guarded(ehci_work_bh, s,
+ &dev->mem_reentrancy_guard);
s->device = dev;
s->vmstate = qemu_add_vm_change_state_handler(usb_ehci_vm_state_change, s);
diff --git a/hw/usb/hcd-uhci.c b/hw/usb/hcd-uhci.c
index 7930b868fa..469c5e57e9 100644
--- a/hw/usb/hcd-uhci.c
+++ b/hw/usb/hcd-uhci.c
@@ -1195,7 +1195,7 @@ void usb_uhci_common_realize(PCIDevice *dev, Error **errp)
USB_SPEED_MASK_LOW | USB_SPEED_MASK_FULL);
}
}
- s->bh = qemu_bh_new(uhci_bh, s);
+ s->bh = qemu_bh_new_guarded(uhci_bh, s, &DEVICE(dev)->mem_reentrancy_guard);
s->frame_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, uhci_frame_timer, s);
s->num_ports_vmstate = NB_PORTS;
QTAILQ_INIT(&s->queues);
diff --git a/hw/usb/host-libusb.c b/hw/usb/host-libusb.c
index d0d46dd0a4..09b961116b 100644
--- a/hw/usb/host-libusb.c
+++ b/hw/usb/host-libusb.c
@@ -1141,7 +1141,8 @@ static void usb_host_nodev_bh(void *opaque)
static void usb_host_nodev(USBHostDevice *s)
{
if (!s->bh_nodev) {
- s->bh_nodev = qemu_bh_new(usb_host_nodev_bh, s);
+ s->bh_nodev = qemu_bh_new_guarded(usb_host_nodev_bh, s,
+ &DEVICE(s)->mem_reentrancy_guard);
}
qemu_bh_schedule(s->bh_nodev);
}
@@ -1739,7 +1740,8 @@ static int usb_host_post_load(void *opaque, int version_id)
USBHostDevice *dev = opaque;
if (!dev->bh_postld) {
- dev->bh_postld = qemu_bh_new(usb_host_post_load_bh, dev);
+ dev->bh_postld = qemu_bh_new_guarded(usb_host_post_load_bh, dev,
+ &DEVICE(dev)->mem_reentrancy_guard);
}
qemu_bh_schedule(dev->bh_postld);
dev->bh_postld_pending = true;
diff --git a/hw/usb/redirect.c b/hw/usb/redirect.c
index 5f0ef9cb3b..59cd3cd7c4 100644
--- a/hw/usb/redirect.c
+++ b/hw/usb/redirect.c
@@ -1437,8 +1437,10 @@ static void usbredir_realize(USBDevice *udev, Error **errp)
}
}
- dev->chardev_close_bh = qemu_bh_new(usbredir_chardev_close_bh, dev);
- dev->device_reject_bh = qemu_bh_new(usbredir_device_reject_bh, dev);
+ dev->chardev_close_bh = qemu_bh_new_guarded(usbredir_chardev_close_bh, dev,
+ &DEVICE(dev)->mem_reentrancy_guard);
+ dev->device_reject_bh = qemu_bh_new_guarded(usbredir_device_reject_bh, dev,
+ &DEVICE(dev)->mem_reentrancy_guard);
dev->attach_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, usbredir_do_attach, dev);
packet_id_queue_init(&dev->cancelled, dev, "cancelled");
diff --git a/hw/usb/xen-usb.c b/hw/usb/xen-usb.c
index 0f7369e7ed..dec91294ad 100644
--- a/hw/usb/xen-usb.c
+++ b/hw/usb/xen-usb.c
@@ -1021,7 +1021,8 @@ static void usbback_alloc(struct XenLegacyDevice *xendev)
QTAILQ_INIT(&usbif->req_free_q);
QSIMPLEQ_INIT(&usbif->hotplug_q);
- usbif->bh = qemu_bh_new(usbback_bh, usbif);
+ usbif->bh = qemu_bh_new_guarded(usbback_bh, usbif,
+ &DEVICE(xendev)->mem_reentrancy_guard);
}
static int usbback_free(struct XenLegacyDevice *xendev)
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 9a4f491b54..f503572e27 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -917,8 +917,9 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
precopy_add_notifier(&s->free_page_hint_notify);
object_ref(OBJECT(s->iothread));
- s->free_page_bh = aio_bh_new(iothread_get_aio_context(s->iothread),
- virtio_ballloon_get_free_page_hints, s);
+ s->free_page_bh = aio_bh_new_guarded(iothread_get_aio_context(s->iothread),
+ virtio_ballloon_get_free_page_hints, s,
+ &dev->mem_reentrancy_guard);
}
if (virtio_has_feature(s->host_features, VIRTIO_BALLOON_F_REPORTING)) {
diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c
index 54f9bbb789..1be7bb543c 100644
--- a/hw/virtio/virtio-crypto.c
+++ b/hw/virtio/virtio-crypto.c
@@ -817,7 +817,8 @@ static void virtio_crypto_device_realize(DeviceState *dev, Error **errp)
vcrypto->vqs[i].dataq =
virtio_add_queue(vdev, 1024, virtio_crypto_handle_dataq_bh);
vcrypto->vqs[i].dataq_bh =
- qemu_bh_new(virtio_crypto_dataq_bh, &vcrypto->vqs[i]);
+ qemu_bh_new_guarded(virtio_crypto_dataq_bh, &vcrypto->vqs[i],
+ &dev->mem_reentrancy_guard);
vcrypto->vqs[i].vcrypto = vcrypto;
}
--
2.37.3

View File

@ -0,0 +1,260 @@
From 57a26ba1c4053cdc426653f921e66f7a8efd3ce7 Mon Sep 17 00:00:00 2001
From: Thomas Huth <thuth@redhat.com>
Date: Mon, 22 May 2023 11:10:11 +0200
Subject: [PATCH 12/15] hw/scsi/lsi53c895a: Fix reentrancy issues in the LSI
controller (CVE-2023-0330)
RH-Author: Jon Maloy <jmaloy@redhat.com>
RH-MergeRequest: 277: memory: prevent dma-reentracy issues
RH-Bugzilla: 1999236
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [12/12] 28f5e04344109d8514869c50468bef481437201d (redhat/rhel/src/qemu-kvm/jons-qemu-kvm-2)
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1999236
Upstream: Merged
CVE: CVE-2021-3750
commit b987718bbb1d0eabf95499b976212dd5f0120d75
Author: Thomas Huth <thuth@redhat.com>
Date: Mon May 22 11:10:11 2023 +0200
hw/scsi/lsi53c895a: Fix reentrancy issues in the LSI controller (CVE-2023-0330)
We cannot use the generic reentrancy guard in the LSI code, so
we have to manually prevent endless reentrancy here. The problematic
lsi_execute_script() function has already a way to detect whether
too many instructions have been executed - we just have to slightly
change the logic here that it also takes into account if the function
has been called too often in a reentrant way.
The code in fuzz-lsi53c895a-test.c has been taken from an earlier
patch by Mauro Matteo Cascella.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1563
Message-Id: <20230522091011.1082574-1-thuth@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
---
hw/scsi/lsi53c895a.c | 23 +++--
tests/qtest/fuzz-lsi53c895a-test.c | 161 +++++++++++++++++++++++++++++
2 files changed, 178 insertions(+), 6 deletions(-)
create mode 100644 tests/qtest/fuzz-lsi53c895a-test.c
diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index 2b9cb2ac5d..b60786fd56 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -1133,15 +1133,24 @@ static void lsi_execute_script(LSIState *s)
uint32_t addr, addr_high;
int opcode;
int insn_processed = 0;
+ static int reentrancy_level;
+
+ reentrancy_level++;
s->istat1 |= LSI_ISTAT1_SRUN;
again:
- if (++insn_processed > LSI_MAX_INSN) {
- /* Some windows drivers make the device spin waiting for a memory
- location to change. If we have been executed a lot of code then
- assume this is the case and force an unexpected device disconnect.
- This is apparently sufficient to beat the drivers into submission.
- */
+ /*
+ * Some windows drivers make the device spin waiting for a memory location
+ * to change. If we have executed more than LSI_MAX_INSN instructions then
+ * assume this is the case and force an unexpected device disconnect. This
+ * is apparently sufficient to beat the drivers into submission.
+ *
+ * Another issue (CVE-2023-0330) can occur if the script is programmed to
+ * trigger itself again and again. Avoid this problem by stopping after
+ * being called multiple times in a reentrant way (8 is an arbitrary value
+ * which should be enough for all valid use cases).
+ */
+ if (++insn_processed > LSI_MAX_INSN || reentrancy_level > 8) {
if (!(s->sien0 & LSI_SIST0_UDC)) {
qemu_log_mask(LOG_GUEST_ERROR,
"lsi_scsi: inf. loop with UDC masked");
@@ -1595,6 +1604,8 @@ again:
}
}
trace_lsi_execute_script_stop();
+
+ reentrancy_level--;
}
static uint8_t lsi_reg_readb(LSIState *s, int offset)
diff --git a/tests/qtest/fuzz-lsi53c895a-test.c b/tests/qtest/fuzz-lsi53c895a-test.c
new file mode 100644
index 0000000000..1b55928b9f
--- /dev/null
+++ b/tests/qtest/fuzz-lsi53c895a-test.c
@@ -0,0 +1,161 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * QTest fuzzer-generated testcase for LSI53C895A device
+ *
+ * Copyright (c) Red Hat
+ */
+
+#include "qemu/osdep.h"
+#include "libqtest.h"
+
+/*
+ * This used to trigger a DMA reentrancy issue
+ * leading to memory corruption bugs like stack
+ * overflow or use-after-free
+ * https://gitlab.com/qemu-project/qemu/-/issues/1563
+ */
+static void test_lsi_dma_reentrancy(void)
+{
+ QTestState *s;
+
+ s = qtest_init("-M q35 -m 512M -nodefaults "
+ "-blockdev driver=null-co,node-name=null0 "
+ "-device lsi53c810 -device scsi-cd,drive=null0");
+
+ qtest_outl(s, 0xcf8, 0x80000804); /* PCI Command Register */
+ qtest_outw(s, 0xcfc, 0x7); /* Enables accesses */
+ qtest_outl(s, 0xcf8, 0x80000814); /* Memory Bar 1 */
+ qtest_outl(s, 0xcfc, 0xff100000); /* Set MMIO Address*/
+ qtest_outl(s, 0xcf8, 0x80000818); /* Memory Bar 2 */
+ qtest_outl(s, 0xcfc, 0xff000000); /* Set RAM Address*/
+ qtest_writel(s, 0xff000000, 0xc0000024);
+ qtest_writel(s, 0xff000114, 0x00000080);
+ qtest_writel(s, 0xff00012c, 0xff000000);
+ qtest_writel(s, 0xff000004, 0xff000114);
+ qtest_writel(s, 0xff000008, 0xff100014);
+ qtest_writel(s, 0xff10002f, 0x000000ff);
+
+ qtest_quit(s);
+}
+
+/*
+ * This used to trigger a UAF in lsi_do_msgout()
+ * https://gitlab.com/qemu-project/qemu/-/issues/972
+ */
+static void test_lsi_do_msgout_cancel_req(void)
+{
+ QTestState *s;
+
+ if (sizeof(void *) == 4) {
+ g_test_skip("memory size too big for 32-bit build");
+ return;
+ }
+
+ s = qtest_init("-M q35 -m 2G -nodefaults "
+ "-device lsi53c895a,id=scsi "
+ "-device scsi-hd,drive=disk0 "
+ "-drive file=null-co://,id=disk0,if=none,format=raw");
+
+ qtest_outl(s, 0xcf8, 0x80000810);
+ qtest_outl(s, 0xcf8, 0xc000);
+ qtest_outl(s, 0xcf8, 0x80000810);
+ qtest_outw(s, 0xcfc, 0x7);
+ qtest_outl(s, 0xcf8, 0x80000810);
+ qtest_outl(s, 0xcfc, 0xc000);
+ qtest_outl(s, 0xcf8, 0x80000804);
+ qtest_outw(s, 0xcfc, 0x05);
+ qtest_writeb(s, 0x69736c10, 0x08);
+ qtest_writeb(s, 0x69736c13, 0x58);
+ qtest_writeb(s, 0x69736c1a, 0x01);
+ qtest_writeb(s, 0x69736c1b, 0x06);
+ qtest_writeb(s, 0x69736c22, 0x01);
+ qtest_writeb(s, 0x69736c23, 0x07);
+ qtest_writeb(s, 0x69736c2b, 0x02);
+ qtest_writeb(s, 0x69736c48, 0x08);
+ qtest_writeb(s, 0x69736c4b, 0x58);
+ qtest_writeb(s, 0x69736c52, 0x04);
+ qtest_writeb(s, 0x69736c53, 0x06);
+ qtest_writeb(s, 0x69736c5b, 0x02);
+ qtest_outl(s, 0xc02d, 0x697300);
+ qtest_writeb(s, 0x5a554662, 0x01);
+ qtest_writeb(s, 0x5a554663, 0x07);
+ qtest_writeb(s, 0x5a55466a, 0x10);
+ qtest_writeb(s, 0x5a55466b, 0x22);
+ qtest_writeb(s, 0x5a55466c, 0x5a);
+ qtest_writeb(s, 0x5a55466d, 0x5a);
+ qtest_writeb(s, 0x5a55466e, 0x34);
+ qtest_writeb(s, 0x5a55466f, 0x5a);
+ qtest_writeb(s, 0x5a345a5a, 0x77);
+ qtest_writeb(s, 0x5a345a5b, 0x55);
+ qtest_writeb(s, 0x5a345a5c, 0x51);
+ qtest_writeb(s, 0x5a345a5d, 0x27);
+ qtest_writeb(s, 0x27515577, 0x41);
+ qtest_outl(s, 0xc02d, 0x5a5500);
+ qtest_writeb(s, 0x364001d0, 0x08);
+ qtest_writeb(s, 0x364001d3, 0x58);
+ qtest_writeb(s, 0x364001da, 0x01);
+ qtest_writeb(s, 0x364001db, 0x26);
+ qtest_writeb(s, 0x364001dc, 0x0d);
+ qtest_writeb(s, 0x364001dd, 0xae);
+ qtest_writeb(s, 0x364001de, 0x41);
+ qtest_writeb(s, 0x364001df, 0x5a);
+ qtest_writeb(s, 0x5a41ae0d, 0xf8);
+ qtest_writeb(s, 0x5a41ae0e, 0x36);
+ qtest_writeb(s, 0x5a41ae0f, 0xd7);
+ qtest_writeb(s, 0x5a41ae10, 0x36);
+ qtest_writeb(s, 0x36d736f8, 0x0c);
+ qtest_writeb(s, 0x36d736f9, 0x80);
+ qtest_writeb(s, 0x36d736fa, 0x0d);
+ qtest_outl(s, 0xc02d, 0x364000);
+
+ qtest_quit(s);
+}
+
+/*
+ * This used to trigger the assert in lsi_do_dma()
+ * https://bugs.launchpad.net/qemu/+bug/697510
+ * https://bugs.launchpad.net/qemu/+bug/1905521
+ * https://bugs.launchpad.net/qemu/+bug/1908515
+ */
+static void test_lsi_do_dma_empty_queue(void)
+{
+ QTestState *s;
+
+ s = qtest_init("-M q35 -nographic -monitor none -serial none "
+ "-drive if=none,id=drive0,"
+ "file=null-co://,file.read-zeroes=on,format=raw "
+ "-device lsi53c895a,id=scsi0 "
+ "-device scsi-hd,drive=drive0,"
+ "bus=scsi0.0,channel=0,scsi-id=0,lun=0");
+ qtest_outl(s, 0xcf8, 0x80001814);
+ qtest_outl(s, 0xcfc, 0xe1068000);
+ qtest_outl(s, 0xcf8, 0x80001818);
+ qtest_outl(s, 0xcf8, 0x80001804);
+ qtest_outw(s, 0xcfc, 0x7);
+ qtest_outl(s, 0xcf8, 0x80002010);
+
+ qtest_writeb(s, 0xe106802e, 0xff); /* Fill DSP bits 16-23 */
+ qtest_writeb(s, 0xe106802f, 0xff); /* Fill DSP bits 24-31: trigger SCRIPT */
+
+ qtest_quit(s);
+}
+
+int main(int argc, char **argv)
+{
+ g_test_init(&argc, &argv, NULL);
+
+ if (!qtest_has_device("lsi53c895a")) {
+ return 0;
+ }
+
+ qtest_add_func("fuzz/lsi53c895a/lsi_do_dma_empty_queue",
+ test_lsi_do_dma_empty_queue);
+
+ qtest_add_func("fuzz/lsi53c895a/lsi_do_msgout_cancel_req",
+ test_lsi_do_msgout_cancel_req);
+
+ qtest_add_func("fuzz/lsi53c895a/lsi_dma_reentrancy",
+ test_lsi_dma_reentrancy);
+
+ return g_test_run();
+}
--
2.37.3

View File

@ -0,0 +1,53 @@
From 18ac13c7d64266238bd44b2188e0d044af3c3377 Mon Sep 17 00:00:00 2001
From: Bandan Das <bsd@redhat.com>
Date: Thu, 3 Aug 2023 15:14:14 -0400
Subject: [PATCH 4/5] i386/cpu: Update how the EBX register of CPUID 0x8000001F
is set
RH-Author: Bandan Das <None>
RH-MergeRequest: 296: Updates to SEV reduced-phys-bits parameter
RH-Bugzilla: 2214840
RH-Acked-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [4/4] 8b236fd9bc4c177bfacf6220a429e711b5bf062e
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2214840
commit fb6bbafc0f19385fb257ee073ed13dcaf613f2f8
Author: Tom Lendacky <thomas.lendacky@amd.com>
Date: Fri Sep 30 10:14:30 2022 -0500
i386/cpu: Update how the EBX register of CPUID 0x8000001F is set
Update the setting of CPUID 0x8000001F EBX to clearly document the ranges
associated with fields being set.
Fixes: 6cb8f2a663 ("cpu/i386: populate CPUID 0x8000_001F when SEV is active")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <5822fd7d02b575121380e1f493a8f6d9eba2b11a.1664550870.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bandan Das <bsd@redhat.com>
---
target/i386/cpu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 9d3dcdcc0d..265f0aadfc 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5836,8 +5836,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
if (sev_enabled()) {
*eax = 0x2;
*eax |= sev_es_enabled() ? 0x8 : 0;
- *ebx = sev_get_cbit_position();
- *ebx |= sev_get_reduced_phys_bits() << 6;
+ *ebx = sev_get_cbit_position() & 0x3f; /* EBX[5:0] */
+ *ebx |= (sev_get_reduced_phys_bits() & 0x3f) << 6; /* EBX[11:6] */
}
break;
default:
--
2.37.3

View File

@ -0,0 +1,78 @@
From 19504ea76b6341c11213316402bb5194487e1f01 Mon Sep 17 00:00:00 2001
From: Bandan Das <bsd@redhat.com>
Date: Thu, 3 Aug 2023 15:13:19 -0400
Subject: [PATCH 3/5] i386/sev: Update checks and information related to
reduced-phys-bits
RH-Author: Bandan Das <None>
RH-MergeRequest: 296: Updates to SEV reduced-phys-bits parameter
RH-Bugzilla: 2214840
RH-Acked-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [3/4] b617173d2b15fa39cdc02b5c1ac4d52e9b0dfede
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2214840
commit 8168fed9f84e3128f7628969ae78af49433d5ce7
Author: Tom Lendacky <thomas.lendacky@amd.com>
Date: Fri Sep 30 10:14:29 2022 -0500
i386/sev: Update checks and information related to reduced-phys-bits
The value of the reduced-phys-bits parameter is propogated to the CPUID
information exposed to the guest. Update the current validation check to
account for the size of the CPUID field (6-bits), ensuring the value is
in the range of 1 to 63.
Maintain backward compatibility, to an extent, by allowing a value greater
than 1 (so that the previously documented value of 5 still works), but not
allowing anything over 63.
Fixes: d8575c6c02 ("sev/i386: add command to initialize the memory encryption context")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <cca5341a95ac73f904e6300f10b04f9c62e4e8ff.1664550870.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bandan Das <bsd@redhat.com>
---
target/i386/sev.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 025ff7a6f8..ba6a65e90c 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -892,15 +892,26 @@ int sev_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
host_cpuid(0x8000001F, 0, NULL, &ebx, NULL, NULL);
host_cbitpos = ebx & 0x3f;
+ /*
+ * The cbitpos value will be placed in bit positions 5:0 of the EBX
+ * register of CPUID 0x8000001F. No need to verify the range as the
+ * comparison against the host value accomplishes that.
+ */
if (host_cbitpos != sev->cbitpos) {
error_setg(errp, "%s: cbitpos check failed, host '%d' requested '%d'",
__func__, host_cbitpos, sev->cbitpos);
goto err;
}
- if (sev->reduced_phys_bits < 1) {
- error_setg(errp, "%s: reduced_phys_bits check failed, it should be >=1,"
- " requested '%d'", __func__, sev->reduced_phys_bits);
+ /*
+ * The reduced-phys-bits value will be placed in bit positions 11:6 of
+ * the EBX register of CPUID 0x8000001F, so verify the supplied value
+ * is in the range of 1 to 63.
+ */
+ if (sev->reduced_phys_bits < 1 || sev->reduced_phys_bits > 63) {
+ error_setg(errp, "%s: reduced_phys_bits check failed,"
+ " it should be in the range of 1 to 63, requested '%d'",
+ __func__, sev->reduced_phys_bits);
goto err;
}
--
2.37.3

View File

@ -0,0 +1,187 @@
From 084e211448f40c3e9d9b1907f6c98dca9f998bc3 Mon Sep 17 00:00:00 2001
From: Hanna Czenczek <hreitz@redhat.com>
Date: Tue, 11 Apr 2023 19:34:18 +0200
Subject: [PATCH 4/5] iotests/iov-padding: New test
RH-Author: Hanna Czenczek <hreitz@redhat.com>
RH-MergeRequest: 291: block: Split padded I/O vectors exceeding IOV_MAX
RH-Bugzilla: 2141964
RH-Acked-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Commit: [4/5] a80be9c26ebd5503745989cd6823cb4814264258
Test that even vectored IO requests with 1024 vector elements that are
not aligned to the device's request alignment will succeed.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230411173418.19549-5-hreitz@redhat.com>
(cherry picked from commit d7e1905e3f54ff9512db4c7a946a8603b62b108d)
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
---
tests/qemu-iotests/tests/iov-padding | 85 ++++++++++++++++++++++++
tests/qemu-iotests/tests/iov-padding.out | 59 ++++++++++++++++
2 files changed, 144 insertions(+)
create mode 100755 tests/qemu-iotests/tests/iov-padding
create mode 100644 tests/qemu-iotests/tests/iov-padding.out
diff --git a/tests/qemu-iotests/tests/iov-padding b/tests/qemu-iotests/tests/iov-padding
new file mode 100755
index 0000000000..b9604900c7
--- /dev/null
+++ b/tests/qemu-iotests/tests/iov-padding
@@ -0,0 +1,85 @@
+#!/usr/bin/env bash
+# group: rw quick
+#
+# Check the interaction of request padding (to fit alignment restrictions) with
+# vectored I/O from the guest
+#
+# Copyright Red Hat
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+#
+
+seq=$(basename $0)
+echo "QA output created by $seq"
+
+status=1 # failure is the default!
+
+_cleanup()
+{
+ _cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+cd ..
+. ./common.rc
+. ./common.filter
+
+_supported_fmt raw
+_supported_proto file
+
+_make_test_img 1M
+
+IMGSPEC="driver=blkdebug,align=4096,image.driver=file,image.filename=$TEST_IMG"
+
+# Four combinations:
+# - Offset 4096, length 1023 * 512 + 512: Fully aligned to 4k
+# - Offset 4096, length 1023 * 512 + 4096: Head is aligned, tail is not
+# - Offset 512, length 1023 * 512 + 512: Neither head nor tail are aligned
+# - Offset 512, length 1023 * 512 + 4096: Tail is aligned, head is not
+for start_offset in 4096 512; do
+ for last_element_length in 512 4096; do
+ length=$((1023 * 512 + $last_element_length))
+
+ echo
+ echo "== performing 1024-element vectored requests to image (offset: $start_offset; length: $length) =="
+
+ # Fill with data for testing
+ $QEMU_IO -c 'write -P 1 0 1M' "$TEST_IMG" | _filter_qemu_io
+
+ # 1023 512-byte buffers, and then one with length $last_element_length
+ cmd_params="-P 2 $start_offset $(yes 512 | head -n 1023 | tr '\n' ' ') $last_element_length"
+ QEMU_IO_OPTIONS="$QEMU_IO_OPTIONS_NO_FMT" $QEMU_IO \
+ -c "writev $cmd_params" \
+ --image-opts \
+ "$IMGSPEC" \
+ | _filter_qemu_io
+
+ # Read all patterns -- read the part we just wrote with writev twice,
+ # once "normally", and once with a readv, so we see that that works, too
+ QEMU_IO_OPTIONS="$QEMU_IO_OPTIONS_NO_FMT" $QEMU_IO \
+ -c "read -P 1 0 $start_offset" \
+ -c "read -P 2 $start_offset $length" \
+ -c "readv $cmd_params" \
+ -c "read -P 1 $((start_offset + length)) $((1024 * 1024 - length - start_offset))" \
+ --image-opts \
+ "$IMGSPEC" \
+ | _filter_qemu_io
+ done
+done
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/tests/iov-padding.out b/tests/qemu-iotests/tests/iov-padding.out
new file mode 100644
index 0000000000..e07a91fac7
--- /dev/null
+++ b/tests/qemu-iotests/tests/iov-padding.out
@@ -0,0 +1,59 @@
+QA output created by iov-padding
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+
+== performing 1024-element vectored requests to image (offset: 4096; length: 524288) ==
+wrote 1048576/1048576 bytes at offset 0
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 524288/524288 bytes at offset 4096
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 4096/4096 bytes at offset 0
+4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 524288/524288 bytes at offset 4096
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 524288/524288 bytes at offset 4096
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 520192/520192 bytes at offset 528384
+508 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== performing 1024-element vectored requests to image (offset: 4096; length: 527872) ==
+wrote 1048576/1048576 bytes at offset 0
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 527872/527872 bytes at offset 4096
+515.500 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 4096/4096 bytes at offset 0
+4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 527872/527872 bytes at offset 4096
+515.500 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 527872/527872 bytes at offset 4096
+515.500 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 516608/516608 bytes at offset 531968
+504.500 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== performing 1024-element vectored requests to image (offset: 512; length: 524288) ==
+wrote 1048576/1048576 bytes at offset 0
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 524288/524288 bytes at offset 512
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 0
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 524288/524288 bytes at offset 512
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 524288/524288 bytes at offset 512
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 523776/523776 bytes at offset 524800
+511.500 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== performing 1024-element vectored requests to image (offset: 512; length: 527872) ==
+wrote 1048576/1048576 bytes at offset 0
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 527872/527872 bytes at offset 512
+515.500 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 0
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 527872/527872 bytes at offset 512
+515.500 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 527872/527872 bytes at offset 512
+515.500 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 520192/520192 bytes at offset 528384
+508 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+*** done
--
2.39.3

View File

@ -0,0 +1,71 @@
From 8f19df61a101c1e57a1bce8adddb57a4a7123a77 Mon Sep 17 00:00:00 2001
From: Thomas Huth <thuth@redhat.com>
Date: Tue, 16 May 2023 11:05:56 +0200
Subject: [PATCH 11/15] lsi53c895a: disable reentrancy detection for MMIO
region, too
RH-Author: Jon Maloy <jmaloy@redhat.com>
RH-MergeRequest: 277: memory: prevent dma-reentracy issues
RH-Bugzilla: 1999236
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [11/12] 8016c86f8432f5ea06c831d1181e87e6d45a6a50 (redhat/rhel/src/qemu-kvm/jons-qemu-kvm-2)
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1999236
Upstream: Merged
CVE: CVE-2021-3750
commit d139fe9ad8a27bcc50b4ead77d2f97d191a0e95e
Author: Thomas Huth <thuth@redhat.com>
Date: Tue May 16 11:05:56 2023 +0200
lsi53c895a: disable reentrancy detection for MMIO region, too
While trying to use a SCSI disk on the LSI controller with an
older version of Fedora (25), I'm getting:
qemu: warning: Blocked re-entrant IO on MemoryRegion: lsi-mmio at addr: 0x34
and the SCSI controller is not usable. Seems like we have to
disable the reentrancy checker for the MMIO region, too, to
get this working again.
The problem could be reproduced it like this:
./qemu-system-x86_64 -accel kvm -m 2G -machine q35 \
-device lsi53c810,id=lsi1 -device scsi-hd,drive=d0 \
-drive if=none,id=d0,file=.../somedisk.qcow2 \
-cdrom Fedora-Everything-netinst-i386-25-1.3.iso
Where somedisk.qcow2 is an image that contains already some partitions
and file systems.
In the boot menu of Fedora, go to
"Troubleshooting" -> "Rescue a Fedora system" -> "3) Skip to shell"
Then check "dmesg | grep -i 53c" for failure messages, and try to mount
a partition from somedisk.qcow2.
Message-Id: <20230516090556.553813-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
---
hw/scsi/lsi53c895a.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index 1e15e13fbf..2b9cb2ac5d 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -2306,6 +2306,7 @@ static void lsi_scsi_realize(PCIDevice *dev, Error **errp)
* re-entrancy guard.
*/
s->ram_io.disable_reentrancy_guard = true;
+ s->mmio_io.disable_reentrancy_guard = true;
address_space_init(&s->pci_io_as, pci_address_space_io(dev), "lsi-pci-io");
qdev_init_gpio_out(d, &s->ext_irq, 1);
--
2.37.3

View File

@ -0,0 +1,59 @@
From 3cffdbf3224ac21016dbee69cb2382c322d4bfbb Mon Sep 17 00:00:00 2001
From: Jon Maloy <jmaloy@redhat.com>
Date: Tue, 9 May 2023 10:29:03 -0400
Subject: [PATCH 05/15] lsi53c895a: disable reentrancy detection for script RAM
RH-Author: Jon Maloy <jmaloy@redhat.com>
RH-MergeRequest: 277: memory: prevent dma-reentracy issues
RH-Bugzilla: 1999236
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [5/12] b5334c3a34b38ed1dccf0030d5704e51e00fdce3 (redhat/rhel/src/qemu-kvm/jons-qemu-kvm-2)
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1999236
Upstream: Merged
CVE: CVE-2021-3750
commit bfd6e7ae6a72b84e2eb9574f56e6ec037f05182c
Author: Alexander Bulekov <alxndr@bu.edu>
Date: Thu Apr 27 17:10:10 2023 -0400
lsi53c895a: disable reentrancy detection for script RAM
As the code is designed to use the memory APIs to access the script ram,
disable reentrancy checks for the pseudo-RAM ram_io MemoryRegion.
In the future, ram_io may be converted from an IO to a proper RAM MemoryRegion.
Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20230427211013.2994127-6-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
---
hw/scsi/lsi53c895a.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index 85e907a785..1e15e13fbf 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -2301,6 +2301,12 @@ static void lsi_scsi_realize(PCIDevice *dev, Error **errp)
memory_region_init_io(&s->io_io, OBJECT(s), &lsi_io_ops, s,
"lsi-io", 256);
+ /*
+ * Since we use the address-space API to interact with ram_io, disable the
+ * re-entrancy guard.
+ */
+ s->ram_io.disable_reentrancy_guard = true;
+
address_space_init(&s->pci_io_as, pci_address_space_io(dev), "lsi-pci-io");
qdev_init_gpio_out(d, &s->ext_irq, 1);
--
2.37.3

View File

@ -0,0 +1,151 @@
From e0c811c2d13f995fe1b095f48637316be5978b0e Mon Sep 17 00:00:00 2001
From: Jon Maloy <jmaloy@redhat.com>
Date: Tue, 9 May 2023 10:29:03 -0400
Subject: [PATCH 01/15] memory: prevent dma-reentracy issues
RH-Author: Jon Maloy <jmaloy@redhat.com>
RH-MergeRequest: 277: memory: prevent dma-reentracy issues
RH-Bugzilla: 1999236
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [1/12] 8fced41b4b2105343e8f0250286b771bcb43c81f (redhat/rhel/src/qemu-kvm/jons-qemu-kvm-2)
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1999236
Upstream: Merged
CVE: CVE-2021-3750
CVE: CVE-2023-0330
commit a2e1753b8054344f32cf94f31c6399a58794a380
Author: Alexander Bulekov <alxndr@bu.edu>
Date: Thu Apr 27 17:10:06 2023 -0400
memory: prevent dma-reentracy issues
Add a flag to the DeviceState, when a device is engaged in PIO/MMIO/DMA.
This flag is set/checked prior to calling a device's MemoryRegion
handlers, and set when device code initiates DMA. The purpose of this
flag is to prevent two types of DMA-based reentrancy issues:
1.) mmio -> dma -> mmio case
2.) bh -> dma write -> mmio case
These issues have led to problems such as stack-exhaustion and
use-after-frees.
Summary of the problem from Peter Maydell:
https://lore.kernel.org/qemu-devel/CAFEAcA_23vc7hE3iaM-JVA6W38LK4hJoWae5KcknhPRD5fPBZA@mail.gmail.com
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/62
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/540
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/541
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/556
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/557
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/827
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1282
Resolves: CVE-2023-0330
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230427211013.2994127-2-alxndr@bu.edu>
[thuth: Replace warn_report() with warn_report_once()]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
---
include/exec/memory.h | 5 +++++
include/hw/qdev-core.h | 7 +++++++
softmmu/memory.c | 16 ++++++++++++++++
3 files changed, 28 insertions(+)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 20f1b27377..e089f90f9b 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -734,6 +734,8 @@ struct MemoryRegion {
bool is_iommu;
RAMBlock *ram_block;
Object *owner;
+ /* owner as TYPE_DEVICE. Used for re-entrancy checks in MR access hotpath */
+ DeviceState *dev;
const MemoryRegionOps *ops;
void *opaque;
@@ -757,6 +759,9 @@ struct MemoryRegion {
unsigned ioeventfd_nb;
MemoryRegionIoeventfd *ioeventfds;
RamDiscardManager *rdm; /* Only for RAM */
+
+ /* For devices designed to perform re-entrant IO into their own IO MRs */
+ bool disable_reentrancy_guard;
};
struct IOMMUMemoryRegion {
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 20d3066595..14226f860d 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -162,6 +162,10 @@ struct NamedClockList {
QLIST_ENTRY(NamedClockList) node;
};
+typedef struct {
+ bool engaged_in_io;
+} MemReentrancyGuard;
+
/**
* DeviceState:
* @realized: Indicates whether the device has been fully constructed.
@@ -193,6 +197,9 @@ struct DeviceState {
int instance_id_alias;
int alias_required_for_version;
ResettableState reset;
+
+ /* Is the device currently in mmio/pio/dma? Used to prevent re-entrancy */
+ MemReentrancyGuard mem_reentrancy_guard;
};
struct DeviceListener {
diff --git a/softmmu/memory.c b/softmmu/memory.c
index 7340e19ff5..102f0a4248 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -541,6 +541,18 @@ static MemTxResult access_with_adjusted_size(hwaddr addr,
access_size_max = 4;
}
+ /* Do not allow more than one simultaneous access to a device's IO Regions */
+ if (mr->dev && !mr->disable_reentrancy_guard &&
+ !mr->ram_device && !mr->ram && !mr->rom_device && !mr->readonly) {
+ if (mr->dev->mem_reentrancy_guard.engaged_in_io) {
+ warn_report_once("Blocked re-entrant IO on MemoryRegion: "
+ "%s at addr: 0x%" HWADDR_PRIX,
+ memory_region_name(mr), addr);
+ return MEMTX_ACCESS_ERROR;
+ }
+ mr->dev->mem_reentrancy_guard.engaged_in_io = true;
+ }
+
/* FIXME: support unaligned access? */
access_size = MAX(MIN(size, access_size_max), access_size_min);
access_mask = MAKE_64BIT_MASK(0, access_size * 8);
@@ -555,6 +567,9 @@ static MemTxResult access_with_adjusted_size(hwaddr addr,
access_mask, attrs);
}
}
+ if (mr->dev) {
+ mr->dev->mem_reentrancy_guard.engaged_in_io = false;
+ }
return r;
}
@@ -1169,6 +1184,7 @@ static void memory_region_do_init(MemoryRegion *mr,
}
mr->name = g_strdup(name);
mr->owner = owner;
+ mr->dev = (DeviceState *) object_dynamic_cast(mr->owner, TYPE_DEVICE);
mr->ram_block = NULL;
if (name) {
--
2.37.3

View File

@ -0,0 +1,68 @@
From c24e38eb508b3fb42ce3ea62fe8de0be6a95a6a8 Mon Sep 17 00:00:00 2001
From: Jon Maloy <jmaloy@redhat.com>
Date: Wed, 7 Jun 2023 11:45:09 -0400
Subject: [PATCH 10/15] memory: stricter checks prior to unsetting
engaged_in_io
RH-Author: Jon Maloy <jmaloy@redhat.com>
RH-MergeRequest: 277: memory: prevent dma-reentracy issues
RH-Bugzilla: 1999236
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [10/12] 773b62a84b2bd4f5ee7fb8e1cfb3bb91c3a01de1 (redhat/rhel/src/qemu-kvm/jons-qemu-kvm-2)
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1999236
Upstream: Merged
CVE: CVE-2021-3750
commit 3884bf6468ac6bbb58c2b3feaa74e87f821b52f3
Author: Alexander Bulekov <alxndr@bu.edu>
Date: Tue May 16 04:40:02 2023 -0400
memory: stricter checks prior to unsetting engaged_in_io
engaged_in_io could be unset by an MR with re-entrancy checks disabled.
Ensure that only MRs that can set the engaged_in_io flag can unset it.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20230516084002.3813836-1-alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
---
softmmu/memory.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/softmmu/memory.c b/softmmu/memory.c
index 102f0a4248..6b98615357 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -533,6 +533,7 @@ static MemTxResult access_with_adjusted_size(hwaddr addr,
unsigned access_size;
unsigned i;
MemTxResult r = MEMTX_OK;
+ bool reentrancy_guard_applied = false;
if (!access_size_min) {
access_size_min = 1;
@@ -551,6 +552,7 @@ static MemTxResult access_with_adjusted_size(hwaddr addr,
return MEMTX_ACCESS_ERROR;
}
mr->dev->mem_reentrancy_guard.engaged_in_io = true;
+ reentrancy_guard_applied = true;
}
/* FIXME: support unaligned access? */
@@ -567,7 +569,7 @@ static MemTxResult access_with_adjusted_size(hwaddr addr,
access_mask, attrs);
}
}
- if (mr->dev) {
+ if (mr->dev && reentrancy_guard_applied) {
mr->dev->mem_reentrancy_guard.engaged_in_io = false;
}
return r;
--
2.37.3

View File

@ -0,0 +1,111 @@
From a1f2a51d1a789c46e806adb332236ca16d538bf9 Mon Sep 17 00:00:00 2001
From: Eric Blake <eblake@redhat.com>
Date: Tue, 2 May 2023 15:52:12 -0500
Subject: [PATCH 3/5] migration: Attempt disk reactivation in more failure
scenarios
RH-Author: Eric Blake <eblake@redhat.com>
RH-MergeRequest: 273: migration: prevent source core dump if NFS dies mid-migration
RH-Bugzilla: 2177957
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: quintela1 <quintela@redhat.com>
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
RH-Commit: [3/3] e84bf1e7233c0273ca3136ecaa6b2cfc9c0efacb (ebblake/qemu-kvm)
Commit fe904ea824 added a fail_inactivate label, which tries to
reactivate disks on the source after a failure while s->state ==
MIGRATION_STATUS_ACTIVE, but didn't actually use the label if
qemu_savevm_state_complete_precopy() failed. This failure to
reactivate is also present in commit 6039dd5b1c (also covering the new
s->state == MIGRATION_STATUS_DEVICE state) and 403d18ae (ensuring
s->block_inactive is set more reliably).
Consolidate the two labels back into one - no matter HOW migration is
failed, if there is any chance we can reach vm_start() after having
attempted inactivation, it is essential that we have tried to restart
disks before then. This also makes the cleanup more like
migrate_fd_cancel().
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20230502205212.134680-1-eblake@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 6dab4c93ecfae48e2e67b984d1032c1e988d3005)
[eblake: downstream migrate_colo() => migrate_colo_enabled()]
Signed-off-by: Eric Blake <eblake@redhat.com>
---
migration/migration.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 6ba8eb0fdf..817170d52d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3255,6 +3255,11 @@ static void migration_completion(MigrationState *s)
MIGRATION_STATUS_DEVICE);
}
if (ret >= 0) {
+ /*
+ * Inactivate disks except in COLO, and track that we
+ * have done so in order to remember to reactivate
+ * them if migration fails or is cancelled.
+ */
s->block_inactive = !migrate_colo_enabled();
qemu_file_set_rate_limit(s->to_dst_file, INT64_MAX);
ret = qemu_savevm_state_complete_precopy(s->to_dst_file, false,
@@ -3290,13 +3295,13 @@ static void migration_completion(MigrationState *s)
rp_error = await_return_path_close_on_source(s);
trace_migration_return_path_end_after(rp_error);
if (rp_error) {
- goto fail_invalidate;
+ goto fail;
}
}
if (qemu_file_get_error(s->to_dst_file)) {
trace_migration_completion_file_err();
- goto fail_invalidate;
+ goto fail;
}
if (!migrate_colo_enabled()) {
@@ -3306,26 +3311,25 @@ static void migration_completion(MigrationState *s)
return;
-fail_invalidate:
- /* If not doing postcopy, vm_start() will be called: let's regain
- * control on images.
- */
- if (s->state == MIGRATION_STATUS_ACTIVE ||
- s->state == MIGRATION_STATUS_DEVICE) {
+fail:
+ if (s->block_inactive && (s->state == MIGRATION_STATUS_ACTIVE ||
+ s->state == MIGRATION_STATUS_DEVICE)) {
+ /*
+ * If not doing postcopy, vm_start() will be called: let's
+ * regain control on images.
+ */
Error *local_err = NULL;
qemu_mutex_lock_iothread();
bdrv_invalidate_cache_all(&local_err);
if (local_err) {
error_report_err(local_err);
- s->block_inactive = true;
} else {
s->block_inactive = false;
}
qemu_mutex_unlock_iothread();
}
-fail:
migrate_set_state(&s->state, current_active_state,
MIGRATION_STATUS_FAILED);
}
--
2.39.1

View File

@ -0,0 +1,59 @@
From dd6d0eace90285c017ae40cba0ffa95ccd963ebd Mon Sep 17 00:00:00 2001
From: Leonardo Bras <leobras@redhat.com>
Date: Tue, 20 Jun 2023 14:51:03 -0300
Subject: [PATCH 15/15] migration: Disable postcopy + multifd migration
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Leonardo Brás <leobras@redhat.com>
RH-MergeRequest: 287: migration: Disable postcopy + multifd migration
RH-Bugzilla: 2169733
RH-Acked-by: Peter Xu <peterx@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [1/1] 07d26fbac35b7586fe790304f03d316ed26a4ef2
Since the introduction of multifd, it's possible to perform a multifd
migration and finish it using postcopy.
A bug introduced by yank (fixed on cfc3bcf373) was previously preventing
a successful use of this migration scenario, and now thing should be
working on most scenarios.
But since there is not enough testing/support nor any reported users for
this scenario, we should disable this combination before it may cause any
problems for users.
Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit b405dfff1ea3cf0530b628895b5a7a50dc8c6996)
[leobras: moves logic from options.c -> migration.c and use cap_list
instead of new_caps for backward compatibility]
Signed-off-by: Leonardo Bras <leobras@redhat.com>
---
migration/migration.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/migration/migration.c b/migration/migration.c
index 817170d52d..1ad82e63f0 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1246,6 +1246,11 @@ static bool migrate_caps_check(bool *cap_list,
error_setg(errp, "Postcopy is not compatible with ignore-shared");
return false;
}
+
+ if (cap_list[MIGRATION_CAPABILITY_MULTIFD]) {
+ error_setg(errp, "Postcopy is not yet compatible with multifd");
+ return false;
+ }
}
if (cap_list[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT]) {
--
2.37.3

View File

@ -0,0 +1,117 @@
From 1b07c7663b6a5c19c9303088d63c39dba7e3bb36 Mon Sep 17 00:00:00 2001
From: Eric Blake <eblake@redhat.com>
Date: Fri, 14 Apr 2023 10:33:58 -0500
Subject: [PATCH 1/5] migration: Handle block device inactivation failures
better
RH-Author: Eric Blake <eblake@redhat.com>
RH-MergeRequest: 273: migration: prevent source core dump if NFS dies mid-migration
RH-Bugzilla: 2177957
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: quintela1 <quintela@redhat.com>
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
RH-Commit: [1/3] 5892c17ca0a21d824d176e7398d12f7cf991651d (ebblake/qemu-kvm)
Consider what happens when performing a migration between two host
machines connected to an NFS server serving multiple block devices to
the guest, when the NFS server becomes unavailable. The migration
attempts to inactivate all block devices on the source (a necessary
step before the destination can take over); but if the NFS server is
non-responsive, the attempt to inactivate can itself fail. When that
happens, the destination fails to get the migrated guest (good,
because the source wasn't able to flush everything properly):
(qemu) qemu-kvm: load of migration failed: Input/output error
at which point, our only hope for the guest is for the source to take
back control. With the current code base, the host outputs a message, but then appears to resume:
(qemu) qemu-kvm: qemu_savevm_state_complete_precopy_non_iterable: bdrv_inactivate_all() failed (-1)
(src qemu)info status
VM status: running
but a second migration attempt now asserts:
(src qemu) qemu-kvm: ../block.c:6738: int bdrv_inactivate_recurse(BlockDriverState *): Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed.
Whether the guest is recoverable on the source after the first failure
is debatable, but what we do not want is to have qemu itself fail due
to an assertion. It looks like the problem is as follows:
In migration.c:migration_completion(), the source sets 'inactivate' to
true (since COLO is not enabled), then tries
savevm.c:qemu_savevm_state_complete_precopy() with a request to
inactivate block devices. In turn, this calls
block.c:bdrv_inactivate_all(), which fails when flushing runs up
against the non-responsive NFS server. With savevm failing, we are
now left in a state where some, but not all, of the block devices have
been inactivated; but migration_completion() then jumps to 'fail'
rather than 'fail_invalidate' and skips an attempt to reclaim those
those disks by calling bdrv_activate_all(). Even if we do attempt to
reclaim disks, we aren't taking note of failure there, either.
Thus, we have reached a state where the migration engine has forgotten
all state about whether a block device is inactive, because we did not
set s->block_inactive in enough places; so migration allows the source
to reach vm_start() and resume execution, violating the block layer
invariant that the guest CPUs should not be restarted while a device
is inactive. Note that the code in migration.c:migrate_fd_cancel()
will also try to reactivate all block devices if s->block_inactive was
set, but because we failed to set that flag after the first failure,
the source assumes it has reclaimed all devices, even though it still
has remaining inactivated devices and does not try again. Normally,
qmp_cont() will also try to reactivate all disks (or correctly fail if
the disks are not reclaimable because NFS is not yet back up), but the
auto-resumption of the source after a migration failure does not go
through qmp_cont(). And because we have left the block layer in an
inconsistent state with devices still inactivated, the later migration
attempt is hitting the assertion failure.
Since it is important to not resume the source with inactive disks,
this patch marks s->block_inactive before attempting inactivation,
rather than after succeeding, in order to prevent any vm_start() until
it has successfully reactivated all devices.
See also https://bugzilla.redhat.com/show_bug.cgi?id=2058982
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Acked-by: Lukas Straub <lukasstraub2@web.de>
Tested-by: Lukas Straub <lukasstraub2@web.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 403d18ae384239876764bbfa111d6cc5dcb673d1)
---
migration/migration.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 0885549de0..08e5e8f013 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3256,13 +3256,11 @@ static void migration_completion(MigrationState *s)
MIGRATION_STATUS_DEVICE);
}
if (ret >= 0) {
+ s->block_inactive = inactivate;
qemu_file_set_rate_limit(s->to_dst_file, INT64_MAX);
ret = qemu_savevm_state_complete_precopy(s->to_dst_file, false,
inactivate);
}
- if (inactivate && ret >= 0) {
- s->block_inactive = true;
- }
}
qemu_mutex_unlock_iothread();
@@ -3321,6 +3319,7 @@ fail_invalidate:
bdrv_invalidate_cache_all(&local_err);
if (local_err) {
error_report_err(local_err);
+ s->block_inactive = true;
} else {
s->block_inactive = false;
}
--
2.39.1

View File

@ -0,0 +1,53 @@
From e79d0506184e861350d2a3e62dd986aa03d30aa8 Mon Sep 17 00:00:00 2001
From: Eric Blake <eblake@redhat.com>
Date: Thu, 20 Apr 2023 09:35:51 -0500
Subject: [PATCH 2/5] migration: Minor control flow simplification
RH-Author: Eric Blake <eblake@redhat.com>
RH-MergeRequest: 273: migration: prevent source core dump if NFS dies mid-migration
RH-Bugzilla: 2177957
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: quintela1 <quintela@redhat.com>
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
RH-Commit: [2/3] f00b21b6ebd377af79af93ac18f103f8dc0309d6 (ebblake/qemu-kvm)
No need to declare a temporary variable.
Suggested-by: Juan Quintela <quintela@redhat.com>
Fixes: 1df36e8c6289 ("migration: Handle block device inactivation failures better")
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 5d39f44d7ac5c63f53d4d0900ceba9521bc27e49)
---
migration/migration.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 08e5e8f013..6ba8eb0fdf 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3248,7 +3248,6 @@ static void migration_completion(MigrationState *s)
ret = global_state_store();
if (!ret) {
- bool inactivate = !migrate_colo_enabled();
ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
trace_migration_completion_vm_stop(ret);
if (ret >= 0) {
@@ -3256,10 +3255,10 @@ static void migration_completion(MigrationState *s)
MIGRATION_STATUS_DEVICE);
}
if (ret >= 0) {
- s->block_inactive = inactivate;
+ s->block_inactive = !migrate_colo_enabled();
qemu_file_set_rate_limit(s->to_dst_file, INT64_MAX);
ret = qemu_savevm_state_complete_precopy(s->to_dst_file, false,
- inactivate);
+ s->block_inactive);
}
}
qemu_mutex_unlock_iothread();
--
2.39.1

View File

@ -0,0 +1,55 @@
From 17c5524ada3f2ca9a9c645f540bedc5575302059 Mon Sep 17 00:00:00 2001
From: Eric Blake <eblake@redhat.com>
Date: Mon, 3 Apr 2023 19:40:47 -0500
Subject: [PATCH 5/5] nbd/server: Request TCP_NODELAY
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Eric Blake <eblake@redhat.com>
RH-MergeRequest: 274: nbd: improve TLS performance of NBD server
RH-Bugzilla: 2035712
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Commit: [2/2] 092145077756cda2a4f849c5911031b0fc4a2134 (ebblake/qemu-kvm)
Nagle's algorithm adds latency in order to reduce network packet
overhead on small packets. But when we are already using corking to
merge smaller packets into transactional requests, the extra delay
from TCP defaults just gets in the way (see recent commit bd2cd4a4).
For reference, qemu as an NBD client already requests TCP_NODELAY (see
nbd_connect() in nbd/client-connection.c); as does libnbd as a client
[1], and nbdkit as a server [2]. Furthermore, the NBD spec recommends
the use of TCP_NODELAY [3].
[1] https://gitlab.com/nbdkit/libnbd/-/blob/a48a1142/generator/states-connect.c#L39
[2] https://gitlab.com/nbdkit/nbdkit/-/blob/45b72f5b/server/sockets.c#L430
[3] https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md#protocol-phases
CC: Florian Westphal <fw@strlen.de>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20230404004047.142086-1-eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit f1426881a827a6d3f31b65616c4a8db1e9e7c45e)
Signed-off-by: Eric Blake <eblake@redhat.com>
---
nbd/server.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/nbd/server.c b/nbd/server.c
index a5edc7f681..6db124cf53 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -2738,6 +2738,7 @@ void nbd_client_new(QIOChannelSocket *sioc,
}
client->tlsauthz = g_strdup(tlsauthz);
client->sioc = sioc;
+ qio_channel_set_delay(QIO_CHANNEL(sioc), false);
object_ref(OBJECT(client->sioc));
client->ioc = QIO_CHANNEL(sioc);
object_ref(OBJECT(client->ioc));
--
2.39.1

View File

@ -0,0 +1,72 @@
From 170872370c6f3c916e741eb32d80431995d7a870 Mon Sep 17 00:00:00 2001
From: Florian Westphal <fw@strlen.de>
Date: Fri, 24 Mar 2023 11:47:20 +0100
Subject: [PATCH 4/5] nbd/server: push pending frames after sending reply
RH-Author: Eric Blake <eblake@redhat.com>
RH-MergeRequest: 274: nbd: improve TLS performance of NBD server
RH-Bugzilla: 2035712
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Commit: [1/2] ab92c06c48810aa40380de0433dcac4c6e4be9a5 (ebblake/qemu-kvm)
qemu-nbd doesn't set TCP_NODELAY on the tcp socket.
Kernel waits for more data and avoids transmission of small packets.
Without TLS this is barely noticeable, but with TLS this really shows.
Booting a VM via qemu-nbd on localhost (with tls) takes more than
2 minutes on my system. tcpdump shows frequent wait periods, where no
packets get sent for a 40ms period.
Add explicit (un)corking when processing (and responding to) requests.
"TCP_CORK, &zero" after earlier "CORK, &one" will flush pending data.
VM Boot time:
main: no tls: 23s, with tls: 2m45s
patched: no tls: 14s, with tls: 15s
VM Boot time, qemu-nbd via network (same lan):
main: no tls: 18s, with tls: 1m50s
patched: no tls: 17s, with tls: 18s
Future optimization: if we could detect if there is another pending
request we could defer the uncork operation because more data would be
appended.
Signed-off-by: Florian Westphal <fw@strlen.de>
Message-Id: <20230324104720.2498-1-fw@strlen.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit bd2cd4a441ded163b62371790876f28a9b834317)
Signed-off-by: Eric Blake <eblake@redhat.com>
---
nbd/server.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/nbd/server.c b/nbd/server.c
index 4630dd7322..a5edc7f681 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -2647,6 +2647,8 @@ static coroutine_fn void nbd_trip(void *opaque)
goto disconnect;
}
+ qio_channel_set_cork(client->ioc, true);
+
if (ret < 0) {
/* It wans't -EIO, so, according to nbd_co_receive_request()
* semantics, we should return the error to the client. */
@@ -2672,6 +2674,7 @@ static coroutine_fn void nbd_trip(void *opaque)
goto disconnect;
}
+ qio_channel_set_cork(client->ioc, false);
done:
nbd_request_put(req);
nbd_client_put(client);
--
2.39.1

View File

@ -0,0 +1,376 @@
From e11cffc152d9af9194139a37f86e357cb36298e8 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Thu, 25 May 2023 12:50:19 +0200
Subject: [PATCH 22/22] pc-bios: Add support for List-Directed IPL from ECKD
DASD
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [21/21] cab945af05566d892459a7c8ea3f114310d6bb67
Bugzilla: https://bugzilla.redhat.com/2209605
commit 8af5d141713f5d20c4bc1719eb746ef8b1746bd6
Author: Jared Rossi <jrossi@linux.ibm.com>
Date: Tue Feb 21 12:45:48 2023 -0500
pc-bios: Add support for List-Directed IPL from ECKD DASD
Check for a List Directed IPL Boot Record, which would supersede the CCW type
entries. If the record is valid, proceed to use the new style pointers
and perform LD-IPL. Each block pointer is interpreted as either an LD-IPL
pointer or a legacy CCW pointer depending on the type of IPL initiated.
In either case CCW- or LD-IPL is transparent to the user and will boot the same
image regardless of which set of pointers is used. Because the interactive boot
menu is only written with the old style pointers, the menu will be disabled for
List Directed IPL from ECKD DASD.
If the LD-IPL fails, retry the IPL using the CCW type pointers.
If no LD-IPL boot record is found, simply perform CCW type IPL as usual.
Signed-off-by: Jared Rossi <jrossi@linux.ibm.com>
Message-Id: <20230221174548.1866861-2-jrossi@linux.ibm.com>
[thuth: Drop some superfluous parantheses]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
pc-bios/s390-ccw/bootmap.c | 157 ++++++++++++++++++++++++++++---------
pc-bios/s390-ccw/bootmap.h | 30 ++++++-
2 files changed, 148 insertions(+), 39 deletions(-)
diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index 994e59c0b0..a2137449dc 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -72,42 +72,74 @@ static inline void verify_boot_info(BootInfo *bip)
"Bad block size in zIPL section of the 1st record.");
}
-static block_number_t eckd_block_num(EckdCHS *chs)
+static void eckd_format_chs(ExtEckdBlockPtr *ptr, bool ldipl,
+ uint64_t *c,
+ uint64_t *h,
+ uint64_t *s)
+{
+ if (ldipl) {
+ *c = ptr->ldptr.chs.cylinder;
+ *h = ptr->ldptr.chs.head;
+ *s = ptr->ldptr.chs.sector;
+ } else {
+ *c = ptr->bptr.chs.cylinder;
+ *h = ptr->bptr.chs.head;
+ *s = ptr->bptr.chs.sector;
+ }
+}
+
+static block_number_t eckd_chs_to_block(uint64_t c, uint64_t h, uint64_t s)
{
const uint64_t sectors = virtio_get_sectors();
const uint64_t heads = virtio_get_heads();
- const uint64_t cylinder = chs->cylinder
- + ((chs->head & 0xfff0) << 12);
- const uint64_t head = chs->head & 0x000f;
+ const uint64_t cylinder = c + ((h & 0xfff0) << 12);
+ const uint64_t head = h & 0x000f;
const block_number_t block = sectors * heads * cylinder
+ sectors * head
- + chs->sector
- - 1; /* block nr starts with zero */
+ + s - 1; /* block nr starts with zero */
return block;
}
-static bool eckd_valid_address(BootMapPointer *p)
+static block_number_t eckd_block_num(EckdCHS *chs)
{
- const uint64_t head = p->eckd.chs.head & 0x000f;
+ return eckd_chs_to_block(chs->cylinder, chs->head, chs->sector);
+}
+
+static block_number_t gen_eckd_block_num(ExtEckdBlockPtr *ptr, bool ldipl)
+{
+ uint64_t cyl, head, sec;
+ eckd_format_chs(ptr, ldipl, &cyl, &head, &sec);
+ return eckd_chs_to_block(cyl, head, sec);
+}
+static bool eckd_valid_chs(uint64_t cyl, uint64_t head, uint64_t sector)
+{
if (head >= virtio_get_heads()
- || p->eckd.chs.sector > virtio_get_sectors()
- || p->eckd.chs.sector <= 0) {
+ || sector > virtio_get_sectors()
+ || sector <= 0) {
return false;
}
if (!virtio_guessed_disk_nature() &&
- eckd_block_num(&p->eckd.chs) >= virtio_get_blocks()) {
+ eckd_chs_to_block(cyl, head, sector) >= virtio_get_blocks()) {
return false;
}
return true;
}
-static block_number_t load_eckd_segments(block_number_t blk, uint64_t *address)
+static bool eckd_valid_address(ExtEckdBlockPtr *ptr, bool ldipl)
+{
+ uint64_t cyl, head, sec;
+ eckd_format_chs(ptr, ldipl, &cyl, &head, &sec);
+ return eckd_valid_chs(cyl, head, sec);
+}
+
+static block_number_t load_eckd_segments(block_number_t blk, bool ldipl,
+ uint64_t *address)
{
block_number_t block_nr;
- int j, rc;
+ int j, rc, count;
BootMapPointer *bprs = (void *)_bprs;
bool more_data;
@@ -117,7 +149,7 @@ static block_number_t load_eckd_segments(block_number_t blk, uint64_t *address)
do {
more_data = false;
for (j = 0;; j++) {
- block_nr = eckd_block_num(&bprs[j].xeckd.bptr.chs);
+ block_nr = gen_eckd_block_num(&bprs[j].xeckd, ldipl);
if (is_null_block_number(block_nr)) { /* end of chunk */
break;
}
@@ -129,11 +161,26 @@ static block_number_t load_eckd_segments(block_number_t blk, uint64_t *address)
break;
}
- IPL_assert(block_size_ok(bprs[j].xeckd.bptr.size),
+ /* List directed pointer does not store block size */
+ IPL_assert(ldipl || block_size_ok(bprs[j].xeckd.bptr.size),
"bad chunk block size");
- IPL_assert(eckd_valid_address(&bprs[j]), "bad chunk ECKD addr");
- if ((bprs[j].xeckd.bptr.count == 0) && unused_space(&(bprs[j+1]),
+ if (!eckd_valid_address(&bprs[j].xeckd, ldipl)) {
+ /*
+ * If an invalid address is found during LD-IPL then break and
+ * retry as CCW
+ */
+ IPL_assert(ldipl, "bad chunk ECKD addr");
+ break;
+ }
+
+ if (ldipl) {
+ count = bprs[j].xeckd.ldptr.count;
+ } else {
+ count = bprs[j].xeckd.bptr.count;
+ }
+
+ if (count == 0 && unused_space(&bprs[j + 1],
sizeof(EckdBlockPtr))) {
/* This is a "continue" pointer.
* This ptr should be the last one in the current
@@ -149,11 +196,10 @@ static block_number_t load_eckd_segments(block_number_t blk, uint64_t *address)
/* Load (count+1) blocks of code at (block_nr)
* to memory (address).
*/
- rc = virtio_read_many(block_nr, (void *)(*address),
- bprs[j].xeckd.bptr.count+1);
+ rc = virtio_read_many(block_nr, (void *)(*address), count + 1);
IPL_assert(rc == 0, "code chunk read failed");
- *address += (bprs[j].xeckd.bptr.count+1) * virtio_get_block_size();
+ *address += (count + 1) * virtio_get_block_size();
}
} while (more_data);
return block_nr;
@@ -237,8 +283,10 @@ static void run_eckd_boot_script(block_number_t bmt_block_nr,
uint64_t address;
BootMapTable *bmt = (void *)sec;
BootMapScript *bms = (void *)sec;
+ /* The S1B block number is NULL_BLOCK_NR if and only if it's an LD-IPL */
+ bool ldipl = (s1b_block_nr == NULL_BLOCK_NR);
- if (menu_is_enabled_zipl()) {
+ if (menu_is_enabled_zipl() && !ldipl) {
loadparm = eckd_get_boot_menu_index(s1b_block_nr);
}
@@ -249,7 +297,7 @@ static void run_eckd_boot_script(block_number_t bmt_block_nr,
memset(sec, FREE_SPACE_FILLER, sizeof(sec));
read_block(bmt_block_nr, sec, "Cannot read Boot Map Table");
- block_nr = eckd_block_num(&bmt->entry[loadparm].xeckd.bptr.chs);
+ block_nr = gen_eckd_block_num(&bmt->entry[loadparm].xeckd, ldipl);
IPL_assert(block_nr != -1, "Cannot find Boot Map Table Entry");
memset(sec, FREE_SPACE_FILLER, sizeof(sec));
@@ -264,13 +312,18 @@ static void run_eckd_boot_script(block_number_t bmt_block_nr,
}
address = bms->entry[i].address.load_address;
- block_nr = eckd_block_num(&bms->entry[i].blkptr.xeckd.bptr.chs);
+ block_nr = gen_eckd_block_num(&bms->entry[i].blkptr.xeckd, ldipl);
do {
- block_nr = load_eckd_segments(block_nr, &address);
+ block_nr = load_eckd_segments(block_nr, ldipl, &address);
} while (block_nr != -1);
}
+ if (ldipl && bms->entry[i].type != BOOT_SCRIPT_EXEC) {
+ /* Abort LD-IPL and retry as CCW-IPL */
+ return;
+ }
+
IPL_assert(bms->entry[i].type == BOOT_SCRIPT_EXEC,
"Unknown script entry type");
write_reset_psw(bms->entry[i].address.load_address); /* no return */
@@ -380,6 +433,23 @@ static void ipl_eckd_ldl(ECKD_IPL_mode_t mode)
/* no return */
}
+static block_number_t eckd_find_bmt(ExtEckdBlockPtr *ptr)
+{
+ block_number_t blockno;
+ uint8_t tmp_sec[MAX_SECTOR_SIZE];
+ BootRecord *br;
+
+ blockno = gen_eckd_block_num(ptr, 0);
+ read_block(blockno, tmp_sec, "Cannot read boot record");
+ br = (BootRecord *)tmp_sec;
+ if (!magic_match(br->magic, ZIPL_MAGIC)) {
+ /* If the boot record is invalid, return and try CCW-IPL instead */
+ return NULL_BLOCK_NR;
+ }
+
+ return gen_eckd_block_num(&br->pgt.xeckd, 1);
+}
+
static void print_eckd_msg(void)
{
char msg[] = "Using ECKD scheme (block size *****), ";
@@ -401,28 +471,43 @@ static void print_eckd_msg(void)
static void ipl_eckd(void)
{
- XEckdMbr *mbr = (void *)sec;
- LDL_VTOC *vlbl = (void *)sec;
+ IplVolumeLabel *vlbl = (void *)sec;
+ LDL_VTOC *vtoc = (void *)sec;
+ block_number_t ldipl_bmt; /* Boot Map Table for List-Directed IPL */
print_eckd_msg();
- /* Grab the MBR again */
+ /* Block 2 can contain either the CDL VOL1 label or the LDL VTOC */
memset(sec, FREE_SPACE_FILLER, sizeof(sec));
- read_block(0, mbr, "Cannot read block 0 on DASD");
+ read_block(2, vlbl, "Cannot read block 2");
- if (magic_match(mbr->magic, IPL1_MAGIC)) {
- ipl_eckd_cdl(); /* only returns in case of error */
- return;
+ /*
+ * First check for a list-directed-format pointer which would
+ * supersede the CCW pointer.
+ */
+ if (eckd_valid_address((ExtEckdBlockPtr *)&vlbl->f.br, 0)) {
+ ldipl_bmt = eckd_find_bmt((ExtEckdBlockPtr *)&vlbl->f.br);
+ if (ldipl_bmt) {
+ sclp_print("List-Directed\n");
+ /* LD-IPL does not use the S1B bock, just make it NULL */
+ run_eckd_boot_script(ldipl_bmt, NULL_BLOCK_NR);
+ /* Only return in error, retry as CCW-IPL */
+ sclp_print("Retrying IPL ");
+ print_eckd_msg();
+ }
+ memset(sec, FREE_SPACE_FILLER, sizeof(sec));
+ read_block(2, vtoc, "Cannot read block 2");
}
- /* LDL/CMS? */
- memset(sec, FREE_SPACE_FILLER, sizeof(sec));
- read_block(2, vlbl, "Cannot read block 2");
+ /* Not list-directed */
+ if (magic_match(vtoc->magic, VOL1_MAGIC)) {
+ ipl_eckd_cdl(); /* may return in error */
+ }
- if (magic_match(vlbl->magic, CMS1_MAGIC)) {
+ if (magic_match(vtoc->magic, CMS1_MAGIC)) {
ipl_eckd_ldl(ECKD_CMS); /* no return */
}
- if (magic_match(vlbl->magic, LNX1_MAGIC)) {
+ if (magic_match(vtoc->magic, LNX1_MAGIC)) {
ipl_eckd_ldl(ECKD_LDL); /* no return */
}
diff --git a/pc-bios/s390-ccw/bootmap.h b/pc-bios/s390-ccw/bootmap.h
index 3946aa3f8d..d4690a88c2 100644
--- a/pc-bios/s390-ccw/bootmap.h
+++ b/pc-bios/s390-ccw/bootmap.h
@@ -45,9 +45,23 @@ typedef struct EckdBlockPtr {
* it's 0 for TablePtr, ScriptPtr, and SectionPtr */
} __attribute__ ((packed)) EckdBlockPtr;
-typedef struct ExtEckdBlockPtr {
+typedef struct LdEckdCHS {
+ uint32_t cylinder;
+ uint8_t head;
+ uint8_t sector;
+} __attribute__ ((packed)) LdEckdCHS;
+
+typedef struct LdEckdBlockPtr {
+ LdEckdCHS chs; /* cylinder/head/sector is an address of the block */
+ uint8_t reserved[4];
+ uint16_t count;
+ uint32_t pad;
+} __attribute__ ((packed)) LdEckdBlockPtr;
+
+/* bptr is used for CCW type IPL, while ldptr is for list-directed IPL */
+typedef union ExtEckdBlockPtr {
EckdBlockPtr bptr;
- uint8_t reserved[8];
+ LdEckdBlockPtr ldptr;
} __attribute__ ((packed)) ExtEckdBlockPtr;
typedef union BootMapPointer {
@@ -57,6 +71,15 @@ typedef union BootMapPointer {
ExtEckdBlockPtr xeckd;
} __attribute__ ((packed)) BootMapPointer;
+typedef struct BootRecord {
+ uint8_t magic[4];
+ uint32_t version;
+ uint64_t res1;
+ BootMapPointer pgt;
+ uint8_t reserved[510 - 32];
+ uint16_t os_id;
+} __attribute__ ((packed)) BootRecord;
+
/* aka Program Table */
typedef struct BootMapTable {
uint8_t magic[4];
@@ -292,7 +315,8 @@ typedef struct IplVolumeLabel {
struct {
unsigned char key[4]; /* == "VOL1" */
unsigned char volser[6];
- unsigned char reserved[6];
+ unsigned char reserved[64];
+ EckdCHS br; /* Location of Boot Record for list-directed IPL */
} f;
};
} __attribute__((packed)) IplVolumeLabel;
--
2.37.3

View File

@ -0,0 +1,55 @@
From 01c09f31978154f0d2fd699621ae958a8c3ea2a5 Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Thu, 9 Mar 2023 08:15:24 -0500
Subject: [PATCH 08/13] physmem: add missing memory barrier
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 263: qatomic: add smp_mb__before/after_rmw()
RH-Bugzilla: 2168472
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Commit: [8/10] f6a9659f7cf40b78de6e85e4a7c06842273aa770
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168472
commit 33828ca11da08436e1b32f3e79dabce3061a0427
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Fri Mar 3 14:36:32 2023 +0100
physmem: add missing memory barrier
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
softmmu/physmem.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 4d0ef5f92f..2b96fad302 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -3087,6 +3087,8 @@ void cpu_register_map_client(QEMUBH *bh)
qemu_mutex_lock(&map_client_list_lock);
client->bh = bh;
QLIST_INSERT_HEAD(&map_client_list, client, link);
+ /* Write map_client_list before reading in_use. */
+ smp_mb();
if (!qatomic_read(&bounce.in_use)) {
cpu_notify_map_clients_locked();
}
@@ -3279,6 +3281,7 @@ void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len,
qemu_vfree(bounce.buffer);
bounce.buffer = NULL;
memory_region_unref(bounce.mr);
+ /* Clear in_use before reading map_client_list. */
qatomic_mb_set(&bounce.in_use, false);
cpu_notify_map_clients();
}
--
2.37.3

View File

@ -0,0 +1,55 @@
From 57ee29fbb08f7b89ee1b7c75b749392c08af3b03 Mon Sep 17 00:00:00 2001
From: Bandan Das <bsd@redhat.com>
Date: Thu, 3 Aug 2023 15:23:54 -0400
Subject: [PATCH 1/5] qapi, i386/sev: Change the reduced-phys-bits value from 5
to 1
RH-Author: Bandan Das <None>
RH-MergeRequest: 296: Updates to SEV reduced-phys-bits parameter
RH-Bugzilla: 2214840
RH-Acked-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [1/4] 4137cb3b57cbb175078bc908fb2301ea2b97fd17
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2214840
commit 798a818f50a9bfc01e8b5943090de458863b897b
Author: Tom Lendacky <thomas.lendacky@amd.com>
Date: Fri Sep 30 10:14:27 2022 -0500
qapi, i386/sev: Change the reduced-phys-bits value from 5 to 1
A guest only ever experiences, at most, 1 bit of reduced physical
addressing. Change the query-sev-capabilities json comment to use 1.
Fixes: 31dd67f684 ("sev/i386: qmp: add query-sev-capabilities command")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <cb96d8e09154533af4b4e6988469bc0b32390b65.1664550870.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
RHEL Notes:
Conflicts: Context differences, since commit 811b4ec7f8eb<qapi, target/i386/sev: Add cpu0-id to query-sev-capabilities>
is missing
Signed-off-by: Bandan Das <bsd@redhat.com>
---
qapi/misc-target.json | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/qapi/misc-target.json b/qapi/misc-target.json
index 4bc45d2474..ede9052440 100644
--- a/qapi/misc-target.json
+++ b/qapi/misc-target.json
@@ -205,7 +205,7 @@
#
# -> { "execute": "query-sev-capabilities" }
# <- { "return": { "pdh": "8CCDD8DDD", "cert-chain": "888CCCDDDEE",
-# "cbitpos": 47, "reduced-phys-bits": 5}}
+# "cbitpos": 47, "reduced-phys-bits": 1}}
#
##
{ 'command': 'query-sev-capabilities', 'returns': 'SevCapability',
--
2.37.3

View File

@ -0,0 +1,177 @@
From e7d0e29d1962092af58d0445439671a6e1d91f71 Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Thu, 9 Mar 2023 08:10:33 -0500
Subject: [PATCH 02/13] qatomic: add smp_mb__before/after_rmw()
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 263: qatomic: add smp_mb__before/after_rmw()
RH-Bugzilla: 2168472
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Commit: [2/10] 1f87eb3157abcf23f020881cedce42f76497f348
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168472
commit ff00bed1897c3d27adc5b0cec6f6eeb5a7d13176
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Thu Mar 2 11:10:56 2023 +0100
qatomic: add smp_mb__before/after_rmw()
On ARM, seqcst loads and stores (which QEMU does not use) are compiled
respectively as LDAR and STLR instructions. Even though LDAR is
also used for load-acquire operations, it also waits for all STLRs to
leave the store buffer. Thus, LDAR and STLR alone are load-acquire
and store-release operations, but LDAR also provides store-against-load
ordering as long as the previous store is a STLR.
Compare this to ARMv7, where store-release is DMB+STR and load-acquire
is LDR+DMB, but an additional DMB is needed between store-seqcst and
load-seqcst (e.g. DMB+STR+DMB+LDR+DMB); or with x86, where MOV provides
load-acquire and store-release semantics and the two can be reordered.
Likewise, on ARM sequentially consistent read-modify-write operations only
need to use LDAXR and STLXR respectively for the load and the store, while
on x86 they need to use the stronger LOCK prefix.
In a strange twist of events, however, the _stronger_ semantics
of the ARM instructions can end up causing bugs on ARM, not on x86.
The problems occur when seqcst atomics are mixed with relaxed atomics.
QEMU's atomics try to bridge the Linux API (that most of the developers
are familiar with) and the C11 API, and the two have a substantial
difference:
- in Linux, strongly-ordered atomics such as atomic_add_return() affect
the global ordering of _all_ memory operations, including for example
READ_ONCE()/WRITE_ONCE()
- in C11, sequentially consistent atomics (except for seq-cst fences)
only affect the ordering of sequentially consistent operations.
In particular, since relaxed loads are done with LDR on ARM, they are
not ordered against seqcst stores (which are done with STLR).
QEMU implements high-level synchronization primitives with the idea that
the primitives contain the necessary memory barriers, and the callers can
use relaxed atomics (qatomic_read/qatomic_set) or even regular accesses.
This is very much incompatible with the C11 view that seqcst accesses
are only ordered against other seqcst accesses, and requires using seqcst
fences as in the following example:
qatomic_set(&y, 1); qatomic_set(&x, 1);
smp_mb(); smp_mb();
... qatomic_read(&x) ... ... qatomic_read(&y) ...
When a qatomic_*() read-modify write operation is used instead of one
or both stores, developers that are more familiar with the Linux API may
be tempted to omit the smp_mb(), which will work on x86 but not on ARM.
This nasty difference between Linux and C11 read-modify-write operations
has already caused issues in util/async.c and more are being found.
Provide something similar to Linux smp_mb__before/after_atomic(); this
has the double function of documenting clearly why there is a memory
barrier, and avoiding a double barrier on x86 and s390x systems.
The new macro can already be put to use in qatomic_mb_set().
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
docs/devel/atomics.rst | 26 +++++++++++++++++++++-----
include/qemu/atomic.h | 17 ++++++++++++++++-
2 files changed, 37 insertions(+), 6 deletions(-)
diff --git a/docs/devel/atomics.rst b/docs/devel/atomics.rst
index 52baa0736d..10fbfc58bb 100644
--- a/docs/devel/atomics.rst
+++ b/docs/devel/atomics.rst
@@ -25,7 +25,8 @@ provides macros that fall in three camps:
- weak atomic access and manual memory barriers: ``qatomic_read()``,
``qatomic_set()``, ``smp_rmb()``, ``smp_wmb()``, ``smp_mb()``,
- ``smp_mb_acquire()``, ``smp_mb_release()``, ``smp_read_barrier_depends()``;
+ ``smp_mb_acquire()``, ``smp_mb_release()``, ``smp_read_barrier_depends()``,
+ ``smp_mb__before_rmw()``, ``smp_mb__after_rmw()``;
- sequentially consistent atomic access: everything else.
@@ -470,7 +471,7 @@ and memory barriers, and the equivalents in QEMU:
sequential consistency.
- in QEMU, ``qatomic_read()`` and ``qatomic_set()`` do not participate in
- the total ordering enforced by sequentially-consistent operations.
+ the ordering enforced by read-modify-write operations.
This is because QEMU uses the C11 memory model. The following example
is correct in Linux but not in QEMU:
@@ -486,9 +487,24 @@ and memory barriers, and the equivalents in QEMU:
because the read of ``y`` can be moved (by either the processor or the
compiler) before the write of ``x``.
- Fixing this requires an ``smp_mb()`` memory barrier between the write
- of ``x`` and the read of ``y``. In the common case where only one thread
- writes ``x``, it is also possible to write it like this:
+ Fixing this requires a full memory barrier between the write of ``x`` and
+ the read of ``y``. QEMU provides ``smp_mb__before_rmw()`` and
+ ``smp_mb__after_rmw()``; they act both as an optimization,
+ avoiding the memory barrier on processors where it is unnecessary,
+ and as a clarification of this corner case of the C11 memory model:
+
+ +--------------------------------+
+ | QEMU (correct) |
+ +================================+
+ | :: |
+ | |
+ | a = qatomic_fetch_add(&x, 2);|
+ | smp_mb__after_rmw(); |
+ | b = qatomic_read(&y); |
+ +--------------------------------+
+
+ In the common case where only one thread writes ``x``, it is also possible
+ to write it like this:
+--------------------------------+
| QEMU (correct) |
diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h
index 112a29910b..7855443cab 100644
--- a/include/qemu/atomic.h
+++ b/include/qemu/atomic.h
@@ -243,6 +243,20 @@
#define smp_wmb() smp_mb_release()
#define smp_rmb() smp_mb_acquire()
+/*
+ * SEQ_CST is weaker than the older __sync_* builtins and Linux
+ * kernel read-modify-write atomics. Provide a macro to obtain
+ * the same semantics.
+ */
+#if !defined(QEMU_SANITIZE_THREAD) && \
+ (defined(__i386__) || defined(__x86_64__) || defined(__s390x__))
+# define smp_mb__before_rmw() signal_barrier()
+# define smp_mb__after_rmw() signal_barrier()
+#else
+# define smp_mb__before_rmw() smp_mb()
+# define smp_mb__after_rmw() smp_mb()
+#endif
+
/* qatomic_mb_read/set semantics map Java volatile variables. They are
* less expensive on some platforms (notably POWER) than fully
* sequentially consistent operations.
@@ -257,7 +271,8 @@
#if !defined(__SANITIZE_THREAD__) && \
(defined(__i386__) || defined(__x86_64__) || defined(__s390x__))
/* This is more efficient than a store plus a fence. */
-# define qatomic_mb_set(ptr, i) ((void)qatomic_xchg(ptr, i))
+# define qatomic_mb_set(ptr, i) \
+ ({ (void)qatomic_xchg(ptr, i); smp_mb__after_rmw(); })
#else
# define qatomic_mb_set(ptr, i) \
({ qatomic_store_release(ptr, i); smp_mb(); })
--
2.37.3

View File

@ -0,0 +1,75 @@
From 2f03293910f3ac559f37d45c95325ae29638003a Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Thu, 9 Mar 2023 08:15:14 -0500
Subject: [PATCH 07/13] qemu-coroutine-lock: add smp_mb__after_rmw()
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 263: qatomic: add smp_mb__before/after_rmw()
RH-Bugzilla: 2168472
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Commit: [7/10] 9cf1b6d3b0dd154489e75ad54a3000ea58983960
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168472
commit e3a3b6ec8169eab2feb241b4982585001512cd55
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Fri Mar 3 10:52:59 2023 +0100
qemu-coroutine-lock: add smp_mb__after_rmw()
mutex->from_push and mutex->handoff in qemu-coroutine-lock implement
the familiar pattern:
write a write b
smp_mb() smp_mb()
read b read a
The memory barrier is required by the C memory model even after a
SEQ_CST read-modify-write operation such as QSLIST_INSERT_HEAD_ATOMIC.
Add it and avoid the unclear qatomic_mb_read() operation.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
util/qemu-coroutine-lock.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
index 2669403839..a03ed0e664 100644
--- a/util/qemu-coroutine-lock.c
+++ b/util/qemu-coroutine-lock.c
@@ -206,10 +206,16 @@ static void coroutine_fn qemu_co_mutex_lock_slowpath(AioContext *ctx,
trace_qemu_co_mutex_lock_entry(mutex, self);
push_waiter(mutex, &w);
+ /*
+ * Add waiter before reading mutex->handoff. Pairs with qatomic_mb_set
+ * in qemu_co_mutex_unlock.
+ */
+ smp_mb__after_rmw();
+
/* This is the "Responsibility Hand-Off" protocol; a lock() picks from
* a concurrent unlock() the responsibility of waking somebody up.
*/
- old_handoff = qatomic_mb_read(&mutex->handoff);
+ old_handoff = qatomic_read(&mutex->handoff);
if (old_handoff &&
has_waiters(mutex) &&
qatomic_cmpxchg(&mutex->handoff, old_handoff, 0) == old_handoff) {
@@ -308,6 +314,7 @@ void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex)
}
our_handoff = mutex->sequence;
+ /* Set handoff before checking for waiters. */
qatomic_mb_set(&mutex->handoff, our_handoff);
if (!has_waiters(mutex)) {
/* The concurrent lock has not added itself yet, so it
--
2.37.3

View File

@ -0,0 +1,61 @@
From 095811c08557b0a2ad1a433d28699ead1e5ef664 Mon Sep 17 00:00:00 2001
From: Bandan Das <bsd@redhat.com>
Date: Thu, 3 Aug 2023 15:12:15 -0400
Subject: [PATCH 2/5] qemu-options.hx: Update the reduced-phys-bits
documentation
RH-Author: Bandan Das <None>
RH-MergeRequest: 296: Updates to SEV reduced-phys-bits parameter
RH-Bugzilla: 2214840
RH-Acked-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [2/4] f8e8f5aeff449a34ce90c6e55e2a51873a6e6a87
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2214840
commit 326e3015c4c6f3197157ea0bb00826ae740e2fad
Author: Tom Lendacky <thomas.lendacky@amd.com>
Date: Fri Sep 30 10:14:28 2022 -0500
qemu-options.hx: Update the reduced-phys-bits documentation
A guest only ever experiences, at most, 1 bit of reduced physical
addressing. Update the documentation to reflect this as well as change
the example value on the reduced-phys-bits option.
Fixes: a9b4942f48 ("target/i386: add Secure Encrypted Virtualization (SEV) object")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <13a62ced1808546c1d398e2025cf85f4c94ae123.1664550870.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bandan Das <bsd@redhat.com>
---
qemu-options.hx | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/qemu-options.hx b/qemu-options.hx
index 4b7798088b..981248e283 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -5204,7 +5204,7 @@ SRST
physical address space. The ``reduced-phys-bits`` is used to
provide the number of bits we loose in physical address space.
Similar to C-bit, the value is Host family dependent. On EPYC,
- the value should be 5.
+ a guest will lose a maximum of 1 bit, so the value should be 1.
The ``sev-device`` provides the device file to use for
communicating with the SEV firmware running inside AMD Secure
@@ -5239,7 +5239,7 @@ SRST
# |qemu_system_x86| \\
...... \\
- -object sev-guest,id=sev0,cbitpos=47,reduced-phys-bits=5 \\
+ -object sev-guest,id=sev0,cbitpos=47,reduced-phys-bits=1 \\
-machine ...,memory-encryption=sev0 \\
.....
--
2.37.3

View File

@ -0,0 +1,146 @@
From d46ca52c3f42add549bd3790a41d06594821334e Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Thu, 9 Mar 2023 08:10:57 -0500
Subject: [PATCH 03/13] qemu-thread-posix: cleanup, fix, document QemuEvent
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 263: qatomic: add smp_mb__before/after_rmw()
RH-Bugzilla: 2168472
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Commit: [3/10] 746070c4d78c7f0a9ac4456d9aee69475acb8964
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168472
commit 9586a1329f5dce6c1d7f4de53cf0536644d7e593
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Thu Mar 2 11:19:52 2023 +0100
qemu-thread-posix: cleanup, fix, document QemuEvent
QemuEvent is currently broken on ARM due to missing memory barriers
after qatomic_*(). Apart from adding the memory barrier, a closer look
reveals some unpaired memory barriers too. Document more clearly what
is going on.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
util/qemu-thread-posix.c | 69 ++++++++++++++++++++++++++++------------
1 file changed, 49 insertions(+), 20 deletions(-)
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index e1225b63bd..dd3b6d4670 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -430,13 +430,21 @@ void qemu_event_destroy(QemuEvent *ev)
void qemu_event_set(QemuEvent *ev)
{
- /* qemu_event_set has release semantics, but because it *loads*
+ assert(ev->initialized);
+
+ /*
+ * Pairs with both qemu_event_reset() and qemu_event_wait().
+ *
+ * qemu_event_set has release semantics, but because it *loads*
* ev->value we need a full memory barrier here.
*/
- assert(ev->initialized);
smp_mb();
if (qatomic_read(&ev->value) != EV_SET) {
- if (qatomic_xchg(&ev->value, EV_SET) == EV_BUSY) {
+ int old = qatomic_xchg(&ev->value, EV_SET);
+
+ /* Pairs with memory barrier in kernel futex_wait system call. */
+ smp_mb__after_rmw();
+ if (old == EV_BUSY) {
/* There were waiters, wake them up. */
qemu_futex_wake(ev, INT_MAX);
}
@@ -445,18 +453,19 @@ void qemu_event_set(QemuEvent *ev)
void qemu_event_reset(QemuEvent *ev)
{
- unsigned value;
-
assert(ev->initialized);
- value = qatomic_read(&ev->value);
- smp_mb_acquire();
- if (value == EV_SET) {
- /*
- * If there was a concurrent reset (or even reset+wait),
- * do nothing. Otherwise change EV_SET->EV_FREE.
- */
- qatomic_or(&ev->value, EV_FREE);
- }
+
+ /*
+ * If there was a concurrent reset (or even reset+wait),
+ * do nothing. Otherwise change EV_SET->EV_FREE.
+ */
+ qatomic_or(&ev->value, EV_FREE);
+
+ /*
+ * Order reset before checking the condition in the caller.
+ * Pairs with the first memory barrier in qemu_event_set().
+ */
+ smp_mb__after_rmw();
}
void qemu_event_wait(QemuEvent *ev)
@@ -464,20 +473,40 @@ void qemu_event_wait(QemuEvent *ev)
unsigned value;
assert(ev->initialized);
- value = qatomic_read(&ev->value);
- smp_mb_acquire();
+
+ /*
+ * qemu_event_wait must synchronize with qemu_event_set even if it does
+ * not go down the slow path, so this load-acquire is needed that
+ * synchronizes with the first memory barrier in qemu_event_set().
+ *
+ * If we do go down the slow path, there is no requirement at all: we
+ * might miss a qemu_event_set() here but ultimately the memory barrier in
+ * qemu_futex_wait() will ensure the check is done correctly.
+ */
+ value = qatomic_load_acquire(&ev->value);
if (value != EV_SET) {
if (value == EV_FREE) {
/*
- * Leave the event reset and tell qemu_event_set that there
- * are waiters. No need to retry, because there cannot be
- * a concurrent busy->free transition. After the CAS, the
- * event will be either set or busy.
+ * Leave the event reset and tell qemu_event_set that there are
+ * waiters. No need to retry, because there cannot be a concurrent
+ * busy->free transition. After the CAS, the event will be either
+ * set or busy.
+ *
+ * This cmpxchg doesn't have particular ordering requirements if it
+ * succeeds (moving the store earlier can only cause qemu_event_set()
+ * to issue _more_ wakeups), the failing case needs acquire semantics
+ * like the load above.
*/
if (qatomic_cmpxchg(&ev->value, EV_FREE, EV_BUSY) == EV_SET) {
return;
}
}
+
+ /*
+ * This is the final check for a concurrent set, so it does need
+ * a smp_mb() pairing with the second barrier of qemu_event_set().
+ * The barrier is inside the FUTEX_WAIT system call.
+ */
qemu_futex_wait(ev, EV_BUSY);
}
}
--
2.37.3

View File

@ -0,0 +1,162 @@
From fa730378c42567e77eaf3e70983108f31f9001b9 Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Thu, 9 Mar 2023 08:11:05 -0500
Subject: [PATCH 04/13] qemu-thread-win32: cleanup, fix, document QemuEvent
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 263: qatomic: add smp_mb__before/after_rmw()
RH-Bugzilla: 2168472
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Commit: [4/10] 43d5bd903b460d4c3c5793a456820e8c5c8521d9
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168472
commit 6c5df4b48f0c52a61342ecb307a43f4c2a3565c4
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Thu Mar 2 11:22:50 2023 +0100
qemu-thread-win32: cleanup, fix, document QemuEvent
QemuEvent is currently broken on ARM due to missing memory barriers
after qatomic_*(). Apart from adding the memory barrier, a closer look
reveals some unpaired memory barriers that are not really needed and
complicated the functions unnecessarily. Also, it is relying on
a memory barrier in ResetEvent(); the barrier _ought_ to be there
but there is really no documentation about it, so make it explicit.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
util/qemu-thread-win32.c | 82 +++++++++++++++++++++++++++-------------
1 file changed, 56 insertions(+), 26 deletions(-)
diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
index 52eb19f351..c10249bc2e 100644
--- a/util/qemu-thread-win32.c
+++ b/util/qemu-thread-win32.c
@@ -246,12 +246,20 @@ void qemu_event_destroy(QemuEvent *ev)
void qemu_event_set(QemuEvent *ev)
{
assert(ev->initialized);
- /* qemu_event_set has release semantics, but because it *loads*
+
+ /*
+ * Pairs with both qemu_event_reset() and qemu_event_wait().
+ *
+ * qemu_event_set has release semantics, but because it *loads*
* ev->value we need a full memory barrier here.
*/
smp_mb();
if (qatomic_read(&ev->value) != EV_SET) {
- if (qatomic_xchg(&ev->value, EV_SET) == EV_BUSY) {
+ int old = qatomic_xchg(&ev->value, EV_SET);
+
+ /* Pairs with memory barrier after ResetEvent. */
+ smp_mb__after_rmw();
+ if (old == EV_BUSY) {
/* There were waiters, wake them up. */
SetEvent(ev->event);
}
@@ -260,17 +268,19 @@ void qemu_event_set(QemuEvent *ev)
void qemu_event_reset(QemuEvent *ev)
{
- unsigned value;
-
assert(ev->initialized);
- value = qatomic_read(&ev->value);
- smp_mb_acquire();
- if (value == EV_SET) {
- /* If there was a concurrent reset (or even reset+wait),
- * do nothing. Otherwise change EV_SET->EV_FREE.
- */
- qatomic_or(&ev->value, EV_FREE);
- }
+
+ /*
+ * If there was a concurrent reset (or even reset+wait),
+ * do nothing. Otherwise change EV_SET->EV_FREE.
+ */
+ qatomic_or(&ev->value, EV_FREE);
+
+ /*
+ * Order reset before checking the condition in the caller.
+ * Pairs with the first memory barrier in qemu_event_set().
+ */
+ smp_mb__after_rmw();
}
void qemu_event_wait(QemuEvent *ev)
@@ -278,29 +288,49 @@ void qemu_event_wait(QemuEvent *ev)
unsigned value;
assert(ev->initialized);
- value = qatomic_read(&ev->value);
- smp_mb_acquire();
+
+ /*
+ * qemu_event_wait must synchronize with qemu_event_set even if it does
+ * not go down the slow path, so this load-acquire is needed that
+ * synchronizes with the first memory barrier in qemu_event_set().
+ *
+ * If we do go down the slow path, there is no requirement at all: we
+ * might miss a qemu_event_set() here but ultimately the memory barrier in
+ * qemu_futex_wait() will ensure the check is done correctly.
+ */
+ value = qatomic_load_acquire(&ev->value);
if (value != EV_SET) {
if (value == EV_FREE) {
- /* qemu_event_set is not yet going to call SetEvent, but we are
- * going to do another check for EV_SET below when setting EV_BUSY.
- * At that point it is safe to call WaitForSingleObject.
+ /*
+ * Here the underlying kernel event is reset, but qemu_event_set is
+ * not yet going to call SetEvent. However, there will be another
+ * check for EV_SET below when setting EV_BUSY. At that point it
+ * is safe to call WaitForSingleObject.
*/
ResetEvent(ev->event);
- /* Tell qemu_event_set that there are waiters. No need to retry
- * because there cannot be a concurrent busy->free transition.
- * After the CAS, the event will be either set or busy.
+ /*
+ * It is not clear whether ResetEvent provides this barrier; kernel
+ * APIs (KeResetEvent/KeClearEvent) do not. Better safe than sorry!
+ */
+ smp_mb();
+
+ /*
+ * Leave the event reset and tell qemu_event_set that there are
+ * waiters. No need to retry, because there cannot be a concurrent
+ * busy->free transition. After the CAS, the event will be either
+ * set or busy.
*/
if (qatomic_cmpxchg(&ev->value, EV_FREE, EV_BUSY) == EV_SET) {
- value = EV_SET;
- } else {
- value = EV_BUSY;
+ return;
}
}
- if (value == EV_BUSY) {
- WaitForSingleObject(ev->event, INFINITE);
- }
+
+ /*
+ * ev->value is now EV_BUSY. Since we didn't observe EV_SET,
+ * qemu_event_set() must observe EV_BUSY and call SetEvent().
+ */
+ WaitForSingleObject(ev->event, INFINITE);
}
}
--
2.37.3

View File

@ -0,0 +1,55 @@
From c5cb3e97098834f9cf12b6c5260d9b43d68d64eb Mon Sep 17 00:00:00 2001
From: Jon Maloy <jmaloy@redhat.com>
Date: Tue, 9 May 2023 10:29:03 -0400
Subject: [PATCH 07/15] raven: disable reentrancy detection for iomem
RH-Author: Jon Maloy <jmaloy@redhat.com>
RH-MergeRequest: 277: memory: prevent dma-reentracy issues
RH-Bugzilla: 1999236
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [7/12] f41983390acba68043d386be090172dd17a5e58c (redhat/rhel/src/qemu-kvm/jons-qemu-kvm-2)
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1999236
Upstream: Merged
CVE: CVE-2021-3750
commit 6dad5a6810d9c60ca320d01276f6133bbcfa1fc7
Author: Alexander Bulekov <alxndr@bu.edu>
Date: Thu Apr 27 17:10:12 2023 -0400
raven: disable reentrancy detection for iomem
As the code is designed for re-entrant calls from raven_io_ops to
pci-conf, mark raven_io_ops as reentrancy-safe.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20230427211013.2994127-8-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
---
hw/pci-host/raven.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/hw/pci-host/raven.c b/hw/pci-host/raven.c
index 6e514f75eb..245b1653e4 100644
--- a/hw/pci-host/raven.c
+++ b/hw/pci-host/raven.c
@@ -294,6 +294,13 @@ static void raven_pcihost_initfn(Object *obj)
memory_region_init(&s->pci_memory, obj, "pci-memory", 0x3f000000);
address_space_init(&s->pci_io_as, &s->pci_io, "raven-io");
+ /*
+ * Raven's raven_io_ops use the address-space API to access pci-conf-idx
+ * (which is also owned by the raven device). As such, mark the
+ * pci_io_non_contiguous as re-entrancy safe.
+ */
+ s->pci_io_non_contiguous.disable_reentrancy_guard = true;
+
/* CPU address space */
memory_region_add_subregion(address_space_mem, PCI_IO_BASE_ADDR,
&s->pci_io);
--
2.37.3

View File

@ -0,0 +1,88 @@
From 3c7bc4319d4e475c820a63176d18afb7b4b2ed78 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Tue, 23 May 2023 12:34:33 +0200
Subject: [PATCH 02/22] s390: kvm: adjust diag318 resets to retain data
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [1/21] 16f2ff166efdd26a3be98d7c97d3b184598d1ca4
Bugzilla: https://bugzilla.redhat.com/2169308
commit c35aff184b2ed5be930da671ea25c857713555af
Author: Collin L. Walling <walling@linux.ibm.com>
Date: Wed Nov 17 10:23:03 2021 -0500
s390: kvm: adjust diag318 resets to retain data
The CPNC portion of the diag318 data is erroneously reset during an
initial CPU reset caused by SIGP. Let's go ahead and relocate the
diag318_info field within the CPUS390XState struct such that it is
only zeroed during a clear reset. This way, the CPNC will be retained
for each VCPU in the configuration after the diag318 instruction
has been invoked.
The s390_machine_reset code already takes care of zeroing the diag318
data on VM resets, which also cover resets caused by diag308.
Fixes: fabdada9357b ("s390: guest support for diagnose 0x318")
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Message-Id: <20211117152303.627969-1-walling@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
target/s390x/cpu.h | 4 ++--
target/s390x/kvm/kvm.c | 4 ++++
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
index ca3845d023..a75e559134 100644
--- a/target/s390x/cpu.h
+++ b/target/s390x/cpu.h
@@ -63,6 +63,8 @@ struct CPUS390XState {
uint64_t etoken; /* etoken */
uint64_t etoken_extension; /* etoken extension */
+ uint64_t diag318_info;
+
/* Fields up to this point are not cleared by initial CPU reset */
struct {} start_initial_reset_fields;
@@ -118,8 +120,6 @@ struct CPUS390XState {
uint16_t external_call_addr;
DECLARE_BITMAP(emergency_signals, S390_MAX_CPUS);
- uint64_t diag318_info;
-
#if !defined(CONFIG_USER_ONLY)
uint64_t tlb_fill_tec; /* translation exception code during tlb_fill */
int tlb_fill_exc; /* exception number seen during tlb_fill */
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index d36b44f32a..8d36c377b5 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -1598,6 +1598,10 @@ void kvm_s390_set_diag318(CPUState *cs, uint64_t diag318_info)
env->diag318_info = diag318_info;
cs->kvm_run->s.regs.diag318 = diag318_info;
cs->kvm_run->kvm_dirty_regs |= KVM_SYNC_DIAG318;
+ /*
+ * diag 318 info is zeroed during a clear reset and
+ * diag 308 IPL subcodes.
+ */
}
}
--
2.37.3

View File

@ -0,0 +1,168 @@
From 4d940934c304a71813dfa4598b20fafe9d2f5625 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Tue, 23 May 2023 12:34:33 +0200
Subject: [PATCH 19/22] s390x/css: revert SCSW ctrl/flag bits on error
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [18/21] e4d5797ab93ba4afd9978a1d3e1f9d05da301506
Bugzilla: https://bugzilla.redhat.com/2169308
commit f53b033e4cd2e7706df3cca04f3bf3c5ffc6b08c
Author: Peter Jin <pjin@linux.ibm.com>
Date: Thu Oct 27 23:23:41 2022 +0200
s390x/css: revert SCSW ctrl/flag bits on error
Revert the control and flag bits in the subchannel status word in case
the SSCH operation fails with non-zero CC (ditto for CSCH and HSCH).
According to POPS, the control and flag bits are only changed if SSCH,
CSCH, and HSCH return CC 0, and no other action should be taken otherwise.
In order to simulate that after the fact, the bits need to be reverted on
non-zero CC.
While the do_subchannel_work logic for virtual (virtio) devices will
return condition code 0, passthrough (vfio) devices may encounter
errors from either the host kernel or real hardware that need to be
accounted for after this point. This includes restoring the state of
the Subchannel Status Word to reflect the subchannel, as these bits
would not be set in the event of a non-zero condition code from the
affected instructions.
Experimentation has shown that a failure on a START SUBCHANNEL (SSCH)
to a passthrough device would leave the subchannel with the START
PENDING activity control bit set, thus blocking subsequent SSCH
operations in css_do_ssch() until some form of error recovery was
undertaken since no interrupt would be expected.
Signed-off-by: Peter Jin <pjin@linux.ibm.com>
Message-Id: <20221027212341.2904795-1-pjin@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
[thuth: Updated the commit description to Eric's suggestion]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
hw/s390x/css.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 48 insertions(+), 3 deletions(-)
diff --git a/hw/s390x/css.c b/hw/s390x/css.c
index 7d9523f811..95d1b3a3ce 100644
--- a/hw/s390x/css.c
+++ b/hw/s390x/css.c
@@ -1522,21 +1522,37 @@ IOInstEnding css_do_xsch(SubchDev *sch)
IOInstEnding css_do_csch(SubchDev *sch)
{
SCHIB *schib = &sch->curr_status;
+ uint16_t old_scsw_ctrl;
+ IOInstEnding ccode;
if (~(schib->pmcw.flags) & (PMCW_FLAGS_MASK_DNV | PMCW_FLAGS_MASK_ENA)) {
return IOINST_CC_NOT_OPERATIONAL;
}
+ /*
+ * Save the current scsw.ctrl in case CSCH fails and we need
+ * to revert the scsw to the status quo ante.
+ */
+ old_scsw_ctrl = schib->scsw.ctrl;
+
/* Trigger the clear function. */
schib->scsw.ctrl &= ~(SCSW_CTRL_MASK_FCTL | SCSW_CTRL_MASK_ACTL);
schib->scsw.ctrl |= SCSW_FCTL_CLEAR_FUNC | SCSW_ACTL_CLEAR_PEND;
- return do_subchannel_work(sch);
+ ccode = do_subchannel_work(sch);
+
+ if (ccode != IOINST_CC_EXPECTED) {
+ schib->scsw.ctrl = old_scsw_ctrl;
+ }
+
+ return ccode;
}
IOInstEnding css_do_hsch(SubchDev *sch)
{
SCHIB *schib = &sch->curr_status;
+ uint16_t old_scsw_ctrl;
+ IOInstEnding ccode;
if (~(schib->pmcw.flags) & (PMCW_FLAGS_MASK_DNV | PMCW_FLAGS_MASK_ENA)) {
return IOINST_CC_NOT_OPERATIONAL;
@@ -1553,6 +1569,12 @@ IOInstEnding css_do_hsch(SubchDev *sch)
return IOINST_CC_BUSY;
}
+ /*
+ * Save the current scsw.ctrl in case HSCH fails and we need
+ * to revert the scsw to the status quo ante.
+ */
+ old_scsw_ctrl = schib->scsw.ctrl;
+
/* Trigger the halt function. */
schib->scsw.ctrl |= SCSW_FCTL_HALT_FUNC;
schib->scsw.ctrl &= ~SCSW_FCTL_START_FUNC;
@@ -1564,7 +1586,13 @@ IOInstEnding css_do_hsch(SubchDev *sch)
}
schib->scsw.ctrl |= SCSW_ACTL_HALT_PEND;
- return do_subchannel_work(sch);
+ ccode = do_subchannel_work(sch);
+
+ if (ccode != IOINST_CC_EXPECTED) {
+ schib->scsw.ctrl = old_scsw_ctrl;
+ }
+
+ return ccode;
}
static void css_update_chnmon(SubchDev *sch)
@@ -1605,6 +1633,8 @@ static void css_update_chnmon(SubchDev *sch)
IOInstEnding css_do_ssch(SubchDev *sch, ORB *orb)
{
SCHIB *schib = &sch->curr_status;
+ uint16_t old_scsw_ctrl, old_scsw_flags;
+ IOInstEnding ccode;
if (~(schib->pmcw.flags) & (PMCW_FLAGS_MASK_DNV | PMCW_FLAGS_MASK_ENA)) {
return IOINST_CC_NOT_OPERATIONAL;
@@ -1626,11 +1656,26 @@ IOInstEnding css_do_ssch(SubchDev *sch, ORB *orb)
}
sch->orb = *orb;
sch->channel_prog = orb->cpa;
+
+ /*
+ * Save the current scsw.ctrl and scsw.flags in case SSCH fails and we need
+ * to revert the scsw to the status quo ante.
+ */
+ old_scsw_ctrl = schib->scsw.ctrl;
+ old_scsw_flags = schib->scsw.flags;
+
/* Trigger the start function. */
schib->scsw.ctrl |= (SCSW_FCTL_START_FUNC | SCSW_ACTL_START_PEND);
schib->scsw.flags &= ~SCSW_FLAGS_MASK_PNO;
- return do_subchannel_work(sch);
+ ccode = do_subchannel_work(sch);
+
+ if (ccode != IOINST_CC_EXPECTED) {
+ schib->scsw.ctrl = old_scsw_ctrl;
+ schib->scsw.flags = old_scsw_flags;
+ }
+
+ return ccode;
}
static void copy_irb_to_guest(IRB *dest, const IRB *src, const PMCW *pmcw,
--
2.37.3

View File

@ -0,0 +1,73 @@
From 6c815e78cea7c26e9a3526cbb686f728eac31021 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Tue, 23 May 2023 12:34:33 +0200
Subject: [PATCH 12/22] s390x: follow qdev tree to detect SCSI device on a CCW
bus
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [11/21] 97303bc9c356e8828d185868736b395bc0b70214
Bugzilla: https://bugzilla.redhat.com/2169308
commit 7d2eb76d0407fc391b78df16d17f1e616ec3e228
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Mon Mar 28 09:40:00 2022 +0200
s390x: follow qdev tree to detect SCSI device on a CCW bus
Do not make assumptions on the parent type of the SCSIDevice, instead
use object_dynamic_cast all the way up to the CcwDevice. This is cleaner
because there is no guarantee that the bus is on a virtio-scsi device;
that is only the case for the default configuration of QEMU's s390x
target.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
hw/s390x/ipl.c | 20 ++++++++++++--------
1 file changed, 12 insertions(+), 8 deletions(-)
diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c
index eb7fc4c4ae..9051d8652d 100644
--- a/hw/s390x/ipl.c
+++ b/hw/s390x/ipl.c
@@ -376,14 +376,18 @@ static CcwDevice *s390_get_ccw_device(DeviceState *dev_st, int *devtype)
object_dynamic_cast(OBJECT(dev_st),
TYPE_SCSI_DEVICE);
if (sd) {
- SCSIBus *bus = scsi_bus_from_device(sd);
- VirtIOSCSI *vdev = container_of(bus, VirtIOSCSI, bus);
- VirtIOSCSICcw *scsi_ccw = container_of(vdev, VirtIOSCSICcw,
- vdev);
-
- ccw_dev = (CcwDevice *)object_dynamic_cast(OBJECT(scsi_ccw),
- TYPE_CCW_DEVICE);
- tmp_dt = CCW_DEVTYPE_SCSI;
+ SCSIBus *sbus = scsi_bus_from_device(sd);
+ VirtIODevice *vdev = (VirtIODevice *)
+ object_dynamic_cast(OBJECT(sbus->qbus.parent),
+ TYPE_VIRTIO_DEVICE);
+ if (vdev) {
+ ccw_dev = (CcwDevice *)
+ object_dynamic_cast(OBJECT(qdev_get_parent_bus(DEVICE(vdev))->parent),
+ TYPE_CCW_DEVICE);
+ if (ccw_dev) {
+ tmp_dt = CCW_DEVTYPE_SCSI;
+ }
+ }
}
}
}
--
2.37.3

View File

@ -0,0 +1,109 @@
From 2fc8489b70445a3db0a2e72c1f1edb4d61d404d6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Mon, 16 Jan 2023 18:46:05 +0100
Subject: [PATCH] s390x/pv: Implement a CGS check helper
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 271: Secure guest can't boot with maximal number of vcpus (248)
RH-Bugzilla: 2187159
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Jon Maloy <jmaloy@redhat.com>
RH-Commit: [1/1] c870d525c48ab6d0df964b5abe48efe2528c9883
When a protected VM is started with the maximum number of CPUs (248),
the service call providing information on the CPUs requires more
buffer space than allocated and QEMU disgracefully aborts :
LOADPARM=[........]
Using virtio-blk.
Using SCSI scheme.
...................................................................................
qemu-system-s390x: KVM_S390_MEM_OP failed: Argument list too long
When protected virtualization is initialized, compute the maximum
number of vCPUs supported by the machine and return useful information
to the user before the machine starts in case of error.
Suggested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20230116174607.2459498-2-clg@kaod.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 75d7150c636569f6687f7e70a33be893be43eb5f)
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
hw/s390x/pv.c | 40 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 40 insertions(+)
diff --git a/hw/s390x/pv.c b/hw/s390x/pv.c
index 728ba24547..749e5db1ce 100644
--- a/hw/s390x/pv.c
+++ b/hw/s390x/pv.c
@@ -20,6 +20,7 @@
#include "exec/confidential-guest-support.h"
#include "hw/s390x/ipl.h"
#include "hw/s390x/pv.h"
+#include "hw/s390x/sclp.h"
#include "target/s390x/kvm/kvm_s390x.h"
static bool info_valid;
@@ -249,6 +250,41 @@ struct S390PVGuestClass {
ConfidentialGuestSupportClass parent_class;
};
+/*
+ * If protected virtualization is enabled, the amount of data that the
+ * Read SCP Info Service Call can use is limited to one page. The
+ * available space also depends on the Extended-Length SCCB (ELS)
+ * feature which can take more buffer space to store feature
+ * information. This impacts the maximum number of CPUs supported in
+ * the machine.
+ */
+static uint32_t s390_pv_get_max_cpus(void)
+{
+ int offset_cpu = s390_has_feat(S390_FEAT_EXTENDED_LENGTH_SCCB) ?
+ offsetof(ReadInfo, entries) : SCLP_READ_SCP_INFO_FIXED_CPU_OFFSET;
+
+ return (TARGET_PAGE_SIZE - offset_cpu) / sizeof(CPUEntry);
+}
+
+static bool s390_pv_check_cpus(Error **errp)
+{
+ MachineState *ms = MACHINE(qdev_get_machine());
+ uint32_t pv_max_cpus = s390_pv_get_max_cpus();
+
+ if (ms->smp.max_cpus > pv_max_cpus) {
+ error_setg(errp, "Protected VMs support a maximum of %d CPUs",
+ pv_max_cpus);
+ return false;
+ }
+
+ return true;
+}
+
+static bool s390_pv_guest_check(ConfidentialGuestSupport *cgs, Error **errp)
+{
+ return s390_pv_check_cpus(errp);
+}
+
int s390_pv_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
{
if (!object_dynamic_cast(OBJECT(cgs), TYPE_S390_PV_GUEST)) {
@@ -261,6 +297,10 @@ int s390_pv_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
return -1;
}
+ if (!s390_pv_guest_check(cgs, errp)) {
+ return -1;
+ }
+
cgs->ready = true;
return 0;
--
2.39.1

View File

@ -0,0 +1,77 @@
From 63ffa29eeb0062dd9145fa97e92d87a5374ae807 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Tue, 23 May 2023 12:34:33 +0200
Subject: [PATCH 07/22] s390x: sigp: Reorder the SIGP STOP code
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [6/21] 0c957b3f4a2d6abb278375a7080055502fa8e34d
Bugzilla: https://bugzilla.redhat.com/2169308
commit 59b9b5186e44a90088a91ed7a7493b03027e4f1f
Author: Eric Farman <farman@linux.ibm.com>
Date: Mon Dec 13 22:09:19 2021 +0100
s390x: sigp: Reorder the SIGP STOP code
Let's wait to mark the VCPU STOPPED until the possible
STORE STATUS operation is completed, so that we know the
CPU is fully stopped and done doing anything. (When we
also clear the possible sigp_order field for STOP orders.)
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20211213210919.856693-2-farman@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
target/s390x/sigp.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/target/s390x/sigp.c b/target/s390x/sigp.c
index 51c727834c..9dd977349a 100644
--- a/target/s390x/sigp.c
+++ b/target/s390x/sigp.c
@@ -139,7 +139,7 @@ static void sigp_stop_and_store_status(CPUState *cs, run_on_cpu_data arg)
case S390_CPU_STATE_OPERATING:
cpu->env.sigp_order = SIGP_STOP_STORE_STATUS;
cpu_inject_stop(cpu);
- /* store will be performed in do_stop_interrup() */
+ /* store will be performed in do_stop_interrupt() */
break;
case S390_CPU_STATE_STOPPED:
/* already stopped, just store the status */
@@ -479,13 +479,17 @@ void do_stop_interrupt(CPUS390XState *env)
{
S390CPU *cpu = env_archcpu(env);
- if (s390_cpu_set_state(S390_CPU_STATE_STOPPED, cpu) == 0) {
- qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
- }
+ /*
+ * Complete the STOP operation before exposing the CPU as
+ * STOPPED to the system.
+ */
if (cpu->env.sigp_order == SIGP_STOP_STORE_STATUS) {
s390_store_status(cpu, S390_STORE_STATUS_DEF_ADDR, true);
}
env->sigp_order = 0;
+ if (s390_cpu_set_state(S390_CPU_STATE_STOPPED, cpu) == 0) {
+ qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
+ }
env->pending_int &= ~INTERRUPT_STOP;
}
--
2.37.3

View File

@ -0,0 +1,55 @@
From 85c0b90fe4ce1e191e215a1fb2fccfe7269527e3 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Tue, 23 May 2023 12:34:33 +0200
Subject: [PATCH 08/22] s390x/tcg: Fix BRASL with a large negative offset
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [7/21] f2eb97bf300afcb440cd5dc6d398ce7ad34f1db9
Bugzilla: https://bugzilla.redhat.com/2169308
commit fc3dd86a290a9c7c3c3273961b03058ae8f1d49f
Author: Ilya Leoshkevich <iii@linux.ibm.com>
Date: Mon Mar 14 11:42:30 2022 +0100
s390x/tcg: Fix BRASL with a large negative offset
When RI2 is 0x80000000, qemu enters an infinite loop instead of jumping
backwards. Fix by adding a missing cast, like in in2_ri2().
Fixes: 8ac33cdb8bfb ("Convert BRANCH AND SAVE")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20220314104232.675863-2-iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
target/s390x/tcg/translate.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index b14e6a04a7..8147d952df 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -1567,7 +1567,7 @@ static DisasJumpType op_bal(DisasContext *s, DisasOps *o)
static DisasJumpType op_basi(DisasContext *s, DisasOps *o)
{
pc_to_link_info(o->out, s, s->pc_tmp);
- return help_goto_direct(s, s->base.pc_next + 2 * get_field(s, i2));
+ return help_goto_direct(s, s->base.pc_next + (int64_t)get_field(s, i2) * 2);
}
static DisasJumpType op_bc(DisasContext *s, DisasOps *o)
--
2.37.3

View File

@ -0,0 +1,55 @@
From b7440db8874a62631427d0b822922747bad9771b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Tue, 23 May 2023 12:34:33 +0200
Subject: [PATCH 09/22] s390x/tcg: Fix BRCL with a large negative offset
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [8/21] 60abe03ceba239268b72ff79e2945b73822fb72f
Bugzilla: https://bugzilla.redhat.com/2169308
commit 16ed5f14215b20c8dc49b96e2149032ba3238beb
Author: Ilya Leoshkevich <iii@linux.ibm.com>
Date: Mon Mar 14 11:42:31 2022 +0100
s390x/tcg: Fix BRCL with a large negative offset
When RI2 is 0x80000000, qemu enters an infinite loop instead of jumping
backwards. Fix by adding a missing cast, like in in2_ri2().
Fixes: 7233f2ed1717 ("target-s390: Convert BRANCH ON CONDITION")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20220314104232.675863-3-iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
target/s390x/tcg/translate.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index 8147d952df..7ff7f90e23 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -1201,7 +1201,7 @@ static DisasJumpType help_branch(DisasContext *s, DisasCompare *c,
bool is_imm, int imm, TCGv_i64 cdest)
{
DisasJumpType ret;
- uint64_t dest = s->base.pc_next + 2 * imm;
+ uint64_t dest = s->base.pc_next + (int64_t)imm * 2;
TCGLabel *lab;
/* Take care of the special cases first. */
--
2.37.3

View File

@ -0,0 +1,57 @@
From 5eae4fd33e2101630ccb7aadeb3ba965800f6f32 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Tue, 23 May 2023 12:34:33 +0200
Subject: [PATCH 17/22] s390x/tcg: Fix opcode for lzrf
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [16/21] 43af79d2c9cd818bfa7ac1819bd9964c86915d97
Bugzilla: https://bugzilla.redhat.com/2169308
commit 131aafa7eff4aa4d747cb7113726b27394a38866
Author: Christian Borntraeger <borntraeger@linux.ibm.com>
Date: Wed Sep 14 12:57:50 2022 +0200
s390x/tcg: Fix opcode for lzrf
Fix the opcode for Load and Zero Rightmost Byte (32).
Fixes: c2a5c1d718ea ("target/s390x: Implement load-and-zero-rightmost-byte insns")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: qemu-stable@nongnu.org
Message-Id: <20220914105750.767697-1-borntraeger@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
target/s390x/tcg/insn-data.def | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
index 96d4794162..d54673a3ba 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.def
@@ -463,7 +463,7 @@
C(0xe39f, LAT, RXY_a, LAT, 0, m2_32u, r1, 0, lat, 0)
C(0xe385, LGAT, RXY_a, LAT, 0, a2, r1, 0, lgat, 0)
/* LOAD AND ZERO RIGHTMOST BYTE */
- C(0xe3eb, LZRF, RXY_a, LZRB, 0, m2_32u, new, r1_32, lzrb, 0)
+ C(0xe33b, LZRF, RXY_a, LZRB, 0, m2_32u, new, r1_32, lzrb, 0)
C(0xe32a, LZRG, RXY_a, LZRB, 0, m2_64, r1, 0, lzrb, 0)
/* LOAD LOGICAL AND ZERO RIGHTMOST BYTE */
C(0xe33a, LLZRGF, RXY_a, LZRB, 0, m2_32u, r1, 0, lzrb, 0)
--
2.37.3

View File

@ -0,0 +1,176 @@
From df836ee4b4e2a69cca5042a3a9daf2c41dc2aa58 Mon Sep 17 00:00:00 2001
From: Stefan Hajnoczi <stefanha@redhat.com>
Date: Tue, 21 Feb 2023 16:22:16 -0500
Subject: [PATCH 11/13] scsi: protect req->aiocb with AioContext lock
RH-Author: Stefan Hajnoczi <stefanha@redhat.com>
RH-MergeRequest: 264: scsi: protect req->aiocb with AioContext lock
RH-Bugzilla: 2090990
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
RH-Commit: [1/3] e6a6d4109713e0fd6d6c515535c66196fea98688
If requests are being processed in the IOThread when a SCSIDevice is
unplugged, scsi_device_purge_requests() -> scsi_req_cancel_async() races
with I/O completion callbacks. Both threads load and store req->aiocb.
This can lead to assert(r->req.aiocb == NULL) failures and undefined
behavior.
Protect r->req.aiocb with the AioContext lock to prevent the race.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230221212218.1378734-2-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 7b7fc3d0102dafe8eb44802493036a526e921a71)
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
hw/scsi/scsi-disk.c | 23 ++++++++++++++++-------
hw/scsi/scsi-generic.c | 11 ++++++-----
2 files changed, 22 insertions(+), 12 deletions(-)
diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index d4914178ea..179ce22c4a 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -270,9 +270,11 @@ static void scsi_aio_complete(void *opaque, int ret)
SCSIDiskReq *r = (SCSIDiskReq *)opaque;
SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
+ aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
+
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
- aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
+
if (scsi_disk_req_check_error(r, ret, true)) {
goto done;
}
@@ -354,10 +356,11 @@ static void scsi_dma_complete(void *opaque, int ret)
SCSIDiskReq *r = (SCSIDiskReq *)opaque;
SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
+ aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
+
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
- aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
if (ret < 0) {
block_acct_failed(blk_get_stats(s->qdev.conf.blk), &r->acct);
} else {
@@ -390,10 +393,11 @@ static void scsi_read_complete(void *opaque, int ret)
SCSIDiskReq *r = (SCSIDiskReq *)opaque;
SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
+ aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
+
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
- aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
if (ret < 0) {
block_acct_failed(blk_get_stats(s->qdev.conf.blk), &r->acct);
} else {
@@ -443,10 +447,11 @@ static void scsi_do_read_cb(void *opaque, int ret)
SCSIDiskReq *r = (SCSIDiskReq *)opaque;
SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
+ aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
+
assert (r->req.aiocb != NULL);
r->req.aiocb = NULL;
- aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
if (ret < 0) {
block_acct_failed(blk_get_stats(s->qdev.conf.blk), &r->acct);
} else {
@@ -527,10 +532,11 @@ static void scsi_write_complete(void * opaque, int ret)
SCSIDiskReq *r = (SCSIDiskReq *)opaque;
SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
+ aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
+
assert (r->req.aiocb != NULL);
r->req.aiocb = NULL;
- aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
if (ret < 0) {
block_acct_failed(blk_get_stats(s->qdev.conf.blk), &r->acct);
} else {
@@ -1659,10 +1665,11 @@ static void scsi_unmap_complete(void *opaque, int ret)
SCSIDiskReq *r = data->r;
SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
+ aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
+
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
- aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
if (scsi_disk_req_check_error(r, ret, true)) {
scsi_req_unref(&r->req);
g_free(data);
@@ -1738,9 +1745,11 @@ static void scsi_write_same_complete(void *opaque, int ret)
SCSIDiskReq *r = data->r;
SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
+ aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
+
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
- aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
+
if (scsi_disk_req_check_error(r, ret, true)) {
goto done;
}
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 3742899839..a1a40df64b 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -111,10 +111,11 @@ static void scsi_command_complete(void *opaque, int ret)
SCSIGenericReq *r = (SCSIGenericReq *)opaque;
SCSIDevice *s = r->req.dev;
+ aio_context_acquire(blk_get_aio_context(s->conf.blk));
+
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
- aio_context_acquire(blk_get_aio_context(s->conf.blk));
scsi_command_complete_noio(r, ret);
aio_context_release(blk_get_aio_context(s->conf.blk));
}
@@ -269,11 +270,11 @@ static void scsi_read_complete(void * opaque, int ret)
SCSIDevice *s = r->req.dev;
int len;
+ aio_context_acquire(blk_get_aio_context(s->conf.blk));
+
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
- aio_context_acquire(blk_get_aio_context(s->conf.blk));
-
if (ret || r->req.io_canceled) {
scsi_command_complete_noio(r, ret);
goto done;
@@ -387,11 +388,11 @@ static void scsi_write_complete(void * opaque, int ret)
trace_scsi_generic_write_complete(ret);
+ aio_context_acquire(blk_get_aio_context(s->conf.blk));
+
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
- aio_context_acquire(blk_get_aio_context(s->conf.blk));
-
if (ret || r->req.io_canceled) {
scsi_command_complete_noio(r, ret);
goto done;
--
2.37.3

View File

@ -0,0 +1,72 @@
From bf3577c044e51094ca2166e748c8bae360c3f0c2 Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Wed, 24 May 2023 07:26:04 -0400
Subject: [PATCH 14/15] target/i386: add support for FB_CLEAR feature
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 281: target/i386: add support for FLUSH_L1D feature
RH-Bugzilla: 2216203
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Jon Maloy <jmaloy@redhat.com>
RH-Commit: [2/2] 8cd4b7366a9898e406ca20c9a28f14ddce855b1e
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2216203
commit 22e1094ca82d5518c1b69aff3e87c550776ae1eb
Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Wed Feb 1 08:57:59 2023 -0500
target/i386: add support for FB_CLEAR feature
As reported by the Intel's doc:
"FB_CLEAR: The processor will overwrite fill buffer values as part of
MD_CLEAR operations with the VERW instruction.
On these processors, L1D_FLUSH does not overwrite fill buffer values."
If this cpu feature is present in host, allow QEMU to choose whether to
show it to the guest too.
One disadvantage of not exposing it is that the guest will report
a non existing vulnerability in
/sys/devices/system/cpu/vulnerabilities/mmio_stale_data
because the mitigation is present only when the cpu has
(FLUSH_L1D and MD_CLEAR) or FB_CLEAR
features enabled.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20230201135759.555607-3-eesposit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
target/i386/cpu.c | 2 +-
target/i386/cpu.h | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 47da059df6..9d3dcdcc0d 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -981,7 +981,7 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
"ssb-no", "mds-no", "pschange-mc-no", "tsx-ctrl",
"taa-no", NULL, NULL, NULL,
NULL, NULL, NULL, NULL,
- NULL, NULL, NULL, NULL,
+ NULL, "fb-clear", NULL, NULL,
NULL, NULL, NULL, NULL,
NULL, NULL, NULL, NULL,
NULL, NULL, NULL, NULL,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 7cb7cea8ab..9b7d664ee7 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -950,6 +950,7 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
#define MSR_ARCH_CAP_PSCHANGE_MC_NO (1U << 6)
#define MSR_ARCH_CAP_TSX_CTRL_MSR (1U << 7)
#define MSR_ARCH_CAP_TAA_NO (1U << 8)
+#define MSR_ARCH_CAP_FB_CLEAR (1U << 17)
#define MSR_CORE_CAP_SPLIT_LOCK_DETECT (1U << 5)
--
2.37.3

View File

@ -0,0 +1,71 @@
From 9cfedd3a9880390ddda25a235b999430c3dd5e83 Mon Sep 17 00:00:00 2001
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Wed, 24 May 2023 07:25:57 -0400
Subject: [PATCH 13/15] target/i386: add support for FLUSH_L1D feature
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-MergeRequest: 281: target/i386: add support for FLUSH_L1D feature
RH-Bugzilla: 2216203
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Jon Maloy <jmaloy@redhat.com>
RH-Commit: [1/2] 50c54ca7c734dc2b9303e724a6c5ac1127472271
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2216203
commit 0e7e3bf1a552c178924867fa7c2f30ccc8a179e0
Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Date: Wed Feb 1 08:57:58 2023 -0500
target/i386: add support for FLUSH_L1D feature
As reported by Intel's doc:
"L1D_FLUSH: Writeback and invalidate the L1 data cache"
If this cpu feature is present in host, allow QEMU to choose whether to
show it to the guest too.
One disadvantage of not exposing it is that the guest will report
a non existing vulnerability in
/sys/devices/system/cpu/vulnerabilities/mmio_stale_data
because the mitigation is present only when the cpu has
(FLUSH_L1D and MD_CLEAR) or FB_CLEAR
features enabled.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20230201135759.555607-2-eesposit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
target/i386/cpu.c | 2 +-
target/i386/cpu.h | 2 ++
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 0543b846ff..47da059df6 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -857,7 +857,7 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
"tsx-ldtrk", NULL, NULL /* pconfig */, NULL,
NULL, NULL, "amx-bf16", "avx512-fp16",
"amx-tile", "amx-int8", "spec-ctrl", "stibp",
- NULL, "arch-capabilities", "core-capability", "ssbd",
+ "flush-l1d", "arch-capabilities", "core-capability", "ssbd",
},
.cpuid = {
.eax = 7,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 5d2ddd81b9..7cb7cea8ab 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -864,6 +864,8 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
#define CPUID_7_0_EDX_SPEC_CTRL (1U << 26)
/* Single Thread Indirect Branch Predictors */
#define CPUID_7_0_EDX_STIBP (1U << 27)
+/* Flush L1D cache */
+#define CPUID_7_0_EDX_FLUSH_L1D (1U << 28)
/* Arch Capabilities */
#define CPUID_7_0_EDX_ARCH_CAPABILITIES (1U << 29)
/* Core Capability */
--
2.37.3

View File

@ -0,0 +1,57 @@
From 522ce31b4998b714b03e781f49403b71531ebe5a Mon Sep 17 00:00:00 2001
From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
Date: Mon, 23 May 2022 18:26:58 +0200
Subject: [PATCH 5/5] target/i386/kvm: Fix disabling MPX on "-cpu host" with
MPX-capable host
RH-Author: Ani Sinha <None>
RH-MergeRequest: 297: target/i386/kvm: Fix disabling MPX on "-cpu host" with MPX-capable host
RH-Bugzilla: 2223947
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Acked-by: Jon Maloy <jmaloy@redhat.com>
RH-Commit: [1/1] 90098294a873a53b366389606fd0402efcbd70ad
Since KVM commit 5f76f6f5ff96 ("KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled")
it is not possible to disable MPX on a "-cpu host" just by adding "-mpx"
there if the host CPU does indeed support MPX.
QEMU will fail to set MSR_IA32_VMX_TRUE_{EXIT,ENTRY}_CTLS MSRs in this case
and so trigger an assertion failure.
Instead, besides "-mpx" one has to explicitly add also
"-vmx-exit-clear-bndcfgs" and "-vmx-entry-load-bndcfgs" to QEMU command
line to make it work, which is a bit convoluted.
Make the MPX-related bits in FEAT_VMX_{EXIT,ENTRY}_CTLS dependent on MPX
being actually enabled so such workarounds are no longer necessary.
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <51aa2125c76363204cc23c27165e778097c33f0b.1653323077.git.maciej.szmigiero@oracle.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 267b5e7e378afd260004cb37a66a6fcd641e3b53)
---
target/i386/cpu.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 265f0aadfc..726814ee2e 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1326,6 +1326,14 @@ static FeatureDep feature_dependencies[] = {
.from = { FEAT_7_0_EBX, CPUID_7_0_EBX_INVPCID },
.to = { FEAT_VMX_SECONDARY_CTLS, VMX_SECONDARY_EXEC_ENABLE_INVPCID },
},
+ {
+ .from = { FEAT_7_0_EBX, CPUID_7_0_EBX_MPX },
+ .to = { FEAT_VMX_EXIT_CTLS, VMX_VM_EXIT_CLEAR_BNDCFGS },
+ },
+ {
+ .from = { FEAT_7_0_EBX, CPUID_7_0_EBX_MPX },
+ .to = { FEAT_VMX_ENTRY_CTLS, VMX_VM_ENTRY_LOAD_BNDCFGS },
+ },
{
.from = { FEAT_7_0_EBX, CPUID_7_0_EBX_RDSEED },
.to = { FEAT_VMX_SECONDARY_CTLS, VMX_SECONDARY_EXEC_RDSEED_EXITING },
--
2.37.3

View File

@ -0,0 +1,57 @@
From 4744afb2458701351c9a1435770566fbee055079 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Tue, 23 May 2023 12:34:33 +0200
Subject: [PATCH 16/22] target/s390x: Fix CLFIT and CLGIT immediate size
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [15/21] 68c0b87490dfe5349797acd7494fd293c3f733ca
Bugzilla: https://bugzilla.redhat.com/2169308
commit d324c21ba0b84b3033baa097e44a7fbbec815fad
Author: Ilya Leoshkevich <iii@linux.ibm.com>
Date: Wed Aug 17 18:15:29 2022 +0200
target/s390x: Fix CLFIT and CLGIT immediate size
I2 is 16 bits, not 32.
Found by running valgrind's none/tests/s390x/traps.
Fixes: 1c2687518235 ("target-s390: Implement COMPARE AND TRAP")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20220817161529.597414-1-iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
target/s390x/tcg/insn-data.def | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
index 99f4f5e36e..96d4794162 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.def
@@ -287,8 +287,8 @@
D(0xb961, CLGRT, RRF_c, GIE, r1_o, r2_o, 0, 0, ct, 0, 1)
D(0xeb23, CLT, RSY_b, MIE, r1_32u, m2_32u, 0, 0, ct, 0, 1)
D(0xeb2b, CLGT, RSY_b, MIE, r1_o, m2_64, 0, 0, ct, 0, 1)
- D(0xec73, CLFIT, RIE_a, GIE, r1_32u, i2_32u, 0, 0, ct, 0, 1)
- D(0xec71, CLGIT, RIE_a, GIE, r1_o, i2_32u, 0, 0, ct, 0, 1)
+ D(0xec73, CLFIT, RIE_a, GIE, r1_32u, i2_16u, 0, 0, ct, 0, 1)
+ D(0xec71, CLGIT, RIE_a, GIE, r1_o, i2_16u, 0, 0, ct, 0, 1)
/* CONVERT TO DECIMAL */
C(0x4e00, CVD, RX_a, Z, r1_o, a2, 0, 0, cvd, 0)
--
2.37.3

View File

@ -0,0 +1,55 @@
From 303eabb99283996ed941a341af127cb8502a9da5 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Tue, 23 May 2023 12:34:33 +0200
Subject: [PATCH 03/22] target/s390x: Fix SLDA sign bit index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [2/21] 8600ece5b20bbe9dfa91e322cf29c5f79000d39c
Bugzilla: https://bugzilla.redhat.com/2169308
commit 521130f267240cb1ed8fd4635496493a153281db
Author: Ilya Leoshkevich <iii@linux.ibm.com>
Date: Wed Jan 12 17:50:12 2022 +0100
target/s390x: Fix SLDA sign bit index
SLDA operates on 64-bit values, so its sign bit index should be 63,
not 31.
Fixes: a79ba3398a0a ("target-s390: Convert SHIFT DOUBLE")
Reported-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20220112165016.226996-2-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
target/s390x/tcg/insn-data.def | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
index 3e5594210c..c92284df5d 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.def
@@ -800,7 +800,7 @@
C(0xebde, SRLK, RSY_a, DO, r3_32u, sh32, new, r1_32, srl, 0)
C(0xeb0c, SRLG, RSY_a, Z, r3_o, sh64, r1, 0, srl, 0)
/* SHIFT LEFT DOUBLE */
- D(0x8f00, SLDA, RS_a, Z, r1_D32, sh64, new, r1_D32, sla, 0, 31)
+ D(0x8f00, SLDA, RS_a, Z, r1_D32, sh64, new, r1_D32, sla, 0, 63)
/* SHIFT LEFT DOUBLE LOGICAL */
C(0x8d00, SLDL, RS_a, Z, r1_D32, sh64, new, r1_D32, sll, 0)
/* SHIFT RIGHT DOUBLE */
--
2.37.3

View File

@ -0,0 +1,62 @@
From 716e77e02fe25d40f09b8f2af1ff68238f7d7058 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Tue, 23 May 2023 12:34:33 +0200
Subject: [PATCH 04/22] target/s390x: Fix SRDA CC calculation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [3/21] 95b2ba26003baa51f85f07e8860f875349c72b86
Bugzilla: https://bugzilla.redhat.com/2169308
commit 57556b28afde4b039bb12bfc274bd8df9022d946
Author: Ilya Leoshkevich <iii@linux.ibm.com>
Date: Wed Jan 12 17:50:13 2022 +0100
target/s390x: Fix SRDA CC calculation
SRDA uses r1_D32 for binding the first operand and s64 for setting CC.
cout_s64() relies on o->out being the shift result, however,
wout_r1_D32() clobbers it.
Fix by using a temporary.
Fixes: a79ba3398a0a ("target-s390: Convert SHIFT DOUBLE")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20220112165016.226996-3-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
target/s390x/tcg/translate.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index dcc249a197..c5e59b68af 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -5420,9 +5420,11 @@ static void wout_r1_P32(DisasContext *s, DisasOps *o)
static void wout_r1_D32(DisasContext *s, DisasOps *o)
{
int r1 = get_field(s, r1);
+ TCGv_i64 t = tcg_temp_new_i64();
store_reg32_i64(r1 + 1, o->out);
- tcg_gen_shri_i64(o->out, o->out, 32);
- store_reg32_i64(r1, o->out);
+ tcg_gen_shri_i64(t, o->out, 32);
+ store_reg32_i64(r1, t);
+ tcg_temp_free_i64(t);
}
#define SPEC_wout_r1_D32 SPEC_r1_even
--
2.37.3

View File

@ -0,0 +1,57 @@
From 300a84c83fc6f112bed7e488f0e64eb6c07d47bf Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Tue, 23 May 2023 12:34:33 +0200
Subject: [PATCH 05/22] target/s390x: Fix cc_calc_sla_64() missing overflows
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [4/21] 2f91de2ac980d6ffa4da0ec41bb30562624a2396
Bugzilla: https://bugzilla.redhat.com/2169308
commit df103c09bc2f549d36ba6313a69c18fc003ef1ee
Author: Ilya Leoshkevich <iii@linux.ibm.com>
Date: Wed Jan 12 17:50:14 2022 +0100
target/s390x: Fix cc_calc_sla_64() missing overflows
An overflow occurs for SLAG when at least one shifted bit is not equal
to sign bit. Therefore, we need to check that `shift + 1` bits are
neither all 0s nor all 1s. The current code checks only `shift` bits,
missing some overflows.
Fixes: cbe24bfa91d2 ("target-s390: Convert SHIFT, ROTATE SINGLE")
Co-developed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20220112165016.226996-4-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
target/s390x/tcg/cc_helper.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/target/s390x/tcg/cc_helper.c b/target/s390x/tcg/cc_helper.c
index c2c96c3a3c..c9b7b0e8c6 100644
--- a/target/s390x/tcg/cc_helper.c
+++ b/target/s390x/tcg/cc_helper.c
@@ -297,7 +297,7 @@ static uint32_t cc_calc_sla_32(uint32_t src, int shift)
static uint32_t cc_calc_sla_64(uint64_t src, int shift)
{
- uint64_t mask = ((1ULL << shift) - 1ULL) << (64 - shift);
+ uint64_t mask = -1ULL << (63 - shift);
uint64_t sign = 1ULL << 63;
uint64_t match;
int64_t r;
--
2.37.3

View File

@ -0,0 +1,101 @@
From a280a700fb016178776cb599d8cf918185df8697 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Tue, 23 May 2023 12:34:33 +0200
Subject: [PATCH 11/22] target/s390x: Fix determination of overflow condition
code after subtraction
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [10/21] 14792faddfca784503f89c292ebaba5be8d3fc96
Bugzilla: https://bugzilla.redhat.com/2169308
commit fc6e0d0f2db5126592bb4066d484fcdfc14ccf36
Author: Bruno Haible <bruno@clisp.org>
Date: Wed Mar 23 17:26:21 2022 +0100
target/s390x: Fix determination of overflow condition code after subtraction
Reported by Paul Eggert in
https://lists.gnu.org/archive/html/bug-gnulib/2021-09/msg00050.html
This program currently prints different results when run with TCG instead
of running on real s390x hardware:
#include <stdio.h>
int overflow_32 (int x, int y)
{
int sum;
return __builtin_sub_overflow (x, y, &sum);
}
int overflow_64 (long long x, long long y)
{
long sum;
return __builtin_sub_overflow (x, y, &sum);
}
int a1 = 0;
int b1 = -2147483648;
long long a2 = 0L;
long long b2 = -9223372036854775808L;
int main ()
{
{
int a = a1;
int b = b1;
printf ("a = 0x%x, b = 0x%x\n", a, b);
printf ("no_overflow = %d\n", ! overflow_32 (a, b));
}
{
long long a = a2;
long long b = b2;
printf ("a = 0x%llx, b = 0x%llx\n", a, b);
printf ("no_overflow = %d\n", ! overflow_64 (a, b));
}
}
Signed-off-by: Bruno Haible <bruno@clisp.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/618
Message-Id: <20220323162621.139313-3-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
target/s390x/tcg/cc_helper.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/target/s390x/tcg/cc_helper.c b/target/s390x/tcg/cc_helper.c
index e11cdb745d..b2e8d3d9f5 100644
--- a/target/s390x/tcg/cc_helper.c
+++ b/target/s390x/tcg/cc_helper.c
@@ -151,7 +151,7 @@ static uint32_t cc_calc_add_64(int64_t a1, int64_t a2, int64_t ar)
static uint32_t cc_calc_sub_64(int64_t a1, int64_t a2, int64_t ar)
{
- if ((a1 > 0 && a2 < 0 && ar < 0) || (a1 < 0 && a2 > 0 && ar > 0)) {
+ if ((a1 >= 0 && a2 < 0 && ar < 0) || (a1 < 0 && a2 > 0 && ar > 0)) {
return 3; /* overflow */
} else {
if (ar < 0) {
@@ -211,7 +211,7 @@ static uint32_t cc_calc_add_32(int32_t a1, int32_t a2, int32_t ar)
static uint32_t cc_calc_sub_32(int32_t a1, int32_t a2, int32_t ar)
{
- if ((a1 > 0 && a2 < 0 && ar < 0) || (a1 < 0 && a2 > 0 && ar > 0)) {
+ if ((a1 >= 0 && a2 < 0 && ar < 0) || (a1 < 0 && a2 > 0 && ar > 0)) {
return 3; /* overflow */
} else {
if (ar < 0) {
--
2.37.3

View File

@ -0,0 +1,98 @@
From 2ddea7186ae50c1f29d790027e8aa98894e51694 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Tue, 23 May 2023 12:34:33 +0200
Subject: [PATCH 10/22] target/s390x: Fix determination of overflow condition
code after addition
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [9/21] e8b946ff4e521e0367cb03fcd918a2f8af8bd4d5
Bugzilla: https://bugzilla.redhat.com/2169308
commit 5a2e67a691501bc4dd81c46c81b8f1881c8bd5df
Author: Bruno Haible <bruno@clisp.org>
Date: Wed Mar 23 17:26:20 2022 +0100
target/s390x: Fix determination of overflow condition code after addition
This program currently prints different results when run with TCG instead
of running on real s390x hardware:
#include <stdio.h>
int overflow_32 (int x, int y)
{
int sum;
return ! __builtin_add_overflow (x, y, &sum);
}
int overflow_64 (long long x, long long y)
{
long sum;
return ! __builtin_add_overflow (x, y, &sum);
}
int a1 = -2147483648;
int b1 = -2147483648;
long long a2 = -9223372036854775808L;
long long b2 = -9223372036854775808L;
int main ()
{
{
int a = a1;
int b = b1;
printf ("a = 0x%x, b = 0x%x\n", a, b);
printf ("no_overflow = %d\n", overflow_32 (a, b));
}
{
long long a = a2;
long long b = b2;
printf ("a = 0x%llx, b = 0x%llx\n", a, b);
printf ("no_overflow = %d\n", overflow_64 (a, b));
}
}
Signed-off-by: Bruno Haible <bruno@clisp.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/616
Message-Id: <20220323162621.139313-2-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
target/s390x/tcg/cc_helper.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/target/s390x/tcg/cc_helper.c b/target/s390x/tcg/cc_helper.c
index 8d04097f78..e11cdb745d 100644
--- a/target/s390x/tcg/cc_helper.c
+++ b/target/s390x/tcg/cc_helper.c
@@ -136,7 +136,7 @@ static uint32_t cc_calc_subu(uint64_t borrow_out, uint64_t result)
static uint32_t cc_calc_add_64(int64_t a1, int64_t a2, int64_t ar)
{
- if ((a1 > 0 && a2 > 0 && ar < 0) || (a1 < 0 && a2 < 0 && ar > 0)) {
+ if ((a1 > 0 && a2 > 0 && ar < 0) || (a1 < 0 && a2 < 0 && ar >= 0)) {
return 3; /* overflow */
} else {
if (ar < 0) {
@@ -196,7 +196,7 @@ static uint32_t cc_calc_comp_64(int64_t dst)
static uint32_t cc_calc_add_32(int32_t a1, int32_t a2, int32_t ar)
{
- if ((a1 > 0 && a2 > 0 && ar < 0) || (a1 < 0 && a2 < 0 && ar > 0)) {
+ if ((a1 > 0 && a2 > 0 && ar < 0) || (a1 < 0 && a2 < 0 && ar >= 0)) {
return 3; /* overflow */
} else {
if (ar < 0) {
--
2.37.3

View File

@ -0,0 +1,55 @@
From 7da1a3d21df30a3e20e0632e90e3ecff8b774b99 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Tue, 23 May 2023 12:34:33 +0200
Subject: [PATCH 18/22] target/s390x: Fix emulation of the VISTR instruction
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [17/21] ca521ee65c0bd2b191d6fdddbfe38daf39bd7b07
Bugzilla: https://bugzilla.redhat.com/2169308
commit f7d81a351d6122440f9190adba69da3f81b7b186
Author: Thomas Huth <thuth@redhat.com>
Date: Wed Oct 12 20:27:54 2022 +0200
target/s390x: Fix emulation of the VISTR instruction
The element size is encoded in the M3 field, not in the M4
field.
Fixes: be6324c6b734 ("s390x/tcg: Implement VECTOR ISOLATE STRING")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1248
Message-Id: <20221012182755.1014853-3-thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
target/s390x/tcg/translate_vx.c.inc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/target/s390x/tcg/translate_vx.c.inc b/target/s390x/tcg/translate_vx.c.inc
index 28bf5a23b6..6a125694ed 100644
--- a/target/s390x/tcg/translate_vx.c.inc
+++ b/target/s390x/tcg/translate_vx.c.inc
@@ -2413,7 +2413,7 @@ static DisasJumpType op_vfene(DisasContext *s, DisasOps *o)
static DisasJumpType op_vistr(DisasContext *s, DisasOps *o)
{
- const uint8_t es = get_field(s, m4);
+ const uint8_t es = get_field(s, m3);
const uint8_t m5 = get_field(s, m5);
static gen_helper_gvec_2 * const g[3] = {
gen_helper_gvec_vistr8,
--
2.37.3

View File

@ -0,0 +1,278 @@
From 9157bc045137b63b4304ffabc549b32e6f30d9b4 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Tue, 23 May 2023 12:34:33 +0200
Subject: [PATCH 06/22] target/s390x: Fix shifting 32-bit values for more than
31 bits
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [5/21] fba372359f0771ec41f3ad7ee4f1376e545da088
Bugzilla: https://bugzilla.redhat.com/2169308
commit 6da170beda33f3e7f1d9242814acd9f428f0f0fb
Author: Ilya Leoshkevich <iii@linux.ibm.com>
Date: Wed Jan 12 17:50:15 2022 +0100
target/s390x: Fix shifting 32-bit values for more than 31 bits
According to PoP, both 32- and 64-bit shifts use lowest 6 address
bits. The current code special-cases 32-bit shifts to use only 5 bits,
which is not correct. For example, shifting by 32 bits currently
preserves the initial value, however, it's supposed zero it out
instead.
Fix by merging sh32 and sh64 and adapting CC calculation to shift
values greater than 31.
Fixes: cbe24bfa91d2 ("target-s390: Convert SHIFT, ROTATE SINGLE")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20220112165016.226996-5-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
target/s390x/cpu-dump.c | 3 +--
target/s390x/s390x-internal.h | 3 +--
target/s390x/tcg/cc_helper.c | 36 +++-----------------------
target/s390x/tcg/insn-data.def | 36 +++++++++++++-------------
target/s390x/tcg/translate.c | 47 ++++++++++++++++------------------
5 files changed, 45 insertions(+), 80 deletions(-)
diff --git a/target/s390x/cpu-dump.c b/target/s390x/cpu-dump.c
index 0f5c062994..ffa9e94d84 100644
--- a/target/s390x/cpu-dump.c
+++ b/target/s390x/cpu-dump.c
@@ -121,8 +121,7 @@ const char *cc_name(enum cc_op cc_op)
[CC_OP_NZ_F64] = "CC_OP_NZ_F64",
[CC_OP_NZ_F128] = "CC_OP_NZ_F128",
[CC_OP_ICM] = "CC_OP_ICM",
- [CC_OP_SLA_32] = "CC_OP_SLA_32",
- [CC_OP_SLA_64] = "CC_OP_SLA_64",
+ [CC_OP_SLA] = "CC_OP_SLA",
[CC_OP_FLOGR] = "CC_OP_FLOGR",
[CC_OP_LCBB] = "CC_OP_LCBB",
[CC_OP_VC] = "CC_OP_VC",
diff --git a/target/s390x/s390x-internal.h b/target/s390x/s390x-internal.h
index 02cf6c3f43..c9acb450ba 100644
--- a/target/s390x/s390x-internal.h
+++ b/target/s390x/s390x-internal.h
@@ -193,8 +193,7 @@ enum cc_op {
CC_OP_NZ_F128, /* FP dst != 0 (128bit) */
CC_OP_ICM, /* insert characters under mask */
- CC_OP_SLA_32, /* Calculate shift left signed (32bit) */
- CC_OP_SLA_64, /* Calculate shift left signed (64bit) */
+ CC_OP_SLA, /* Calculate shift left signed */
CC_OP_FLOGR, /* find leftmost one */
CC_OP_LCBB, /* load count to block boundary */
CC_OP_VC, /* vector compare result */
diff --git a/target/s390x/tcg/cc_helper.c b/target/s390x/tcg/cc_helper.c
index c9b7b0e8c6..8d04097f78 100644
--- a/target/s390x/tcg/cc_helper.c
+++ b/target/s390x/tcg/cc_helper.c
@@ -268,34 +268,7 @@ static uint32_t cc_calc_icm(uint64_t mask, uint64_t val)
}
}
-static uint32_t cc_calc_sla_32(uint32_t src, int shift)
-{
- uint32_t mask = ((1U << shift) - 1U) << (32 - shift);
- uint32_t sign = 1U << 31;
- uint32_t match;
- int32_t r;
-
- /* Check if the sign bit stays the same. */
- if (src & sign) {
- match = mask;
- } else {
- match = 0;
- }
- if ((src & mask) != match) {
- /* Overflow. */
- return 3;
- }
-
- r = ((src << shift) & ~sign) | (src & sign);
- if (r == 0) {
- return 0;
- } else if (r < 0) {
- return 1;
- }
- return 2;
-}
-
-static uint32_t cc_calc_sla_64(uint64_t src, int shift)
+static uint32_t cc_calc_sla(uint64_t src, int shift)
{
uint64_t mask = -1ULL << (63 - shift);
uint64_t sign = 1ULL << 63;
@@ -459,11 +432,8 @@ static uint32_t do_calc_cc(CPUS390XState *env, uint32_t cc_op,
case CC_OP_ICM:
r = cc_calc_icm(src, dst);
break;
- case CC_OP_SLA_32:
- r = cc_calc_sla_32(src, dst);
- break;
- case CC_OP_SLA_64:
- r = cc_calc_sla_64(src, dst);
+ case CC_OP_SLA:
+ r = cc_calc_sla(src, dst);
break;
case CC_OP_FLOGR:
r = cc_calc_flogr(dst);
diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
index c92284df5d..99f4f5e36e 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.def
@@ -747,8 +747,8 @@
C(0xb9e1, POPCNT, RRE, PC, 0, r2_o, r1, 0, popcnt, nz64)
/* ROTATE LEFT SINGLE LOGICAL */
- C(0xeb1d, RLL, RSY_a, Z, r3_o, sh32, new, r1_32, rll32, 0)
- C(0xeb1c, RLLG, RSY_a, Z, r3_o, sh64, r1, 0, rll64, 0)
+ C(0xeb1d, RLL, RSY_a, Z, r3_o, sh, new, r1_32, rll32, 0)
+ C(0xeb1c, RLLG, RSY_a, Z, r3_o, sh, r1, 0, rll64, 0)
/* ROTATE THEN INSERT SELECTED BITS */
C(0xec55, RISBG, RIE_f, GIE, 0, r2, r1, 0, risbg, s64)
@@ -784,29 +784,29 @@
C(0x0400, SPM, RR_a, Z, r1, 0, 0, 0, spm, 0)
/* SHIFT LEFT SINGLE */
- D(0x8b00, SLA, RS_a, Z, r1, sh32, new, r1_32, sla, 0, 31)
- D(0xebdd, SLAK, RSY_a, DO, r3, sh32, new, r1_32, sla, 0, 31)
- D(0xeb0b, SLAG, RSY_a, Z, r3, sh64, r1, 0, sla, 0, 63)
+ D(0x8b00, SLA, RS_a, Z, r1, sh, new, r1_32, sla, 0, 31)
+ D(0xebdd, SLAK, RSY_a, DO, r3, sh, new, r1_32, sla, 0, 31)
+ D(0xeb0b, SLAG, RSY_a, Z, r3, sh, r1, 0, sla, 0, 63)
/* SHIFT LEFT SINGLE LOGICAL */
- C(0x8900, SLL, RS_a, Z, r1_o, sh32, new, r1_32, sll, 0)
- C(0xebdf, SLLK, RSY_a, DO, r3_o, sh32, new, r1_32, sll, 0)
- C(0xeb0d, SLLG, RSY_a, Z, r3_o, sh64, r1, 0, sll, 0)
+ C(0x8900, SLL, RS_a, Z, r1_o, sh, new, r1_32, sll, 0)
+ C(0xebdf, SLLK, RSY_a, DO, r3_o, sh, new, r1_32, sll, 0)
+ C(0xeb0d, SLLG, RSY_a, Z, r3_o, sh, r1, 0, sll, 0)
/* SHIFT RIGHT SINGLE */
- C(0x8a00, SRA, RS_a, Z, r1_32s, sh32, new, r1_32, sra, s32)
- C(0xebdc, SRAK, RSY_a, DO, r3_32s, sh32, new, r1_32, sra, s32)
- C(0xeb0a, SRAG, RSY_a, Z, r3_o, sh64, r1, 0, sra, s64)
+ C(0x8a00, SRA, RS_a, Z, r1_32s, sh, new, r1_32, sra, s32)
+ C(0xebdc, SRAK, RSY_a, DO, r3_32s, sh, new, r1_32, sra, s32)
+ C(0xeb0a, SRAG, RSY_a, Z, r3_o, sh, r1, 0, sra, s64)
/* SHIFT RIGHT SINGLE LOGICAL */
- C(0x8800, SRL, RS_a, Z, r1_32u, sh32, new, r1_32, srl, 0)
- C(0xebde, SRLK, RSY_a, DO, r3_32u, sh32, new, r1_32, srl, 0)
- C(0xeb0c, SRLG, RSY_a, Z, r3_o, sh64, r1, 0, srl, 0)
+ C(0x8800, SRL, RS_a, Z, r1_32u, sh, new, r1_32, srl, 0)
+ C(0xebde, SRLK, RSY_a, DO, r3_32u, sh, new, r1_32, srl, 0)
+ C(0xeb0c, SRLG, RSY_a, Z, r3_o, sh, r1, 0, srl, 0)
/* SHIFT LEFT DOUBLE */
- D(0x8f00, SLDA, RS_a, Z, r1_D32, sh64, new, r1_D32, sla, 0, 63)
+ D(0x8f00, SLDA, RS_a, Z, r1_D32, sh, new, r1_D32, sla, 0, 63)
/* SHIFT LEFT DOUBLE LOGICAL */
- C(0x8d00, SLDL, RS_a, Z, r1_D32, sh64, new, r1_D32, sll, 0)
+ C(0x8d00, SLDL, RS_a, Z, r1_D32, sh, new, r1_D32, sll, 0)
/* SHIFT RIGHT DOUBLE */
- C(0x8e00, SRDA, RS_a, Z, r1_D32, sh64, new, r1_D32, sra, s64)
+ C(0x8e00, SRDA, RS_a, Z, r1_D32, sh, new, r1_D32, sra, s64)
/* SHIFT RIGHT DOUBLE LOGICAL */
- C(0x8c00, SRDL, RS_a, Z, r1_D32, sh64, new, r1_D32, srl, 0)
+ C(0x8c00, SRDL, RS_a, Z, r1_D32, sh, new, r1_D32, srl, 0)
/* SQUARE ROOT */
F(0xb314, SQEBR, RRE, Z, 0, e2, new, e1, sqeb, 0, IF_BFP)
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index c5e59b68af..b14e6a04a7 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -636,8 +636,7 @@ static void gen_op_calc_cc(DisasContext *s)
case CC_OP_LTUGTU_64:
case CC_OP_TM_32:
case CC_OP_TM_64:
- case CC_OP_SLA_32:
- case CC_OP_SLA_64:
+ case CC_OP_SLA:
case CC_OP_SUBU:
case CC_OP_NZ_F128:
case CC_OP_VC:
@@ -1178,19 +1177,6 @@ struct DisasInsn {
/* ====================================================================== */
/* Miscellaneous helpers, used by several operations. */
-static void help_l2_shift(DisasContext *s, DisasOps *o, int mask)
-{
- int b2 = get_field(s, b2);
- int d2 = get_field(s, d2);
-
- if (b2 == 0) {
- o->in2 = tcg_const_i64(d2 & mask);
- } else {
- o->in2 = get_address(s, 0, b2, d2);
- tcg_gen_andi_i64(o->in2, o->in2, mask);
- }
-}
-
static DisasJumpType help_goto_direct(DisasContext *s, uint64_t dest)
{
if (dest == s->pc_tmp) {
@@ -4113,9 +4099,18 @@ static DisasJumpType op_soc(DisasContext *s, DisasOps *o)
static DisasJumpType op_sla(DisasContext *s, DisasOps *o)
{
+ TCGv_i64 t;
uint64_t sign = 1ull << s->insn->data;
- enum cc_op cco = s->insn->data == 31 ? CC_OP_SLA_32 : CC_OP_SLA_64;
- gen_op_update2_cc_i64(s, cco, o->in1, o->in2);
+ if (s->insn->data == 31) {
+ t = tcg_temp_new_i64();
+ tcg_gen_shli_i64(t, o->in1, 32);
+ } else {
+ t = o->in1;
+ }
+ gen_op_update2_cc_i64(s, CC_OP_SLA, t, o->in2);
+ if (s->insn->data == 31) {
+ tcg_temp_free_i64(t);
+ }
tcg_gen_shl_i64(o->out, o->in1, o->in2);
/* The arithmetic left shift is curious in that it does not affect
the sign bit. Copy that over from the source unchanged. */
@@ -5924,17 +5919,19 @@ static void in2_ri2(DisasContext *s, DisasOps *o)
}
#define SPEC_in2_ri2 0
-static void in2_sh32(DisasContext *s, DisasOps *o)
+static void in2_sh(DisasContext *s, DisasOps *o)
{
- help_l2_shift(s, o, 31);
-}
-#define SPEC_in2_sh32 0
+ int b2 = get_field(s, b2);
+ int d2 = get_field(s, d2);
-static void in2_sh64(DisasContext *s, DisasOps *o)
-{
- help_l2_shift(s, o, 63);
+ if (b2 == 0) {
+ o->in2 = tcg_const_i64(d2 & 0x3f);
+ } else {
+ o->in2 = get_address(s, 0, b2, d2);
+ tcg_gen_andi_i64(o->in2, o->in2, 0x3f);
+ }
}
-#define SPEC_in2_sh64 0
+#define SPEC_in2_sh 0
static void in2_m2_8u(DisasContext *s, DisasOps *o)
{
--
2.37.3

View File

@ -0,0 +1,54 @@
From 2bfd1db9c3efcf7b73790565b4f8597bc04762c2 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Tue, 23 May 2023 12:34:33 +0200
Subject: [PATCH 13/22] target/s390x: Fix the accumulation of ccm in op_icm
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [12/21] ad52141b1d733a34d392b72d9962ea7ac521dc17
Bugzilla: https://bugzilla.redhat.com/2169308
commit 21641ee5a9b31568c990c7fc949eeb9bcd0f6a0f
Author: Richard Henderson <richard.henderson@linaro.org>
Date: Fri Apr 1 13:36:59 2022 -0600
target/s390x: Fix the accumulation of ccm in op_icm
Coverity rightly reports that 0xff << pos can overflow.
This would affect the ICMH instruction.
Fixes: Coverity CID 1487161
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20220401193659.332079-1-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
target/s390x/tcg/translate.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index 7ff7f90e23..75f0418c10 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -2592,7 +2592,7 @@ static DisasJumpType op_icm(DisasContext *s, DisasOps *o)
tcg_gen_qemu_ld8u(tmp, o->in2, get_mem_index(s));
tcg_gen_addi_i64(o->in2, o->in2, 1);
tcg_gen_deposit_i64(o->out, o->out, tmp, pos, 8);
- ccm |= 0xff << pos;
+ ccm |= 0xffull << pos;
}
m3 = (m3 << 1) & 0xf;
pos -= 8;
--
2.37.3

View File

@ -0,0 +1,60 @@
From 95d7d0e24fa51913b41cca7c35cb75460b850ecb Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Tue, 23 May 2023 12:34:33 +0200
Subject: [PATCH 14/22] target/s390x: Fix writeback to v1 in helper_vstl
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [13/21] 9db50d12afc0a85921e6bfdb69f12ba29f3dce72
Bugzilla: https://bugzilla.redhat.com/2169308
commit db67a6ff480b346b1415b983f9582028cf8e18f0
Author: Richard Henderson <richard.henderson@linaro.org>
Date: Thu Apr 28 11:46:56 2022 +0200
target/s390x: Fix writeback to v1 in helper_vstl
Fixes: 0e0a5b49ad58 ("s390x/tcg: Implement VECTOR STORE WITH LENGTH")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Miller <dmiller423@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20220428094708.84835-2-david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
target/s390x/tcg/vec_helper.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/target/s390x/tcg/vec_helper.c b/target/s390x/tcg/vec_helper.c
index ededf13cf0..48d86722b2 100644
--- a/target/s390x/tcg/vec_helper.c
+++ b/target/s390x/tcg/vec_helper.c
@@ -200,7 +200,6 @@ void HELPER(vstl)(CPUS390XState *env, const void *v1, uint64_t addr,
addr = wrap_address(env, addr + 8);
cpu_stq_data_ra(env, addr, s390_vec_read_element64(v1, 1), GETPC());
} else {
- S390Vector tmp = {};
int i;
for (i = 0; i < bytes; i++) {
@@ -209,6 +208,5 @@ void HELPER(vstl)(CPUS390XState *env, const void *v1, uint64_t addr,
cpu_stb_data_ra(env, addr, byte, GETPC());
addr = wrap_address(env, addr + 1);
}
- *(S390Vector *)v1 = tmp;
}
}
--
2.37.3

View File

@ -0,0 +1,67 @@
From 1acfca06f0dbbc586f0d86833196a4463dc8b8c2 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Tue, 23 May 2023 12:34:33 +0200
Subject: [PATCH 15/22] target/s390x: fix handling of zeroes in vfmin/vfmax
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [14/21] 27f66691e08192a5c9f2ecbde3603c0adece4857
Bugzilla: https://bugzilla.redhat.com/2169308
commit 13c59eb09bd6d1fbc13f08b708226421f14a232b
Author: Ilya Leoshkevich <iii@linux.ibm.com>
Date: Wed Jul 13 20:26:10 2022 +0200
target/s390x: fix handling of zeroes in vfmin/vfmax
vfmin_res() / vfmax_res() are trying to check whether a and b are both
zeroes, but in reality they check that they are the same kind of zero.
This causes incorrect results when comparing positive and negative
zeroes.
Fixes: da4807527f3b ("s390x/tcg: Implement VECTOR FP (MAXIMUM|MINIMUM)")
Co-developed-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20220713182612.3780050-2-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
target/s390x/tcg/vec_fpu_helper.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/target/s390x/tcg/vec_fpu_helper.c b/target/s390x/tcg/vec_fpu_helper.c
index 1a77993471..d1249706f9 100644
--- a/target/s390x/tcg/vec_fpu_helper.c
+++ b/target/s390x/tcg/vec_fpu_helper.c
@@ -794,7 +794,7 @@ static S390MinMaxRes vfmin_res(uint16_t dcmask_a, uint16_t dcmask_b,
default:
g_assert_not_reached();
}
- } else if (unlikely(dcmask_a & dcmask_b & DCMASK_ZERO)) {
+ } else if (unlikely((dcmask_a & DCMASK_ZERO) && (dcmask_b & DCMASK_ZERO))) {
switch (type) {
case S390_MINMAX_TYPE_JAVA:
return neg_a ? S390_MINMAX_RES_A : S390_MINMAX_RES_B;
@@ -844,7 +844,7 @@ static S390MinMaxRes vfmax_res(uint16_t dcmask_a, uint16_t dcmask_b,
default:
g_assert_not_reached();
}
- } else if (unlikely(dcmask_a & dcmask_b & DCMASK_ZERO)) {
+ } else if (unlikely((dcmask_a & DCMASK_ZERO) && (dcmask_b & DCMASK_ZERO))) {
const bool neg_a = dcmask_a & DCMASK_NEGATIVE;
switch (type) {
--
2.37.3

View File

@ -0,0 +1,90 @@
From b83e60b3a2488e988986f2c7e63cb7eb40d7cf27 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Tue, 23 May 2023 12:34:33 +0200
Subject: [PATCH 20/22] target/s390x/tcg: Fix and improve the SACF instruction
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [19/21] 62030baceb0b0d1d651ba9026bb419ed4b2a8149
Bugzilla: https://bugzilla.redhat.com/2169308
commit 21be74a9a59d1e4954ebb59dcbee0fda0b19de00
Author: Thomas Huth <thuth@redhat.com>
Date: Thu Dec 1 19:44:43 2022 +0100
target/s390x/tcg: Fix and improve the SACF instruction
The SET ADDRESS SPACE CONTROL FAST instruction is not privileged, it can be
used from problem space, too. Just the switching to the home address space
is privileged and should still generate a privilege exception. This bug is
e.g. causing programs like Java that use the "getcpu" vdso kernel function
to crash (see https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=990417#26 ).
While we're at it, also check if DAT is not enabled. In that case the
instruction is supposed to generate a special operation exception.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/655
Message-Id: <20221201184443.136355-1-thuth@redhat.com>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Conflicts:
file rename target/s390x/tcg/insn-data.h.in -> insn-data.def
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
target/s390x/tcg/cc_helper.c | 7 +++++++
target/s390x/tcg/insn-data.def | 2 +-
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/target/s390x/tcg/cc_helper.c b/target/s390x/tcg/cc_helper.c
index b2e8d3d9f5..b36f8cdc8b 100644
--- a/target/s390x/tcg/cc_helper.c
+++ b/target/s390x/tcg/cc_helper.c
@@ -487,6 +487,10 @@ void HELPER(sacf)(CPUS390XState *env, uint64_t a1)
{
HELPER_LOG("%s: %16" PRIx64 "\n", __func__, a1);
+ if (!(env->psw.mask & PSW_MASK_DAT)) {
+ tcg_s390_program_interrupt(env, PGM_SPECIAL_OP, GETPC());
+ }
+
switch (a1 & 0xf00) {
case 0x000:
env->psw.mask &= ~PSW_MASK_ASC;
@@ -497,6 +501,9 @@ void HELPER(sacf)(CPUS390XState *env, uint64_t a1)
env->psw.mask |= PSW_ASC_SECONDARY;
break;
case 0x300:
+ if ((env->psw.mask & PSW_MASK_PSTATE) != 0) {
+ tcg_s390_program_interrupt(env, PGM_PRIVILEGED, GETPC());
+ }
env->psw.mask &= ~PSW_MASK_ASC;
env->psw.mask |= PSW_ASC_HOME;
break;
diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
index d54673a3ba..548b0eedc2 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.def
@@ -1315,7 +1315,7 @@
/* SERVICE CALL LOGICAL PROCESSOR (PV hypercall) */
F(0xb220, SERVC, RRE, Z, r1_o, r2_o, 0, 0, servc, 0, IF_PRIV | IF_IO)
/* SET ADDRESS SPACE CONTROL FAST */
- F(0xb279, SACF, S, Z, 0, a2, 0, 0, sacf, 0, IF_PRIV)
+ C(0xb279, SACF, S, Z, 0, a2, 0, 0, sacf, 0)
/* SET CLOCK */
F(0xb204, SCK, S, Z, la2, 0, 0, 0, sck, 0, IF_PRIV | IF_IO)
/* SET CLOCK COMPARATOR */
--
2.37.3

View File

@ -0,0 +1,56 @@
From 30ae4c8951df25085e479e0e2e5b43d2175f996a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Tue, 23 May 2023 12:34:33 +0200
Subject: [PATCH 21/22] target/s390x/tcg/mem_helper: Test the right bits in
psw_key_valid()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 279: Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
RH-Bugzilla: 2169308 2209605
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: David Hildenbrand <david@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [20/21] 00a243a96953023387bab6f1925b734755c53e6e
Bugzilla: https://bugzilla.redhat.com/2169308
commit 5e275ca6fb32bcb4b56b29e6acfd3cf306c4a180
Author: Thomas Huth <thuth@redhat.com>
Date: Mon Dec 5 15:20:43 2022 +0100
target/s390x/tcg/mem_helper: Test the right bits in psw_key_valid()
The PSW key mask is a 16 bit field, and the psw_key variable is
in the range from 0 to 15, so it does not make sense to use
"0x80 >> psw_key" for testing the bits here. We should use 0x8000
instead.
Message-Id: <20221205142043.95185-1-thuth@redhat.com>
Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
target/s390x/tcg/mem_helper.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/target/s390x/tcg/mem_helper.c b/target/s390x/tcg/mem_helper.c
index 362a30d99e..bd27c75dfb 100644
--- a/target/s390x/tcg/mem_helper.c
+++ b/target/s390x/tcg/mem_helper.c
@@ -50,7 +50,7 @@ static inline bool psw_key_valid(CPUS390XState *env, uint8_t psw_key)
if (env->psw.mask & PSW_MASK_PSTATE) {
/* PSW key has range 0..15, it is valid if the bit is 1 in the PKM */
- return pkm & (0x80 >> psw_key);
+ return pkm & (0x8000 >> psw_key);
}
return true;
}
--
2.37.3

View File

@ -0,0 +1,98 @@
From 884e6dfecc8b0f155015f0a25888300d8e1707f8 Mon Sep 17 00:00:00 2001
From: Hanna Czenczek <hreitz@redhat.com>
Date: Tue, 11 Apr 2023 19:34:15 +0200
Subject: [PATCH 1/5] util/iov: Make qiov_slice() public
RH-Author: Hanna Czenczek <hreitz@redhat.com>
RH-MergeRequest: 291: block: Split padded I/O vectors exceeding IOV_MAX
RH-Bugzilla: 2141964
RH-Acked-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Commit: [1/5] 7f082982e49bacbcc21ca24e471b4399e64321a9
We want to inline qemu_iovec_init_extended() in block/io.c for padding
requests, and having access to qiov_slice() is useful for this. As a
public function, it is renamed to qemu_iovec_slice().
(We will need to count the number of I/O vector elements of a slice
there, and then later process this slice. Without qiov_slice(), we
would need to call qemu_iovec_subvec_niov(), and all further
IOV-processing functions may need to skip prefixing elements to
accomodate for a qiov_offset. Because qemu_iovec_subvec_niov()
internally calls qiov_slice(), we can just have the block/io.c code call
qiov_slice() itself, thus get the number of elements, and also create an
iovec array with the superfluous prefixing elements stripped, so the
following processing functions no longer need to skip them.)
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230411173418.19549-2-hreitz@redhat.com>
(cherry picked from commit 3d06cea8256d54a6b0238934c31012f7f17100f5)
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
---
include/qemu/iov.h | 3 +++
util/iov.c | 14 +++++++-------
2 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/include/qemu/iov.h b/include/qemu/iov.h
index 9330746680..46fadfb27a 100644
--- a/include/qemu/iov.h
+++ b/include/qemu/iov.h
@@ -229,6 +229,9 @@ int qemu_iovec_init_extended(
void *tail_buf, size_t tail_len);
void qemu_iovec_init_slice(QEMUIOVector *qiov, QEMUIOVector *source,
size_t offset, size_t len);
+struct iovec *qemu_iovec_slice(QEMUIOVector *qiov,
+ size_t offset, size_t len,
+ size_t *head, size_t *tail, int *niov);
int qemu_iovec_subvec_niov(QEMUIOVector *qiov, size_t offset, size_t len);
void qemu_iovec_add(QEMUIOVector *qiov, void *base, size_t len);
void qemu_iovec_concat(QEMUIOVector *dst,
diff --git a/util/iov.c b/util/iov.c
index 58c7b3eeee..3ccb530b16 100644
--- a/util/iov.c
+++ b/util/iov.c
@@ -373,15 +373,15 @@ static struct iovec *iov_skip_offset(struct iovec *iov, size_t offset,
}
/*
- * qiov_slice
+ * qemu_iovec_slice
*
* Find subarray of iovec's, containing requested range. @head would
* be offset in first iov (returned by the function), @tail would be
* count of extra bytes in last iovec (returned iov + @niov - 1).
*/
-static struct iovec *qiov_slice(QEMUIOVector *qiov,
- size_t offset, size_t len,
- size_t *head, size_t *tail, int *niov)
+struct iovec *qemu_iovec_slice(QEMUIOVector *qiov,
+ size_t offset, size_t len,
+ size_t *head, size_t *tail, int *niov)
{
struct iovec *iov, *end_iov;
@@ -406,7 +406,7 @@ int qemu_iovec_subvec_niov(QEMUIOVector *qiov, size_t offset, size_t len)
size_t head, tail;
int niov;
- qiov_slice(qiov, offset, len, &head, &tail, &niov);
+ qemu_iovec_slice(qiov, offset, len, &head, &tail, &niov);
return niov;
}
@@ -434,8 +434,8 @@ int qemu_iovec_init_extended(
}
if (mid_len) {
- mid_iov = qiov_slice(mid_qiov, mid_offset, mid_len,
- &mid_head, &mid_tail, &mid_niov);
+ mid_iov = qemu_iovec_slice(mid_qiov, mid_offset, mid_len,
+ &mid_head, &mid_tail, &mid_niov);
}
total_niov = !!head_len + mid_niov + !!tail_len;
--
2.39.3

View File

@ -0,0 +1,157 @@
From cc31f7eb1c362dc308a163b7364c96ed098a793a Mon Sep 17 00:00:00 2001
From: Hanna Czenczek <hreitz@redhat.com>
Date: Tue, 11 Apr 2023 19:34:17 +0200
Subject: [PATCH 3/5] util/iov: Remove qemu_iovec_init_extended()
RH-Author: Hanna Czenczek <hreitz@redhat.com>
RH-MergeRequest: 291: block: Split padded I/O vectors exceeding IOV_MAX
RH-Bugzilla: 2141964
RH-Acked-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Commit: [3/5] 19c8307ef1289f1991199d1d1f6ab6c89a4b59ce
bdrv_pad_request() was the main user of qemu_iovec_init_extended().
HEAD^ has removed that use, so we can remove qemu_iovec_init_extended()
now.
The only remaining user is qemu_iovec_init_slice(), which can easily
inline the small part it really needs.
Note that qemu_iovec_init_extended() offered a memcpy() optimization to
initialize the new I/O vector. qemu_iovec_concat_iov(), which is used
to replace its functionality, does not, but calls qemu_iovec_add() for
every single element. If we decide this optimization was important, we
will need to re-implement it in qemu_iovec_concat_iov(), which might
also benefit its pre-existing users.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230411173418.19549-4-hreitz@redhat.com>
(cherry picked from commit cc63f6f6fa1aaa4b6405dd69432c693e9c8d18ca)
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
---
include/qemu/iov.h | 5 ---
util/iov.c | 79 +++++++---------------------------------------
2 files changed, 11 insertions(+), 73 deletions(-)
diff --git a/include/qemu/iov.h b/include/qemu/iov.h
index 46fadfb27a..63a1c01965 100644
--- a/include/qemu/iov.h
+++ b/include/qemu/iov.h
@@ -222,11 +222,6 @@ static inline void *qemu_iovec_buf(QEMUIOVector *qiov)
void qemu_iovec_init(QEMUIOVector *qiov, int alloc_hint);
void qemu_iovec_init_external(QEMUIOVector *qiov, struct iovec *iov, int niov);
-int qemu_iovec_init_extended(
- QEMUIOVector *qiov,
- void *head_buf, size_t head_len,
- QEMUIOVector *mid_qiov, size_t mid_offset, size_t mid_len,
- void *tail_buf, size_t tail_len);
void qemu_iovec_init_slice(QEMUIOVector *qiov, QEMUIOVector *source,
size_t offset, size_t len);
struct iovec *qemu_iovec_slice(QEMUIOVector *qiov,
diff --git a/util/iov.c b/util/iov.c
index 3ccb530b16..af3ccc2546 100644
--- a/util/iov.c
+++ b/util/iov.c
@@ -411,70 +411,6 @@ int qemu_iovec_subvec_niov(QEMUIOVector *qiov, size_t offset, size_t len)
return niov;
}
-/*
- * Compile new iovec, combining @head_buf buffer, sub-qiov of @mid_qiov,
- * and @tail_buf buffer into new qiov.
- */
-int qemu_iovec_init_extended(
- QEMUIOVector *qiov,
- void *head_buf, size_t head_len,
- QEMUIOVector *mid_qiov, size_t mid_offset, size_t mid_len,
- void *tail_buf, size_t tail_len)
-{
- size_t mid_head, mid_tail;
- int total_niov, mid_niov = 0;
- struct iovec *p, *mid_iov = NULL;
-
- assert(mid_qiov->niov <= IOV_MAX);
-
- if (SIZE_MAX - head_len < mid_len ||
- SIZE_MAX - head_len - mid_len < tail_len)
- {
- return -EINVAL;
- }
-
- if (mid_len) {
- mid_iov = qemu_iovec_slice(mid_qiov, mid_offset, mid_len,
- &mid_head, &mid_tail, &mid_niov);
- }
-
- total_niov = !!head_len + mid_niov + !!tail_len;
- if (total_niov > IOV_MAX) {
- return -EINVAL;
- }
-
- if (total_niov == 1) {
- qemu_iovec_init_buf(qiov, NULL, 0);
- p = &qiov->local_iov;
- } else {
- qiov->niov = qiov->nalloc = total_niov;
- qiov->size = head_len + mid_len + tail_len;
- p = qiov->iov = g_new(struct iovec, qiov->niov);
- }
-
- if (head_len) {
- p->iov_base = head_buf;
- p->iov_len = head_len;
- p++;
- }
-
- assert(!mid_niov == !mid_len);
- if (mid_niov) {
- memcpy(p, mid_iov, mid_niov * sizeof(*p));
- p[0].iov_base = (uint8_t *)p[0].iov_base + mid_head;
- p[0].iov_len -= mid_head;
- p[mid_niov - 1].iov_len -= mid_tail;
- p += mid_niov;
- }
-
- if (tail_len) {
- p->iov_base = tail_buf;
- p->iov_len = tail_len;
- }
-
- return 0;
-}
-
/*
* Check if the contents of subrange of qiov data is all zeroes.
*/
@@ -506,14 +442,21 @@ bool qemu_iovec_is_zero(QEMUIOVector *qiov, size_t offset, size_t bytes)
void qemu_iovec_init_slice(QEMUIOVector *qiov, QEMUIOVector *source,
size_t offset, size_t len)
{
- int ret;
+ struct iovec *slice_iov;
+ int slice_niov;
+ size_t slice_head, slice_tail;
assert(source->size >= len);
assert(source->size - len >= offset);
- /* We shrink the request, so we can't overflow neither size_t nor MAX_IOV */
- ret = qemu_iovec_init_extended(qiov, NULL, 0, source, offset, len, NULL, 0);
- assert(ret == 0);
+ slice_iov = qemu_iovec_slice(source, offset, len,
+ &slice_head, &slice_tail, &slice_niov);
+ if (slice_niov == 1) {
+ qemu_iovec_init_buf(qiov, slice_iov[0].iov_base + slice_head, len);
+ } else {
+ qemu_iovec_init(qiov, slice_niov);
+ qemu_iovec_concat_iov(qiov, slice_iov, slice_niov, slice_head, len);
+ }
}
void qemu_iovec_destroy(QEMUIOVector *qiov)
--
2.39.3

View File

@ -0,0 +1,81 @@
From 7b17ef78eee2b30829666f12e87ff1eee3c195b5 Mon Sep 17 00:00:00 2001
From: Jon Maloy <jmaloy@redhat.com>
Date: Tue, 15 Aug 2023 19:00:44 -0400
Subject: [PATCH] vhost-vdpa: do not cleanup the vdpa/vhost-net structures if
peer nic is present
RH-Author: Jon Maloy <jmaloy@redhat.com>
RH-MergeRequest: 304: vhost-vdpa: do not cleanup the vdpa/vhost-net structures if peer nic is present
RH-Bugzilla: 2215786
RH-Acked-by: Ani Sinha <None>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [1/1] 16aa37efdf129f2619cedf9c030222b88eda9e26 (redhat/rhel/src/qemu-kvm/jons-qemu-kvm-2)
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2215786
CVE: CVE-2023-3301
Upstream: Merged
Conflicts: commit babf8b87127a is not present in this release, so the commit does not
apply cleanly. The two adjacent munmap() calls introduced by that commit
don't seem to be needed for the logics of this change.
commit a0d7215e339b61c7d7a7b3fcf754954d80d93eb8
Author: Ani Sinha <anisinha@redhat.com>
Date: Mon Jun 19 12:22:09 2023 +0530
vhost-vdpa: do not cleanup the vdpa/vhost-net structures if peer nic is present
When a peer nic is still attached to the vdpa backend, it is too early to free
up the vhost-net and vdpa structures. If these structures are freed here, then
QEMU crashes when the guest is being shut down. The following call chain
would result in an assertion failure since the pointer returned from
vhost_vdpa_get_vhost_net() would be NULL:
do_vm_stop() -> vm_state_notify() -> virtio_set_status() ->
virtio_net_vhost_status() -> get_vhost_net().
Therefore, we defer freeing up the structures until at guest shutdown
time when qemu_cleanup() calls net_cleanup() which then calls
qemu_del_net_client() which would eventually call vhost_vdpa_cleanup()
again to free up the structures. This time, the loop in net_cleanup()
ensures that vhost_vdpa_cleanup() will be called one last time when
all the peer nics are detached and freed.
All unit tests pass with this change.
CC: imammedo@redhat.com
CC: jusual@redhat.com
CC: mst@redhat.com
Fixes: CVE-2023-3301
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20230619065209.442185-1-anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
---
net/vhost-vdpa.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 814f704687..ac48de9495 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -128,6 +128,14 @@ static void vhost_vdpa_cleanup(NetClientState *nc)
{
VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
+ /*
+ * If a peer NIC is attached, do not cleanup anything.
+ * Cleanup will happen as a part of qemu_cleanup() -> net_cleanup()
+ * when the guest is shutting down.
+ */
+ if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_NIC) {
+ return;
+ }
if (s->vhost_net) {
vhost_net_cleanup(s->vhost_net);
g_free(s->vhost_net);
--
2.39.3

View File

@ -0,0 +1,337 @@
From 31e9e3691789469b93a75d0221387bab3e526094 Mon Sep 17 00:00:00 2001
From: Stefan Hajnoczi <stefanha@redhat.com>
Date: Tue, 21 Feb 2023 16:22:18 -0500
Subject: [PATCH 13/13] virtio-scsi: reset SCSI devices from main loop thread
RH-Author: Stefan Hajnoczi <stefanha@redhat.com>
RH-MergeRequest: 264: scsi: protect req->aiocb with AioContext lock
RH-Bugzilla: 2090990
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
RH-Commit: [3/3] 30d7c2bd868efa6694992e75ace22fb48aef161b
When an IOThread is configured, the ctrl virtqueue is processed in the
IOThread. TMFs that reset SCSI devices are currently called directly
from the IOThread and trigger an assertion failure in blk_drain() from
the following call stack:
virtio_scsi_handle_ctrl_req -> virtio_scsi_do_tmf -> device_code_reset
-> scsi_disk_reset -> scsi_device_purge_requests -> blk_drain
../block/block-backend.c:1780: void blk_drain(BlockBackend *): Assertion `qemu_in_main_thread()' failed.
The blk_drain() function is not designed to be called from an IOThread
because it needs the Big QEMU Lock (BQL).
This patch defers TMFs that reset SCSI devices to a Bottom Half (BH)
that runs in the main loop thread under the BQL. This way it's safe to
call blk_drain() and the assertion failure is avoided.
Introduce s->tmf_bh_list for tracking TMF requests that have been
deferred to the BH. When the BH runs it will grab the entire list and
process all requests. Care must be taken to clear the list when the
virtio-scsi device is reset or unrealized. Otherwise deferred TMF
requests could execute later and lead to use-after-free or other
undefined behavior.
The s->resetting counter that's used by TMFs that reset SCSI devices is
accessed from multiple threads. This patch makes that explicit by using
atomic accessor functions. With this patch applied the counter is only
modified by the main loop thread under the BQL but can be read by any
thread.
Reported-by: Qing Wang <qinwang@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230221212218.1378734-4-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit be2c42b97c3a3a395b2f05bad1b6c7de20ecf2a5)
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Conflicts:
- hw/scsi/virtio-scsi.c
- VirtIOSCSIReq is defined in include/hw/virtio/virtio-scsi.h
downstream instead of hw/scsi/virtio-scsi.c because commit
3dc584abeef0 ("virtio-scsi: move request-related items from .h to
.c") is missing. Update the struct fields in virtio-scsi.h
downstream.
- Use qbus_reset_all() downstream instead of bus_cold_reset() because
commit 4a5fc890b1d3 ("scsi: Use device_cold_reset() and
bus_cold_reset()") is missing.
- Drop GLOBAL_STATE_CODE() because these macros don't exist
downstream. They are assertions/documentation and can be removed
without affecting the code.
---
hw/scsi/virtio-scsi.c | 155 +++++++++++++++++++++++++-------
include/hw/virtio/virtio-scsi.h | 21 +++--
2 files changed, 139 insertions(+), 37 deletions(-)
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index a35257c35a..ef19a9bcd0 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -256,6 +256,118 @@ static inline void virtio_scsi_ctx_check(VirtIOSCSI *s, SCSIDevice *d)
}
}
+static void virtio_scsi_do_one_tmf_bh(VirtIOSCSIReq *req)
+{
+ VirtIOSCSI *s = req->dev;
+ SCSIDevice *d = virtio_scsi_device_get(s, req->req.tmf.lun);
+ BusChild *kid;
+ int target;
+
+ switch (req->req.tmf.subtype) {
+ case VIRTIO_SCSI_T_TMF_LOGICAL_UNIT_RESET:
+ if (!d) {
+ req->resp.tmf.response = VIRTIO_SCSI_S_BAD_TARGET;
+ goto out;
+ }
+ if (d->lun != virtio_scsi_get_lun(req->req.tmf.lun)) {
+ req->resp.tmf.response = VIRTIO_SCSI_S_INCORRECT_LUN;
+ goto out;
+ }
+ qatomic_inc(&s->resetting);
+ qdev_reset_all(&d->qdev);
+ qatomic_dec(&s->resetting);
+ break;
+
+ case VIRTIO_SCSI_T_TMF_I_T_NEXUS_RESET:
+ target = req->req.tmf.lun[1];
+ qatomic_inc(&s->resetting);
+
+ rcu_read_lock();
+ QTAILQ_FOREACH_RCU(kid, &s->bus.qbus.children, sibling) {
+ SCSIDevice *d1 = SCSI_DEVICE(kid->child);
+ if (d1->channel == 0 && d1->id == target) {
+ qdev_reset_all(&d1->qdev);
+ }
+ }
+ rcu_read_unlock();
+
+ qatomic_dec(&s->resetting);
+ break;
+
+ default:
+ g_assert_not_reached();
+ break;
+ }
+
+out:
+ object_unref(OBJECT(d));
+
+ virtio_scsi_acquire(s);
+ virtio_scsi_complete_req(req);
+ virtio_scsi_release(s);
+}
+
+/* Some TMFs must be processed from the main loop thread */
+static void virtio_scsi_do_tmf_bh(void *opaque)
+{
+ VirtIOSCSI *s = opaque;
+ QTAILQ_HEAD(, VirtIOSCSIReq) reqs = QTAILQ_HEAD_INITIALIZER(reqs);
+ VirtIOSCSIReq *req;
+ VirtIOSCSIReq *tmp;
+
+ virtio_scsi_acquire(s);
+
+ QTAILQ_FOREACH_SAFE(req, &s->tmf_bh_list, next, tmp) {
+ QTAILQ_REMOVE(&s->tmf_bh_list, req, next);
+ QTAILQ_INSERT_TAIL(&reqs, req, next);
+ }
+
+ qemu_bh_delete(s->tmf_bh);
+ s->tmf_bh = NULL;
+
+ virtio_scsi_release(s);
+
+ QTAILQ_FOREACH_SAFE(req, &reqs, next, tmp) {
+ QTAILQ_REMOVE(&reqs, req, next);
+ virtio_scsi_do_one_tmf_bh(req);
+ }
+}
+
+static void virtio_scsi_reset_tmf_bh(VirtIOSCSI *s)
+{
+ VirtIOSCSIReq *req;
+ VirtIOSCSIReq *tmp;
+
+ virtio_scsi_acquire(s);
+
+ if (s->tmf_bh) {
+ qemu_bh_delete(s->tmf_bh);
+ s->tmf_bh = NULL;
+ }
+
+ QTAILQ_FOREACH_SAFE(req, &s->tmf_bh_list, next, tmp) {
+ QTAILQ_REMOVE(&s->tmf_bh_list, req, next);
+
+ /* SAM-6 6.3.2 Hard reset */
+ req->resp.tmf.response = VIRTIO_SCSI_S_TARGET_FAILURE;
+ virtio_scsi_complete_req(req);
+ }
+
+ virtio_scsi_release(s);
+}
+
+static void virtio_scsi_defer_tmf_to_bh(VirtIOSCSIReq *req)
+{
+ VirtIOSCSI *s = req->dev;
+
+ QTAILQ_INSERT_TAIL(&s->tmf_bh_list, req, next);
+
+ if (!s->tmf_bh) {
+ s->tmf_bh = qemu_bh_new(virtio_scsi_do_tmf_bh, s);
+ qemu_bh_schedule(s->tmf_bh);
+ }
+}
+
/* Return 0 if the request is ready to be completed and return to guest;
* -EINPROGRESS if the request is submitted and will be completed later, in the
* case of async cancellation. */
@@ -263,8 +375,6 @@ static int virtio_scsi_do_tmf(VirtIOSCSI *s, VirtIOSCSIReq *req)
{
SCSIDevice *d = virtio_scsi_device_get(s, req->req.tmf.lun);
SCSIRequest *r, *next;
- BusChild *kid;
- int target;
int ret = 0;
virtio_scsi_ctx_check(s, d);
@@ -321,15 +431,9 @@ static int virtio_scsi_do_tmf(VirtIOSCSI *s, VirtIOSCSIReq *req)
break;
case VIRTIO_SCSI_T_TMF_LOGICAL_UNIT_RESET:
- if (!d) {
- goto fail;
- }
- if (d->lun != virtio_scsi_get_lun(req->req.tmf.lun)) {
- goto incorrect_lun;
- }
- s->resetting++;
- qdev_reset_all(&d->qdev);
- s->resetting--;
+ case VIRTIO_SCSI_T_TMF_I_T_NEXUS_RESET:
+ virtio_scsi_defer_tmf_to_bh(req);
+ ret = -EINPROGRESS;
break;
case VIRTIO_SCSI_T_TMF_ABORT_TASK_SET:
@@ -372,22 +476,6 @@ static int virtio_scsi_do_tmf(VirtIOSCSI *s, VirtIOSCSIReq *req)
}
break;
- case VIRTIO_SCSI_T_TMF_I_T_NEXUS_RESET:
- target = req->req.tmf.lun[1];
- s->resetting++;
-
- rcu_read_lock();
- QTAILQ_FOREACH_RCU(kid, &s->bus.qbus.children, sibling) {
- SCSIDevice *d1 = SCSI_DEVICE(kid->child);
- if (d1->channel == 0 && d1->id == target) {
- qdev_reset_all(&d1->qdev);
- }
- }
- rcu_read_unlock();
-
- s->resetting--;
- break;
-
case VIRTIO_SCSI_T_TMF_CLEAR_ACA:
default:
req->resp.tmf.response = VIRTIO_SCSI_S_FUNCTION_REJECTED;
@@ -603,7 +691,7 @@ static void virtio_scsi_request_cancelled(SCSIRequest *r)
if (!req) {
return;
}
- if (req->dev->resetting) {
+ if (qatomic_read(&req->dev->resetting)) {
req->resp.cmd.response = VIRTIO_SCSI_S_RESET;
} else {
req->resp.cmd.response = VIRTIO_SCSI_S_ABORTED;
@@ -784,9 +872,12 @@ static void virtio_scsi_reset(VirtIODevice *vdev)
VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(vdev);
assert(!s->dataplane_started);
- s->resetting++;
+
+ virtio_scsi_reset_tmf_bh(s);
+
+ qatomic_inc(&s->resetting);
qbus_reset_all(BUS(&s->bus));
- s->resetting--;
+ qatomic_dec(&s->resetting);
vs->sense_size = VIRTIO_SCSI_SENSE_DEFAULT_SIZE;
vs->cdb_size = VIRTIO_SCSI_CDB_DEFAULT_SIZE;
@@ -1018,6 +1109,8 @@ static void virtio_scsi_device_realize(DeviceState *dev, Error **errp)
VirtIOSCSI *s = VIRTIO_SCSI(dev);
Error *err = NULL;
+ QTAILQ_INIT(&s->tmf_bh_list);
+
virtio_scsi_common_realize(dev,
virtio_scsi_handle_ctrl,
virtio_scsi_handle_event,
@@ -1055,6 +1148,8 @@ static void virtio_scsi_device_unrealize(DeviceState *dev)
{
VirtIOSCSI *s = VIRTIO_SCSI(dev);
+ virtio_scsi_reset_tmf_bh(s);
+
qbus_set_hotplug_handler(BUS(&s->bus), NULL);
virtio_scsi_common_unrealize(dev);
}
diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h
index 543681bc18..b0e36f25aa 100644
--- a/include/hw/virtio/virtio-scsi.h
+++ b/include/hw/virtio/virtio-scsi.h
@@ -77,13 +77,22 @@ struct VirtIOSCSICommon {
VirtQueue **cmd_vqs;
};
+struct VirtIOSCSIReq;
+
struct VirtIOSCSI {
VirtIOSCSICommon parent_obj;
SCSIBus bus;
- int resetting;
+ int resetting; /* written from main loop thread, read from any thread */
bool events_dropped;
+ /*
+ * TMFs deferred to main loop BH. These fields are protected by
+ * virtio_scsi_acquire().
+ */
+ QEMUBH *tmf_bh;
+ QTAILQ_HEAD(, VirtIOSCSIReq) tmf_bh_list;
+
/* Fields for dataplane below */
AioContext *ctx; /* one iothread per virtio-scsi-pci for now */
@@ -106,13 +115,11 @@ typedef struct VirtIOSCSIReq {
QEMUSGList qsgl;
QEMUIOVector resp_iov;
- union {
- /* Used for two-stage request submission */
- QTAILQ_ENTRY(VirtIOSCSIReq) next;
+ /* Used for two-stage request submission and TMFs deferred to BH */
+ QTAILQ_ENTRY(VirtIOSCSIReq) next;
- /* Used for cancellation of request during TMFs */
- int remaining;
- };
+ /* Used for cancellation of request during TMFs */
+ int remaining;
SCSIRequest *sreq;
size_t resp_size;
--
2.37.3

View File

@ -0,0 +1,177 @@
From 93dfffa3c354c87aae712f5d6c86be5b26d975d4 Mon Sep 17 00:00:00 2001
From: Greg Kurz <groug@kaod.org>
Date: Tue, 15 Feb 2022 19:15:29 +0100
Subject: [PATCH 01/22] virtiofsd: Add basic support for FUSE_SYNCFS request
RH-Author: German Maglione <None>
RH-MergeRequest: 278: virtiofsd: Add basic support for FUSE_SYNCFS request
RH-Bugzilla: 2196880
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Hanna Czenczek <hreitz@redhat.com>
RH-Acked-by: Jon Maloy <jmaloy@redhat.com>
RH-Commit: [1/1] 7a0cbe70d97f13e74b2116218fccd9f79d335752
Honor the expected behavior of syncfs() to synchronously flush all data
and metadata to disk on linux systems.
If virtiofsd is started with '-o announce_submounts', the client is
expected to send a FUSE_SYNCFS request for each individual submount.
In this case, we just create a new file descriptor on the submount
inode with lo_inode_open(), call syncfs() on it and close it. The
intermediary file is needed because O_PATH descriptors aren't
backed by an actual file and syncfs() would fail with EBADF.
If virtiofsd is started without '-o announce_submounts' or if the
client doesn't have the FUSE_CAP_SUBMOUNTS capability, the client
only sends a single FUSE_SYNCFS request for the root inode. The
server would thus need to track submounts internally and call
syncfs() on each of them. This will be implemented later.
Note that syncfs() might suffer from a time penalty if the submounts
are being hammered by some unrelated workload on the host. The only
solution to prevent that is to avoid shared mounts.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20220215181529.164070-2-groug@kaod.org>
Reviewed-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit 45b04ef48dbbeb18d93c2631bf5584ac493de749)
Signed-off-by: German Maglione <gmaglione@redhat.com>
---
tools/virtiofsd/fuse_lowlevel.c | 11 +++++++
tools/virtiofsd/fuse_lowlevel.h | 13 ++++++++
tools/virtiofsd/passthrough_ll.c | 44 +++++++++++++++++++++++++++
tools/virtiofsd/passthrough_seccomp.c | 1 +
4 files changed, 69 insertions(+)
diff --git a/tools/virtiofsd/fuse_lowlevel.c b/tools/virtiofsd/fuse_lowlevel.c
index 5d431a7038..57f928463a 100644
--- a/tools/virtiofsd/fuse_lowlevel.c
+++ b/tools/virtiofsd/fuse_lowlevel.c
@@ -1876,6 +1876,16 @@ static void do_lseek(fuse_req_t req, fuse_ino_t nodeid,
}
}
+static void do_syncfs(fuse_req_t req, fuse_ino_t nodeid,
+ struct fuse_mbuf_iter *iter)
+{
+ if (req->se->op.syncfs) {
+ req->se->op.syncfs(req, nodeid);
+ } else {
+ fuse_reply_err(req, ENOSYS);
+ }
+}
+
static void do_init(fuse_req_t req, fuse_ino_t nodeid,
struct fuse_mbuf_iter *iter)
{
@@ -2282,6 +2292,7 @@ static struct {
[FUSE_RENAME2] = { do_rename2, "RENAME2" },
[FUSE_COPY_FILE_RANGE] = { do_copy_file_range, "COPY_FILE_RANGE" },
[FUSE_LSEEK] = { do_lseek, "LSEEK" },
+ [FUSE_SYNCFS] = { do_syncfs, "SYNCFS" },
};
#define FUSE_MAXOP (sizeof(fuse_ll_ops) / sizeof(fuse_ll_ops[0]))
diff --git a/tools/virtiofsd/fuse_lowlevel.h b/tools/virtiofsd/fuse_lowlevel.h
index c55c0ca2fc..b889dae4de 100644
--- a/tools/virtiofsd/fuse_lowlevel.h
+++ b/tools/virtiofsd/fuse_lowlevel.h
@@ -1226,6 +1226,19 @@ struct fuse_lowlevel_ops {
*/
void (*lseek)(fuse_req_t req, fuse_ino_t ino, off_t off, int whence,
struct fuse_file_info *fi);
+
+ /**
+ * Synchronize file system content
+ *
+ * If this request is answered with an error code of ENOSYS,
+ * this is treated as success and future calls to syncfs() will
+ * succeed automatically without being sent to the filesystem
+ * process.
+ *
+ * @param req request handle
+ * @param ino the inode number
+ */
+ void (*syncfs)(fuse_req_t req, fuse_ino_t ino);
};
/**
diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c
index 523d8fbe1e..00ccb90a72 100644
--- a/tools/virtiofsd/passthrough_ll.c
+++ b/tools/virtiofsd/passthrough_ll.c
@@ -3357,6 +3357,49 @@ static void lo_lseek(fuse_req_t req, fuse_ino_t ino, off_t off, int whence,
}
}
+static int lo_do_syncfs(struct lo_data *lo, struct lo_inode *inode)
+{
+ int fd, ret = 0;
+
+ fuse_log(FUSE_LOG_DEBUG, "lo_do_syncfs(ino=%" PRIu64 ")\n",
+ inode->fuse_ino);
+
+ fd = lo_inode_open(lo, inode, O_RDONLY);
+ if (fd < 0) {
+ return -fd;
+ }
+
+ if (syncfs(fd) < 0) {
+ ret = errno;
+ }
+
+ close(fd);
+ return ret;
+}
+
+static void lo_syncfs(fuse_req_t req, fuse_ino_t ino)
+{
+ struct lo_data *lo = lo_data(req);
+ struct lo_inode *inode = lo_inode(req, ino);
+ int err;
+
+ if (!inode) {
+ fuse_reply_err(req, EBADF);
+ return;
+ }
+
+ err = lo_do_syncfs(lo, inode);
+ lo_inode_put(lo, &inode);
+
+ /*
+ * If submounts aren't announced, the client only sends a request to
+ * sync the root inode. TODO: Track submounts internally and iterate
+ * over them as well.
+ */
+
+ fuse_reply_err(req, err);
+}
+
static void lo_destroy(void *userdata)
{
struct lo_data *lo = (struct lo_data *)userdata;
@@ -3417,6 +3460,7 @@ static struct fuse_lowlevel_ops lo_oper = {
.copy_file_range = lo_copy_file_range,
#endif
.lseek = lo_lseek,
+ .syncfs = lo_syncfs,
.destroy = lo_destroy,
};
diff --git a/tools/virtiofsd/passthrough_seccomp.c b/tools/virtiofsd/passthrough_seccomp.c
index a3ce9f898d..3e9d6181dc 100644
--- a/tools/virtiofsd/passthrough_seccomp.c
+++ b/tools/virtiofsd/passthrough_seccomp.c
@@ -108,6 +108,7 @@ static const int syscall_allowlist[] = {
SCMP_SYS(set_robust_list),
SCMP_SYS(setxattr),
SCMP_SYS(symlinkat),
+ SCMP_SYS(syncfs),
SCMP_SYS(time), /* Rarely needed, except on static builds */
SCMP_SYS(tgkill),
SCMP_SYS(unlinkat),
--
2.37.3

View File

@ -83,7 +83,7 @@ Obsoletes: %1-rhev <= %{epoch}:%{version}-%{release}
Summary: QEMU is a machine emulator and virtualizer
Name: qemu-kvm
Version: 6.2.0
Release: 31%{?rcrel}%{?dist}
Release: 39%{?rcrel}%{?dist}
# Epoch because we pushed a qemu-1.0 package. AIUI this can't ever be dropped
Epoch: 15
License: GPLv2 and GPLv2+ and CC-BY
@ -626,6 +626,161 @@ Patch242: kvm-io-Add-support-for-MSG_PEEK-for-socket-channel.patch
Patch243: kvm-migration-check-magic-value-for-deciding-the-mapping.patch
# For bz#2168187 - [s390x] qemu-kvm coredumps when SE crashes
Patch244: kvm-target-s390x-arch_dump-Fix-memory-corruption-in-s390.patch
# For bz#2168472 - Guest hangs when starting or rebooting
Patch245: kvm-aio_wait_kick-add-missing-memory-barrier.patch
# For bz#2168472 - Guest hangs when starting or rebooting
Patch246: kvm-qatomic-add-smp_mb__before-after_rmw.patch
# For bz#2168472 - Guest hangs when starting or rebooting
Patch247: kvm-qemu-thread-posix-cleanup-fix-document-QemuEvent.patch
# For bz#2168472 - Guest hangs when starting or rebooting
Patch248: kvm-qemu-thread-win32-cleanup-fix-document-QemuEvent.patch
# For bz#2168472 - Guest hangs when starting or rebooting
Patch249: kvm-edu-add-smp_mb__after_rmw.patch
# For bz#2168472 - Guest hangs when starting or rebooting
Patch250: kvm-aio-wait-switch-to-smp_mb__after_rmw.patch
# For bz#2168472 - Guest hangs when starting or rebooting
Patch251: kvm-qemu-coroutine-lock-add-smp_mb__after_rmw.patch
# For bz#2168472 - Guest hangs when starting or rebooting
Patch252: kvm-physmem-add-missing-memory-barrier.patch
# For bz#2168472 - Guest hangs when starting or rebooting
Patch253: kvm-async-update-documentation-of-the-memory-barriers.patch
# For bz#2168472 - Guest hangs when starting or rebooting
Patch254: kvm-async-clarify-usage-of-barriers-in-the-polling-case.patch
# For bz#2090990 - qemu crash with error scsi_req_unref(SCSIRequest *): Assertion `req->refcount > 0' failed or scsi_dma_complete(void *, int): Assertion `r->req.aiocb != NULL' failed [8.7.0]
Patch255: kvm-scsi-protect-req-aiocb-with-AioContext-lock.patch
# For bz#2090990 - qemu crash with error scsi_req_unref(SCSIRequest *): Assertion `req->refcount > 0' failed or scsi_dma_complete(void *, int): Assertion `r->req.aiocb != NULL' failed [8.7.0]
Patch256: kvm-dma-helpers-prevent-dma_blk_cb-vs-dma_aio_cancel-rac.patch
# For bz#2090990 - qemu crash with error scsi_req_unref(SCSIRequest *): Assertion `req->refcount > 0' failed or scsi_dma_complete(void *, int): Assertion `r->req.aiocb != NULL' failed [8.7.0]
Patch257: kvm-virtio-scsi-reset-SCSI-devices-from-main-loop-thread.patch
# For bz#2187159 - RHEL8.8 - KVM - Secure Guest crashed during booting with 248 vcpus
Patch258: kvm-s390x-pv-Implement-a-CGS-check-helper.patch
# For bz#2177957 - Qemu core dump if cut off nfs storage during migration
Patch259: kvm-migration-Handle-block-device-inactivation-failures-.patch
# For bz#2177957 - Qemu core dump if cut off nfs storage during migration
Patch260: kvm-migration-Minor-control-flow-simplification.patch
# For bz#2177957 - Qemu core dump if cut off nfs storage during migration
Patch261: kvm-migration-Attempt-disk-reactivation-in-more-failure-.patch
# For bz#2035712 - [qemu] Booting from Guest Image over NBD with TLS Is Slow
Patch262: kvm-nbd-server-push-pending-frames-after-sending-reply.patch
# For bz#2035712 - [qemu] Booting from Guest Image over NBD with TLS Is Slow
Patch263: kvm-nbd-server-Request-TCP_NODELAY.patch
# For bz#2196880 - [virtiofs] Backport FUSE_SYNCFS support
Patch264: kvm-virtiofsd-Add-basic-support-for-FUSE_SYNCFS-request.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch265: kvm-s390-kvm-adjust-diag318-resets-to-retain-data.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch266: kvm-target-s390x-Fix-SLDA-sign-bit-index.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch267: kvm-target-s390x-Fix-SRDA-CC-calculation.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch268: kvm-target-s390x-Fix-cc_calc_sla_64-missing-overflows.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch269: kvm-target-s390x-Fix-shifting-32-bit-values-for-more-tha.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch270: kvm-s390x-sigp-Reorder-the-SIGP-STOP-code.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch271: kvm-s390x-tcg-Fix-BRASL-with-a-large-negative-offset.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch272: kvm-s390x-tcg-Fix-BRCL-with-a-large-negative-offset.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch273: kvm-target-s390x-Fix-determination-of-overflow-condition.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch274: kvm-target-s390x-Fix-determination-of-overflow-cond.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch275: kvm-s390x-follow-qdev-tree-to-detect-SCSI-device-on-a-CC.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch276: kvm-target-s390x-Fix-the-accumulation-of-ccm-in-op_icm.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch277: kvm-target-s390x-Fix-writeback-to-v1-in-helper_vstl.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch278: kvm-target-s390x-fix-handling-of-zeroes-in-vfmin-vfmax.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch279: kvm-target-s390x-Fix-CLFIT-and-CLGIT-immediate-size.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch280: kvm-s390x-tcg-Fix-opcode-for-lzrf.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch281: kvm-target-s390x-Fix-emulation-of-the-VISTR-instruction.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch282: kvm-s390x-css-revert-SCSW-ctrl-flag-bits-on-error.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch283: kvm-target-s390x-tcg-Fix-and-improve-the-SACF-instructio.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch284: kvm-target-s390x-tcg-mem_helper-Test-the-right-bits-in-p.patch
# For bz#2169308 - Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9
# For bz#2209605 - [IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu)
Patch285: kvm-pc-bios-Add-support-for-List-Directed-IPL-from-ECKD-.patch
# For bz#1999236 - CVE-2021-3750 virt:rhel/qemu-kvm: QEMU: hcd-ehci: DMA reentrancy issue leads to use-after-free [rhel-8]
Patch286: kvm-memory-prevent-dma-reentracy-issues.patch
# For bz#1999236 - CVE-2021-3750 virt:rhel/qemu-kvm: QEMU: hcd-ehci: DMA reentrancy issue leads to use-after-free [rhel-8]
Patch287: kvm-async-Add-an-optional-reentrancy-guard-to-the-BH-API.patch
# For bz#1999236 - CVE-2021-3750 virt:rhel/qemu-kvm: QEMU: hcd-ehci: DMA reentrancy issue leads to use-after-free [rhel-8]
Patch288: kvm-checkpatch-add-qemu_bh_new-aio_bh_new-checks.patch
# For bz#1999236 - CVE-2021-3750 virt:rhel/qemu-kvm: QEMU: hcd-ehci: DMA reentrancy issue leads to use-after-free [rhel-8]
Patch289: kvm-hw-replace-most-qemu_bh_new-calls-with-qemu_bh_new_g.patch
# For bz#1999236 - CVE-2021-3750 virt:rhel/qemu-kvm: QEMU: hcd-ehci: DMA reentrancy issue leads to use-after-free [rhel-8]
Patch290: kvm-lsi53c895a-disable-reentrancy-detection-for-script-R.patch
# For bz#1999236 - CVE-2021-3750 virt:rhel/qemu-kvm: QEMU: hcd-ehci: DMA reentrancy issue leads to use-after-free [rhel-8]
Patch291: kvm-bcm2835_property-disable-reentrancy-detection-for-io.patch
# For bz#1999236 - CVE-2021-3750 virt:rhel/qemu-kvm: QEMU: hcd-ehci: DMA reentrancy issue leads to use-after-free [rhel-8]
Patch292: kvm-raven-disable-reentrancy-detection-for-iomem.patch
# For bz#1999236 - CVE-2021-3750 virt:rhel/qemu-kvm: QEMU: hcd-ehci: DMA reentrancy issue leads to use-after-free [rhel-8]
Patch293: kvm-apic-disable-reentrancy-detection-for-apic-msi.patch
# For bz#1999236 - CVE-2021-3750 virt:rhel/qemu-kvm: QEMU: hcd-ehci: DMA reentrancy issue leads to use-after-free [rhel-8]
Patch294: kvm-async-avoid-use-after-free-on-re-entrancy-guard.patch
# For bz#1999236 - CVE-2021-3750 virt:rhel/qemu-kvm: QEMU: hcd-ehci: DMA reentrancy issue leads to use-after-free [rhel-8]
Patch295: kvm-memory-stricter-checks-prior-to-unsetting-engaged_in.patch
# For bz#1999236 - CVE-2021-3750 virt:rhel/qemu-kvm: QEMU: hcd-ehci: DMA reentrancy issue leads to use-after-free [rhel-8]
Patch296: kvm-lsi53c895a-disable-reentrancy-detection-for-MMIO-reg.patch
# For bz#1999236 - CVE-2021-3750 virt:rhel/qemu-kvm: QEMU: hcd-ehci: DMA reentrancy issue leads to use-after-free [rhel-8]
Patch297: kvm-hw-scsi-lsi53c895a-Fix-reentrancy-issues-in-the-LSI-.patch
# For bz#2216203 - [qemu-kvm]VM reports vulnerabilty to mmio_stale_data on patched host with microcode
Patch298: kvm-target-i386-add-support-for-FLUSH_L1D-feature.patch
# For bz#2216203 - [qemu-kvm]VM reports vulnerabilty to mmio_stale_data on patched host with microcode
Patch299: kvm-target-i386-add-support-for-FB_CLEAR-feature.patch
# For bz#2169733 - Qemu on destination host crashed if migrate with postcopy and multifd enabled
Patch300: kvm-migration-Disable-postcopy-multifd-migration.patch
# For bz#2141964 - Guest hit EXT4-fs error on host 4K disk when repeatedly hot-plug/unplug running IO disk
Patch301: kvm-util-iov-Make-qiov_slice-public.patch
# For bz#2141964 - Guest hit EXT4-fs error on host 4K disk when repeatedly hot-plug/unplug running IO disk
Patch302: kvm-block-Collapse-padded-I-O-vecs-exceeding-IOV_MAX.patch
# For bz#2141964 - Guest hit EXT4-fs error on host 4K disk when repeatedly hot-plug/unplug running IO disk
Patch303: kvm-util-iov-Remove-qemu_iovec_init_extended.patch
# For bz#2141964 - Guest hit EXT4-fs error on host 4K disk when repeatedly hot-plug/unplug running IO disk
Patch304: kvm-iotests-iov-padding-New-test.patch
# For bz#2141964 - Guest hit EXT4-fs error on host 4K disk when repeatedly hot-plug/unplug running IO disk
Patch305: kvm-block-Fix-pad_request-s-request-restriction.patch
# For bz#2214840 - [AMDSERVER 8.9 Bug] Qemu SEV reduced-phys-bits fixes
Patch306: kvm-qapi-i386-sev-Change-the-reduced-phys-bits-value-fro.patch
# For bz#2214840 - [AMDSERVER 8.9 Bug] Qemu SEV reduced-phys-bits fixes
Patch307: kvm-qemu-options.hx-Update-the-reduced-phys-bits-documen.patch
# For bz#2214840 - [AMDSERVER 8.9 Bug] Qemu SEV reduced-phys-bits fixes
Patch308: kvm-i386-sev-Update-checks-and-information-related-to-re.patch
# For bz#2214840 - [AMDSERVER 8.9 Bug] Qemu SEV reduced-phys-bits fixes
Patch309: kvm-i386-cpu-Update-how-the-EBX-register-of-CPUID-0x8000.patch
# For bz#2223947 - [RHEL8.9] qemu core dump with '-cpu host,mpx=off' on Cascadelake host
Patch310: kvm-target-i386-kvm-Fix-disabling-MPX-on-cpu-host-with-M.patch
# For bz#2215786 - CVE-2023-3301 virt:rhel/qemu-kvm: QEMU: net: triggerable assertion due to race condition in hot-unplug [rhel-8]
Patch311: kvm-vhost-vdpa-do-not-cleanup-the-vdpa-vhost-net-structu.patch
BuildRequires: wget
BuildRequires: rpm-build
@ -1795,6 +1950,119 @@ sh %{_sysconfdir}/sysconfig/modules/kvm.modules &> /dev/null || :
%changelog
* Mon Aug 28 2023 Miroslav Rezanina <mrezanin@redhat.com> - 6.2.0-39
- kvm-vhost-vdpa-do-not-cleanup-the-vdpa-vhost-net-structu.patch [bz#2215786]
- Resolves: bz#2215786
(CVE-2023-3301 virt:rhel/qemu-kvm: QEMU: net: triggerable assertion due to race condition in hot-unplug [rhel-8])
* Wed Aug 09 2023 Jon Maloy <jmaloy@redhat.com> - 6.2.0-38
- kvm-qapi-i386-sev-Change-the-reduced-phys-bits-value-fro.patch [bz#2214840]
- kvm-qemu-options.hx-Update-the-reduced-phys-bits-documen.patch [bz#2214840]
- kvm-i386-sev-Update-checks-and-information-related-to-re.patch [bz#2214840]
- kvm-i386-cpu-Update-how-the-EBX-register-of-CPUID-0x8000.patch [bz#2214840]
- kvm-target-i386-kvm-Fix-disabling-MPX-on-cpu-host-with-M.patch [bz#2223947]
- Resolves: bz#2214840
([AMDSERVER 8.9 Bug] Qemu SEV reduced-phys-bits fixes)
- Resolves: bz#2223947
([RHEL8.9] qemu core dump with '-cpu host,mpx=off' on Cascadelake host)
* Tue Jul 25 2023 Miroslav Rezanina <mrezanin@redhat.com> - 6.2.0-37
- kvm-util-iov-Make-qiov_slice-public.patch [bz#2141964]
- kvm-block-Collapse-padded-I-O-vecs-exceeding-IOV_MAX.patch [bz#2141964]
- kvm-util-iov-Remove-qemu_iovec_init_extended.patch [bz#2141964]
- kvm-iotests-iov-padding-New-test.patch [bz#2141964]
- kvm-block-Fix-pad_request-s-request-restriction.patch [bz#2141964]
- Resolves: bz#2141964
(Guest hit EXT4-fs error on host 4K disk when repeatedly hot-plug/unplug running IO disk)
* Thu Jun 29 2023 Jon Maloy <jmaloy@redhat.com> - 6.2.0-36
- kvm-memory-prevent-dma-reentracy-issues.patch [bz#1999236]
- kvm-async-Add-an-optional-reentrancy-guard-to-the-BH-API.patch [bz#1999236]
- kvm-checkpatch-add-qemu_bh_new-aio_bh_new-checks.patch [bz#1999236]
- kvm-hw-replace-most-qemu_bh_new-calls-with-qemu_bh_new_g.patch [bz#1999236]
- kvm-lsi53c895a-disable-reentrancy-detection-for-script-R.patch [bz#1999236]
- kvm-bcm2835_property-disable-reentrancy-detection-for-io.patch [bz#1999236]
- kvm-raven-disable-reentrancy-detection-for-iomem.patch [bz#1999236]
- kvm-apic-disable-reentrancy-detection-for-apic-msi.patch [bz#1999236]
- kvm-async-avoid-use-after-free-on-re-entrancy-guard.patch [bz#1999236]
- kvm-memory-stricter-checks-prior-to-unsetting-engaged_in.patch [bz#1999236]
- kvm-lsi53c895a-disable-reentrancy-detection-for-MMIO-reg.patch [bz#1999236]
- kvm-hw-scsi-lsi53c895a-Fix-reentrancy-issues-in-the-LSI-.patch [bz#1999236]
- kvm-target-i386-add-support-for-FLUSH_L1D-feature.patch [bz#2216203]
- kvm-target-i386-add-support-for-FB_CLEAR-feature.patch [bz#2216203]
- kvm-migration-Disable-postcopy-multifd-migration.patch [bz#2169733]
- Resolves: bz#1999236
(CVE-2021-3750 virt:rhel/qemu-kvm: QEMU: hcd-ehci: DMA reentrancy issue leads to use-after-free [rhel-8])
- Resolves: bz#2216203
([qemu-kvm]VM reports vulnerabilty to mmio_stale_data on patched host with microcode)
- Resolves: bz#2169733
(Qemu on destination host crashed if migrate with postcopy and multifd enabled)
* Fri Jun 02 2023 Jon Maloy <jmaloy@redhat.com> - 6.2.0-35
- kvm-virtiofsd-Add-basic-support-for-FUSE_SYNCFS-request.patch [bz#2196880]
- kvm-s390-kvm-adjust-diag318-resets-to-retain-data.patch [bz#2169308 bz#2209605]
- kvm-target-s390x-Fix-SLDA-sign-bit-index.patch [bz#2169308 bz#2209605]
- kvm-target-s390x-Fix-SRDA-CC-calculation.patch [bz#2169308 bz#2209605]
- kvm-target-s390x-Fix-cc_calc_sla_64-missing-overflows.patch [bz#2169308 bz#2209605]
- kvm-target-s390x-Fix-shifting-32-bit-values-for-more-tha.patch [bz#2169308 bz#2209605]
- kvm-s390x-sigp-Reorder-the-SIGP-STOP-code.patch [bz#2169308 bz#2209605]
- kvm-s390x-tcg-Fix-BRASL-with-a-large-negative-offset.patch [bz#2169308 bz#2209605]
- kvm-s390x-tcg-Fix-BRCL-with-a-large-negative-offset.patch [bz#2169308 bz#2209605]
- kvm-target-s390x-Fix-determination-of-overflow-condition.patch [bz#2169308 bz#2209605]
- kvm-target-s390x-Fix-determination-of-overflow-cond.patch [bz#2169308 bz#2209605]
- kvm-s390x-follow-qdev-tree-to-detect-SCSI-device-on-a-CC.patch [bz#2169308 bz#2209605]
- kvm-target-s390x-Fix-the-accumulation-of-ccm-in-op_icm.patch [bz#2169308 bz#2209605]
- kvm-target-s390x-Fix-writeback-to-v1-in-helper_vstl.patch [bz#2169308 bz#2209605]
- kvm-target-s390x-fix-handling-of-zeroes-in-vfmin-vfmax.patch [bz#2169308 bz#2209605]
- kvm-target-s390x-Fix-CLFIT-and-CLGIT-immediate-size.patch [bz#2169308 bz#2209605]
- kvm-s390x-tcg-Fix-opcode-for-lzrf.patch [bz#2169308 bz#2209605]
- kvm-target-s390x-Fix-emulation-of-the-VISTR-instruction.patch [bz#2169308 bz#2209605]
- kvm-s390x-css-revert-SCSW-ctrl-flag-bits-on-error.patch [bz#2169308 bz#2209605]
- kvm-target-s390x-tcg-Fix-and-improve-the-SACF-instructio.patch [bz#2169308 bz#2209605]
- kvm-target-s390x-tcg-mem_helper-Test-the-right-bits-in-p.patch [bz#2169308 bz#2209605]
- kvm-pc-bios-Add-support-for-List-Directed-IPL-from-ECKD-.patch [bz#2169308 bz#2209605]
- Resolves: bz#2196880
([virtiofs] Backport FUSE_SYNCFS support)
- Resolves: bz#2169308
(Backport latest s390x-related fixes from upstream QEMU for qemu-kvm in RHEL 8.9)
- Resolves: bz#2209605
([IBM 8.9 FEAT] KVM: ECKD List Directed IPL - virtio (qemu))
* Fri May 19 2023 Miroslav Rezanina <mrezanin@redhat.com> - 6.2.0-34
- kvm-migration-Handle-block-device-inactivation-failures-.patch [bz#2177957]
- kvm-migration-Minor-control-flow-simplification.patch [bz#2177957]
- kvm-migration-Attempt-disk-reactivation-in-more-failure-.patch [bz#2177957]
- kvm-nbd-server-push-pending-frames-after-sending-reply.patch [bz#2035712]
- kvm-nbd-server-Request-TCP_NODELAY.patch [bz#2035712]
- Resolves: bz#2177957
(Qemu core dump if cut off nfs storage during migration)
- Resolves: bz#2035712
([qemu] Booting from Guest Image over NBD with TLS Is Slow)
* Tue Apr 25 2023 Miroslav Rezanina <mrezanin@redhat.com> - 6.2.0-33
- kvm-s390x-pv-Implement-a-CGS-check-helper.patch [bz#2187159]
- Resolves: bz#2187159
(RHEL8.8 - KVM - Secure Guest crashed during booting with 248 vcpus)
* Mon Mar 13 2023 Jon Maloy <jmaloy@redhat.com> - 6.2.0-32.el8_8
- kvm-aio_wait_kick-add-missing-memory-barrier.patch [bz#2168472]
- kvm-qatomic-add-smp_mb__before-after_rmw.patch [bz#2168472]
- kvm-qemu-thread-posix-cleanup-fix-document-QemuEvent.patch [bz#2168472]
- kvm-qemu-thread-win32-cleanup-fix-document-QemuEvent.patch [bz#2168472]
- kvm-edu-add-smp_mb__after_rmw.patch [bz#2168472]
- kvm-aio-wait-switch-to-smp_mb__after_rmw.patch [bz#2168472]
- kvm-qemu-coroutine-lock-add-smp_mb__after_rmw.patch [bz#2168472]
- kvm-physmem-add-missing-memory-barrier.patch [bz#2168472]
- kvm-async-update-documentation-of-the-memory-barriers.patch [bz#2168472]
- kvm-async-clarify-usage-of-barriers-in-the-polling-case.patch [bz#2168472]
- kvm-scsi-protect-req-aiocb-with-AioContext-lock.patch [bz#2090990]
- kvm-dma-helpers-prevent-dma_blk_cb-vs-dma_aio_cancel-rac.patch [bz#2090990]
- kvm-virtio-scsi-reset-SCSI-devices-from-main-loop-thread.patch [bz#2090990]
- Resolves: bz#2168472
(Guest hangs when starting or rebooting)
- Resolves: bz#2090990
(qemu crash with error scsi_req_unref(SCSIRequest *): Assertion `req->refcount > 0' failed or scsi_dma_complete(void *, int): Assertion `r->req.aiocb != NULL' failed [8.7.0])
* Wed Feb 15 2023 Jon Maloy <jmaloy@redhat.com> - 6.2.0-31
- kvm-io-Add-support-for-MSG_PEEK-for-socket-channel.patch [bz#2137740]
- kvm-migration-check-magic-value-for-deciding-the-mapping.patch [bz#2137740]