qemu-kvm/kvm-migration-Fix-possible-races-when-shutting-down-the-.patch
Miroslav Rezanina 5f95659303 * Mon Oct 16 2023 Miroslav Rezanina <mrezanin@redhat.com> - 8.1.0-3
- kvm-migration-Fix-race-that-dest-preempt-thread-close-to.patch [RHEL-11219]
- kvm-migration-Fix-possible-race-when-setting-rp_state.er.patch [RHEL-11219]
- kvm-migration-Fix-possible-races-when-shutting-down-the-.patch [RHEL-11219]
- kvm-migration-Fix-possible-race-when-shutting-down-to_ds.patch [RHEL-11219]
- kvm-migration-Remove-redundant-cleanup-of-postcopy_qemuf.patch [RHEL-11219]
- kvm-migration-Consolidate-return-path-closing-code.patch [RHEL-11219]
- kvm-migration-Replace-the-return-path-retry-logic.patch [RHEL-11219]
- kvm-migration-Move-return-path-cleanup-to-main-migration.patch [RHEL-11219]
- kvm-file-posix-Clear-bs-bl.zoned-on-error.patch [RHEL-7360]
- kvm-file-posix-Check-bs-bl.zoned-for-zone-info.patch [RHEL-7360]
- kvm-file-posix-Fix-zone-update-in-I-O-error-path.patch [RHEL-7360]
- kvm-file-posix-Simplify-raw_co_prw-s-out-zone-code.patch [RHEL-7360]
- kvm-tests-file-io-error-New-test.patch [RHEL-7360]
- Resolves: RHEL-11219
  (migration tests failing for RHEL 9.4 sometimes)
- Resolves: RHEL-7360
  (Qemu Core Dumped When Writing Larger Size Than The Size of A Data Disk)
2023-10-16 03:40:21 -04:00

75 lines
3.0 KiB
Diff

From 1cc26aac30b883244e300d872fa3a19f39afbb66 Mon Sep 17 00:00:00 2001
From: Fabiano Rosas <farosas@suse.de>
Date: Mon, 18 Sep 2023 14:28:17 -0300
Subject: [PATCH 03/13] migration: Fix possible races when shutting down the
return path
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Peter Xu <peterx@redhat.com>
RH-MergeRequest: 203: migration: Fix race that dest preempt thread close too early
RH-Jira: RHEL-11219
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Leonardo Brás <leobras@redhat.com>
RH-Commit: [3/8] d734387cf07e956a1f1e817c200e5b36cf0a00b1 (peterx/qemu-kvm)
We cannot call qemu_file_shutdown() on the return path file without
taking the file lock. The return path thread could be running it's
cleanup code and have just cleared the from_dst_file pointer.
Checking ms->to_dst_file for errors could also race with
migrate_fd_cleanup() which clears the to_dst_file pointer.
Protect both accesses by taking the file lock.
This was caught by inspection, it should be rare, but the next patches
will start calling this code from other places, so let's do the
correct thing.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-4-farosas@suse.de>
(cherry picked from commit 639decf529793fc544c8055b82be8abe77fa48fa)
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index b92c6ae436..517b3e04d2 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2053,17 +2053,18 @@ static int open_return_path_on_source(MigrationState *ms,
static int await_return_path_close_on_source(MigrationState *ms)
{
/*
- * If this is a normal exit then the destination will send a SHUT and the
- * rp_thread will exit, however if there's an error we need to cause
- * it to exit.
+ * If this is a normal exit then the destination will send a SHUT
+ * and the rp_thread will exit, however if there's an error we
+ * need to cause it to exit. shutdown(2), if we have it, will
+ * cause it to unblock if it's stuck waiting for the destination.
*/
- if (qemu_file_get_error(ms->to_dst_file) && ms->rp_state.from_dst_file) {
- /*
- * shutdown(2), if we have it, will cause it to unblock if it's stuck
- * waiting for the destination.
- */
- qemu_file_shutdown(ms->rp_state.from_dst_file);
+ WITH_QEMU_LOCK_GUARD(&ms->qemu_file_lock) {
+ if (ms->to_dst_file && ms->rp_state.from_dst_file &&
+ qemu_file_get_error(ms->to_dst_file)) {
+ qemu_file_shutdown(ms->rp_state.from_dst_file);
+ }
}
+
trace_await_return_path_close_on_source_joining();
qemu_thread_join(&ms->rp_state.rp_thread);
ms->rp_state.rp_thread_created = false;
--
2.39.3