Enable QXL device build
Enable building for ppc64le Re-added Spice support Don't remove slof.bin for ppc64le
This commit is contained in:
commit
5adf4e6785
180
kvm-migration-Fix-UAF-for-incoming-migration-on-Migratio.patch
Normal file
180
kvm-migration-Fix-UAF-for-incoming-migration-on-Migratio.patch
Normal file
@ -0,0 +1,180 @@
|
||||
From f479e7cd7cc4e48f7383c4ee78609bb7605b70c6 Mon Sep 17 00:00:00 2001
|
||||
From: Peter Xu <peterx@redhat.com>
|
||||
Date: Thu, 20 Feb 2025 08:24:59 -0500
|
||||
Subject: [PATCH 1/3] migration: Fix UAF for incoming migration on
|
||||
MigrationState
|
||||
|
||||
RH-Author: Peter Xu <peterx@redhat.com>
|
||||
RH-MergeRequest: 342: migration: Fix UAF for incoming migration on MigrationState
|
||||
RH-Jira: RHEL-69776
|
||||
RH-Acked-by: Juraj Marcin <None>
|
||||
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
|
||||
RH-Commit: [1/1] 65c90964461cf884167db18bedf62db7dc242573 (peterx/qemu-kvm)
|
||||
|
||||
On the incoming migration side, QEMU uses a coroutine to load all the VM
|
||||
states. Inside, it may reference MigrationState on global states like
|
||||
migration capabilities, parameters, error state, shared mutexes and more.
|
||||
|
||||
However there's nothing yet to make sure MigrationState won't get
|
||||
destroyed (e.g. after migration_shutdown()). Meanwhile there's also no API
|
||||
available to remove the incoming coroutine in migration_shutdown(),
|
||||
avoiding it to access the freed elements.
|
||||
|
||||
There's a bug report showing this can happen and crash dest QEMU when
|
||||
migration is cancelled on source.
|
||||
|
||||
When it happens, the dest main thread is trying to cleanup everything:
|
||||
|
||||
#0 qemu_aio_coroutine_enter
|
||||
#1 aio_dispatch_handler
|
||||
#2 aio_poll
|
||||
#3 monitor_cleanup
|
||||
#4 qemu_cleanup
|
||||
#5 qemu_default_main
|
||||
|
||||
Then it found the migration incoming coroutine, schedule it (even after
|
||||
migration_shutdown()), causing crash:
|
||||
|
||||
#0 __pthread_kill_implementation
|
||||
#1 __pthread_kill_internal
|
||||
#2 __GI_raise
|
||||
#3 __GI_abort
|
||||
#4 __assert_fail_base
|
||||
#5 __assert_fail
|
||||
#6 qemu_mutex_lock_impl
|
||||
#7 qemu_lockable_mutex_lock
|
||||
#8 qemu_lockable_lock
|
||||
#9 qemu_lockable_auto_lock
|
||||
#10 migrate_set_error
|
||||
#11 process_incoming_migration_co
|
||||
#12 coroutine_trampoline
|
||||
|
||||
To fix it, take a refcount after an incoming setup is properly done when
|
||||
qmp_migrate_incoming() succeeded the 1st time. As it's during a QMP
|
||||
handler which needs BQL, it means the main loop is still alive (without
|
||||
going into cleanups, which also needs BQL).
|
||||
|
||||
Releasing the refcount now only until the incoming migration coroutine
|
||||
finished or failed. Hence the refcount is valid for both (1) setup phase
|
||||
of incoming ports, mostly IO watches (e.g. qio_channel_add_watch_full()),
|
||||
and (2) the incoming coroutine itself (process_incoming_migration_co()).
|
||||
|
||||
Note that we can't unref in migration_incoming_state_destroy(), because
|
||||
both qmp_xen_load_devices_state() and load_snapshot() will use it without
|
||||
an incoming migration. Those hold BQL so they're not prone to this issue.
|
||||
|
||||
PS: I suspect nobody uses Xen's command at all, as it didn't register yank,
|
||||
hence AFAIU the command should crash on master when trying to unregister
|
||||
yank in migration_incoming_state_destroy().. but that's another story.
|
||||
|
||||
Also note that in some incoming failure cases we may not always unref the
|
||||
MigrationState refcount, which is a trade-off to keep things simple. We
|
||||
could make it accurate, but it can be an overkill. Some examples:
|
||||
|
||||
- Unlike most of the rest protocols, socket_start_incoming_migration()
|
||||
may create net listener after incoming port setup sucessfully.
|
||||
It means we can't unref in migration_channel_process_incoming() as a
|
||||
generic path because socket protocol might keep using MigrationState.
|
||||
|
||||
- For either socket or file, multiple IO watches might be created, it
|
||||
means logically each IO watch needs to take one refcount for
|
||||
MigrationState so as to be 100% accurate on ownership of refcount taken.
|
||||
|
||||
In general, we at least need per-protocol handling to make it accurate,
|
||||
which can be an overkill if we know incoming failed after all. Add a short
|
||||
comment to explain that when taking the refcount in qmp_migrate_incoming().
|
||||
|
||||
Bugzilla: https://issues.redhat.com/browse/RHEL-69775
|
||||
Tested-by: Yan Fu <yafu@redhat.com>
|
||||
Signed-off-by: Peter Xu <peterx@redhat.com>
|
||||
Reviewed-by: Fabiano Rosas <farosas@suse.de>
|
||||
Message-ID: <20250220132459.512610-1-peterx@redhat.com>
|
||||
Signed-off-by: Fabiano Rosas <farosas@suse.de>
|
||||
(cherry picked from commit d657a14de5d597bbfe7b54e4c4f0646f440e98ad)
|
||||
Signed-off-by: Peter Xu <peterx@redhat.com>
|
||||
---
|
||||
migration/migration.c | 40 ++++++++++++++++++++++++++++++++++++++--
|
||||
1 file changed, 38 insertions(+), 2 deletions(-)
|
||||
|
||||
diff --git a/migration/migration.c b/migration/migration.c
|
||||
index 999d4cac54..aabdc45c16 100644
|
||||
--- a/migration/migration.c
|
||||
+++ b/migration/migration.c
|
||||
@@ -115,6 +115,27 @@ static void migration_downtime_start(MigrationState *s)
|
||||
s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
|
||||
}
|
||||
|
||||
+/*
|
||||
+ * This is unfortunate: incoming migration actually needs the outgoing
|
||||
+ * migration state (MigrationState) to be there too, e.g. to query
|
||||
+ * capabilities, parameters, using locks, setup errors, etc.
|
||||
+ *
|
||||
+ * NOTE: when calling this, making sure current_migration exists and not
|
||||
+ * been freed yet! Otherwise trying to access the refcount is already
|
||||
+ * an use-after-free itself..
|
||||
+ *
|
||||
+ * TODO: Move shared part of incoming / outgoing out into separate object.
|
||||
+ * Then this is not needed.
|
||||
+ */
|
||||
+static void migrate_incoming_ref_outgoing_state(void)
|
||||
+{
|
||||
+ object_ref(migrate_get_current());
|
||||
+}
|
||||
+static void migrate_incoming_unref_outgoing_state(void)
|
||||
+{
|
||||
+ object_unref(migrate_get_current());
|
||||
+}
|
||||
+
|
||||
static void migration_downtime_end(MigrationState *s)
|
||||
{
|
||||
int64_t now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
|
||||
@@ -821,7 +842,7 @@ process_incoming_migration_co(void *opaque)
|
||||
* postcopy thread.
|
||||
*/
|
||||
trace_process_incoming_migration_co_postcopy_end_main();
|
||||
- return;
|
||||
+ goto out;
|
||||
}
|
||||
/* Else if something went wrong then just fall out of the normal exit */
|
||||
}
|
||||
@@ -837,7 +858,8 @@ process_incoming_migration_co(void *opaque)
|
||||
}
|
||||
|
||||
migration_bh_schedule(process_incoming_migration_bh, mis);
|
||||
- return;
|
||||
+ goto out;
|
||||
+
|
||||
fail:
|
||||
migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
|
||||
MIGRATION_STATUS_FAILED);
|
||||
@@ -854,6 +876,9 @@ fail:
|
||||
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
+out:
|
||||
+ /* Pairs with the refcount taken in qmp_migrate_incoming() */
|
||||
+ migrate_incoming_unref_outgoing_state();
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -1875,6 +1900,17 @@ void qmp_migrate_incoming(const char *uri, bool has_channels,
|
||||
return;
|
||||
}
|
||||
|
||||
+ /*
|
||||
+ * Making sure MigrationState is available until incoming migration
|
||||
+ * completes.
|
||||
+ *
|
||||
+ * NOTE: QEMU _might_ leak this refcount in some failure paths, but
|
||||
+ * that's OK. This is the minimum change we need to at least making
|
||||
+ * sure success case is clean on the refcount. We can try harder to
|
||||
+ * make it accurate for any kind of failures, but it might be an
|
||||
+ * overkill and doesn't bring us much benefit.
|
||||
+ */
|
||||
+ migrate_incoming_ref_outgoing_state();
|
||||
once = false;
|
||||
}
|
||||
|
||||
--
|
||||
2.39.3
|
||||
|
@ -0,0 +1,89 @@
|
||||
From 804d6d4fe3e5fde1fe5e21b5c4a33b6f279fe407 Mon Sep 17 00:00:00 2001
|
||||
From: =?UTF-8?q?Daniel=20P=2E=20Berrang=C3=A9?= <berrange@redhat.com>
|
||||
Date: Fri, 6 Dec 2024 11:45:24 +0000
|
||||
Subject: [PATCH 2/3] scripts: improve error from qemu-trace-stap on missing
|
||||
'stap'
|
||||
MIME-Version: 1.0
|
||||
Content-Type: text/plain; charset=UTF-8
|
||||
Content-Transfer-Encoding: 8bit
|
||||
|
||||
RH-Author: Daniel P. Berrangé <berrange@redhat.com>
|
||||
RH-MergeRequest: 346: scripts: improve error from qemu-trace-stap on missing 'stap'
|
||||
RH-Jira: RHEL-83535
|
||||
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
|
||||
RH-Commit: [1/2] 2b2425b5c8cd8d59e9d8d4b228d2d97156bba510 (berrange/centos-src-qemu)
|
||||
|
||||
If the 'stap' binary is missing in $PATH, a huge trace is thrown
|
||||
|
||||
$ qemu-trace-stap list /usr/bin/qemu-system-x86_64
|
||||
Traceback (most recent call last):
|
||||
File "/usr/bin/qemu-trace-stap", line 169, in <module>
|
||||
main()
|
||||
File "/usr/bin/qemu-trace-stap", line 165, in main
|
||||
args.func(args)
|
||||
File "/usr/bin/qemu-trace-stap", line 83, in cmd_run
|
||||
subprocess.call(stapargs)
|
||||
File "/usr/lib64/python3.12/subprocess.py", line 389, in call
|
||||
with Popen(*popenargs, **kwargs) as p:
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
File "/usr/lib64/python3.12/subprocess.py", line 1026, in {}init{}
|
||||
self._execute_child(args, executable, preexec_fn, close_fds,
|
||||
File "/usr/lib64/python3.12/subprocess.py", line 1955, in _execute_child
|
||||
raise child_exception_type(errno_num, err_msg, err_filename)
|
||||
FileNotFoundError: [Errno 2] No such file or directory: 'stap'
|
||||
|
||||
With this change the user now gets
|
||||
|
||||
$ qemu-trace-stap list /usr/bin/qemu-system-x86_64
|
||||
Unable to find 'stap' in $PATH
|
||||
|
||||
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
|
||||
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
|
||||
Message-id: 20241206114524.1666664-1-berrange@redhat.com
|
||||
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
||||
(cherry picked from commit 9976be3911a2d0503f026ae37c17077273bf30ee)
|
||||
---
|
||||
scripts/qemu-trace-stap | 6 ++++--
|
||||
1 file changed, 4 insertions(+), 2 deletions(-)
|
||||
|
||||
diff --git a/scripts/qemu-trace-stap b/scripts/qemu-trace-stap
|
||||
index eb6e951ff2..e983460ee7 100755
|
||||
--- a/scripts/qemu-trace-stap
|
||||
+++ b/scripts/qemu-trace-stap
|
||||
@@ -56,6 +56,7 @@ def tapset_dir(binary):
|
||||
|
||||
|
||||
def cmd_run(args):
|
||||
+ stap = which("stap")
|
||||
prefix = probe_prefix(args.binary)
|
||||
tapsets = tapset_dir(args.binary)
|
||||
|
||||
@@ -76,7 +77,7 @@ def cmd_run(args):
|
||||
|
||||
# We request an 8MB buffer, since the stap default 1MB buffer
|
||||
# can be easily overflowed by frequently firing QEMU traces
|
||||
- stapargs = ["stap", "-s", "8", "-I", tapsets ]
|
||||
+ stapargs = [stap, "-s", "8", "-I", tapsets ]
|
||||
if args.pid is not None:
|
||||
stapargs.extend(["-x", args.pid])
|
||||
stapargs.extend(["-e", script])
|
||||
@@ -84,6 +85,7 @@ def cmd_run(args):
|
||||
|
||||
|
||||
def cmd_list(args):
|
||||
+ stap = which("stap")
|
||||
tapsets = tapset_dir(args.binary)
|
||||
|
||||
if args.verbose:
|
||||
@@ -96,7 +98,7 @@ def cmd_list(args):
|
||||
|
||||
if verbose:
|
||||
print("Listing probes with name '%s'" % script)
|
||||
- proc = subprocess.Popen(["stap", "-I", tapsets, "-l", script],
|
||||
+ proc = subprocess.Popen([stap, "-I", tapsets, "-l", script],
|
||||
stdout=subprocess.PIPE,
|
||||
universal_newlines=True)
|
||||
out, err = proc.communicate()
|
||||
--
|
||||
2.39.3
|
||||
|
@ -158,7 +158,7 @@ Obsoletes: %{name}-block-ssh <= %{epoch}:%{version} \
|
||||
Summary: QEMU is a machine emulator and virtualizer
|
||||
Name: qemu-kvm
|
||||
Version: 9.1.0
|
||||
Release: 15%{?rcrel}%{?dist}%{?cc_suffix}.alma.1
|
||||
Release: 16%{?rcrel}%{?dist}%{?cc_suffix}.alma.1
|
||||
# Epoch because we pushed a qemu-1.0 package. AIUI this can't ever be dropped
|
||||
# Epoch 15 used for RHEL 8
|
||||
# Epoch 17 used for RHEL 9 (due to release versioning offset in RHEL 8.5)
|
||||
@ -497,6 +497,10 @@ Patch152: kvm-iotests-Add-filter_qtest.patch
|
||||
Patch153: kvm-iotests-Add-qsd-migrate-case.patch
|
||||
# For RHEL-54670 - Provide QMP command for block device reactivation after migration [rhel-10.0]
|
||||
Patch154: kvm-iotests-Add-NBD-based-tests-for-inactive-nodes.patch
|
||||
# For RHEL-69776 - [rhel10]Guest crashed on the target host when the migration was canceled
|
||||
Patch155: kvm-migration-Fix-UAF-for-incoming-migration-on-Migratio.patch
|
||||
# For RHEL-83535 - [Qemu RHEL-10] qemu-trace-stap should handle lack of stap more gracefully
|
||||
Patch156: kvm-scripts-improve-error-from-qemu-trace-stap-on-missin.patch
|
||||
|
||||
# AlmaLinux Patch
|
||||
Patch2001: 2001-Add-ppc64-support.patch
|
||||
@ -662,6 +666,7 @@ This package provides documentation and auxiliary programs used with %{name}.
|
||||
|
||||
%package tools
|
||||
Summary: %{name} support tools
|
||||
Recommends: systemtap-client
|
||||
%description tools
|
||||
%{name}-tools provides various tools related to %{name} usage.
|
||||
|
||||
@ -1613,12 +1618,21 @@ useradd -r -u 107 -g qemu -G kvm -d / -s /sbin/nologin \
|
||||
%endif
|
||||
|
||||
%changelog
|
||||
* Tue Feb 18 2025 Eduard Abdullin <eabdullin@almalinux.org> - 18:9.1.0-15.alma.1
|
||||
* Thu Mar 27 2025 Eduard Abdullin <eabdullin@almalinux.org> - 18:9.1.0-16.alma.1
|
||||
- Enable QXL device build
|
||||
- Enable building for ppc64le
|
||||
- Re-added Spice support
|
||||
- Don't remove slof.bin for ppc64le
|
||||
|
||||
* Tue Mar 25 2025 Miroslav Rezanina <mrezanin@redhat.com> - 9.1.0-16
|
||||
- kvm-migration-Fix-UAF-for-incoming-migration-on-Migratio.patch [RHEL-69776]
|
||||
- kvm-scripts-improve-error-from-qemu-trace-stap-on-missin.patch [RHEL-83535]
|
||||
- kvm-Recommend-systemtap-client-from-qemu-tools.patch [RHEL-83535]
|
||||
- Resolves: RHEL-69776
|
||||
([rhel10]Guest crashed on the target host when the migration was canceled)
|
||||
- Resolves: RHEL-83535
|
||||
([Qemu RHEL-10] qemu-trace-stap should handle lack of stap more gracefully)
|
||||
|
||||
* Mon Feb 17 2025 Miroslav Rezanina <mrezanin@redhat.com> - 9.1.0-15
|
||||
- kvm-migration-Add-helper-to-get-target-runstate.patch [RHEL-54670]
|
||||
- kvm-qmp-cont-Only-activate-disks-if-migration-completed.patch [RHEL-54670]
|
||||
|
Loading…
Reference in New Issue
Block a user