Enable QXL device build

Enable building for ppc64le

Re-added Spice support

Don't remove slof.bin for ppc64le
This commit is contained in:
Eduard Abdullin 2025-03-27 03:25:00 +00:00 committed by root
commit 5adf4e6785
3 changed files with 285 additions and 2 deletions

View File

@ -0,0 +1,180 @@
From f479e7cd7cc4e48f7383c4ee78609bb7605b70c6 Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx@redhat.com>
Date: Thu, 20 Feb 2025 08:24:59 -0500
Subject: [PATCH 1/3] migration: Fix UAF for incoming migration on
MigrationState
RH-Author: Peter Xu <peterx@redhat.com>
RH-MergeRequest: 342: migration: Fix UAF for incoming migration on MigrationState
RH-Jira: RHEL-69776
RH-Acked-by: Juraj Marcin <None>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [1/1] 65c90964461cf884167db18bedf62db7dc242573 (peterx/qemu-kvm)
On the incoming migration side, QEMU uses a coroutine to load all the VM
states. Inside, it may reference MigrationState on global states like
migration capabilities, parameters, error state, shared mutexes and more.
However there's nothing yet to make sure MigrationState won't get
destroyed (e.g. after migration_shutdown()). Meanwhile there's also no API
available to remove the incoming coroutine in migration_shutdown(),
avoiding it to access the freed elements.
There's a bug report showing this can happen and crash dest QEMU when
migration is cancelled on source.
When it happens, the dest main thread is trying to cleanup everything:
#0 qemu_aio_coroutine_enter
#1 aio_dispatch_handler
#2 aio_poll
#3 monitor_cleanup
#4 qemu_cleanup
#5 qemu_default_main
Then it found the migration incoming coroutine, schedule it (even after
migration_shutdown()), causing crash:
#0 __pthread_kill_implementation
#1 __pthread_kill_internal
#2 __GI_raise
#3 __GI_abort
#4 __assert_fail_base
#5 __assert_fail
#6 qemu_mutex_lock_impl
#7 qemu_lockable_mutex_lock
#8 qemu_lockable_lock
#9 qemu_lockable_auto_lock
#10 migrate_set_error
#11 process_incoming_migration_co
#12 coroutine_trampoline
To fix it, take a refcount after an incoming setup is properly done when
qmp_migrate_incoming() succeeded the 1st time. As it's during a QMP
handler which needs BQL, it means the main loop is still alive (without
going into cleanups, which also needs BQL).
Releasing the refcount now only until the incoming migration coroutine
finished or failed. Hence the refcount is valid for both (1) setup phase
of incoming ports, mostly IO watches (e.g. qio_channel_add_watch_full()),
and (2) the incoming coroutine itself (process_incoming_migration_co()).
Note that we can't unref in migration_incoming_state_destroy(), because
both qmp_xen_load_devices_state() and load_snapshot() will use it without
an incoming migration. Those hold BQL so they're not prone to this issue.
PS: I suspect nobody uses Xen's command at all, as it didn't register yank,
hence AFAIU the command should crash on master when trying to unregister
yank in migration_incoming_state_destroy().. but that's another story.
Also note that in some incoming failure cases we may not always unref the
MigrationState refcount, which is a trade-off to keep things simple. We
could make it accurate, but it can be an overkill. Some examples:
- Unlike most of the rest protocols, socket_start_incoming_migration()
may create net listener after incoming port setup sucessfully.
It means we can't unref in migration_channel_process_incoming() as a
generic path because socket protocol might keep using MigrationState.
- For either socket or file, multiple IO watches might be created, it
means logically each IO watch needs to take one refcount for
MigrationState so as to be 100% accurate on ownership of refcount taken.
In general, we at least need per-protocol handling to make it accurate,
which can be an overkill if we know incoming failed after all. Add a short
comment to explain that when taking the refcount in qmp_migrate_incoming().
Bugzilla: https://issues.redhat.com/browse/RHEL-69775
Tested-by: Yan Fu <yafu@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-ID: <20250220132459.512610-1-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
(cherry picked from commit d657a14de5d597bbfe7b54e4c4f0646f440e98ad)
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 40 ++++++++++++++++++++++++++++++++++++++--
1 file changed, 38 insertions(+), 2 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 999d4cac54..aabdc45c16 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -115,6 +115,27 @@ static void migration_downtime_start(MigrationState *s)
s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
}
+/*
+ * This is unfortunate: incoming migration actually needs the outgoing
+ * migration state (MigrationState) to be there too, e.g. to query
+ * capabilities, parameters, using locks, setup errors, etc.
+ *
+ * NOTE: when calling this, making sure current_migration exists and not
+ * been freed yet! Otherwise trying to access the refcount is already
+ * an use-after-free itself..
+ *
+ * TODO: Move shared part of incoming / outgoing out into separate object.
+ * Then this is not needed.
+ */
+static void migrate_incoming_ref_outgoing_state(void)
+{
+ object_ref(migrate_get_current());
+}
+static void migrate_incoming_unref_outgoing_state(void)
+{
+ object_unref(migrate_get_current());
+}
+
static void migration_downtime_end(MigrationState *s)
{
int64_t now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
@@ -821,7 +842,7 @@ process_incoming_migration_co(void *opaque)
* postcopy thread.
*/
trace_process_incoming_migration_co_postcopy_end_main();
- return;
+ goto out;
}
/* Else if something went wrong then just fall out of the normal exit */
}
@@ -837,7 +858,8 @@ process_incoming_migration_co(void *opaque)
}
migration_bh_schedule(process_incoming_migration_bh, mis);
- return;
+ goto out;
+
fail:
migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_FAILED);
@@ -854,6 +876,9 @@ fail:
exit(EXIT_FAILURE);
}
+out:
+ /* Pairs with the refcount taken in qmp_migrate_incoming() */
+ migrate_incoming_unref_outgoing_state();
}
/**
@@ -1875,6 +1900,17 @@ void qmp_migrate_incoming(const char *uri, bool has_channels,
return;
}
+ /*
+ * Making sure MigrationState is available until incoming migration
+ * completes.
+ *
+ * NOTE: QEMU _might_ leak this refcount in some failure paths, but
+ * that's OK. This is the minimum change we need to at least making
+ * sure success case is clean on the refcount. We can try harder to
+ * make it accurate for any kind of failures, but it might be an
+ * overkill and doesn't bring us much benefit.
+ */
+ migrate_incoming_ref_outgoing_state();
once = false;
}
--
2.39.3

View File

@ -0,0 +1,89 @@
From 804d6d4fe3e5fde1fe5e21b5c4a33b6f279fe407 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Daniel=20P=2E=20Berrang=C3=A9?= <berrange@redhat.com>
Date: Fri, 6 Dec 2024 11:45:24 +0000
Subject: [PATCH 2/3] scripts: improve error from qemu-trace-stap on missing
'stap'
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Daniel P. Berrangé <berrange@redhat.com>
RH-MergeRequest: 346: scripts: improve error from qemu-trace-stap on missing 'stap'
RH-Jira: RHEL-83535
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [1/2] 2b2425b5c8cd8d59e9d8d4b228d2d97156bba510 (berrange/centos-src-qemu)
If the 'stap' binary is missing in $PATH, a huge trace is thrown
$ qemu-trace-stap list /usr/bin/qemu-system-x86_64
Traceback (most recent call last):
File "/usr/bin/qemu-trace-stap", line 169, in <module>
main()
File "/usr/bin/qemu-trace-stap", line 165, in main
args.func(args)
File "/usr/bin/qemu-trace-stap", line 83, in cmd_run
subprocess.call(stapargs)
File "/usr/lib64/python3.12/subprocess.py", line 389, in call
with Popen(*popenargs, **kwargs) as p:
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.12/subprocess.py", line 1026, in {}init{}
self._execute_child(args, executable, preexec_fn, close_fds,
File "/usr/lib64/python3.12/subprocess.py", line 1955, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'stap'
With this change the user now gets
$ qemu-trace-stap list /usr/bin/qemu-system-x86_64
Unable to find 'stap' in $PATH
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20241206114524.1666664-1-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 9976be3911a2d0503f026ae37c17077273bf30ee)
---
scripts/qemu-trace-stap | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/scripts/qemu-trace-stap b/scripts/qemu-trace-stap
index eb6e951ff2..e983460ee7 100755
--- a/scripts/qemu-trace-stap
+++ b/scripts/qemu-trace-stap
@@ -56,6 +56,7 @@ def tapset_dir(binary):
def cmd_run(args):
+ stap = which("stap")
prefix = probe_prefix(args.binary)
tapsets = tapset_dir(args.binary)
@@ -76,7 +77,7 @@ def cmd_run(args):
# We request an 8MB buffer, since the stap default 1MB buffer
# can be easily overflowed by frequently firing QEMU traces
- stapargs = ["stap", "-s", "8", "-I", tapsets ]
+ stapargs = [stap, "-s", "8", "-I", tapsets ]
if args.pid is not None:
stapargs.extend(["-x", args.pid])
stapargs.extend(["-e", script])
@@ -84,6 +85,7 @@ def cmd_run(args):
def cmd_list(args):
+ stap = which("stap")
tapsets = tapset_dir(args.binary)
if args.verbose:
@@ -96,7 +98,7 @@ def cmd_list(args):
if verbose:
print("Listing probes with name '%s'" % script)
- proc = subprocess.Popen(["stap", "-I", tapsets, "-l", script],
+ proc = subprocess.Popen([stap, "-I", tapsets, "-l", script],
stdout=subprocess.PIPE,
universal_newlines=True)
out, err = proc.communicate()
--
2.39.3

View File

@ -158,7 +158,7 @@ Obsoletes: %{name}-block-ssh <= %{epoch}:%{version} \
Summary: QEMU is a machine emulator and virtualizer
Name: qemu-kvm
Version: 9.1.0
Release: 15%{?rcrel}%{?dist}%{?cc_suffix}.alma.1
Release: 16%{?rcrel}%{?dist}%{?cc_suffix}.alma.1
# Epoch because we pushed a qemu-1.0 package. AIUI this can't ever be dropped
# Epoch 15 used for RHEL 8
# Epoch 17 used for RHEL 9 (due to release versioning offset in RHEL 8.5)
@ -497,6 +497,10 @@ Patch152: kvm-iotests-Add-filter_qtest.patch
Patch153: kvm-iotests-Add-qsd-migrate-case.patch
# For RHEL-54670 - Provide QMP command for block device reactivation after migration [rhel-10.0]
Patch154: kvm-iotests-Add-NBD-based-tests-for-inactive-nodes.patch
# For RHEL-69776 - [rhel10]Guest crashed on the target host when the migration was canceled
Patch155: kvm-migration-Fix-UAF-for-incoming-migration-on-Migratio.patch
# For RHEL-83535 - [Qemu RHEL-10] qemu-trace-stap should handle lack of stap more gracefully
Patch156: kvm-scripts-improve-error-from-qemu-trace-stap-on-missin.patch
# AlmaLinux Patch
Patch2001: 2001-Add-ppc64-support.patch
@ -662,6 +666,7 @@ This package provides documentation and auxiliary programs used with %{name}.
%package tools
Summary: %{name} support tools
Recommends: systemtap-client
%description tools
%{name}-tools provides various tools related to %{name} usage.
@ -1613,12 +1618,21 @@ useradd -r -u 107 -g qemu -G kvm -d / -s /sbin/nologin \
%endif
%changelog
* Tue Feb 18 2025 Eduard Abdullin <eabdullin@almalinux.org> - 18:9.1.0-15.alma.1
* Thu Mar 27 2025 Eduard Abdullin <eabdullin@almalinux.org> - 18:9.1.0-16.alma.1
- Enable QXL device build
- Enable building for ppc64le
- Re-added Spice support
- Don't remove slof.bin for ppc64le
* Tue Mar 25 2025 Miroslav Rezanina <mrezanin@redhat.com> - 9.1.0-16
- kvm-migration-Fix-UAF-for-incoming-migration-on-Migratio.patch [RHEL-69776]
- kvm-scripts-improve-error-from-qemu-trace-stap-on-missin.patch [RHEL-83535]
- kvm-Recommend-systemtap-client-from-qemu-tools.patch [RHEL-83535]
- Resolves: RHEL-69776
([rhel10]Guest crashed on the target host when the migration was canceled)
- Resolves: RHEL-83535
([Qemu RHEL-10] qemu-trace-stap should handle lack of stap more gracefully)
* Mon Feb 17 2025 Miroslav Rezanina <mrezanin@redhat.com> - 9.1.0-15
- kvm-migration-Add-helper-to-get-target-runstate.patch [RHEL-54670]
- kvm-qmp-cont-Only-activate-disks-if-migration-completed.patch [RHEL-54670]