This uses a Python script which implements concurrent downloads
(via asyncio) to download workaround and update packages and
configure the repos. This should speed up the process for large
multi-package updates.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This effectively reverts 97618193 - but had to be done manually
and adjusted to maintain support for testing side tags and for
testing multiple tasks, since those features were added since
the update ISO change.
The 'scheduler injects ISOs of packages into the tests' approach
was intended to speed things up, especially for large updates,
and it did, but it had a few drawbacks. It means restarting
older tests from the web UI doesn't work as the ISOs get garbage
collected (you have to re-schedule in this case). And it has the
rather large problem that you can now only schedule tests from
the openQA server (or at least a machine with the openQA asset
share mounted), because the package download and ISO creation
just happen wherever the scheduler is running and assume that
the openQA asset share that will be used by the tests is at
/var/lib/openqa/share in that filesystem.
That's too big of a drawback to continue with this approach, IMO,
so this reverts back to the old way of doing things, with a bit
of refactoring to clean up the flow a little, and with support
for testing side tags and multiple tasks maintained.
As a follow-up I'm going to see if I can replace
_download_packages with a much more efficient downloader script
to mitigate the time this process takes on each test, especially
for large updates.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
The issue is not marked as fixed, but per the needle cleanup we
have not hit this workaround condition for a long time. I checked
past failures and it doesn't look like we're still hitting the
problem but the needle stopped matching, or anything.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Seems the bug might just be that plymouth got more sensitive to
fast typing, so instead of waiting, let's try slow typing.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
New Plymouth seems to have a bug where it shows the decryption
prompt briefly then shows a spinner and refreshes it, throwing
away any already-typed input. This is breaking our tests quite
often (any time os-autoinst is "lucky" enough to spot the first
brief appearance of the prompt and start typing). To work around
it, after we first see the prompt, wait for the screen to settle
and re-assert the needle before typing. This should reliably
wait out the refresh cycle.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Plasma 6's color chooser seems to have dropped the nice "basic
colors", so choosing black got harder. Let's try using the HTML
color input box thingy instead, and typing #000000, the HTML
color code for black.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This PR tries to respond to issue #294.
On Silverblue, this will try:
* flatpak install
* flatpak remote-add
* flatpak list
* flatpak remotes
* flatpak remove
* flatpak update
and also it tests that a flatpak can be built.
Currently, the installation via WebUI is mostly pushing the Next button
which seems to be ok for the production which is based in the US.
This PR makes openQA to select languages when the G-I-S runs
before Anaconda. The particular language is selected based on
the LANGUAGE variable.
The latest version of g-i-s grew a "Previous" button on the
final page, and hitting tab once now activates that, not the
Try button. shift-tab *should* get us to Try on both old and
new versions.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
webUI has been deferred to F40, so we need to expect the old UI
flow on F39 now. This should cover everything, I hope.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
We're reverting webUI for Fedora 39.
https://bodhi.fedoraproject.org/updates/FEDORA-2023-73c4f1a802
is the update that implements this; adapt the tests to handle it
(by expecting the old flow when testing that update, and editing
the kickstart to drop anaconda-webui when building the live
image).
Signed-off-by: Adam Williamson <awilliam@redhat.com>
We now have a fix for the bug on the new webUI flow where the
language and keyboard screens weren't skipped on the first boot
after install, as intended. language is never *really* skipped -
it just turns into welcome - but keyboard is now being skipped,
which messes up the logic here.
For a short time we need to handle both paths, to get the new
anaconda builds through and new composes built. In a couple of
days we can simplify this to just always assume keyboard will
be skipped on the first boot on Workstation live installs on
F39+.
Also drop handling of auth_required in g-i-s - I'm pretty sure
that bug got fixed years ago - and wait_still_screen for three
seconds on each page, to let animated transitions settle.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
We just landed the webUI stuff for F39, so now we need these
conditionals to kick in for F39+, not F40+.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This is tailored to the initial deployment of webUI in
Workstation live images only; we may need to tweak flows and
approaches as webUI goes further.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Per https://bodhi.fedoraproject.org/updates/FEDORA-2023-5fd964c1bf#comment-3149533
we kinda need to do this to allow this update through, so long as
we're not going to have dnf obsolete dnf5 or anything like that.
It's a bit unfortunate but I don't see an alternative.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
We're seeing a lot of tests fail on 404s when trying to access
the koji-rawhide repo (the repo for the Rawhide build tag, which
we use to get packages tagged since the last compose. nirik is
trying to figure this out from the server end, but for now at
least, let's mark the repo as skip_if_unavailable. This should
mean that if we hit a 404, the test will continue, it just won't
have access to the packages from that repo. Occasionally this
will cause a problem - a false failure or false pass - but this
still seems better than every test that hits it failing. The
false pass case is the most concerning, but I would hope in that
case some other tests from the same update would fail, making it
not an issue.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Per nirik, 'repos/rawhide' is just a symlink to 'repos/fXX-build'
and this could possibly be part of our 404 problem. So let's
try using fXX-build directly instead of the symlink.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This repo gets regenerated a lot, so we should be very aggressive
about metadata expiry. Hopefully this will forestall most of
the cases where we get a 404 trying to access this repo's
metadata files.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
We're still getting failures from pagure.io even with the retry
stuff. nirik asked us to add this to help figure out what the
heck is going on there.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This tweaks all pagure.io downloads to be retried a few times,
since we seem to be getting failures quite often. We use curl
for this as it has nice options for it.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This requires a change in the package we use for base_update_cli
because pandoc-common is not in ELN.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
We drop the line for the update ISO from /etc/fstab before
uploading the image after the cockpit_default test, but we don't
make sure it's set up again before Cockpit tries to use it, in
the subsequent Cockpit tests. I don't know why this didn't fail
on stg before, but it sure as hell is failing in prod...
Signed-off-by: Adam Williamson <awilliam@redhat.com>
I'm attempting a new approach to the update and workaround repos.
Instead of having each update test recreate them for itself -
which is slow and wastes bandwidth - the dispatcher will create
an ISO at test schedule time and pass it as ISO_2. Then the test
just mounts the ISO. This makes the necessary adjustments on the
test side.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Drop the e2fsprogs scratch build workaround we were using for
https://github.com/fedora-silverblue/issue-tracker/issues/470 -
with the new 'use a custom ref and rebase to the official ref'
thing I implemented for update ostree tests, it shouldn't be
necessary any longer.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
I think this is new behaviour in rpm 4.19, or else we would have
run into this before when testing an s390utils update. But it's
easy enough to handle.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
F36 is now EOL, and from F37 onwards, grub is the bootloader in
any situation where it actually matters to do_bootloader (which
is only when we're editing parameters). We do still use syslinux
in the PXE tests on x86_64 BIOS, but we don't edit the parameters
in that case.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
https://github.com/storaged-project/udisks/issues/1102 - udisks2
seems to have a bug where it leaves filesystems mounted at a
"temporary" mount point after creating them. We need to work
around this when it happens or else we'll frequently get test
failures.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This adds a new script - cleanup-needles.py - to use for cleaning
up old needles. It has to be used in conjunction with a database
query; the comment at the top explains how to do that query and
export the needed information. It produces a git commit with
needles that haven't matched since a certain date (specified in
the sql query) removed, subject to a 'keeplist' of needles we
keep even if they seem to be old.
I also enhanced check-needles.py to check for cases where tests
seem to be trying to match a tag we have no needles for. This
was necessary to find cases where the cleanup script was too
aggressive (i.e. the things that wound up in the 'keeplist'),
but it also turned out to find quite a lot of cases where the
code really *was* looking for a needle that had gone in a
previous cleanup or that never existed; the commits before this
one clean up a lot of those cases.
The code to decide which string literals are needle tags is
pretty dumb and hacky and needs some manual cueing sometimes -
that's what the `# testtag` changes in this commit are for.
Making it smarter would probably require this script to get a
lot more complicated and either incorporate or become a
tokenizer, which I don't really want to do.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
It looks like neither of these has been a problem for some time.
The notification needle has not matched for a year. The akonadi
needle doesn't exist any more - it was cleaned up in the 2021
needle cleanup, meaning it hadn't matched for weeks in 2021. I
checked the last several months of KDE app start/stop tests and
don't see any case where there was a stray notification that we
missed. So I think we can just ditch this whole mechanism for
now; if we have problems with these notifications again in future
we can put it back.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
For several releases now, the 'new user mode' of g-i-s is just
gone: https://gitlab.gnome.org/GNOME/gnome-initial-setup/-/issues/12
so this whole path where we used to be able to set up Japanese
input methods on first boot after install doesn't work any more.
We had to set up a whole different route to set the input method
via control center instead (which lives in _graphical_input).
This block is never reached any more, and the needles for it were
cleaned up in 2021.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Since 022865ab we do not use start_with_launcher on KDE any more.
The needle for it has since got lost in an unused needle
cleanup. Let's just drop this branch for now; we can add it back
if we ever need it again.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
The needle that backs this workaround was dropped in the 2021
needle cleanup, so it's never worked since then. I checked, and
no aarch64 tests seem to be failing on this any more.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Upstream https://github.com/os-autoinst/openQA/pull/4973 requires
us to poke things here a bit. This only works with the newer
os-autoinst and openQA (there may be a way to conditionalize it
to work with both, but I can't be bothered figuring it out, let's
just update).
Signed-off-by: Adam Williamson <awilliam@redhat.com>
There's a fairly longstanding issue in GDM where switching layout
just doesn't work sometimes:
https://gitlab.gnome.org/GNOME/gnome-shell/-/issues/6066#note_1707051
there doesn't seem to be any progress on getting this fixed, and
it's annoying constantly restarting tests that fail on it. So
this just makes us try three times to switch before giving up.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
In install_repository_hd_variation , we have two disks attached,
so when we reach the INSTALLATION DESTINATION screen we expect
we have to select the correct target disk. However, as of the
most recent anaconda build in Rawhide, anaconda realizes we're
using a file from the other disk as our install repo, and
'protects' it - i.e. it does not show it on the INSTALLATION
DESTINATION screen, meaning only one disk is shown there, and
when only one disk is shown, it's pre-selected. So when we click,
we actually *un*select it, and the test fails.
Fixing this is a bit awkward; I wanted to add a new variable,
ANACONDA_PROTECTED_DISKS or something, and subtract that from
the NUMDISKS count; but we can't really do that as the enhanced
protection isn't in F38, and we can't easily set variables
differently on different releases (we'd have to set them in the
scheduler code, not just put them in the templates). So we just
code in a doofy condition for this instead. Maybe when F38 is
EOL we can change to the variable approach.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
An lvm2 issue which breaks the installer (#2180557) and anaconda
renaming the .desktop file for the Workstation live welcome
screen, which caused it not to appear -
https://pagure.io/livesys-scripts/pull-request/12 .
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Reasoning:
1. pandoc is not in critpath so will not itself be tested
2. pandoc is widely used and actively maintained
3. package is noarch
4. package has minimal deps
Hopefully this will work for everything. For some reason, the
"use python3-blivet for pykickstart tests" fails mysteriously
sometimes, see e.g.
https://openqa.stg.fedoraproject.org/tests/2672282
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Somehow, the dummy package being python3-kickstart causes the
graphical update tests (only) to fail for pykickstart updates
(that's the source package of python3-kickstart). The CLI and
Cockpit update tests are fine with this and pass.
To workaround this, use python3-blivet as the dummy package for
the graphical update tests when testing an update that contains
python3-kickstart. I've updated the test repo to contain both
dummy packages.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
So...there's an ffmpeg update:
https://bodhi.fedoraproject.org/updates/FEDORA-2023-a5e10b188a
which went stable. It includes new sonames of all the ffmpeg
libs. It also pulls in a thing called oneVPL, which has a bug
that breaks ostree composes.
There's a big kf5 update:
https://bodhi.fedoraproject.org/updates/FEDORA-2023-b086a98f78
which contains kf5-kfilemetadata, which is built against ffmpeg.
Neal made sure that update's build of it was built against the
new ffmpeg and submitted both for stable at once - but the tests
on the kf5 update failed because they weren't run against the
new ffmpeg as it wasn't yet stable, and the kf5 update was
ejected from the push because of the failed tests.
So now we have the ffmpeg update stable but not the
kf5-kfilemetadata rebuild for it, which will break KDE stuff,
and the oneVPL issue means ostree composes will all fail.
This adds the ffmpeg update as a workaround so we can re-run the
tests for the kf5 update and get them to pass so we can push it
stable. It also adds the oneVPL update as a workaround so ostree
compose tests don't start failing.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Move the xauth disablement and the disabling of studies into
_setup_browser, instead of repeating it in a couple of other
places (but *not* doing it in the zezere test, where we should
be doing it). Drop some explicit package installs that should
no longer be needed as Firefox and/or X.org now depend on those
things. Install the current default fonts (Noto), not the old
ones (DejaVu).
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Just using a scratch build for now as my fix hasn't been reviewed
and may have dumb mistakes in it, but it does seem to fix the
openQA test at least.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
A different way to address the same problem as 56936df7 . Let's
just *remove* the repo management packages after we're done
creating the repos. dnf will automatically remove the unused
dependencies too. This fixes the python-cryptography case at
least - I tested.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
There's this awkward path for the live image install tests on
updates. We run the 'are the correct versions of all the packages
installed' check on these tests to ensure the right versions
actually made it onto the live image. So we don't run
`dnf -y update` at the end of repo_setup_updates on that path,
because if we did that, even if the packages on the live image
were old, we'd update them there and hide the problem.
However, this causes a bit of an ordering issue, because in
order to set up the advisory repo, we need to install a few
packages. What if the update under test includes one of those
packages, or a dependency that wasn't already installed? In
that case, we wind up with the older stable version of the
package (because obviously we can't install the newer version
from the advisory repo *before we've set up the advisory repo*),
don't update it later, and so the 'correct version' check at
the end of the test fails. See:
https://openqa.fedoraproject.org/tests/1778707 for a case of
this happening with a python-cryptography update.
Up till now I was trying to handle this by just updating the
specific packages we install, but that doesn't account for
*dependencies* of them. I looked down the path of trying to
generate a list of all those dependencies and update all of
them but it looks a bit mad. So instead let's try this. On that
specific path, we'll generate the "all installed packages" list
*before* we run repo_setup, so it just doesn't include anything
that gets installed during repo_setup. The implementation is a
bit icky but not too horrible.
We *could* just *always* generate the all installed packages
list earlier, but then that would mean we *wouldn't* catch dep
issues in this kind of package on the other test paths, whereas
currently we do. I don't want to lose that.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This is to handle a temporary condition where the screen isn't
present on the KDE base disk images for F38 or F39 yet, so they
only see it on the second boot on update tests, but don't handle
it because we marked it as already 'done'.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
KDE has a welcome tour now, on F38 and Rawhide at least. Let's
"handle" it with extreme prejudice...
Signed-off-by: Adam Williamson <awilliam@redhat.com>
The F38 update that breaks this hasn't gone stable due to gating
and the F39 update will be pulled in once the tests are done
and it goes stable, but doing this anyway so I can re-run the
tests on the F38 plasma-workspace update and push it stable,
and rerun all the failed Rawhide tests without waiting for all
the tests on this update to finish first.
We still need to handle 43 only requiring one for now, and we
can't just make it release-dependent until 44 is stable for both
38 and Rawhide, so let's use a needle match temporarily. Only
44 has these eye/pencil icons on this screen.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
As we're not getting composes ATM this isn't being pulled into
tests of subsequent updates, but we need it to be or else there
are issues.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
With Rawhide updates, we quite often run into a situation where
a test runs after a *later* version of the package has already
gone stable. This even happens for stable releases too, though
less often. The current shell-based check just always fails on
this case, but it's usually OK, and manually marking every case
like this with an "it's OK!" comment gets tiring. Instead, let's
use a smarter Python script to do the check. We compare the EVR
of all installed update packages with the EVR of the package
from the update. If it's the same, fine. If the installed package
is lower-versioned, that's always an error, and we fail. If the
installed package is higher-versioned, we check whether the
update already went stable. If it did, then we soft fail, because
probably nothing can go wrong at this point (this is the usual
Rawhide case). If the update did not yet go stable, we still
hard fail, because something can go wrong in this case: if the
update *now* goes stable, the older version from the update may
be tagged over the newer version the test got (presumably from
current stable).
If anything goes wrong with the Bodhi check, or the test is
running on a task not an advisory, we treat both cases as fatal.
The script also gives easier-to-understand output than the old
approach, which should be a bonus.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Same conditions as used in main.pm to load the tests in the
normal flow. It makes no sense to do this on non-update tests,
or on the non-matching support server case.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This is safer if the advisory stuff was done on a previous test
run. Hilariously, this exposed a dumb mistake I made years ago
in installedtest.pm and never noticed before: the calls to
advisory_* at the bottom of that file are meant to be in the
post_fail_hook, but they weren't, which meant they got called
by the scheduler. This didn't cause any failures because the
first line caused them to return immediately based on a get_var
call (which it's OK to do in the scheduler), but changing it
to a script_run call (which it's *not* OK to do in scheduling)
caused all the tests to blow up immediately and confused me
*a lot* until I spotted this!
Signed-off-by: Adam Williamson <awilliam@redhat.com>
When I enabled _advisory_post for live and ostree install tests,
the point was to check that updated packages were included in
the install media and used during installation. We shouldn't run
a system update in _repo_setup_updates on this path because it
will hide the problem if the updated packages weren't included.
The INSTALL variable is for this purpose - it was previously
used to skip _advisory_post on the same path. At the same time
let's remove some stray settings of this var on non-update tests
as it serves no purpose there.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Per discussion at https://pagure.io/fedora-ci/general/issue/376
it really feels like this is the right thing to do. There are no
buildroot overrides for Rawhide, so we don't have to worry about
cross-pollution. The buildroot repo only contains builds that
have been tagged stable since the most recent Rawhide compose,
and thus will go into the next one. It makes sense to test later
updates against these. This avoids issues like:
https://openqa.fedoraproject.org/tests/overview?distri=fedora&version=38&build=Update-FEDORA-2022-30a952e331&groupid=2
where the tests of an update failed because it requires another
update which had been submitted and tagged stable previously, but
after the last compose.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Also use get_var("TEST") for installer_build - no point trying
to upload these logs for the other tests in the same flavor,
they won't be there.
Signed-off-by: Adam Williamson <awilliam@redhat.com>