The weird bug turned out to be caused by an internal DNS zone
in the new infra not being signed:
https://pagure.io/fedora-infrastructure/issue/9411
This is now resolved, so we can drop the workaround.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Revert "Regenerate grub.cfg for ppc64le Silverblue to boot (step 2)"
This reverts commit d384cfed30.
Revert "Regenerate grub.cfg for ppc64le Silverblue to boot, brc#1817004"
This reverts commit 8d7be9a227.
Not required anymore for f33 (only for f32)
And bad side effect for f33 (failure not analysed)
eg: https://openqa.stg.fedoraproject.org/tests/949783#step/_do_install_and_reboot/32
Keep correction to avoid warning in autoinst-log when ABRT var not defined.
Signed-off-by: Guy Menanteau <menantea@linux.vnet.ibm.com>
Signed-off-by: Michel Normand <normand@linux.vnet.ibm.com>
We've had this 'exception' for mcelog.service failing in here for
years. Looking into it, it seems to now be fixed:
https://bugzilla.redhat.com/show_bug.cgi?id=1526725
and hasn't happened in our official instances for years (I guess
because they're all Intel boxes). However, we have a similar case
on ppc64le with hcn-init.service failing spuriously:
https://bugzilla.redhat.com/show_bug.cgi?id=1894654
so I'm just converting it into a workaround for that instead. We
could wire this up to be more sophisticated, with some kind of
array or hash of services that are allowed to fail and more
complex checking code, but let's not bother unless/until it's
necessary.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
So, there's a problem with how we figure out the NetworkManager
connection to use in setup_tap_static: it expects the first
connection in the list to be the right one, but this is only
actually true so long as it's *active*. When we're in the tap
case, it's usually not going to actually *work* out of the box
on boot (or else we wouldn't need setup_tap_static at all...),
so some time after boot, NetworkManager gives up on it and marks
it as inactive. And after that, setup_tap_static won't work any
more.
I never noticed this as a problem before because usually we do
setup_tap_static before that point. But it seems in the vnc
client tests, on aarch64, desktop boot and login is slow enough
that by the time we switch to a VT and try to setup the network,
we're very close to that cutoff, and sometimes miss it.
This, I hope, avoids the problem by doing the network setup in
that test before we deal with the desktop login, then doing the
desktop login, then doing the actual VNC bits.
The alternative here would be to figure out a better way to do
setup_tap_static, but I can't.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Seems like this often fails when booting the desktop disk image
on aarch64 if we start typing right away.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
It seems like it can be *really* slow on aarch64, since 218:
https://github.com/cockpit-project/cockpit/issues/14840
this should give it a total of 180 seconds on aarch64 (90 second
still screen timeout plus 30 second assert_screen timeout, with
1.5x scale).
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This sets us up to test the release-blocking aarch64 disk images
(Minimal, Server and Workstation). It also allows for testing
armhfp disk images on aarch64 worker hosts (though my testing of
that isn't going too well so far), and fixes the initial-setup
handling for a change upstream ('use password' is now the default
so we don't need to choose it). We rewire disk image deployment
test loading to work through the generic loader code rather than
using ENTRYPOINT, as it allows us to more gracefully handle
graphical (Workstation) vs. console (Server, Minimal), moving
the code for handling console initial-setup to a helper function
just like the code for gnome-initial-setup and having _console_
wait_login call it when appropriate. We also tweak desktop_vt a
bit because now we need to switch from a console running as test
to a desktop, which breaks the assumption that the highest
numbered session of user test is the desktop...
Signed-off-by: Adam Williamson <awilliam@redhat.com>
I noticed today that if we deploy FreeIPA with dnssec validation
enabled, dnf can't resolve dl.fedoraproject.org afterwards, which
is a problem because it means we wind up falling through to
random mirrors for metadata and package download once the server
is deployed, which can be slow and give old packages. This seems
to be why the server upgrade test on F33 is sometimes failing
because we get an older FreeIPA package on upgrade, even though
the newer one has been stable for a week.
It's difficult to pin down exactly where this bug is and fix it,
I've mailed some folks to try and work it out, but until that's
figured out, let's just disable dnssec validation.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
We've been getting failures lately on the first page load, I
think because Firefox is getting even more grindy on startup. So
turn the 'sleep' into a 'wait_still_screen', extend another wait,
and tweak the 'browser' needle so it only matches after the
bookmark bar has loaded rather than as soon as half the chrome
appears. Also make all the wait_still_screens use similarity 45
for consistency (flashing cursor could be there on any of them).
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This check wasn't working, the test passed whatever wait_serial
found. This version suggested by defolos works, I checked.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This is the best option I can come up with to deal with #195.
Update notifications seem to have become transient in KDE lately
(even in F31 and F32, if I'm looking at these screenshots right).
This actually simplifies things a lot to do more or less the
same in the KDE and GNOME paths: open the 'permanent' store of
notifications (in GNOME you get to it by clicking on the clock,
in KDE via the systray) and then look for no notifications (live
path) or only an update notification (post-install path). We
only run this test for composes so we shouldn't need to worry
about anything older than F32, and I believe this should work
for KDE in F32 and F33. I left out click_unwanted_notifications
for now as I'm hoping it should be unnecessary, but we can add
it back in if necessary.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This does some of the things suggested by cheimes in
https://bugzilla.redhat.com/show_bug.cgi?id=1880628#c24 . It
seems to make the replica tests work with resolved, still work
with pre-F33 resolving, and not break anything. Also remove the
workaround to disable resolved if it's running, as we can now
work with it.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
We don't need a separate 'welcome' needle because it just matches
on an OK button anyway. So turn that needle into an OK needle
(we don't have any existing 'blue OK button' needle) and simplify
the logic to a single loop for kde_ok and krusader_settings_close.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
On ppc64le it looks like this test is often failing because it
takes a second or two to update the partition list after we
click update settings, but we're not waiting for that, so we
wind up clicking in the wrong place because we match the next
partition needle before the list is refreshed but click after
it's refreshed. Let's hope these waits solve it.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Having systemd-resolved in use seems to cause problems for
FreeIPA servers:
https://bugzilla.redhat.com/show_bug.cgi?id=1880628
until the scripts are enhanced to do this or something, let's
disable it before server/replica deployment.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
ipa-replica-install already changes the DNS config to use the
local bind instance, we don't need to do this and it's actually
wrong (as it bypasses the local BIND we should use and uses
the VM host's DNS servers instead).
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Seems what they had before worked until systemd-resolved became
the default; now we need to make sure we do nmcli mod and then
bring the connection down and up, as we do in tapnet.pm. Writing
to resolv.conf is kinda "wrong" for resolved but I don't think
it really breaks anything so I think I'll just leave those bits
in until F32 goes EOL just in case they're still somehow needed
on F31 or F32.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
https://openqa.fedoraproject.org/tests/667693#step/desktop_browser/8
shows us matching on Save File when the window is in kind of a
borked state; we'd probably wind up clicking on Open with,
because by the time we click the content of the window will have
moved to where it's actually supposed to be...so let's try this
to slow it down a bit.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Getting some odd failures where the downloaded file doesn't show
up in the right place which I think might be due to over rapid
clicking here. Try and slow it down a bit.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This should work even if the ifcfg plugin is not present (hi,
CoreOS) or 'predictable' (har) network names are on.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
There's a complex bug in current Rawhide affecting the database
server test; it boils down to "deployment fails because LANG is
set to a locale for which the corresponding langpack is not
installed". As we know broadly what's going on there now, let's
work around it with a soft failure so we catch any later bugs in
the process.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
The time before the ssh key provision request goes through turns
out to be kinda unpredictable, so instead of just a hardcoded
wait then assuming it should succeed, let's do a loop-y retry
thing instead.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
The FreeIPA UI change that the previous commit adapted to is in
4.8.9. That's stable for Rawhide and F33 already, but still in
testing for F32, and won't go to F31. So we need to make the
change conditional on release number, and we also add the update
to workarounds for F32 so we don't have to do something awkward
while we wait for it to go stable.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
OTP field was moved into the last position in the password change dialog
to prevent issues with OTP code expiring while users enter their
passwords.
Signed-off-by: Alexander Bokovoy <abokovoy@redhat.com>
GNOME now also splits 'Restart...' and 'Power Off...' as KDE
does, so we need to tweak the conditional and add some needles.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
In Fedora 33, we generally no longer include a disk-based swap
partition by default (instead swap-on-ZRAM is used, see
https://fedoraproject.org/wiki/Changes/SwapOnZRAM ). This tweaks
our tests to account for that. In tests that aren't to do with
swap at all, we stop including a swap partition in order to be
closer to the default layout. We replace the old _no_swap blivet
and custom tests with _with_swap tests that, as the name implies,
*explicitly include* a swap partition, and adjust the postinstall
test to check the disk swap partition is there.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
30 seconds doesn't seem to be reliable enough. Let's try 60, if
that's not enough I'll try and think of something smarter.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Just repositioning the mouse appears not to be enough to prevent
the sesssion going idle any more, since the 20200731.n.0 compose.
Not sure what causes this, probably the kernel. Adding a space
keypress seems to help in both KDE and GNOME.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This is a bit complex to automate, because we cannot really use
the production Zezere server (provision.fedoraproject.org) as
the test case shows, as we'd have to solve authentication and
we also don't really want to constantly keep registering new
hosts to it that are going to disappear and never be seen again.
So, instead we'll do it by setting up our *own* Zezere, and
provisioning our IoT system in that. We run two tests. The
'ignition' test is the actual IoT 'device'; all it really does
is boot up, sit around, and wait to be provisioned. The 'server'
test first sets up a Zezere server, then logs into it, adds an
ssh key, claims the IoT device, provisions it, and connects to
it to create a special file which tells the 'ignition' test
everything worked and it can close out.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This is to make the infra folks happy, apparently using 10.0.x.x
and 10.1.x.x is causing conflicts since our actual infra network
uses those ranges too.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
It actually is supposed to be installed by default, so if it's
missing that's a bug. It's been added to comps now so it should
be there from now on.
Signed-off-by: Adam Williamson <awilliam@redhat.com>