That clever-clever 'check the packages from the update were
installed' thing from yesterday breaks on kernel updates, as
they're installonly; after the update, the new version of the
package is installed, but the *old* version is too, and the way
I implemented the check, it treats that as a failure. Let's try
and handle this a somewhat-clever way (if this fails, I'm just
going to grep out lines with 'kernel' in them, as a *dumb* way).
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This should fix log collection when a French or Japanese test
fails before the test itself would have done this.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
If an update test fails before reaching advisory_post, we don't
generate the 'what update packages were installed' and 'were
any update packages *not* installed when they should have been'
logs, but these may well be useful for diagnosing the failure -
so let's also do the same stuff there. Only let's not do it all
twice.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
We hit an interesting case in update testing recently:
https://bodhi.fedoraproject.org/updates/FEDORA-2018-115068f60e
An earlier version of that update failed testing. When we dug
into it a bit, we found that the test was failing because an
earlier version of the `pki-server` package was installed than
the version that was in the update; when asked (as part of
FreeIPA deployment) to install it, dnf had noticed that there
were dependency issues with the version of the package from the
update, but it happened to be able to install the version from
the frozen 'stable' repo...so it just went ahead and did that.
In this case, the 'missed' package resulted in a test failure,
but it'd actually be possible for this to happen and the test
to complete; we really ought to notice when this happens, and
treat it as a test failure.
So what this attempts to do is: at the end of all update tests,
check for all installed packages with the same name as a package
from the update, and compare their full NEVR to the one of the
package from the update. If a package with the same name as one
of the update packages is installed, but does not appear to be
the *same NEVR*, we fail, and upload the lists of packages for
manual investigation as to what the heck's going on.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
There's really no point having separate error and error_report
needles. Just match on error_report as well as clicking on it.
Also add a new error_report needle for latest Rawhide fonts.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Sometimes we get a test failing because the SUT isn't connecting
to the network for some reason. In this case we never get any
logs, because `upload_logs` relies on being able to reacht at
least the worker host system via the network.
This attempts to detect when we can't ping the worker host, and
in that case, send some info out over the serial line instead.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
That whole creaky edifice of conditionals that figured out how
many times to press 'down' was a mess I always hated, and I just
found out that the fix for BLS wasn't complete - I'd assumed in
writing it that systems weren't being migrated to BLS on upgrade
to F30, but actually they are. This makes that design very hard
as we'd have had to find a way to change the number of 'down'
presses part-way through update tests, and all the ways I can
think of to do that would've made this even sillier.
Happily I managed to come up with what looks like a much simpler
approach: just go from the bottom. It seems that in every setup
I can think of to check - all three arches, BLS, no BLS, pre-
install, post-install - the linux line is two lines up from the
bottom of the config stanza (the last line is blank, and the
last line but one is the initramfs line). So we can just press
down 50 times (to make damn sure we're at the bottom) then press
up twice and we should be in the right place, no matter the arch,
the release, or if BLS is in use or not. Whew.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This bug is breaking all update FreeIPA tests; until the updates
go stable, let's pull them in to update tests so the results
are useful.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
The one we were using before doesn't seem to exist any more in
Rawhide. /etc/os-release should be fine.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Now the BLS stuff is enabled in Rawhide, we need to press 'down'
a different number of times to reach the 'linux' line when
editing the boot params (I really, really wish there was a
better way to do this :<). It gets tricky as there are all sorts
of cases here (support_server tests use a CURRREL disk image,
and then there's upgrade tests)...I think this covers things for
now.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Since a recent sssd update, console login during FreeIPA tests
is taking unusually long. We don't want this to fail all the
tests, so let's extend the timeout, but with a soft fail.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Somehow, recently, FreeIPA tests are running into Firefox not
quitting because it's showing a warning about closing multiple
tabs. (I think we didn't *get* multiple tabs before but now we
do, for some reason). So let's work around this by clicking
"Close tabs" if the warning appears.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
For some reason, in recent tests, switching to a console after
live install completes is taking a long time, and tests are
failing because we 'only' allow 10 seconds for the login prompt
to appear. This seems to indicate some kind of performance bug,
but we don't really want all liveinst tests to fail on in, this
is not primarily a performance testing framework. So let's
tweak the root_console / console_login bits a bit to allow a
configurable timeout for the login prompt to appear, and use
that to wait 30 secs instead of 10 in this case.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
In recent Rawhide, it seems the Workstation live session runs on
tty2 not tty1 for some reason. This throws off anacondatest
root_console, which assumes there'll be a vt on tty2. Handle it
by using tty3 instead if we're in a GNOME live environment.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Looking at this, it's a bit weird: the updated packages are
actually included in the upgrade process, but we still run
_advisory_update, which does basically nothing...then reboots.
That's kinda silly and makes the tests a bit flaky, let's fix
it. I don't think there's actually any problem with doing the
upload of updatepkgs.txt in _repo_setup_updates, becase that
already guards against being run more than once, it just bails
very early if it's already been run.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
There seems to be a bug in Rawhide lately where, when our tests
want to install a bare X and run Firefox on it, this takes an
unusually long time to start up, with SELinux in enforcing mode.
With SELinux in permissive mode it starts as fast as usual. This
isn't a hard failure and we don't want it to block all later
tests, so let's handle it and treat it as a soft fail.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
OK, we now need to work around this goddamn grub bug in *three*
places, so let's stop copying the loop around and factor it out
instead. The third place is encrypted installs, as they wait
for the decryption prompt on boot.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Per Neal Gompa boot will proceed if we just page through the
error(?) messages displayed when #1618928 happens, so let's do
that to let the tests get further and see what else is broken.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
It seems that for some reason the localized layout gets loaded
on the installer VTs by this point in time, so we need to load
'us' again for this complex command to work.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Sometimes on aarch64 clicking the partition scheme drop-down
just doesn't seem to make the menu appear, instead the button
goes active but that's all. It's very unlikely we'll be able
to track down why as this doesn't happen in manual testing on
aarch64 (according to @pwhalen), so instead let's just work
around it.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Upstream is gonna change the default from 30 to 0, it seems:
https://github.com/os-autoinst/os-autoinst/pull/965
so let's go ahead and change these two cases where we have no
explicit timeout to have one.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
The reason we have all this horrible code to use the commented-
out baseurl lines in the repo files instead of the metalinks
that are usually used is a timing issue with the metalink
system. As a protection against stale mirrors, the metalink
system sends the package manager a list of mirrors *and a list
of recent checksums for the repo metadata*. The package manager
goes out and gets the metadata from the first mirror on the
list, then checksums it; if the checksum isn't on the list of
checksums it got from mirrormanager, it assumes that means the
mirror is stale, and tries the next on the list instead.
The problem is that MM's list of checksums is currently only
updated once an hour (by a cron job). So we kept running into
a problem where, when a test ran just after one of the repos
had been regenerated, the infra mirror it's supposed to use
would be rejected because the checksum wasn't on the list - but
not because the mirror was stale, but because it was too fresh,
it had got the new packages and metadata but mirrormanager's
list of checksums hadn't been updated to include the checksum
for the latest metadata.
All this baseurl munging code was getting ridiculous, though,
what with the tests getting more complicated and errors showing
up in the actual repo files and stuff. It occurred to me that
instead of using the baseurl we can just use the 'mirrorlist'
system instead of 'metalink'. mirrorlist is the dumber, older
system which just provides the package manager a list of mirrors
and nothing else - the whole stale-mirror-detection-checksum
thing does not happen with mirrorlists, the package manager just
tries all the mirrors in order and uses the first that works.
And happily, it's very easy to convert the metalink URLs into
mirrorlist URLs, and it saves all that faffing around trying to
fix up baseurls.
Also, adjust upgrade_boot to do the s/metalink/mirrorlist/
substitution, so upgrade tests don't run into the timing issue
in the steps before the main repo_setup run is done by
upgrade_run, and adjust repo_setup_compose to sub this line out
later.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Now F28 went stable, we're not disabling updates on upgrade any
more, and this bug got exposed: the location of the updates and
updates-testing repos actually changed between F27 and F28, so
the `baseurl` line from fedora-repos in F27 isn't correct for
F28. When doing an upgrade from < 28 to > 27, we need to correct
the URL when we're done installing stuff from the old release
repos but before we start trying to pull stuff from the new
release repos.
This repo munging crap is really getting fragile, it'd be great
if we could get that metadata timing issue resolved so we could
reliably use mirrormanager...
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This adds the FreeIPA server and client upgrade tests to a new
updates-server-upgrade flavor which fedora_openqa will schedule
for updates. This way, we can test whether updates break
FreeIPA upgrades, which is a request the FreeIPA team made to
me. This has been deployed on staging for the last week or so
and appears to work fine.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Since gnome-initial-setup-3.28.0-5.fc28 , the g-i-s screens
that are supposed to be suppressed as part of
https://fedoraproject.org/wiki/Changes/ReduceInitialSetupRedundancy
are now suppressed on FAW installs as well as traditional ones.
So adjust the logic accordingly.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
We were doing this in a post-install test, but not on failures.
We need it to figure out why Firefox is crashing on aarch64...
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Trying to keep track of what these magic numbers mean is really
getting messy, so let's do it a bit more explicitly, using the
page names g-i-s uses internally, and lots of comments. This
should make it clearer and more maintainable when stuff changes.
Signed-off-by: Adam Williamson <awilliam@redhat.com>