This is a surprisingly large change as we want to go back to
the console we were previously on after doing it. To do that we
need to know what console we were on, and to know *that*, we need
to port everything that currently uses (ctrl-)alt-fX to switch
consoles to use select_console instead.
This is primarily intended to make running setup_repos.py faster
when it has to download a lot of packages (as typing in hundreds
of package names is quite slow). But it actually makes the whole
thing faster, even when only downloading one or two packages.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Same conditions as used in main.pm to load the tests in the
normal flow. It makes no sense to do this on non-update tests,
or on the non-matching support server case.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This is safer if the advisory stuff was done on a previous test
run. Hilariously, this exposed a dumb mistake I made years ago
in installedtest.pm and never noticed before: the calls to
advisory_* at the bottom of that file are meant to be in the
post_fail_hook, but they weren't, which meant they got called
by the scheduler. This didn't cause any failures because the
first line caused them to return immediately based on a get_var
call (which it's OK to do in the scheduler), but changing it
to a script_run call (which it's *not* OK to do in scheduling)
caused all the tests to blow up immediately and confused me
*a lot* until I spotted this!
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Also use get_var("TEST") for installer_build - no point trying
to upload these logs for the other tests in the same flavor,
they won't be there.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
It seems to time out a lot on lab but not on prod, for some
reason. Let's just give it a little longer.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
When we run Firefox directly on X lately, we often hit a bug
where X just suddenly exits in the middle of doing stuff in
Firefox. I'm not sure if this is a bug in X or in Firefox (if
Firefox crashed, X would immediately exit). Let's see if this
helps get any info on what's going on with Firefox.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
On Rawhide update tests we often don't seem to get to the login
prompt in 10 seconds, so tweak the code a bit to let us specify
a timeout in root_console, and use 30 seconds here.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
When we're logging via the serial console when a test fails and
no network is available, we only log the journal from the current
boot. But we might well need to see messages from previous boots.
So let's just log the whole journal.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Remove a bunch of needles that have not been used for some time,
plus a few workarounds that are similarly stale.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Inspired by openQA's 01-compile-check-all.t, this adds a perl
test which checks the syntax of main.pm and all lib and test
files, and hooks it up to CI. Requires os-autoinst and
perl-Test-Strict.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
There is nothing inherently 'root'-y about these so it makes no
sense to prefix their names with 'root-'. And why change from
'console' to 'terminal' compared to the naming used in the
actual qemu command and the log files? It's just confusing.
Let's be consistent (except for using - instead of _ here...
but - is easier to type!)
Signed-off-by: Adam Williamson <awilliam@redhat.com>
The log files are all under the ostree deploy root, the 'real'
system root has nothing useful. Try and find the deploy root
and prepend it to all the relevant commands if we're a CANNED
install.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This was necessary for debugging the FreeIPA 4.8 pre-release
update bug, so let's have it for all runs, just in case.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
We're getting an intermittent case where FreeIPA tests fail
because of a web server certificate issue. Click 'Advanced' in
Firefox when this happens so we can get a bit more info on the
problem.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Just like the installer image build test, only...it builds a live
image. This involves reimplementing quite a chunk of the Koji
livemedia task. Ah, well. Also involves rethinking the flavor
names a bit here, these seem...better.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This adds a test which builds a netinst image potentially with
the package(s) from the update, and uploads that image. It also
adds a test which runs a default install using that image. This
is intended to check whether the update breaks the creation or
use of install images; particularly this will let us test
anaconda etc. updates. We also update the minimal disk image
name, as we have to make it bigger to accommodate this test,
and making it bigger changes its name - the actual change to
the disk image itself is in createhdds. We also have to redo a
bunch of installer needles for F28 fonts, after I removed them
a month or so back...
Signed-off-by: Adam Williamson <awilliam@redhat.com>
If a test fails to the dracut shell, we currently don't do
anything useful. This should recognize when that happens, and
upload rdsosreport.txt.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This should fix log collection when a French or Japanese test
fails before the test itself would have done this.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
If an update test fails before reaching advisory_post, we don't
generate the 'what update packages were installed' and 'were
any update packages *not* installed when they should have been'
logs, but these may well be useful for diagnosing the failure -
so let's also do the same stuff there. Only let's not do it all
twice.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Sometimes we get a test failing because the SUT isn't connecting
to the network for some reason. In this case we never get any
logs, because `upload_logs` relies on being able to reacht at
least the worker host system via the network.
This attempts to detect when we can't ping the worker host, and
in that case, send some info out over the serial line instead.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
We were doing this in a post-install test, but not on failures.
We need it to figure out why Firefox is crashing on aarch64...
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Previous approach wouldn't work for tests that run after the
install test...let's just set a password from a chroot after
install completes. Don't really like this as it changes the
'real' install process a bit, but it's the least invasive short
term fix at least. We can maybe do something more sudo-y later
with a bit more thought.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
It's really INSTALL_NO_USER, not USER_LOGIN='false'. Also, we
need to make root_console work with no root password, sigh.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Committing without review as this is pretty trivial and I've
had it on staging for the last few days without issue. Just gets
us somewhat better info for debugging FreeIPA issues.
Summary:
This is to handle cases like #1414904 , where the system boots
to emergency mode. We really need logs to try and debug this.
Test Plan:
Force a test to hit emergency mode somehow (right now
you can just run base_services_start on Rawhide over and over
until you hit #1414904, but there's probably an easier way to
do it, I think there's a systemd boot arg to tell it which target
to boot for e.g.) and check logs get uploaded. Also check this
doesn't break log upload for a 'normal' failure.
Reviewers: garretraziel_but_actually_jsedlak_who_uses_stupid_nicknames
Reviewed By: garretraziel_but_actually_jsedlak_who_uses_stupid_nicknames
Subscribers: tflink
Differential Revision: https://phab.qa.fedoraproject.org/D1103
Summary:
This adds a couple of new exporter modules, renames main_common
to utils (this is a better name: openSUSE's main_common is
functions used in main.pm, utils is what they call their module
full of miscellaneous commonly-used functions), and moves a
bunch of utility functions that were previously needlessly
implemented as instance methods in base classes into the
exporter modules. That means we can get rid of all the annoying
$self-> syntax for calling them.
We get rid of `fedorabase` entirely, as it's no longer useful
for anything. Other base classes keep the 'standard' methods
(like `post_fail_hook`) and methods which actually need to be
methods (like `root_console`, whose behaviour is different in
anacondatest and installedtest).
Test Plan:
Do a full test suite run and check everything lines
up. There should be no functional differences from before at all,
this is just a re-org.
Reviewers: jskladan, garretraziel_but_actually_jsedlak_who_uses_stupid_nicknames
Reviewed By: garretraziel_but_actually_jsedlak_who_uses_stupid_nicknames
Subscribers: tflink
Differential Revision: https://phab.qa.fedoraproject.org/D1080
The README looks pretty ugly on Pagure. So let's unwrap it.
Let's also move the function docs into the source files. We're
much more likely to keep them up to date that way, I think. We
should probably change over to proper perl POD documentation at
some point, but comments in-line are OK for now I think.
This should solve all those annoying "Failed to synchronize
cache for repo 'updates'" failures we've had: there's no need
for the 'updates' repository to be enabled when we've decided
we want the `repo_setup` changes to be made, and having it
enabled causes problems when we run right after the Rawhide
compose completes. We hit the awkward period where the rawhide
repo has been synced but mirrormanager has not been updated
with the new metadata checksums, so mirrormanager rejects the
metadata from dl.fp.o and DNF has to go out and hit other
mirrors until it finds one which didn't sync yet. Since the
point of `repo_setup` is specifically to hack up the config so
we only use packages from the compose *anyway*, there's no
reason at all to worry about leaving 'updates' enabled and
nerfing it like we do 'fedora' and 'rawhide', we can just turn
it off.
Summary:
The current installedtest post_fail_hook assumes /var/tmp/abrt
exists at all, and dies if it doesn't, leading to no /var/log
upload. We can also avoid using openQA `script_output` - which
is annoyingly indirect and slow - by using this neat `test -n`
trick I found on SO. Let's also use it in the anacondatest
post_fail_hook to avoid uploading /var/tmp when it's empty
(which we currently do). This also drops the 0 arg from a few
more script_run calls, because it's safe to wait for the run
to complete and we should probably do so to avoid later typing
errors if the commands are slow.
Test Plan:
Cause both anaconda and installed tests to fail and
check the hooks work as intended. Maybe twiddle the failures to
ensure directories do and don't exist and/or have contents and
make sure things work OK. I've tested this to some degree and
I'm pretty sure it works right.
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D1041
It's not always in minimal installs. This is a simple change
and needed to make the post-fail hook work for minimal installs,
so pushing without review.
Summary:
os-autoinst implements `script_run` itself now, we aren't
required to implement it ourselves any more. os-autoinst's
implementation is better than ours, as it allows for verifying
the script actually ran (via the redirect-output-to-serial-
console trick).
So this drops our implementation so we'll just use the upstream
one. Where I judged we don't want to bother with the 'check
the command actually ran' feature I've adjusted our direct
`script_run` calls to pass a wait time of 0, which skips the
'wait for command to run' stuff entirely and just does a simple
'type the string and hit enter'.
Because of how the inheritance works, our `assert_script_run`
calls already used the os-autoinst `script_run`, rather than
the one from our distribution.
This should prevent `prepare_test_packages` sometimes going
wrong right after removing the python3-kickstart package, as
we'll properly wait for that removal to complete now (before
we weren't, we'd just start typing the next command while it
was still running, which could result in lost keypresses).
Test Plan:
Check all tests still run OK (I've tried this on
staging and it seems fine).
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D1034
Summary:
Since we can match on multiple needles, we can drop the loop
from console_login and instead do it this way, which is simpler
and should work better on ARM (the timeouts will scale and
allow ARM to be slow here). Also move it to main_common as
there's no logical reason for it to be a class method.
Also remove the `check` arg. `check` was only set to 0 by two
tests, _console_shutdown and anacondatest's _post_fail_hook.
For _console_shutdown, I think I just wanted to give it the
best possible chance of succeeding. But we're really not going
to lose anything significant by checking, the only case where
check=>0 would've helped is if the 'good' needle had stopped
matching, and all sorts of other tests will fail in that case.
anacondatest was only using it to save a screenshot of whatever
was on the tty if it didn't reach a root console, which doesn't
seem that useful, and we'll get screenshots from check_screen
and assert_screen anyway.
Test Plan:
Run all tests, check they behave as expected and
none inappropriately fails on console login.
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D1016