as today seems to be required to avoid createhdds to hang
on creation of disk_f25_minimal_2_ppc64le.img.tmp ...[13/22]
Trying to VNC connect, show anaconda menu stuck on
"Performing post-installation setup tasks"
Signed-off-by: Michel Normand <normand@linux.vnet.ibm.com>
Assumption createhdds executed on a PowerPC ppc64le host
to create the PowerPC specific images.
Detect current CPU arch of host machine to create virt-install images
only for supported architectures. (hardcoded lists)
hdds.json specific changes for PowerPC
* no desktop or kde images
Signed-off-by: Michel Normand <normand@linux.vnet.ibm.com>
'stable' means 'all current stable releases', and is used for
desktop and server as we need to ensure we have those images
available for all stable releases (including the '-3' release
while it's not EOL) for update testing. (Currently, F24 update
tests are all failing as the images are missing).
'current' means 'the current stable release' - it's the same as
'-1', but just easier to understand. It's used for support.
It seems like the installer images (in os/images) for the F26
Workstation tree somehow come from the OStree installer compose
rather than the network installer compose; install.img and
boot.iso are far larger than they should be, and match the size
of the OStree installer .iso . So instead of using those images
and bumping up the memory size to 4GiB, use the Everything tree
for F26 Workstation image builds and go back to 2GiB.
The updates.img is broken with current F26, and Radek claims the
bug is fixed. Let's see. Unfortunately I don't remember what I
put in the updates.img, so if I have to recreate it things could
get fun...
This bug is causing havoc with image creation on openqa01, so
use an updates.img for F26 image builds until it's fixed.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This is an attempt to add features desirable for creating
Taskotron base images. It extends the 'release' handling for
virt-install images in several ways to allow requesting of
'branched' and 'rawhide' image creation. It also adds an arg
to request virt-install image creation run in text mode, not
graphical mode. Graphical mode is what we always want for
openQA (so the installed OS doesn't have kernel params intended
for serial console interaction), but for Taskotron purposes,
we want the install run in text mode.
This also adds 'branched' to the default JSON file for minimal
and desktop, as we will want branched versions of these images
for the critpath update testing workflow to work on Branched
after Bodhi activation.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
There's been an annoying bug on the production openQA server
for a while now: createhdds runs to refresh the upgrade base
images don't actually work most of the time, each attempt to
run virt-install fails. I finally got time to debug this a bit
today and it seems to be some kind of network issue: the VM has
no network access so the install doesn't work. I've no idea why
that's happening, but using user-mode networking seems to work
around it for now and shouldn't have any terrible side effects
even for openQA servers not affected by this problem. So let's
do this until we can pin down the real bug.
Summary:
I've come to dislike the approach where we source files that
get included in the disk images remotely; it's unnecessarily
complicated and needlessly exposes us to unexpected changes in
the remote files. So simply keep the files in this git repo
instead, none of them is very big. This also simplifies the
code.
Also, change all the kickstarts to specify --device=link and
--activate on their network lines. These arguments *should*
be included, according to the documentation (the first to
specify which device the config is for, the second to specify
that it should be activated in the installer environment), and
I think the lack of the second is actually now a problem for
the FreeIPA kickstart enrolment test (not sure exactly what
changed to make this a problem where it wasn't before, but
*something* has, and this fixes it).
Test Plan:
check that all tests still run properly. The
FreeIPA enrolment kickstart test should now get somewhat
further. Before this change, for F25 and Rawhide, it fails to
enrol at all during install. Now it should enrol properly -
that's what the kickstart changes fix, it was failing because
it wasn't using the right network config - but it still fails
when trying to log in as test1, due to RHBZ #1366403 .
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D968
Summary:
Encrypted desktop upgrade tests are being added to openqa_fedora,
this generates the necessary disk image for it and handles
result reporting.
Test Plan:
Check the tests run with the generated disk images,
check result reporting generates appropriate ResTups.
Reviewers: garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D923
Summary:
We've kinda been having too much trouble with virt-builder
lately, mainly SELinux related issues due to how it does image
customization. It also produces images that differ in notable
ways from what a 'typical' install would give. virt-install
solves both these problems, and also gives us more flexibility
for storage configuration and post-install customization should
we need them in future.
The change isn't really too drastic, and the design is similar:
instead of virt-builder commands files, each image type now has
a kickstart file where all its customizations can be done.
There's also a single extra image dict key, 'variant', which
specifies which install tree variant to use for running the
install. It defaults to 'Everything' (for F24+) and 'Server'
(for <F24, as Everything wasn't installable until F24) but we
set it to 'Server' for the server images and 'Workstation' for
the desktop images, so those installs will use the correct
variant install class.
We run the installs in VNC. You can do it with a serial console
and log the output, but then anaconda gets clever and changes
several things in the installed system based on the fact that
you did the install over a serial console: it twiddles with
the kernel args and doesn't set graphical.target as the default.
We don't want any of that mess, so we do a VNC install.
The 'size' value is just a number of gigabytes for virt-install
images (as that's how the virt-install 'size' argument works).
This also drops some unused 32-bit images (we don't do 32-bit
KDE or Server upgrade tests, so there's no need to build those
images).
Test Plan:
Re-generate all affected images and re-run all tests
that use them, make sure they work. I am doing this on staging
at present. Note: this would render D911 unnecessary.
Reviewers: garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D917
A couple of bugs showed up with the graphical desktop images
(Workstation and KDE) for Fedora 24. Setting graphical.target
as the default by doing the symlink during image build, in the
virt-builder appliance, leaves it incorrectly labelled; in F24
this seems to stop systemd from reading it, so it falls back
to rescue.target when createhdds boots the image to try and
get the selinux autorelabel done, and relabelling never happens.
So instead, we change the default target with a firstboot
command (which will get run when createhdds boots to do the
relabel, so by the time openQA boots the image, the target will
be changed).
Also, a tricky bug in fedora-release has the ultimate effect
that if you start with a minimal install then install a desktop
environment on top - like we effectively do for these images -
the login manager service for the desktop does not get enabled,
as it should. So we work around that by explicitly enabling the
appropriate service with (again) a firstboot command.
Pushing without review as garretraziel is out this week and we
need these fixes - without them disk image generation in prod
is broken, which breaks quite a lot of tests.
Summary:
For NFS tests and to set up the support server to do DHCP and
DNS. Also add result reporting for the new NFS tests.
Test Plan:
Check the packages show up in the support server,
run the new tests (see openqa_fedora diff), generate ResTups
with the CLI tool, check they look OK.
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D887
Summary:
this will be used for tests like iSCSI and NFS which need a
server end, but where (unlike e.g. FreeIPA) we don't want set
up of the server end to be considered a test and run on the
compose being tested, but instead we just consider the server
end a 'test asset' and expect it to be as reliable as possible.
For now we just set it up to provide an iSCSI target, as that's
what I want to write a test for first.
Note: I was initially writing the actual iSCSI target config
file as part of image creation, but that turns out to be a bit
tricky because we can't safely write a relative or absolute path
for the source file into the commands file. We'd have to instead
allow specifying file injections in the JSON, similar to how we
do for guestfs images, then construct an absolute path in the
Python code using SCRIPTDIR, but it was easier to just write the
file in the openQA test instead, so I did that.
Test Plan:
check that creating the support image works (and if
you're being really thorough, boot it and check the package is
there).
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D883
Summary:
Creating the virt-builder images directly with their final file
name means that if we're rebuilding them for age and the build
fails, we'll lose the existing image: it seems better to keep
it than have no image at all, when this happens. It also means
that while the rebuild is in process, the file might exist but
be useless and cause any tests that happen to be running at the
time to fail. So just like the guestfs images, create the file
with a .tmp extension initially and rename it after a successful
build.
Test Plan:
Do some image builds and check they work, check that
temp file is cleaned up on failure and ctrl-c if you can...
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D858
we don't want it (a realistic upgrade scenario would not involve
initial-setup getting in the way) and it seems to mess up boot
sometimes (i-s crashes on start if the system wasn't installed
via anaconda, and that seems to be causing the failed boots
sometimes).
Summary:
This goes with D839. It makes the necessary changes to report
the results for the desktop-terminal tests, and changes the
username in the upgrade test disk images from 'ejohn' to 'test'
to be in line with the name used in all the other tests, and
save us having to change the 'user_logged_in' needles.
Test Plan:
Run with D839, check that all tests work and results
are correctly sent to the wiki.
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D840
Summary:
This goes along with the openqa_fedora commit to add the tests.
I didn't update the Docker instructions yet because I don't
quite remember how that goes. It might need a whole different
setup using some other networking...thing...
Test Plan: See https://phab.qadevel.cloud.fedoraproject.org/D831
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D832
Summary:
this goes with D799, which changes the updates image tests to
use a different updates image that works with current anaconda.
One of the updates image tests is for loading the updates image
from a hard disk, so we have a hard disk image that contains
the updates image, so that needs rebuilding with the new update
image. Simples! This is the first time we use 'imgver', I knew
it'd come in handy some time...
Test Plan:
Apply with D799, do `createhdds.py all --clean`, make
sure the new image is built and the old one removed, run the
tests and make sure they work (especially install_updates_img_
local).
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D800
Summary:
With the previous code, the temporary file wasn't closed before
we try to upload its contents to the disk image. This seems to
cause problems when run in Python 2; the file is truncated on
upload. So rejig things a bit so we use tempfile.mkstemp, close
the file after writing to it, then remove it ourselves after
doing the upload. Tested with Python 2 and Python 3 that this
creates a correct image.
If something goes wrong at the wrong time the temp file will
be left around, but hey, it's a temp file - they get cleaned
up on system shutdown. So doesn't seem like a big deal.
Test Plan:
Check building the two images that involve a file
upload in both Python 2 and Python 3, make sure the file is
complete and correct in all cases.
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D786
Summary:
createhdds.sh was just too damn simple and understandable, so
I thought I'd make it three times longer, object oriented,
and hard to understand!
OK, OK, that's not great sales. Alright. The main thing was to
make it smarter. This rewrite lets it do these things:
* Only create the images that are missing (not rebuild all)
* Work out the releases to build images for
* Rename images when appropriate
* Rebuild images when they need rebuilding
* Remove old / abandoned images
It can figure out what images ought to be present - including
working out the 'next' release and figuring out from that what
releases it needs images for - and build only the missing ones.
There's a 'version' concept for images; if the existing image
is older than the version given in the data file, it'll be
rebuilt. The data file can list 'rename' pairs, allowing images
to be renamed (like when we move from a single image to multiple
label/filesystem variants). This code uses fedfind's ability
to find the current release version to figure out what releases
we need virtbuilder images for (so you don't have to pass it
in). And it can find image files that aren't in the 'currently
expected' set and wipe them. Images can also have a 'maxage',
triggering a rebuild when they exceed it - this is intended
for the virtbuilder images, so we get a rebuild with the
latest updates every so often (default is two weeks).
The point of all this is to help with unattended deployment/
maintenance, i.e. the ansible deployment we have in infra;
the idea is that we can just set that up to run the 'all'
subcommand every so often, and it'll remove old images, create
new ones, and rebuild ones that are outdated.
I kept the ability to build a single image (or a whole image
'group'), and included the ability to just run a check without
actually doing a rebuild. There's a few little weird things
and holes here as it's not really the focus of the tool.
Test Plan:
Build all images, do a full test run, and see if
it works OK. Test out all the variations of building single
images / image groups, and using the 'check' command.
Reviewers: jskladan, garretraziel
Reviewed By: jskladan, garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D687
Summary:
For F22 we don't need to remove firewalld, as the approach to
flavor configurations was changed and the conflict wasn't
there any more. On the other hand for F21 we still do, and
we need to use yum, not dnf. So make the command run inside
virt-builder conditional on $VERSION. I figure we still might
want to create F21 images if we want to test F21->F23 upgrades.
Test Plan:
I built the new set of HDDs on BOS with this change,
so we can see if the upgrade tests all run as expected. To
check the F21 images I guess we'd have to poke at 'em with
guestfish or something.
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D572
Summary:
This converts openqa_trigger into a fedora_openqa_schedule
package which is properly modularized: there's a CLI module,
a schedule module, a report module, and for now conf_test_
suites is its own module (though I think it's kind of ugly
and we should turn it into a JSON file or something).
ISO file download location configuration is now done with
an optional config file, as with the splits it becomes a
mess to try and pass it through from the CLI args. This also
means custom ISO locations will be respected by other things
we write which use the 'schedule' module.
This includes a setup.py so the package and fedora-openqa-
schedule command can be installed systemwide. We could now
extend this to install stuff like the systemd services and
little scripts like run-nightly.sh.
Test Plan:
Check that things work more or less as before. New CLI command
is 'fedora-openqa-schedule'; it has the 'current' and 'compose'
sub-commands, plus a new 'report' sub-command which works like calling
report-job-results.py directly used to. Check that installing systemwide
works properly. Check that ISO download location configuration works as
expected. Running './fedora-openqa-schedule' from within the git checkout
should also work.
Reviewers: garretraziel, jskladan
Reviewed By: garretraziel, jskladan
Maniphest Tasks: T541
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D547
Summary:
Have the backing functions always raise TriggerExceptions if
no jobs could be scheduled, and have the CLI command functions
just quit (with exit code 1) immediately when this happens.
I don't see any value to logging the errors and continuing to
run, nothing useful is going to happen with no jobs.
This allows us to have the 'current' cronjob run the compose
report: we just have it do:
openqa_trigger.py current && check-compose
then it won't generate a compose report every time it runs, but
only when the current compose changed and some jobs ran.
Test Plan:
Well, just make sure things quit or run as intended,
I guess.
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D541
Summary:
So the first attempt to use the waiting stuff in production
failed because, at some point, koji_done got a socket.error
from the server. Not sure if that was a Koji outage or some
kind of rate control, but even if it's rate control and we need
to tweak the wait interval, this seems advisable: we shouldn't
die the first time we hit any kind of error while waiting, we
should retry a few times first (with increasingly long delays
between the retry attempts). I know bare except clauses are BAD,
but I think it's OK here as we can't really cover every possible
exception which might get raised in any module during a 'go hit
a server and do a bunch of stuff' operation, and if the error
keeps happening we *are* going to raise it eventually.
Test Plan:
Check I didn't break the 'normal' case, and try causing
an error to appear somehow (e.g. disconnect from the network or
hack up the 'client' instantiation in fedfind to use the wrong URL)
and see if the error handling works as intended.
Reviewers: garretraziel, jskladan
Reviewed By: garretraziel, jskladan
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D532
Summary:
This is an alternative to D516. It drops the 'all' subcommand
entirely and adds some capabilities to 'compose' to make it
suitable for scripting. --ifnotcurrent will check if the compose
is the same as the current validation event, and bail out if it
is. --wait waits for the compose to be available. I've also
enhanced fedfind (in 1.4.1) to allow passing just 'Branched'
as the milestone; it will guess the release and date, now (it
didn't before).
With all of that taken together, we could have three cron jobs,
one for 'current', one for 'branched', and one for 'rawhide'.
That way we don't have to do any clever multiprocess/round-robin
waiting stuff, and the jobs for each release will be run and
reported as fast as possible (and we can run the compose report
right after the trigger script and have that sent out
efficiently too).
Test Plan:
Try all the possibilities with 'compose' - especially
check that it works with just '-m Rawhide' and '-m Branched',
and that the --wait and --ifnotcurrent args work as intended.
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D525
Summary:
We can just parse the release out of the 'build' value, here.
All tests still run properly because we use the * wildcard
in the job groups and so on. Passing a proper 'version' value
prevents T581: when you schedule jobs from an ISO, openQA
will obsolete any running or scheduled jobs with the same
DISTRI, FLAVOR, VERSION and ARCH.
Test Plan:
Schedule jobs for two composes at the same time
(e.g. Branched and Rawhide), see that the second set does
not obsolete the first. Make sure setting VERSION has no
unexpected consequences.
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D522
Summary:
This goes along with the openqa_fedora diff to add a no_swap
test.
Test Plan:
If the new test runs (see other diff), check its results
can be submitted to the wiki correctly.
Reviewers: garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D504
Summary:
D500 adds 32-bit test definitions to the test templates, so we
can schedule that arch again.
Test Plan: Goes along with D500, same test plan.
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D501
Summary:
This way you can just copy/paste the damn list for report_job_
results instead of reformatting it, if you want to feed it in
manually for any reason.
Test Plan:
Run some commands that schedule jobs and make sure they're
printed as '1 2 3' not '[1, 2, 3]'
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D493
Summary:
Cancelled jobs have their own state (at least in the openQA
running on happyassassin, they do). This causes r_j_r to get
stuck forever if any of the jobs are cancelled. Consider a
cancelled job 'done'.
Test Plan:
Try submitting results for a set of jobs that includes one or
more cancelled jobs. It should now work (before it would get
stuck).
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D492