In python 3 f.seek() on text doesn't work like it does in py2/C because
text is now unicode. So change read_tail to use byte mode and take
unicode into account. Also add tests for it.
The VALID_API_STRING function allows for characters that should not be
allowed in blueprint names. VALID_BLUEPRINT_NAME allows us to
specifically check if a blueprint contains a valid name.
dnf seems to have changed the default for skip_if_unavailable. Some
mock repositories are still around in later tests, which then fail
because metadata cannot be synced.
Also expose skip_if_unavailable in dnf_repo_to_file_repo(), so that
tests checking for equality of repo files continue to pass.
This makes sure that required fields are included, and that sections are
not empty. It does not check for all optional fields.
If there are errors it will gather up all of them and then raise a
RecipeError with a string of all the errors.
The new toml library, introduced with abe7df34f, outputs different
whitespace from the old one. Fix the test expectation and strip()
results from toml.dumps(), because it contains superfluous newlines at
the end.
This also includes extensive tests for each of the currently supported
customizations. It should be generic enough to continue working as long
as the list of dicts includes a 'name' or 'user' field in the dict.
Otherwise support for a new dict key will need to be added to the
customizations_diff function.
Instead of setting up the routes inside a function we can now use a
BlueprintSkip class, which allows us to register them at different
routes (eg. /api/v0/ and /api/v1/) and override any routes that will be
replaced by the new API version.
When adding a new API we want to use the old code for any routes that
aren't being overridden.
This modifies the Flask Blueprint class so that a skip_rules list can be
passed to server.register_blueprint()
To maintain consistency with the other options this changes firewall to
combine the existing settings from the image template with the settings
from the blueprint.
Also updated the docs, added a new test for it, and sorted the output
for consistency.
Add support for enabling and disabling systemd services in the
blueprint. It works like this:
[customizations.services]
enabled = ["sshd", "cockpit.socket", "httpd"]
disabled = ["postfix", "telnetd"]
They are *added* to any existing settings in the kickstart templates.
You can now open ports in the firewall, using port numbers or service
names:
[customizations.firewall]
ports = ["22:tcp", "80:tcp", "imap:tcp", "53:tcp", "53:udp"]
Or enable/disable services registered with firewalld:
[customizations.firewall.services]
enabled = ["ftp", "ntp", "dhcp"]
disabled = ["telnet"]
If the template contains firewall --disabled it cannot be overridden,
under the assumption that it is required for the image to boot in the
selected environment.
You can now set the keyboard layout and language. Eg.
[customizations.locale]
languages = ["en_CA.utf8", "en_HK.utf8"]
keyboard = "de (dvorak)"
Existing entries in the kickstart templates are replaced with the new
ones. If there are no entries then it will default to 'keyboard us' and
'lang en_US.UTF-8'
Includes tests, and leaves the existing keyboard and lang entries in the
templates with a note that they can be replaced by the blueprint.
This fixes the customizations list problem earlier than in
add_customizations.
In the recipe it should be [customizations] not [[customizations]]
which creates a list. If it was used that way grab the first element and
replace the list with it.
For example:
[customizations.timezone]
timezone = "US/Samoa"
ntpservers = ["0.pool.ntp.org"]
Also includes tests.
This removes the timezone kickstart command from all of the templates
except for google.ks which needs to set it's own ntp servers and timezone.
If timezone isn't included in the blueprint, and it is not already in a
template, it will be set to 'timezone UTC' by default.
If timezone is set in a template it is left as-is, under the assumption
that the image type requires it to boot correctly.
This is based on the VHD compose type, with the following differences:
* Use the vhdx format instead of vhd
* No WALinuxAgent
* Install hyperv-daemons
The hyperv-daemons are activated through udev rules, so there is no need
to add them to the services line.
This option will create an optionally compressed tarball containing a
disk image. This format is used by Google's Compute Engine.
This also adds a new option, tar_disk_name, to set the name of the disk
image that will be wrapped in the final tarball. opts.image_name
continues to be the final output file name.
If provided, round the disk image size up to a multiple of the value.
This allows for image formats with specific size-alignment requirements
(e.g., disk size must be in GiB).
rpmfluff was including / in the rpm, which conflicts with
filesystem.rpm
The rpm globs are pretty limited, and we don't actually know the file
paths until later, so we have to use a glob or a directory.
So when the destination is / it now uses /* to select all the files and
sub-directories in the archive. The limitation of this is that it cannot
support dotfiles directly under /, they will cause a rpmbuild error.
For destinations other than / it uses the name of the directory, so
dotfiles are fine in that situation.
This allows iso builds to include the extra kernel boot parameters by
passing them to the arch-specific live/*tmpl template.
Also adds tests to make sure it is written to config.toml in the build
metadata.
The shlex splitting can fail, resulting in error messages like:
ERROR livemedia-creator: No closing quotation
without any context in the log files. This logs the line that failed to
be split and expanded.
Sometimes it is necessary to modify the kernel command-line of the
image, this adds support for a [customizations.kernel] section to the
blueprint:
[customizations.kernel]
append = "nosmt=force"
This will be appended to the kickstart's bootloader --append argument.
Includes tests for modifying the bootloader line, the kickstart
template, and examining the final-kickstart.ks created for a compose.
This hooks up creation of the rpm to the build, adds it to the
kickstart, and passes the url to Anaconda. The dnf repo with the rpms is
created under the results directory so it will be included when
downloading the build's results.
This adds support, documentation, and testing for a [[repos.git]]
blueprint section that can be used to install files from a git
repository. It will create an rpm that will be added to the build,
and included in the metadata that can be downloaded. This allows you to
accurately keep track of the source of configuration files and extra
metadata that is added to the build.
The source repo and reference will be listed in the rpm's summary making
it easy to discover on the installed system.
Reading a blueprint wasn't checking to see if it had been deleted so it
was returning the most recent commit before it had been deleted. This
allowed things like starting a compose with a blueprint that technically
doesn't exist.
One exception to this is the /changes/ route, it must be available so
that you can use the commit hash to undo a delete.
This also adds tests for the various operations.
Resolves: rhbz#1682113
In order to support iso creation on multiple arches with the templates
we need to be able to select different packages based on arch.
lorax-composer uses the arch-specific Lorax templates in order to
generate the output iso so this patch:
1. Creates a new template and type to parse it, live-install.tmpl
which contains only installpkg commands and #if clauses for arch
2. Removes bootloader related packages from the live-iso.ks
3. Remove dracut-config-rescue exclusion because it can cause problems
with some blueprints.
4. Switch logo requirement to system-logos which is satisfied by
generic-logos or fedora-logos. This prevents conflicts when a blueprint
installs fedora-release-workstation.
So in the future, if x86.tmpl, etc. need a new package to support
creating the iso it should be added to the correct section in
./share/live/live-install.tmpl
reqpart can be used to make kickstarts more platform agnostic, creating
needed partitions without lmc having to keep track of the arch-specific
needs. eg. ppc64 needs prepboot and /boot
This increases the size of the disk based on whether reqpart or
reqpart --add-boot is in the kickstart.
Note that this is only valid for partitioned disk output types, not
for filesystem images or live iso output.
It is not actually needed. projects_info deduplicates the package list,
placing other builds into the builds list instead of making a new
package entry. So it returns a sorted and deduped list of packages, as
expected.
In livemedia-creator's usage of this it can never pass in None, but if
someone were to import the library and use it, it would crash with
NoneType. So add the extra checks to make sure cancel_func isn't None,
just in case.
When using LMC to virt-install a system to an image, cancel_func is not
provided in run_creator, causing a TypeError (NoneType object is not
callable).
Signed-off-by: Yuval Turgeman <yturgema@redhat.com>
The reason for the 3G minimum was because anaconda had a bug with how it
calculated minimum disk size when using kickstart. The gix for this has
been in Anaconda since 29.19-1, so we can now remove our limit and
create somewhat smaller disk images.
Anaconda can leave child processes and mounts around when it crashes or
is canceled before finishing. It also sometimes unmounts unrelated file
systems (https://github.com/rhinstaller/anaconda/issues/1791).
Run it in a mount and pid namespace to clean up after it.
We need to be root to read the certificates that give access to the
package repos. Right now, the alternative seems to be changing
permissions on the certs themselves, which seems less good. We're
running anaconda as root anyway.
If a repository has `sslcacert`, `sslclientcert`, or `ssclientkey` set,
pass them to anaconda through the kickstart file. This is mostly the
case when using RHEL repositories that are accessed through a
subscription.
In some cases when the host has, for whatever reason, multiple copies of
the same repo listed the build may fail with an error about running out
of space.
So this commit removes duplicate entries after the host's repos have been
loaded. It also adjusts some of the test repos to use different
temporary repo names for the tests.
If systemd's tmpfiles.d timer is executed while lorax is running it will
remove any files and directories older than 30 days. This is what has
been causing the occasional error where /proc/ would seem to vanish
during the install.
Upstream has proposed this solution, https://github.com/systemd/systemd/pull/11482
but until that is released we need a work-around to protect the lorax
files.
This commit does several things:
* Move the default tmpdir from /var/tmp/ to /var/tmp/lorax/
* Add a lorax.conf tmpfiles.d file that prevents systemd-tmpfiles from
removing anything under /var/tmp/lorax/
* Add an exit handler to lorax so that temporary directories are removed on
exit or on a python traceback.
* Use flock to lock access to the tempdir while lorax is running.
* Remove any unlocked tempdirs named /var/tmp/lorax/lorax.* at startup
Note that the exit handler will not remove the tempdir if lorax is
killed with a signal -- those are being caught by dnf and prevent the
exit handler from running.
systemd-tmpfiles cannot clean up the tempdirs at boot time because they
contain files labeled as shadow_t, so we have to remove those when lorax
runs. It uses the flock to prevent removing any directories created by
parallel instances of lorax and only removes ones that are unlocked.
Worst case they will be around until the first run of lorax after a
reboot.
If you want to keep the working directory around for debugging purposes
use --workdir /var/tmp/lorax/my-workdir and it won't be removed by
lorax.
After a novirt disk image install, we run `setfiles` in the
install root to ensure some SELinux contexts are correct. /dev
is currently excluded from this run. However, as reported and
discussed in https://bugzilla.redhat.com/show_bug.cgi?id=1663040
it seems that with a recent systemd change, startup of many
services will fail if /dev itself is incorrectly labelled, and
in current Rawhide live images, it *is* incorrectly labelled.
Including `/dev` in this setfiles command appears to resolve the
problem in my testing.
Resolves: rhbz#1663040
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Drop running pkill. This causes problems if more than one is running on
a system (eg. in parallel using mock). It can kill off other processes
unrelated to this instance of anaconda.
This reverts commit 6b5c4df8b5.
Support both
[customizations]
hostname = "whatever"
and
[[customizations]]
hostname = "whatever"
in the blueprint data. The [[ syntax matches the other customization
directives (user, group, sshkey), and as such it's easy to accidentally
use it for the hostname without even realizing it's specifying something
different.
Add some tests for converting customizations to kickstarts.
Run df on the filesystem image after it has been created.
Output will be in program.log, eg:
Running... df /var/tmp/lorax.imgutils.wm04pg_v
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/loop0 1998672 1619508 362780 82% /var/tmp/lorax.imgutils.wm04pg_v
Return code: 0
It ends up that this isn't as easy as you'd think. Anaconda sets up some
signal handlers to handle cleanly exiting, but they are not being run
when sent a TERM after package installation has started. I think DNF
resets them causing it to get ignored.
When the cancel is sent it can take several minutes for it to have an
effect. In my testing it usually takes around 2 minutes for anaconda to
notice and exit.
This sends a TERM to the process and then waits for it to exit. When it
returns it then removed any device-mapper devices that were setup for
image installations, removes any hanging loop devices.
It then kills off any process with pyanaconda. in the cmdline, and
anaconda-bus.conf (because anaconda starts a bunch of helpers and if it
doesn't shut down cleanly they remain running).
Resolves: rhbz#1656691
In addition to monitoring the logs for errors, call a function (or
functions) that tell it to cancel the anaconda process and cleanup.
Also check for a cancel after creating the squashfs image for live-iso
since that's a long running process.
This required adding a new argument to a number of existing functions,
passing it down to QEMUInstall and novirt_install where the function is
called.
Resolves: rhbz#1656691
When there is no run or new symlink do one last check to make sure no
STATUS file was written. If it is missing, go ahead and remove the
results directory.
Related: rhbz#1656691
If another CANCEL request has already been made just exit from
uuid_cancel. If the build is FINISHED before it times out just exit,
don't remove the finished results.
Related: rhbz#1656691
When the repository has multiple arches, eg. i686 and x86_64, it should
add a new entry to the project's builds list, not create a new project
in the list.
This handles that by adding a modified insort_left function and
examining the packages returned from dnf to make sure they aren't
already listed in the results. It also handles adding them in sorted
order so that no further sorting needs to be done on the results.
Resolves: rhbz#1656642
If the system ran out of space, or was rebooted unexpectedly, the state
of the queue symlinks, or the results STATUS files may be inconsistent.
This checks them and:
* Removes broken symlinks from queue/new and queue/run
* Removes symlinks from run and sets the build to FAILED
* Sets builds w/o a STATUS to FAILED
* Sets builds with STATUS of RUNNING to FAILED
* Creates missing queue/new symlinks to results with STATUS of WAITING
So, any builds that were running during the reboot will be FAILED, and
any that were waiting to be started will be started upon rebooting.
Resolves: rhbz#1647985
Anaconda, Lorax, lorax-composer, and livemedia-creator can all now run
with SELinux in Enforcing mode. It does not need to be disabled and if
there are denials they should be reported as a bug.
Log the current state of SELinux when starting, update the
documentation.
Running lorax-composer --no-system-repos will prevent it from copying
the dnf repositories from /etc/yum.repos.d/ into the lorax-composer repo
directory. It will *only* use repositories setup using the sources api
or written to /var/lib/lorax/composer/repos.d/
If lorax-composer has previously been run without this switch the system
repos will need to be removed from the composer/repos.d/ directory. It
would also be a good idea to remove the cached metadata in
/var/tmp/composer/
Resolves: rhbz#1650363
Make runtime directly into squashfs image. This reduces largely
unreproducible ext4 layer, but requires anaconda's dracut module
modification to properly mount the image.
And in an intermediate version it returns a VectorString object which
isn't serializable by the json or toml modules.
So convert it to a list so that the type is consistent in the sources
code.
By default mkfs.mksdos choose volume id based on current time. If
SOURCE_DATE_EPOCH is set, use that instead.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
The previous method worked, but wasn't exactly idiomatic. This is more
correct, and appears to work the same (templates depsolve, version globs
work, multiple repos work).
Note that this does use a private dnf attribute ._goal, but the word is
that this is going to become a public api soon, so yes it is there on
purpose.
Depending on how lorax-composer is run setting up an empty blueprints
directory can fail. So this moves checking/creation until after the
other directories are created and uses make_owned_dir to make sure
ownership is correct.
It needs to be root in order to set the ownership and permissions on the
directories that are under /var/lib/lorax/composer/
Refactor the directory creation into a utility function, and use a umask
of 0o006 to ensure that the parent directories created do not have o+rw
set on them (makedirs behavior is different between Python 3.6 and 3.7
so umask of 0 doesn't work consistently).
If a package is in multiple repos dnf may return more than 1 of them
when using best...glob so we pick the highest NEVRA one and install
that.
Related: rhbz#1636239
At the end of disk image installs, use fstrim on the generated filesystem to
discard any blocks that were allocated during the install and are now unused.
This will allow tools such as qemu-img to create images that do not include
deleted data.
For raw disk images that do not go through qemu-img, use fallocate --dig-holes
to create sparse holes in place of the unused blocks.
composer-cli uses TOML for 'blueprints save' which was returning an
empty 200 response if the blueprint didn't exist. Change this to return
a standard 400 error response if the blueprint doesn't exist.
composer-cli is already setup to handle receiving json when an error is
returned so just the toml API response for `blueprints/save` needed to
be changed.
`os.path.exists("/run/weldr/api.socket")` returns False for users which have no
access. This leads to composer printing that the file does not exist, which is
misleading.
Since it's no possible to distinguish the two cases, fix this problem by
combining them and showing a single error message.
Anaconda requires the root password to be set or locked, so if there
isn't anything setting it we write out 'rootpw --lock'
Also adds tests for this.
Resolves: rhbz#1626122
See https://bugzilla.redhat.com/show_bug.cgi?id=1595917 and
https://github.com/rpm-software-management/dnf/pull/1200 for
more on this. Briefly, DNF before 3.0 presented this config
value as a list...and mutating it worked. DNF from 3.0 until
3.6 presented it as a list...mutating it didn't work, but also
didn't *fail*, so this has actually not been doing anything on
DNF 3.x but we haven't noticed.
In DNF 3.6 values like this are presented as tuples instead of
lists, to try and catch usages like this, and it worked! We
need to change this one.
There is an additional weirdness here. tsflags is actually, in
libdnf terms, an OptionStringListAppend option: that means that
when something tries to *set* its value, the new value is just
appended to the existing list of values. This is very weird
behaviour when you're interacting with it like this, but
happens to be quite useful, as we can just 'set' the value to
a list like this and it will actually get appended (which is
what we want), and this one syntax happens to work correctly in
DNF 2.x, 3.0 through 3.5.1, and 3.6.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
When the kickstart is handed off to Anaconda for building it will
download its own copy of the metadata and re-run the depsolve. So if the
dnf cache isn't current there will be a mismatch and the build will
fail to find some of the versions in final-kickstart.ks
This adds a new context to DNFLock, .lock_check, that will force a check
of the metadata. It also implements its own timeout and forces a
refresh of the metadata when that expires because the dnf expiration
doesn't always work as expected.
Resolves: rhbz#1631561
Ends up you cannot use the kickstart user command on root, since it
already exists, so we have to translate that into a rootpw command.
So [[customizations.user]] with name = "root" only support key, which
will set the ssh key, and password which will use rootpw to set the
password. plain text or encrypted are supported.
Related: rhbz#1626122
This differs from lmc's --make-ami in that creates a full disk image instead of
an fsimage. Create a raw disk image with a / and /boot partitions, and enable
sshd, chronyd, and cockpit by default.
Remove `except` block which immediately raises the same exception again (it's
not a subclass of another caught exception, so this is safe).
Remove a false positive, because it is not emitted from the code base.
Disable subprocess-popen-preexec-fn in startProgram, which is not used
internally.
lorax uses pyanaconda's SimpleConfigParser in three different
places (twice with a copy that's been dumped into pylorax, once
by importing it), just to do a fairly simple job: read some
values out of /etc/os-release. The only value SimpleConfigParser
is adding over Python's own ConfigParser here is to read a file
with no section headers, and to unquote the values. The cost is
either a dependency on pyanaconda, or needing to copy the whole
of simpleparser plus some other utility bits from pyanaconda
into lorax. This seems like a bad trade-off.
This changes the approach: we copy one very simple utility
function from pyanaconda (`unquote`), and do some very simple
wrapping of ConfigParser to handle reading a file without any
section headers, and returning unquoted values. This way we can
read what we need out of os-release without needing a dep on
pyanaconda or to copy lots of things from it into pylorax.
Resolves: #449Resolves: #450
Signed-off-by: Adam Williamson <awilliam@redhat.com>
In the near-future there may be /lib/modules/ directories for older
kernels with weak dependencies listed. These may not match the installed
kernel(s) so we cannot depend on them to drive generate_module_data.
Instead use the existing findkernels() function to get the list of
installed kernels and iterate those, running depmod on them.
Resolves: rhbz#1622213
blueprints/changes is different, each blueprint has it's own total,
limited by the call's limit. So it needs to find the max total of all
the requested blueprints.
The blueprints/changes API is a bit different from the others, the total
that it includes is for each blueprint, not one total for all of them,
since there will be a different number of commits for each.
The function is passed the dict, and it can be used to select the total
to use for retrieving all of the results. If it isn't included it will
use data["total"] which works fine in most cases.
Add a limit argument to all potentially paginated results, equal to
whatever the composer backend is the total number of results. This still
has the potential to provide truncated data if the number of results
increases between the two HTTP requests.
Resolves: #404
This adds the following optional arguments to the /compose/status route:
- type, matches the compose_type field
- status, matches the queue_status field
- blueprint, matches the blueprint field
A value of 1 is too low for heavy users of the API, such as the weldr-web
interface.
This is also systemd's default for sockets it opens. Using lorax-composer with
socket activation already results in a backlog of SOMAXCONN connections.
Currently we are making MBR disk images for qcow2 and partitioned disk,
so the UEFI packages aren't required at this point.
Move the clearpart command into compose.py so that in the futute it can
use clearpart --disklabel to create a GPT image, and add the required
packages to the package set.
The idea here is to make sure all return points have the same type for
the error cases. There's not really all that many, so they just go in
one patch. Some of these could potentially turn into more specialized
errors later.
(cherry picked from commit fd901c5e3f)
Note the exception string checking around compose_type. I didn't really
want to introduce a new exception type just for this, but also didn't
want to duplicate strings. I'd be open to other suggestions for how to
do this.
(cherry picked from commit b3bb438254)
This adds some fairly redundant code to the beginning of all the
blueprint routes to attempt reading a commit from git for the
blueprint's recipe. If it succeeds, the blueprint exists and the route
can continue. Otherwise, return an error. Hopefully this doesn't slow
things down too much.
(cherry picked from commit a925cc7ddb)
Note that this also changes the return type of uuid_info to return None
when an unknown ID is given. The other uuid_* functions are fine
because they are checked ahead of time.
(cherry picked from commit 6497b4fb65)
Each element in the errors value is now a dict, with a msg field and an
id field. The id field contains a value out of errors.py that can be
used by the front end to key on. The msg field is the same as what's
been there.
The idea is to keep the number of IDs somewhat limited so there's not a
huge number of things for the front end to know.
(cherry picked from commit 9677b012da)
This should make it easier to return more complex error structures. It
also doesn't appear to matter - tests still pass without changes.
(cherry picked from commit 4c3f93e329)
Make sure no UTF8 characters are allowed and return an error if they
are.
Also includes tests to make sure the correct error is returned.
(cherry picked from commit 86d79cd8a6)
Currently the code is not UTF8 safe, so we need to return a clear error
when invalid characters are passed in.
This also adds tests for the routes to confirm that an error is
correctly returned.
(cherry picked from commit 74f5def3d4)
This patch does two things:
1) Add "compose list", which lists compose UUIDs and other basic info,
2) Fix up "blueprints list", "modules list", "sources list", and
"compose types" so their output is just a plain list of identifiers
This handles the case where a route is requested, but without a required
parameter. So, /blueprints/info is requested instead of
/blueprints/info/http-server. It accomplishes this via a decorator, so
a lot of these route-related functions now have quite a few decorators
attached to them.
Typo'd URLs (/blueprints/nfo for instance) will still return a 404. I
think this is a reasonable thing to do.
(cherry picked from commit 5daf2d416a)
Unfortunately, this isn't very useful if /modules/info is provided with
multiple modules. yum doesn't traceback when doPackageLists is given
something that doesn't exist. It just returns an empty list. If
/modules/info is given just one module and yum gives us an empty list,
it's easy to say what happened. If /modules/info is given several
modules and just one does not exist, we will not be able to detect that.
Fixing this would require doing more yum operations, which is likely to
slow things down and isn't the direction I want to be going.
(cherry picked from commit 8e948e4a4d)
Right now, this is when the compose is queued up, when it is started by
anaconda, and when it is finished (whether that's success or not).
(cherry picked from commit 3ba9d53b8b)