It needs to be root in order to set the ownership and permissions on the
directories that are under /var/lib/lorax/composer/
Refactor the directory creation into a utility function, and use a umask
of 0o006 to ensure that the parent directories created do not have o+rw
set on them (makedirs behavior is different between Python 3.6 and 3.7
so umask of 0 doesn't work consistently).
If a package is in multiple repos dnf may return more than 1 of them
when using best...glob so we pick the highest NEVRA one and install
that.
Related: rhbz#1636239
At the end of disk image installs, use fstrim on the generated filesystem to
discard any blocks that were allocated during the install and are now unused.
This will allow tools such as qemu-img to create images that do not include
deleted data.
For raw disk images that do not go through qemu-img, use fallocate --dig-holes
to create sparse holes in place of the unused blocks.
composer-cli uses TOML for 'blueprints save' which was returning an
empty 200 response if the blueprint didn't exist. Change this to return
a standard 400 error response if the blueprint doesn't exist.
composer-cli is already setup to handle receiving json when an error is
returned so just the toml API response for `blueprints/save` needed to
be changed.
`os.path.exists("/run/weldr/api.socket")` returns False for users which have no
access. This leads to composer printing that the file does not exist, which is
misleading.
Since it's no possible to distinguish the two cases, fix this problem by
combining them and showing a single error message.
Anaconda requires the root password to be set or locked, so if there
isn't anything setting it we write out 'rootpw --lock'
Also adds tests for this.
Resolves: rhbz#1626122
See https://bugzilla.redhat.com/show_bug.cgi?id=1595917 and
https://github.com/rpm-software-management/dnf/pull/1200 for
more on this. Briefly, DNF before 3.0 presented this config
value as a list...and mutating it worked. DNF from 3.0 until
3.6 presented it as a list...mutating it didn't work, but also
didn't *fail*, so this has actually not been doing anything on
DNF 3.x but we haven't noticed.
In DNF 3.6 values like this are presented as tuples instead of
lists, to try and catch usages like this, and it worked! We
need to change this one.
There is an additional weirdness here. tsflags is actually, in
libdnf terms, an OptionStringListAppend option: that means that
when something tries to *set* its value, the new value is just
appended to the existing list of values. This is very weird
behaviour when you're interacting with it like this, but
happens to be quite useful, as we can just 'set' the value to
a list like this and it will actually get appended (which is
what we want), and this one syntax happens to work correctly in
DNF 2.x, 3.0 through 3.5.1, and 3.6.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
When the kickstart is handed off to Anaconda for building it will
download its own copy of the metadata and re-run the depsolve. So if the
dnf cache isn't current there will be a mismatch and the build will
fail to find some of the versions in final-kickstart.ks
This adds a new context to DNFLock, .lock_check, that will force a check
of the metadata. It also implements its own timeout and forces a
refresh of the metadata when that expires because the dnf expiration
doesn't always work as expected.
Resolves: rhbz#1631561
Ends up you cannot use the kickstart user command on root, since it
already exists, so we have to translate that into a rootpw command.
So [[customizations.user]] with name = "root" only support key, which
will set the ssh key, and password which will use rootpw to set the
password. plain text or encrypted are supported.
Related: rhbz#1626122
This differs from lmc's --make-ami in that creates a full disk image instead of
an fsimage. Create a raw disk image with a / and /boot partitions, and enable
sshd, chronyd, and cockpit by default.
Remove `except` block which immediately raises the same exception again (it's
not a subclass of another caught exception, so this is safe).
Remove a false positive, because it is not emitted from the code base.
Disable subprocess-popen-preexec-fn in startProgram, which is not used
internally.
lorax uses pyanaconda's SimpleConfigParser in three different
places (twice with a copy that's been dumped into pylorax, once
by importing it), just to do a fairly simple job: read some
values out of /etc/os-release. The only value SimpleConfigParser
is adding over Python's own ConfigParser here is to read a file
with no section headers, and to unquote the values. The cost is
either a dependency on pyanaconda, or needing to copy the whole
of simpleparser plus some other utility bits from pyanaconda
into lorax. This seems like a bad trade-off.
This changes the approach: we copy one very simple utility
function from pyanaconda (`unquote`), and do some very simple
wrapping of ConfigParser to handle reading a file without any
section headers, and returning unquoted values. This way we can
read what we need out of os-release without needing a dep on
pyanaconda or to copy lots of things from it into pylorax.
Resolves: #449Resolves: #450
Signed-off-by: Adam Williamson <awilliam@redhat.com>
In the near-future there may be /lib/modules/ directories for older
kernels with weak dependencies listed. These may not match the installed
kernel(s) so we cannot depend on them to drive generate_module_data.
Instead use the existing findkernels() function to get the list of
installed kernels and iterate those, running depmod on them.
Resolves: rhbz#1622213
blueprints/changes is different, each blueprint has it's own total,
limited by the call's limit. So it needs to find the max total of all
the requested blueprints.
The blueprints/changes API is a bit different from the others, the total
that it includes is for each blueprint, not one total for all of them,
since there will be a different number of commits for each.
The function is passed the dict, and it can be used to select the total
to use for retrieving all of the results. If it isn't included it will
use data["total"] which works fine in most cases.
Add a limit argument to all potentially paginated results, equal to
whatever the composer backend is the total number of results. This still
has the potential to provide truncated data if the number of results
increases between the two HTTP requests.
Resolves: #404
This adds the following optional arguments to the /compose/status route:
- type, matches the compose_type field
- status, matches the queue_status field
- blueprint, matches the blueprint field
A value of 1 is too low for heavy users of the API, such as the weldr-web
interface.
This is also systemd's default for sockets it opens. Using lorax-composer with
socket activation already results in a backlog of SOMAXCONN connections.
Currently we are making MBR disk images for qcow2 and partitioned disk,
so the UEFI packages aren't required at this point.
Move the clearpart command into compose.py so that in the futute it can
use clearpart --disklabel to create a GPT image, and add the required
packages to the package set.
The idea here is to make sure all return points have the same type for
the error cases. There's not really all that many, so they just go in
one patch. Some of these could potentially turn into more specialized
errors later.
(cherry picked from commit fd901c5e3f)
Note the exception string checking around compose_type. I didn't really
want to introduce a new exception type just for this, but also didn't
want to duplicate strings. I'd be open to other suggestions for how to
do this.
(cherry picked from commit b3bb438254)
This adds some fairly redundant code to the beginning of all the
blueprint routes to attempt reading a commit from git for the
blueprint's recipe. If it succeeds, the blueprint exists and the route
can continue. Otherwise, return an error. Hopefully this doesn't slow
things down too much.
(cherry picked from commit a925cc7ddb)
Note that this also changes the return type of uuid_info to return None
when an unknown ID is given. The other uuid_* functions are fine
because they are checked ahead of time.
(cherry picked from commit 6497b4fb65)
Each element in the errors value is now a dict, with a msg field and an
id field. The id field contains a value out of errors.py that can be
used by the front end to key on. The msg field is the same as what's
been there.
The idea is to keep the number of IDs somewhat limited so there's not a
huge number of things for the front end to know.
(cherry picked from commit 9677b012da)
This should make it easier to return more complex error structures. It
also doesn't appear to matter - tests still pass without changes.
(cherry picked from commit 4c3f93e329)
Make sure no UTF8 characters are allowed and return an error if they
are.
Also includes tests to make sure the correct error is returned.
(cherry picked from commit 86d79cd8a6)
Currently the code is not UTF8 safe, so we need to return a clear error
when invalid characters are passed in.
This also adds tests for the routes to confirm that an error is
correctly returned.
(cherry picked from commit 74f5def3d4)
This patch does two things:
1) Add "compose list", which lists compose UUIDs and other basic info,
2) Fix up "blueprints list", "modules list", "sources list", and
"compose types" so their output is just a plain list of identifiers
This handles the case where a route is requested, but without a required
parameter. So, /blueprints/info is requested instead of
/blueprints/info/http-server. It accomplishes this via a decorator, so
a lot of these route-related functions now have quite a few decorators
attached to them.
Typo'd URLs (/blueprints/nfo for instance) will still return a 404. I
think this is a reasonable thing to do.
(cherry picked from commit 5daf2d416a)
Unfortunately, this isn't very useful if /modules/info is provided with
multiple modules. yum doesn't traceback when doPackageLists is given
something that doesn't exist. It just returns an empty list. If
/modules/info is given just one module and yum gives us an empty list,
it's easy to say what happened. If /modules/info is given several
modules and just one does not exist, we will not be able to detect that.
Fixing this would require doing more yum operations, which is likely to
slow things down and isn't the direction I want to be going.
(cherry picked from commit 8e948e4a4d)
Right now, this is when the compose is queued up, when it is started by
anaconda, and when it is finished (whether that's success or not).
(cherry picked from commit 3ba9d53b8b)
If one of the timestamps isn't present (for instance, the finished
timestamp for a job that is still running), null is returned.
(cherry picked from commit 17c40ef271)
This is responsible for writing out a new times.toml file, containing
important timestamps in the life of a compose. This seems a little more
reliable than attempting to infer things from the filesystem, especially
in light of the fact that we can't ever really know when a file was
created.
(cherry picked from commit b59d59b124)
Some results have errors and no status, others have status and errors.
Update the function to return the final rc to exit with, and a bool
indicating whether or not to continue processing the other fields.
Add a bunch of tests for the new function to make sure I have the logic
correct.
(cherry picked from commit 35fa067219)
A bad system repo can cause lorax-composer to fail to start. Instead of
a traceback log the error and exit.
(note that the exit still results in an OSError traceback due to part of
it running as root, this needs to be addressed in another commit).
This adds a new argument to projects_depsolve and
projects_depsolve_with_size that contains the group list, unfortunately.
I would have prefered adding a function that just returns a list of all
the contents of a group and then add that to what was being passed into
projects_depsolve. However, there does not appear to be any good way to
do that in yum aside from a lot of grubbing around in the comps object,
which I am unwilling to do.
(cherry picked from commit 5fe4b47072)
This is the same as the output at the top level, just trimmed down to
only the options for a single subcommand. It's trigged by providing
"help" or "--help" as a subcommand option.
(cherry picked from commit 954f330ace)
This isn't a real subcommand like the others. The option processing
just intercepts it and prints the output. Given that we're subcommand
based, it makes sense to support this in addition to --help.
(cherry picked from commit 3743d6d208)
Depsolve the packages included in the templates and report any errors
using the /api/status 'msgs' field. This should help narrow down
problems with package sources not being setup correctly.
Pass --dnfplugin='*' to enable all of them.
Pass --dnfplugin='plugin-name' to enable one fo them. You can use it
multiple times to enable multiple plugins. Globs work as well.
It appears that sometimes the loop device doesn't get setup properly,
this may be a race with other users of loop devices on the system, or
some other mechanism that isn't understood.
To try and prevent total failure when this happens this patch retries
the loop setup 3 times before giving up. Previously it would wait for
the loop device to appear (checking 5 times), that operation is now
executed 3 times with a new losetup attempt each time.
Resolves: rhbz#1589084
(cherry picked from commit c746e8b0c3)
Use it to override the default dracut arguments (displayed as part of
the --help output). If you want to extend the default arguments they
all need to be passed in on the cmdline as well. eg.
--dracut-arg='--xz' --dracut-arg='--install /.buildstamp' ...
Resolves: rhbz#1452220
Make it possible to manipulate the simple and regexp
tests the LogRequestHandler class uses to check error
messages for potential error states.
This is accomplished by moving the simple and regexp test
strings to class members, where they can be easily
manipulated by users of the pylorax module.
It's also now possible to set the log request handler class
for a LogMonitor.
This functionality can then be used for example like this:
customized_log_request_handler = monitor.LogRequestHandler
customized_log_request_handler.simple_tests.remove("Call Trace:")
log_monitor = monitor.LogMonitor(install_log,
timeout=opts.timeout,
log_request_handler_class = customized_log_request_handler)
This way installation will continue even if there was a call
trace in the logs. In a similar way additional tests and regexps can be
also added.
DNF Repo.dump() function cannot be used as a .repo file for dnf due to
it writing baseurl and gpgkey as a list instead of a string. Add a new
function to write this in the correct format, and limited to the fields
we use.
Add a test for the new function.
Fix /projects/source/info to return an error 400 if a nonexistant TOML
source is requested. If JSON is used the error is part of the standard
response.
Update test_server.py to check for the correct error code.
When adding a source failed it wasn't being removed from the dnf object.
This fixes that, and returns an error when setting up the source fails.
Also adds a test for it.
This also includes detecting rawhide vs. non-rawhide releases and
adjusting the tests accordingly (some of the source names change).
We had only been indirectly pulling in GConf, and anyways
nothing was listening to these keys.
<kalev> I still think it's a fallout from 27a90d973f
Really in general, if we wanted to make changes like this
it'd probably be a lot simpler to do them on boot or so.
https://bugzilla.redhat.com/show_bug.cgi?id=1581838
First is Anaconda uses 6k blocks per file for its estimate, and it
fudges by 10% so adjust for those with an extra 10% of headroom just in
case.
Second is an Anaconda bug that won't allow it to do a kickstart install
to a disk smaller than 3000 MB. There is a PR to fix it upstream, but
for now the minimum size has to be 3000e9
This adds support for the optional blueprint section [customizations].
Use it like this:
[customizations]
hostname = yourhostnamehere
[[customiations.sshkey]]
user = root
key = root user key
Different versions of libgit2 act differently. Using TIME results in
some commits (like a revert) being listed correctly, but the rest being
listed in reverse order. Leaving it at the default works for
libgit2-0.26.3
This moves everything except the cmdline checking into run_creator in
pylorax.creator
It also rearranges some functions to prevent import loops, and adds a
utility function to imgutils (mkfsimage_from_disk for copying a
partition into a filesystem image).
This no longer uses the enabled configuration setting to select repos to
use. It uses everything in the repo_dir, and if system repos have not
been disabled it copies them into the repo_dir at startup, overwriting
the previous copy.
This reduces the amount of code in livemedia-creator to the cmdline
parsing and calling of the installer functions. Moving them into other
modules will allow them to be used by other projects, like the
lorax-composer API server.
filter(provides=...) doesn't work with paths. The release packages
provide system-release so just look for that instead of a file.
Now it finds the release package and selects it along with the
corresponding logos package.
Note, this has been broken since commit 431ca6ce
Commit 8edaefd4d1 added the ability to install specific NVR's of
packages, but it did not adjust the exclude operation to account for
this.
This patch fixes that, applying the exclude only to the name part of the
package NVR, and changes some variable names to pkgnvr/pkgnvrs to make
it more clear that the content has changed to <name>-<version>-<release>
Some lorax users run it from inside mock, which isn't able to detect
whether the host is in Permissive mode. This can lead to confusing
error messages, so this points them in the right direction.
When multiple units are passed to systemctl and one fails it doesn't
finish the others. Change the template command to call systemctl for
each unit individually.
This also removes the lvm2-activation-generator in runtime-cleanup.tmpl
This will allow anaconda to fetch kickstarts using https when installing
with fips=1
Leave vmlinuz and .vmlinuz.hmac in /boot
dracut-fips module needs the vmlinuz.hmac file in order to boot.
It seems that on rare occasions losetup can return before the /dev/loopX
is ready for use, causing problems with mkfs. This tries to make sure
that the loop device really is associated with the backing file before
continuing.
NOTE that using losetup --list -O to return the backing store
associated with the loop device can fail due to losetup truncating
the output filename if sysfs isn't setup. Instead of printing the full
path it will truncate it to 64 characters with a * at the end.
See util-linux lib/loopdev.c for the code that does this.
Use the existing get_loop_name function, which uses losetup -j, to lookup
the loop device associated with the backing store which should work the
same, just in the opposite direction.
For historical reasons, lorax used the 'anaconda' package as a
touchstone to determine the architecture for the build. At some
point, this package became a metapackage that pulls in both the
GUI and headless installers.
In the modular world, it's possible that only the core and TUI bits
may be available for use. The only subpackage of anaconda that is
guaranteed to be on any viable system is anaconda-core, so let's
switch to using that for the touchstone instead of the metapackage.
Signed-off-by: Stephen Gallagher <sgallagh@redhat.com>
Also sort the expanded list of packages so that any failures will
be consistent instead of depending on the randomness of a set().
And add better logging when things fail.
The core issue is that repodata may have packages that match globs, but
they cannot actually be installed (eg. sigrok-firmware). This can cause
*some* of the globbed packages to be installed before hitting the
failure package.
With this change it will log the expanded list of packages if a glob is
used. It will skip any packages that fail to install when using
--optional with the glob, and continue to install the rest.
Related: rhbz#1440417
Previously lorax had no way to use repos with self-signed certificates.
This adds the --noverifyssl cmdline option which will ignore certificate
errors.
Resolves: rhbz#1430483
OSTree is a deduplicating hardlink store using a new file path
`/ostree`, which SELinux policy doesn't know about. However, OSTree
has SELinux support built in, and rpm-ostree (for example) uses this
to ensure the attributes on files stored there are simply always
correct. Relabeling it will corrupt it.
Hence, let's skip it.
Right now we dump all subprocess output to `program.log`. Unfortunately,
The pungi/koji stack doesn't know how to scrape out the lorax logs.
And even when running interactively, it's annoying that *some* fatal
errors show up on stderr, but if it's from a subprocess, I need to go
over and `tail program.log`.
Let's output the subprocess stderr directly, since the user is
going to want it prominently anyways.
anaconda-26.1 changed how package scriptlet failures are handled. They
are now fatal, and anaconda hangs after logging an Installation failure.
ERR packaging: Installation failed: PayloadInstallError('DNF error:
Non-fatal POSTIN scriptlet failure in rpm package mlocate',)
Catch this (the 'packaging: Installation failed' part) and terminate the
image creation.
This controls how big the root filesystem is for the squashfs used in
the boot.iso, the default is 2GiB.
Note that larger rootfs sizes will require more memory and may cause the
build to fail.
I'm working on
https://fedoraproject.org/wiki/Changes/WorkstationOstree and when
using lorax to make an installer ISO with content embedded, I run out
of disk space since the desktop+various apps is large.
Since this ends up being compressed anyways, let's just bump the
currently arbitrary `2` to `10` - the only real cost I can think of is
going to be a few more superblock entries.
If the query filter doesn't return anything it would just ignore the
install request instead of logging and raising an error when
required=True.
This checks for no packages matching, and if required is True raises an
error after all of the requested packages have been processed, instead
of after the first one to fail.
Previous versions of lorax assumed that installpkg was optional, and
would continue on if the PKGGLOB didn't match anything. But the majority
of the packages are required so this allows the boot.iso to be built
with missing packages that are hard to track down.
It makes more sense to make the PKGGLOB required and to flag the
few exceptions to this with --optional.
DNF doesn't want users to access base.logging anymore.
Lorax already takes over the "dnf" logger and directs it to ./dnf.log,
so it wasn't really being used.
This raises the debug level to DNF's custom DDEBUG, and sets it up so
that dnf.librepo.log and hawkey.log are next to dnf.log
Before attempting to cleanup any dangling anaconda mounts copy the
anaconda logs to their final location.
Also, catch failures to cleanup the mounts, log it, and continue trying
the other mountpoints. A cleanup failure will result in an InstallError
instead of a CalledProcessError.
Fedora now has a edk2 package so use the OVMF code from there. This also
adds using a copy of OVMF_VARS for each boot instead of reusing the one
provided by the package.
In some cases the initramfs may not be present in /boot to save space.
Use it if present, otherwise use the kernel version to recreate the name
of it.
This also fixes problems with dracut running out of space when not using
--live-rootfs-keep-size
There's no reason to require the initramfs when we can rebuild it using
the version from the kernel. This adds handling of missing initramfs so
that lmc kickstarts can remove it from the squashfs, saving about 40M on
the iso.
This makes sure the contents of /boot are at the expected locations in
/boot and in sys_root. For partitioned images it mounts the separate
/boot partition on /boot. For both fsimage and partitioned images ir
binf mounts it to sys_root so that the kernel+initrd can be found.
The boot directory isn't always named boot.0, so wildcard it and let the
count check handle failure if there is more than 1.
umount tries to delete a mountpoint if it has lorax.imgutils in the
path. This doesn't work right if you try to umount something mounted
deeper on the path.
This adds a delete option, which is True by default, to skip the delete.
If an anaconda no-virt run crashes it can leave things mounted under
/mnt/sysimage. Previously anaconda-cleanup was used to handle this, but
it will also try to cleanup host mountpoints which isn't desired.
When using the template install command copying the same file to itself
shouldn't crash. Just log the error and continue.
Also copy the s390 configuration files for use with livemedia-creator
Resolves: rhbz#1269213
When an image name hasn't been passed, and the compression type is
something other than xz, the default image name should use the user
specified compression suffix.
Resolves: rhbz#1318958
Some cases of mksquashfs were not using -Xbcj when it is available for
the arch. This adds a function to return the correct args based on the
arch and the cmdline args.
Allow the template to select a different compression type or arguments
for the installimg command.
On 32bit builds running inside a mock xz sees the full amount of system
memory which can result in xz failing with a memory error. This allows
the template to limit the amount of memory it tries to use.
lmc --no-virt was switching selinux to permissive if it was enforcing
and restore it when done. This works fine when it is the only session
running, but would cause problems if it was run in parallel.
It now only checks the state and exits with an error if it isn't already
disabled or in Permissive mode.
Users will need to run setenforce 0 before running lmc.
If there isn't enough space for DNF to download packages it will log:
"Not enough disk space to download the packages."
So add this to the messages in monitor that trigger an error.
This makes package selection a little more roundabout, but it allows for
unused packages (and their dependencies) to be removed from globs during
the install phase.
dnf.subject.Subject is the class used by dnf's Base.install to select
packages, so the behavior of installpkg without --except options is the
same as it was before.
commit 4699c88109 changed how the disk
size is estimated and not all users took into account that the return
value is in MiB.
This would result in qemu based iso installations having a rootfs.img
that was 1024x too large.
Something is causing problems with the ext4 rootfs.img when running with
no-virt inside koji. This results in a failed image that looks good
until you try to boot it.
make_squashfs will now return False if it fails, and make_live_image
will return None (instead of the result path). lmc will exit with a 1
and log an error.
When using no-virt the runtime filesystem size comes from the kickstart.
For virt installs lmc was creating a runtime filesystem that was just
slightly larger than the space used by the files installed by anaconda.
This can run into problems with larger filesystem. It is also
inconsistent behavior between virt and no-virt installations.
With this commit the virt runtime filesystem will also come from the
kickstart.
Switching to using qemu directly allows lmc to be more flexible. It can
now run from inside a mock chroot for creation of all image types,
inculding disk images, and can take advantage of KVM on the host system
if /dev/kvm device is present inside the mock.
It should also be possible to create cross-arch images, but without kvm
available this is likely to be a very slow option.
When running a no-virt installation it was parsing the kickstart url
method and passing it to anaconda using --repo which prevents it from
working with url --mirrorlist method. There is no good reason to do
this, anaconda gets the method directly from the kickstart when it isn't
on the cmdline.
This allows lorax to support multiple templates.
If there is no templates.d under the sharedir (/usr/share/lorax or the
directory passed by --sharedir) then the templates in that directory
will be used as they were previously.
If there are directories under templates.d the first one will be used,
unless --sharedir points to a specific one.
Use 4k blocks for the ext4 filesystem. Run fsck on the filesystem to
make sure deleted blocks are actually zeroed, and pass -Xbcj to
mksquashfs.
4k blocks and -Xbcj decreases the size by 2-6% depending on the
filesystem size. Zeroing the blocks of the ext4 fs improves things
dramatically. The problem is that DNF downloads the rpms before
installing them. In addition to forcing us to use a larger filesystem
than we would like it leaves data that is difficult to compress on the
image. The downloaded files are removed, but need to be zeroed out so
that mksquashfs can compress it.
Instead of reusing --image-name add a new argument to name the iso. This
way the disk image can be given a unique name with --image-name and the
iso can be named something different.