Disable systemd resolved

Fixes flake: 'dial tcp: lookup cdn03.quay.io: no such host'

Okay, doesn't actually fix as in _fix_, just fix as in "sweep it
under the rug". The actual bug is in systemd-resolved, or in the
quay.io/cloudflare.net DNS nameservers, or in the weird specific
setup for cdn03 (it's a CNAME, compared to cdn01/02 which are A).
Maybe a combination of all of the above. I don't care; I just
want the flakes gone. I realize that this makes our testing
environment different from default Fedora, and am okay with
that because I suspect many Fedora users disable systemd-resolved
as SOP.

Signed-off-by: Ed Santiago <santiago@redhat.com>
This commit is contained in:
Ed Santiago 2023-05-08 09:31:11 -06:00
parent fdd5bef7e5
commit a37af540a5
3 changed files with 45 additions and 0 deletions

View File

@ -0,0 +1,41 @@
#!/bin/bash
#
# Excerpted from https://github.com/containers/automation_images/blob/main/systemd_banish.sh
#
# Early 2023: https://github.com/containers/podman/issues/16973
#
# We see countless instances of "lookup cdn03.quay.io" flakes.
# Disabling the systemd resolver has completely resolved those,
# from multiple flakes per day to zero in a month.
#
# Opinions differ on the merits of systemd-resolve, but the fact is
# it breaks our CI testing. Kill it.
nsswitch=/etc/authselect/nsswitch.conf
if [[ -e $nsswitch ]]; then
if grep -q -E 'hosts:.*resolve' $nsswitch; then
echo "Disabling systemd-resolved"
sed -i -e 's/^\(hosts: *\).*/\1files dns myhostname/' $nsswitch
systemctl disable --now systemd-resolved
rm -f /etc/resolv.conf
# NetworkManager may already be running, or it may not....
systemctl start NetworkManager
sleep 1
systemctl restart NetworkManager
# ...and it may create resolv.conf upon start/restart, or it
# may not. Keep restarting until it does. (Yes, I realize
# this is cargocult thinking. Don't care. Not worth the effort
# to diagnose and solve properly.)
retries=10
while ! test -e /etc/resolv.conf;do
retries=$((retries - 1))
if [[ $retries -eq 0 ]]; then
echo "Timed out waiting for resolv.conf" >&2
echo "...gonna try continuing. Expect failures." >&2
fi
systemctl restart NetworkManager
sleep 5
done
fi
fi

View File

@ -0,0 +1,3 @@
---
- name: disable systemd resolved
script: ./disable_systemd_resolved.sh

View File

@ -7,6 +7,7 @@
- artifacts: ./artifacts
rootless_user: testuser
roles:
- role: disable_systemd_resolved
- role: rootless_user_ready
tasks: