Commit Graph

210 Commits

Author SHA1 Message Date
Tao Liu
c743881ae6 virtiofs support for kexec-tools
This patch add virtiofs support for kexec-tools by introducing a new option
for /etc/kdump.conf:

virtiofs myfs

Where myfs is a variable tag name specified in qemu cmdline
"-device vhost-user-fs-pci,tag=myfs".

The patch covers the following cases:
1) Dumping VM's vmcore to a virtiofs shared directory;
2) When the VM's rootfs is a virtiofs shared directory and dumping the
   VM's vmcore to its subdirectory, such as /var/crash;
3) The combination of case 1 & 2: The VM's rootfs is a virtiofs shared
   directory and dumping the VM's vmcore to another virtiofs shared
   directory.

Case 2 & 3 need dracut >= 057, otherwise VM cannot boot from virtiofs
shared rootfs. But it is not the issue of kexec-tools.

Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
2022-09-29 12:22:49 +08:00
Baoquan He
1913ea9118 Checking the existence of 40-redhat.rules before modifying
Resolves: bz2106645

The code of commit 163c02970e takes effect in rhel firstly, later
pulled to Fedora. However, Fedora OS doesn't have 40-redhat.rules
in systemd-udev package. With this commit applied, a false positive
warning message can always been seen as below.

So fixing it by checking if 40-redhat.rules exists before handling.
With this change, the false warning is gone.

[root@ ~]# kdumpctl restart
kdump: kexec: unloaded kdump kernel
kdump: Stopping kdump: [OK]
kdump: No kdump initial ramdisk found.
kdump: Rebuilding /boot/initramfs-5.19.0-rc6+kdump.img
sed: can't read /var/tmp/dracut.NnAV2g/initramfs/usr/lib/udev/rules.d/40-redhat.rules: No such file or directory
kdump: kexec: loaded kdump kernel
kdump: Starting kdump: [OK]

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
2022-07-21 15:02:32 +08:00
Pingfan Liu
163c02970e ppc64/ppc64le: drop cpu online rule in 40-redhat.rules in kdump initramfs
Onlining secondary cpus breaks kdump completely on KVM on Power hosts
Though we use maxcpus=1 by default but 40-redhat.rules will bring up all
possible cpus by default.

Thus before we get the kernel fix and the systemd rule fix let's remove
the cpu rule in 40-redhat.rules for ppc64/ppc64le kdump initramfs.

This is back ported from RHEL, and original credit goes to Dave Young
<dyoung@redhat.com>

Signed-off-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Tao Liu <ltao@redhat.com>
2021-12-23 15:40:23 +08:00
Coiby Xu
6936fbc1b2 fix broken extra_bins when installing multiple binaries
When there more than one binaries, quoting "$val" would make
dracut-install treat multiple binaries as one binary. Take
"extra_bins /usr/sbin/ping /usr/sbin/ip" as an example, the
following error would occur when building initrd,

dracut-install: ERROR: installing '/usr/sbin/ping /usr/sbin/ip'
dracut: FAILED: /usr/lib/dracut/dracut-install -D /var/tmp/dracut.ODrioZ/initramfs -a /usr/sbin/ping /usr/sbin/ip

Fix it by not quoting the variable and bypassing SC2086 shellcheck.

Fixes: commit 86538ca6e2
       ("bash scripts: fix variable quoting issue")

Acked-by: Tao Liu <ltao@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
2021-11-05 17:50:45 +08:00
Kairui Song
58d3e6db3a kdump-lib.sh: rework nmcli related functions
This fixes word splitting issue with nmcli args. Current kexec-tools
scripts won't call nmcli with correct arguments when there are space in
network interface name.

nmcli expects multiple parameters, but get_nmcli_value_by_field only
accepts two params and depends on shell word splitting to split the
_nm_show_cmd into multiple params, which is very fragile.
So switch the param order, simplified this function and now multiple
params can be used properly.

And get_nmcli_connection_show_cmd_by_ifname returns multiple
nmcli params in a single variable, it depend on shell word splitting to
split the words when calling nmcli. But this is very fragile and break
easily when there are any special character in the connection path.

This function is only introduced to get and cache the nmcli command
which contains the "connection name".

Actually only cache the "connection path" is enough. Callers should
just call get_nmcli_connection_apath_by_ifname to cache the path, and
a new helper get_nmcli_field_by_conpath is introduced here to get value
from nmcli. This way "connection path" can contain any character.

Also get rid of another nmcli_cmd usage in
get_nmcli_connection_apath_by_ifname which stores multiple params in a
single bash variable separated by space.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-15 23:11:37 +08:00
Kairui Song
b1c794a2cf dracut-kdump.sh: Use stat instead of ls to get vmcore size
ls output is fragile, so use stat instead.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:54 +08:00
Kairui Song
e7118d1de8 Merge kdump-error-handler.sh into kdump.sh
kdump-error-handler.sh does nothing except calling three functions,
it can be easily merged into kdump.sh by using a parameter to run the
error handling routine.

kdump-lib-initramfs.sh was created to hold the three shared functions
and related code, so by merging these two files, kdump-lib-initramfs.sh
can be simplified by a lot.

Following up commits will clean up kdump-lib-initramfs.sh.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:54 +08:00
Kairui Song
a5faa052d4 kdump-lib-initramfs.sh: prepare to be a POSIX compatible lib
Move all functions needed in the second kernel from kdump-lib.sh
to kdump-lib-initramfs.sh, and update shebang headers.

Now, kdump-lib-initramfs.sh is an independent lib script, no longer
depend on kdump-lib.sh, and kdump-lib.sh is no longer needed for
the second kernel.

In later commits, functions in kdump-lib-initramfs.sh will be reworked
to be POSIX compatible, kdump-lib.sh will contain bash only functions.

POSIX shell have very limited features, eg. `local` keyword doesn't
exist in POSIX but we rely on that heavily. So kdump-lib.sh will
use bash syntax and contain the most complex helper and codes.

kdump-lib-initramfs.sh will contain the minimum set of helpers,
and be shared by both the first and second kernel.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:46 +08:00
Kairui Song
0e4b66b1ab bash scripts: reformat with shfmt
This is a batch update done with:
shfmt -s -w mkfadumprd mkdumprd kdumpctl *-module-setup.sh

Clean up code style and reduce code base size, no behaviour change.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:29 +08:00
Kairui Song
4f75e16700 bash scripts: declare and assign separately
Declare and assign separately to avoid masking return values:
https://github.com/koalaman/shellcheck/wiki/SC2155

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:29 +08:00
Kairui Song
a4648fc851 bash scripts: fix redundant exit code check
As suggested by:
https://github.com/koalaman/shellcheck/wiki/SC2181

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:29 +08:00
Kairui Song
86538ca6e2 bash scripts: fix variable quoting issue
Fixed quoting issues found by shellcheck, no feature
change. This should fix many errors when there is space
in any shell variables, eg. dump target's name/path/id.

False positives are marked with "# shellcheck disable=SCXXXX", for
example, args are expected to split so it should not be quoted.

And replaced some `cut -d ' ' -fX` with `awk '{print $X}'` since cut
is fragile, and doesn't work well with any quoted strings that have
redundant space.

Following quoting related issues are fixed (check the link
for example code and what could go wrong):

https://github.com/koalaman/shellcheck/wiki/SC2046
https://github.com/koalaman/shellcheck/wiki/SC2053
https://github.com/koalaman/shellcheck/wiki/SC2068
https://github.com/koalaman/shellcheck/wiki/SC2086
https://github.com/koalaman/shellcheck/wiki/SC2206

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:29 +08:00
Kairui Song
70978c00e5 bash scripts: replace '[ ]' with '[[ ]]' for bash scripts
kdumpctl, mkdumprd, *-module-setup.sh only target bash, since they
only run in first kernel and depend on dracut, and dracut depends
on bash. So use '[[ ]]' to replace '[ ]'.

This is a batch update done with following command:
`sed -i -e 's/\(\s\)\[\s\([^]]*\)\s\]/\1\[\[\ \2 \]\]/g' kdumpctl, mkdumprd, *-module-setup.sh`
and replaced [ ... -a ... ] with [[ ... ]] && [[ ... ]] manually.

See https://tldp.org/LDP/abs/html/testconstructs.html for more details
on '[[ ]]', it's more versatile, safer, and slightly faster than '[ ]'.

This will also help shfmt to clean up the code in later commits.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:29 +08:00
Kairui Song
54cc5c44be bash scripts: use $(...) notation instead of legacy ...
This is a batch update done with following command:

`sed -i -e 's/`\([^`]*\)`/\$(\1)/g' mkfadumprd mkdumprd \
 kdumpctl dracut-module-setup.sh dracut-fadump-module-setup.sh \
 dracut-early-kdump-module-setup.sh`

And manually converted some corner cases. This fixes
all related issues detected by shellcheck.
Make it easier to do clean up in later commits.

Check following link for reasons to switch to the new syntax:
https://github.com/koalaman/shellcheck/wiki/SC2006

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:29 +08:00
Kairui Song
a416930706 bash scripts: always use "read -r"
This helps to strip spaces and avoid mangling backslashes:

https://github.com/koalaman/shellcheck/wiki/SC2162

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:29 +08:00
Kairui Song
fdfad3102e bash scripts: get rid of unnecessary sed calls
Use bash builtin string substitution instead, as suggested by:
https://github.com/koalaman/shellcheck/wiki/SC2001

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:29 +08:00
Kairui Song
c4d85142be bash scripts: get rid of expr and let
As suggested by:
https://github.com/koalaman/shellcheck/wiki/SC2219

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:29 +08:00
Kairui Song
6d45257cc1 bash scripts: remove useless cat
Some `cat` calls are useless, remove them to make it cleaner.
See: https://github.com/koalaman/shellcheck/wiki/SC2002

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:29 +08:00
Kairui Song
3b0157197b dracut-module-setup.sh: remove surrounding $() for subshell
Some functions are executed in subshell to avoid variable environment
pollution. But the surrounding $() is not needed, and it may lead to
executing output which is unexpected here.

See: https://github.com/koalaman/shellcheck/wiki/SC2091

Signed-off-by: Kairui Song <kasong@redhat.com>
Suggested-by: Coiby Xu <coxu@redhat.com>
2021-09-14 03:25:29 +08:00
Kairui Song
67e559a6b9 dracut-module-setup.sh: make iscsi check fail early if cd failed
As suggested by:
https://github.com/koalaman/shellcheck/wiki/SC2164

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:29 +08:00
Kairui Song
3b2fa982bb dracut-module-setup.sh: fix a loop over ls issue
Iterating over ls output is fragile:
https://github.com/koalaman/shellcheck/wiki/SC2045

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:29 +08:00
Kairui Song
dfe7555323 dracut-module-setup.sh: fix a ambiguous variable reference
Wrap the variable with {...}, else it may get interpreted as array due
to the '[' char next to it.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:29 +08:00
Kairui Song
da3ad9cbda dracut-module-setup.sh: use "*" to expend array as string
As suggested by:
https://github.com/koalaman/shellcheck/wiki/SC2199
The array is not quoted here but implicitly concatenate still happens,
could be harmless but shellcheck complains about it so fix it.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:29 +08:00
Kairui Song
49dd4fcdbb dracut-module-setup.sh: fix _bondoptions wrong references
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:29 +08:00
Kairui Song
ba7aa447b2 dracut-module-setup.sh: remove an unused variable
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:29 +08:00
Kairui Song
46542ccda5 dracut-module-setup.sh: rework kdump_get_ip_route_field
Avoid duplicated echo / cut / grep call, just use sed.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:29 +08:00
Kairui Song
09ccf88405 kdump-lib.sh: add a config value retrive helper
Add a helper kdump_get_conf_val to replace get_option_value.

It can help cover more corner cases in the code, like when there are
multiple spaces in config file, config value separated by a tab,
heading spaces, or trailing comments.

And this uses "sed group command" and "sed hold buffer", make it much
faster than previous `grep <config> | tail -1`.

This helper is supposed to provide a universal way for kexec-tools
scripts to read in config value. Currently, different scripts are
reading the config in many different fragile ways.

For example, following codes are found in kexec-tools script code base:
  1. grep ^force_rebuild $KDUMP_CONFIG_FILE
     echo $_force_rebuild | cut -d' '  -f2

  2. grep ^kdump_post $KDUMP_CONFIG_FILE | cut -d\  -f2

  3. awk '/^sshkey/ {print $2}' $conf_file

  4. grep ^path $KDUMP_CONFIG_FILE | cut -d' '  -f2-

1, 2, and 4 will fail if the space is replaced by, e.g. a tab

1 and 2 might fail if there are multiple spaces between config name
and config value:
"kdump_post  /var/crash/scripts/kdump-post.sh"
A space will be read instead of config value.

1, 2, 3 will fail if there are space in file path, like:
"kdump_post /var/crash/scripts dir/kdump-post.sh"

4 will fail if there are trailing comments:
"path /var/crash # some comment here"

And all will fail if there are heading space,
" path /var/crash"

And all will most likely cause problems if the config file contains
the same option more than once.

And all of them are slower than the new sed call. Old get_option_value
is also very slow and doesn't handle heading space.

Although we never claim to support heading space or tailing comments
before, it's harmless to be more robust on config reading, and many
conf files in /etc support heading spaces. And have a faster and
safer config reading helper makes it easier to clean up the code.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:29 +08:00
Kairui Song
a0282ab22c kdump-lib.sh: add a config format and read helper
Add a helper `kdump_read_conf` to replace read_strip_comments.
`kdump_read_conf` does a few more things:

  - remove trailing spaces.
  - format the content, remove duplicated spaces between name and value.
  - read from KDUMP_CONFIG_FILE (/etc/kdump.conf) directly, avoid pasting
    "/etc/kdump.conf" path everywhere in the code.
  - check if config file exists, just in case.

Also unify the environmental variable, now KDUMP_CONFIG_FILE stands for
the default config location.

This helps avoid some shell pitfalls about spaces when reading config.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Philipp Rudo <prudo@redhat.com>
2021-09-14 03:25:29 +08:00
Coiby Xu
b2bbb54d89 Check the existence of /sys/bus/ccwgroup/devices/*/online beforehand
On s390x KVM machines, the following errors would show when building kdump
initramfs that dumps vmcore to a remote target,
    $ kdumpctl rebuild
    /usr/lib/dracut/modules.d/99kdumpbase/module-setup.sh: line 475: /sys/bus/ccwgroup/devices/online: No such file or directory
    /usr/lib/dracut/modules.d/99kdumpbase/module-setup.sh: line 476: [: -ne: unary operator expected

This happens because s390x KVM machines use virtual network and
/sys/bus/ccwgroup/devices/ exists but is empty. Fix it by check
the existence of file "/sys/bus/ccwgroup/devices/*/online".

Fixes: commit 7d47251568
       ("Iterate /sys/bus/ccwgroup/devices to tell if we should set up rd.znet")

BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1982474
Reported-by: Jie Li <jieli@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
t Acked-by: Kairui Song <kasong@redhat.com>
2021-07-21 17:10:28 +08:00
Hari Bathini
fa9201b240 fadump: isolate fadump initramfs image within the default one
In case of fadump, the initramfs image has to be built to boot into
the production environment as well as to offload the active crash dump
to the specified dump target (for boot after crash). As the same image
would be used for both boot scenarios, it could not be built optimally
while accommodating both cases.

Use --include to include the initramfs image built for offloading
active crash dump to the specified dump target. Also, introduce a new
out-of-tree dracut module (99zz-fadumpinit) that installs a customized
init program while moving the default /init to /init.dracut. This
customized init program is leveraged to isolate fadump image within
the default initramfs image by kicking off default boot process
(exec /init.dracut) for regular boot scenario and activating fadump
initramfs image, if the system is booting after a crash.

If squash is available, ensure default initramfs image is also built
with squash module to reduce memory consumption in capture kernel.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-06-29 21:35:58 +08:00
Coiby Xu
ad6f60d70d fix format issue in find_online_znet_device
Change spaces to tab to fix alignment issue.

Fixes: commit 7d47251568
       ("Iterate /sys/bus/ccwgroup/devices to tell if we should set up rd.znet")
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-06-29 17:11:07 +08:00
Coiby Xu
03f9b91351 check the existence of /sys/bus/ccwgroup/devices before trying to find online network device
/sys/bus/ccwgroup/devices doesn't exist for non-s390x machines which leads to
the warning "find: '/sys/bus/ccwgroup/devices': No such file or directory".
This warning can be eliminated by checking the existence of
"/sys/bus/ccwgroup/devices" beforehand.

Fixes: commit 7d47251568
       ("Iterate /sys/bus/ccwgroup/devices to tell if we should set up rd.znet")

Reported-by: Ruowen Qin <ruqin@redhat.com>
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1974618
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-06-29 17:11:00 +08:00
Coiby Xu
7d47251568 Iterate /sys/bus/ccwgroup/devices to tell if we should set up rd.znet
This patch fixes bz1941106 and bz1941905 which passed empty rd.znet to the
kernel command line in the following cases,
 - The IBM (Z15) KVM guest uses virtio for all devices including network
   device, so there is no znet device for IBM KVM guest. So we can't
   assume a s390x machine always has a znet device.
 - When a bridged network is used, kexec-tools tries to obtain the znet
   configuration from the ifcfg script of the bridged network rather than
   from the ifcfg script of znet device.

We can iterate /sys/bus/ccwgroup/devices to tell if there if there is
a znet network device. By getting an ifname from znet, we can also avoid
mistaking the slave netdev as a znet network device in a bridged network
or bonded network.

Note: This patch also assumes there is only one znet device as commit
7148c0a30d ("add s390x netdev setup")
which greatly simplifies the code. According to IBM [1], there could be
more than znet devices for a z/VM system and a z/VM system may have a
non-znet network device like ConnectX. Since kdump_setup_znet was
introduced in 2012 and so far there is no known customer complaint that
invalidates this assumption I think it's safe to assume an IBM z/VM
system only has one znet device. Besides, there is no z/VM system found
on beaker to test the alternative scenarios.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1941905#c13

Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-06-08 10:48:34 +08:00
Kairui Song
a2306346bc Remove the kdump error handler isolation wrapper
The wrapper is introduced in commit 002337c, according to the commit
message, the only usage of the wrapper is when dracut-initqueue calls
"systemctl start emergency" directly. In that case, emergency
is started, but not in a isolation mode, which means dracut-initqueue
is still running. On the other hand, emergency will call
"systemctl start dracut-initqueue" again when default action is dump_to_rootfs.

systemd would block on the last dracut-initqueue, waiting for the first
instance to exit, which leaves us hang.

In previous commit we added initqueue status detect in dump_to_rootfs,
so now even without the wrapper, it will not hang.

And actually, previously, with the wrapper, emergency might still hang
for like 30s. When dracut called emergency service because initqueue
timed out, dump_to_rootfs will try start initqueue again and timeout
again. Now with the wrapper removed, we can avoid these two kinds of
hangs, bacause without the isolation we can detect initqueue service
status correctly in such case.

Also remove the invalid header comments in service file, the service
is not part of systemd code. And sync the service spec with dracut.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
2021-06-04 14:26:45 +08:00
Coiby Xu
8178d7a5a1 Warn the user if network scripts are used
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-05-13 17:13:31 +08:00
Coiby Xu
d5f6d38173 Set up bond cmdline by "nmcli --get-values"
Now kdumpctl will exit if failing to set up bond cmdline.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-05-13 17:13:28 +08:00
Coiby Xu
6f1badec78 Set up dns cmdline by parsing "nmcli --get-values"
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-05-13 17:13:25 +08:00
Coiby Xu
8b08b4f17b Set up s390 znet cmdline by "nmcli --get-values"
Now kdumpctl will abort when failing to set up znet.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-05-13 17:13:15 +08:00
Coiby Xu
8a33ffffbc rd.route should use the name from kdump_setup_ifname
This fixes bz1854037 which happens because kexec-tools generates rd.route for
eth0 instead of for kdump-eth0,
 1. "rd.route=168.63.129.16:10.0.0.1:eth0 rd.route=169.254.169.254:10.0.0.1:eth0" is passed to the dracut cmdline by kexec-tools
 2. In the 2rd kernel, dracut/modules.d/35network-manager/nm-config.sh calls
    /usr/libexec/nm-initrd-generator to generate two .nmconnection files
    based on the dracut cmdline, i.e. kdump-eth0.nmconnection and eth0.nmconnection,
    - /run/NetworkManager/system-connections/kdump-eth0.nmconnection
        [connection]
        id=kdump-eth0
        uuid=3ef53b1b-3908-437e-a15f-cf1f3ea2678b
        type=ethernet
        autoconnect-retries=1
        interface-name=kdump-eth0
        multi-connect=1
        permissions=
        wait-device-timeout=60000
        [ethernet]
        mac-address-blacklist=
        [ipv4]
        address1=10.0.0.4/24,10.0.0.1
        dhcp-timeout=90
        dns=168.63.129.16;
        dns-search=
        may-fail=false
        method=manual
        [ipv6]
        addr-gen-mode=eui64
        dhcp-timeout=90
        dns-search=
        method=disabled
        [proxy]

    - /run/NetworkManager/system-connections/eth0.nmconnection
        [connection]
        id=eth0
        uuid=f224dc22-2891-4d7b-8f66-745029df4b53
        type=ethernet
        autoconnect-retries=1
        interface-name=eth0
        multi-connect=1
        permissions=
        [ethernet]
        mac-address-blacklist=
        [ipv4]
        dhcp-timeout=90
        dns=168.63.129.16;
        dns-search=
        method=auto
        route1=168.63.129.16/32,10.0.0.1
        route2=169.254.169.254/32,10.0.0.1
        [ipv6]
        addr-gen-mode=eui64
        dhcp-timeout=90
        dns-search=
        method=auto
        [proxy]

 3. Since there's eth0.nmconnection, NetworkManager will try to get an IP for eth0 regardless of the fact it's a slave NIC and time out
    ```
    $ ip link show
    2: kdump-eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
       link/ether 00:0d:3a:11:86:8b brd ff:ff:ff:ff:ff:ff
    3: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master kdump-eth0 state UP mode DEFAULT group default qlen 1000
    ```

Reported-by: Huijing Hei <hhei@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-05-11 02:11:50 +08:00
Coiby Xu
97ee5dc64c get kdump ifname once in kdump_install_net
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-05-11 02:11:50 +08:00
Kairui Song
ee160bf04d Revert "Always set vm.zone_reclaim_mode = 3 in kdump kernel"
This reverts commit 5633e83318.

vm.zone_reclaim_mode may cause trashing on some machines. And after
second thought, vm.zone_reclaim_mode is barely helpful for machines
with high mem stress, so just revert it.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Pingfan Liu <piliu@redhat.com>
2021-04-28 18:05:23 +08:00
Tao Liu
475e33030b Make dracut-squash required for kexec-tools
This patch reverts commit "Make dracut-squash a weak dep".

Although kexec-tools can work without dracut-squash, it is essential
for kdump to run properly in cases [1][2] where minimal amount of memory
consumption is expected. Thus dracut-squash is needed for it.

[1] https://lists.fedoraproject.org/archives/list/kexec@lists.fedoraproject.org/message/SJX7CW3WLOYSFI2YJKGTUGDBWSCMZXVZ/
[2] https://www.spinics.net/lists/systemd-devel/msg05864.html

Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-04-28 16:13:39 +08:00
Kairui Song
8d0ef743e0 Revert "get kdump ifname once in kdump_install_net"
This reverts commit afda4f4961.
2021-04-28 15:18:53 +08:00
Kairui Song
b0156e9b64 Revert "pass kdumpnic to kdump_setup_/bond/bridge/vlan directly"
This reverts commit 586d767697.
2021-04-28 15:18:53 +08:00
Kairui Song
4753ab2c70 Revert "rd.route should use the name from kdump_setup_ifname"
This reverts commit 18ffd3cb17.
2021-04-28 15:18:53 +08:00
Coiby Xu
18ffd3cb17 rd.route should use the name from kdump_setup_ifname
This fix bz1854037 which happens because kexec-tools generates rd.route for
eth0 instead of for kdump-eth0,
 1. "rd.route=168.63.129.16:10.0.0.1:eth0 rd.route=169.254.169.254:10.0.0.1:eth0" is passed to the dracut cmdline by kexec-tools
 2. In the 2rd kernel,
    - dracut/modules.d/40network/net-lib.sh will write /tmp/net.route.eth0 based on rd.route
    - dracut/modules.d/45ifcfg/write-ifcfg.sh will copy /tmp/net.route.eth0 to /tmp/icfg and then copytree /tmp/ifcfg to /run/initramfs/state/etc/sysconfig/network-scripts
 3. NetworkManager will try to get an IP for eth0 regardless of the fact it's a slave NIC and time out
    ```
    $ ip link show
    2: kdump-eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
       link/ether 00:0d:3a:11:86:8b brd ff:ff:ff:ff:ff:ff
    3: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master kdump-eth0 state UP mode DEFAULT group default qlen 1000
    ```

Reported-by: Huijing Hei <hhei@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com
2021-04-25 16:56:32 +08:00
Coiby Xu
586d767697 pass kdumpnic to kdump_setup_/bond/bridge/vlan directly
This avoids calling kdump_setup_ifname repeatedly.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com
2021-04-25 16:52:29 +08:00
Coiby Xu
afda4f4961 get kdump ifname once in kdump_install_net
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com
2021-04-25 16:52:21 +08:00
Coiby Xu
1ca1b71780 Implement IP netmask calculation to replace "ipcalc -m"
Recently, dracut-network drops depedency on dhcp-client which requires
ipcalc. Thus the dependency chain
"kexec-tools -> dracut-network -> dhcp-client -> ipcalc"
is broken. When NIC is configured to a static IP, kexec-tools depended
on "ipcalc -m" to get netmask. This commit implements the shell
equivalent of "ipcalc -m".

The following test code shows cal_netmask_by_prefix is consistent with
"ipcalc -m",

    #!/bin/bash
    . dracut-module-setup.sh

    for i in {0..128}; do
        mask_expected=$(ipcalc -m fe::/$i| cut -d"=" -f2)
        mask_actual=$(cal_netmask_by_prefix $i "-6")
        if [[ "$mask_expected" != "$mask_actual" ]]; then
            echo "prefix="$i, "expected="$mask_expected, "acutal="$mask_actual
            exit
        fi
    done

    echo "IPv6 tests passed"

    for i in {0..32}; do
        mask_expected=$(ipcalc -m 8.8.8.8/$i| cut -d"=" -f2)
        mask_actual=$(cal_netmask_by_prefix $i "")
        if [[ "$mask_expected" != "$mask_actual" ]]; then
            echo "prefix="$i, "expected="$mask_expected, "acutal="$mask_actual
            exit
        fi
    done

    echo "IPv4 tests passed"

    i=-2
    res=$(cal_netmask_by_prefix "$i" "")
    if [[ $? -ne 1 ]]; then
        echo "cal_netmask_by_prefix should exit when prefix<0"
        exit
    fi

    res=$(cal_netmask_by_prefix "$i" "")
    if [[ $? -ne 1 ]]; then
        echo "cal_netmask_by_prefix should exit when prefix<0"
        exit
    fi

    i=33
    $(cal_netmask_by_prefix $i "")
    if [[ $? -ne 1 ]]; then
        echo "cal_netmask_by_prefix should exit when prefix>32 for IPv4"
        exit
    fi

    i=129
    $(cal_netmask_by_prefix $i "-6")
    if [[ $? -ne 1 ]]; then
        echo "cal_netmask_by_prefix should exit when prefix>128 for IPv4"
        exit
    fi

    echo "Bad prefixes tests passed"

    echo "All tests passed"

Reported-by: Jie Li <jieli@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-04-16 18:51:34 +08:00
Coiby Xu
8b4b7bf808 Don't use die in dracut-module-setup.sh
die (in dracut-lib.sh) is supposed to be used in the initramfs environment.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-04-16 18:02:32 +08:00