Commit Graph

28 Commits

Author SHA1 Message Date
Coiby Xu
17c26558d9 tests: use the default crashkernel value
And with commit t5b31b099 ("Simplify the management of the kernel
parameter crashkernel"), the default crashkernel value will be
used for the kernel. But the test VM has a RAM of 768M thus this is no
actual reserved memory for kdump.  Even With the old crashkernel=224M,
network dumping tests like nfs-kdump will fail out of memory when
running against current Fedora Cloud images (>=F37).

This patch address the above two issues by
 1. increasing the RAM of test VM to 1G
 2. installing the kernel-modules which contains the squashfs module in
    order to use the dracut squash module for kdump initrd.

Thanks to the dracut squash module, now even crashkernel=192M (the
default crashkernel value for RAM between 1G and 4G) works for
network dumping. Another benefit brought by this change is the default
crashkernel value can be tested as well.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-06-20 10:24:25 +08:00
Coiby Xu
7cd799462e tests: skip checking if the second partition has the boot label
All the tests failed to run on the Fedora 37 host because the boot
partition failed to be mounted and in turn the key kernel cmdline
parameters like selinux=0 couldn't be added.

The root problem is somehow running lsblk on the second partition
returns an empty label unless we wait for enough time. Before figuring
out the root cause, simply skip check that the second partition
needs to have the boot label.

Note the root problem can be produced by building a test image,
    cd tests
    ./scripts/build-image.sh Fedora-Cloud-Base-37-1.7.x86_64.qcow2 output_image   scripts/build-scripts/base-image_test.sh
    Source image is qcow2, using snapshot...
    Formatting 'build/base-image1.building', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=5368709120 backing_file=Fedora-Cloud-Base-37-1.7.x86_64.qcow2 backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
    It's a image with multiple partitions, using last partition as main partition
    grep: /boot/grub2/grubenv: No such file or directory
    grub2-editenv: error: cannot open `/boot/grub2/grubenv.new': No such file or directory.
    /dev/nbd0 disconnected

Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-06-20 10:24:25 +08:00
Coiby Xu
8f243d2ab1 tests: generate correct RPM name
Tests failed to run against Fedora 37 or newer cloud images because of
the following error,
    It's a image with multiple partitions, using last partition as main partition
    'xxx/tests/build/x86_64/kexec-tools-2.0.26-5.fc37.src.rpm' not found
    /dev/nbd0 disconnected
    make: *** [Makefile:73: xxx/tests/output/test-base-image] Error 1

This is because starting with Fedora 37, rpm changes its API,
    # Fedora >= 37
    $ rpm -q --specfile kexec-tools.spec
    kexec-tools-2.0.26-5.fc37.src
    # Fedora 36
    $ rpm -q --specfile kexec-tools.spec
    kexec-tools-2.0.26-5.fc36

The tests depends on rpm to generate correct RPM name. Fix this issue by
removing the trailing .src from the output of "rpm -q --specfile".

Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
2023-06-20 10:24:08 +08:00
Coiby Xu
cea74a7b3e tests: use .nmconnection to set up test network
F36 has dropped support on ifcfg and as a result current network tests
fails. Use .nmconnection to set up test network instead.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-11-09 14:07:29 +08:00
Tao Liu
6d2c22bb81 selftest: Add lvm2 thin provision for kdump test
Signed-off-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-11-01 12:20:34 +08:00
Tao Liu
68978f9241 selftest: Only iterate the .sh files for test execution
Previously, all files within $TESTCASEDIR/$test_case are regarded
as shell script files for testing. However there might be config
files under the directory. So let's only iterate the .sh files.

Signed-off-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-11-01 12:20:34 +08:00
Coiby Xu
2d5df7a512 tests: specify the Fedora version when running fedpkg sources
So fedpkg will fetch the sources that matches given Fedora version.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-08-03 20:14:12 +08:00
Coiby Xu
f91711ba8e tests: specify the backing format for the backing file when using qemu-img create
New version of qemu-img requires specifying the backing format for the
backing file otherwise it will abort.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-08-03 20:14:12 +08:00
Coiby Xu
d347ad591f tests: correctly mount the root and also the boot partitions for Fedora 35, 36 and rawhide Cloud Base Image
Fedora 33 and 34 Cloud Base Images have only one partition with the
following directory structure,
.
├── bin -> usr/bin
├── boot
├── dev
├── etc
├── home
├── root

By comparison, Fedora 35, 36 and 37 Cloud Base Images have multiple
partitions. The root partition which is the last partition has the
following directory,
.
├── home
└── root
    ├── bin -> usr/bin
    ├── boot
    ├── dev
    ├── etc
    ├── home
    ├── root

and the 2nd partition is the boot partition.

This patch address the above changes by mounting {LAST_PARTITION}/root as
to TEMP_ROOT and mount SECOND_PARTITION to TEMP_ROOT/boot. So the test
image can be built successfully.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2022-08-03 20:14:12 +08:00
Tao Liu
aa9e70349b selftest: Add early kdump test
This patch will introduce early kdump test.

It reuses the code of nfs kdump test, in order to setup 2 seperated VMs,
one(the client) for trigger the early kdump and crash, the other(the server)
for saving vmcore and check if vmcore exists. In order to minimize the
repetted code, a soft link is made to copy the same server side code.

Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
2022-01-24 11:18:00 +08:00
Tao Liu
f4ab396574 selftest: run-test.sh: wait for subprocess instead of kill it
When run tests with 2 VMs, for example nfs/ssh kdump tests, client VM will do the
crash and dump, server VM will do vmcore saving and if-vmcore-exists
check.

Previously, when client VM finishes running, run-test.sh will kill the lead background
process, and then check if server VM has outputted "TEST PASSED" or "TEST FAILED" string.
However it didn't wait for server VM to finish. As a result, the server VM's final
outputs are not collected and checked, leaving the test result as "TEST RESULT NOT FOUND"
sometimes.

For example, the following is the pstree status of $(jobs -p) before it
gets killed. We can see the server VM is still running:

run-test.sh,172455 /root/kexec-tools/tests/scripts/run-test.sh --console nfs-early-kdump
  └─run-test.sh,172457 /root/kexec-tools/tests/scripts/run-test.sh --console...
      └─timeout,172480 --foreground 10m /root/kexec-tools/tests/scripts/run-qemu...
          └─qemu-system-x86,172481 -enable-kvm -cpu host -nodefaults...
              ├─{qemu-system-x86},172489
              ├─{qemu-system-x86},172492
              ├─{qemu-system-x86},172493
              ├─{qemu-system-x86},172628
              └─{qemu-system-x86},172629

In this patch, we will wait for $(jobs -p) to finish, in order to get
the complete output of test results.

Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
2022-01-24 11:17:52 +08:00
Coiby Xu
294c965ca3 selftest: kill VM reliably by recursively kill children processes
qemu is launched in nested subprocess and can't be killed by simply
killing the job ids,

        PID    Command
    2269634    │  ├─ sshd: root [priv]
    2269637    │  │  └─ sshd: root@pts/0
    2269638    │  │     └─ -bash
    2269744    │  │        └─ make test-run V=1
    2273117    │  │           └─ /bin/bash /root/kexec-tools-300/tests/scripts/run-test.sh
    2273712    │  │              ├─ /bin/bash /root/kexec-tools-300/tests/scripts/run-test.sh
    2273714    │  │              │  └─ /bin/bash /root/kexec-tools-300/tests/scripts/run-test.sh
    2273737    │  │              │     └─ timeout --foreground 10m /root/kexec-tools-300/tests/scripts/run-qemu -nodefaults -nographic -smp 2 -m 768M -monitor no
    2273738    │  │              │        └─ /usr/bin/qemu-system-x86_64 -enable-kvm -cpu host -nodefaults -nographic -smp 2 -m 768M -monitor none -serial stdio
    2273746    │  │              │           ├─ /usr/bin/qemu-system-x86_64 -enable-kvm -cpu host -nodefaults -nographic -smp 2 -m 768M -monitor none -serial std
    2273797    │  │              ├─ /bin/bash /root/kexec-tools-300/tests/scripts/run-test.sh
    2273798    │  │              │  └─ /bin/bash /root/kexec-tools-300/tests/scripts/run-test.sh
    2273831    │  │              │     └─ timeout --foreground 10m /root/kexec-tools-300/tests/scripts/run-qemu -nodefaults -nographic -smp 2 -m 768M -monitor no
    2273832    │  │              │        └─ /usr/bin/qemu-system-x86_64 -enable-kvm -cpu host -nodefaults -nographic -smp 2 -m 768M -monitor none -serial stdio
    2273840    │  │              │           ├─ /usr/bin/qemu-system-x86_64 -enable-kvm -cpu host -nodefaults -nographic -smp 2 -m 768M -monitor none -serial std

This led to the error "qemu-system-x86_64: can't bind ip=0.0.0.0 to
socket: Address already in use".

This patch will kill qemu by killing all the children of the job id.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-10-15 19:20:25 +08:00
Coiby Xu
62578ace21 selftest: ignore all spaces when compare the dmesg files
For the log entry that has multiple lines, "makedumpfile --dump-dmesg"
would indent the remaining lines while vmcore-dmesg doesn't. For
example, vmcore-dmesg.txt and vmcore-dmesg.txt.2 are the outputs of
vmcore-dmesg and "makedumpfile --dump-dmesg" respectively,

    ```
    diff -u vmcore-dmesg.txt vmcore-dmesg.txt.2
    --- vmcore-dmesg.txt    2021-03-28 22:13:09.986000000 -0400
    +++ vmcore-dmesg.txt.2  2021-03-28 22:13:39.920106131 -0400
    @@ -397,9 +397,9 @@
     [    1.710742] vc vcsa: hash matches
     [    1.711938] RAS: Correctable Errors collector initialized.
     [    1.713736] Unstable clock detected, switching default tracing clock to "global"
    -If you want to keep using the local clock, then add:
    -  "trace_clock=local"
    -on the kernel command line
    +               If you want to keep using the local clock, then add:
    +                 "trace_clock=local"
    +               on the kernel command line
     [    1.750539] ata1.01: NODEV after polling detection
     [    1.750973] ata1.00: ATA-7: QEMU HARDDISK, 2.5+, max UDMA/100
     [    1.752885] ata1.00: 8388608 sectors, multi 16: LBA48
    ```

Quite often, all three tests could fail because of the above difference. So
let's ignore all the spaces. This patch could fix bz1952299 [1].

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1952299

Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-06-08 22:21:47 +08:00
Coiby Xu
560c0e8a7b selftest: Make test_base_image depends on EXTRA_RPMS
test_base_image should depend on EXTRA_RPMS so it gets rebuild when
EXTRA_RPMS changes.

Fixes: commit bbc064f958
       ("selftest: add EXTRA_RPMs so dracut RPMs can be installed onto
        the image to run the tests")
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-06-08 22:21:47 +08:00
Coiby Xu
0a15d859bb selftest: fix the error of misplacing double quotes
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-06-08 22:21:47 +08:00
Coiby Xu
8619f58538 selftest: replace qemu-kvm with one based on dracut's run-qemu
Dracut's run-qemu could find  which virtualization technology to the
user in the order of kvm, kqemu, userspace. Using run-qemu could allow
running tests where qemu-kvm doesn't exist.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-03-24 15:51:02 +08:00
Coiby Xu
bbc064f958 selftest: add EXTRA_RPMs so dracut RPMs can be installed onto the image to run the tests
dracut will build the PRMs which will be installed onto the image to run
the tests.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-03-24 15:50:57 +08:00
Coiby Xu
0dedb2c91a selftest: Fix bug of collecting test RPMs from argument
Currently, TEST_RPMS would be only using the last RPM.
Append each RPM path to TEST_RPMs instead,

Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-03-24 15:50:41 +08:00
Kairui Song
7b7e5d0743 selftest: Fix several test issue with Fedora 33
- ssh-copy-id is bugged and not working, use a more robust way to sync
  ssh keys
- systemd-resolvd will bind on port 53 so DHCP server won't work,
  disable systemd-resolvd's builtin DNS server

Signed-off-by: Kairui Song <kasong@redhat.com>
2020-11-18 23:57:13 +08:00
Kairui Song
616d359c5e selftest: add more detailed log and fix a test failure issue
Signed-off-by: Kairui Song <kasong@redhat.com>
2020-11-18 23:57:02 +08:00
Kairui Song
13ac244630 selftest: Update test base image to Fedora 33 2020-11-18 17:15:18 +08:00
Kairui Song
f85a291fcb selftest: Fix qcow2 image format detect
qemu-img will report "qcow2" or "qcow2 backing qcow2" for qcow2 image,
cover both case.

Signed-off-by: Kairui Song <kasong@redhat.com>
2020-11-18 01:53:01 +08:00
Kairui Song
aced2c06a0 selftest: Always use the get_image_fmt helper
Avoid code duplication.
2020-11-18 01:53:01 +08:00
Lianbo Jiang
3221f4e91f increase makdumpfile default message level to 7
Currently, the makedumpfile option '--message-level' is set to 1 when
dumping the vmcore, it only displays the progress indicator message,
but there are no common message and error message, it is important to
report some additional messages, especially for the error message,
which is very useful for the debugging.

In view of this, let's change the message level to 7 by default.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2020-10-27 17:45:32 +08:00
Kairui Song
c44cdb6703 selftest: Show the path of dumped vmcore on test end
Make the test script print following line when the test is finished and vmcore is successfully dumped:

  You can retrive the verify the vmcore file using following command:
  ./scripts/copy-from-image.sh \
      /home/kasong/fedpkg/kexec-tools/tests/output/ssh-kdump/0-server.img \
      /var/crash/192.168.77.62-2020-09-02-05:16:26/vmcore.flat ./
  Kernel package verion is: kernel-core-5.6.6-300.fc32.x86_64

Also add a helper to copy files out of the VM image.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
2020-09-17 10:43:02 +08:00
Kairui Song
978a849765 selftest: Add document for selftests
Acked-by: Dave Young <dyoung@redhat.com>
2020-09-17 10:42:58 +08:00
Kairui Song
a8dbd281f7 selftest: Add basic test framework
Now, by execute `make test-run` in tests/, kexec-tools sanity tests
will be performed using VMs. There are currently 3 test cases, for local
kdump, nfs kdump and ssh kdump.

For each test VM, the selftest framework will create a snapshot layer,
do setup as required by the test case, this ensure each test runs in a
clean VM.

This framework will install a custom systemd service that starts when
system have finished booting, and the service will do basic routine
(fetch and set boot counter, etc..), then call the test case which is
installed in /kexec-kdump-test/test.sh in VM.

Each VM will have two serial consoles, one for ordinary console usage,
one for the test communication and log. The test script will watch the
second test console to know the test status.

The test cases are located in tests/scripts/testcases, documents about
the test cases structure will be provided in following commits.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
2020-09-17 10:42:54 +08:00
Kairui Song
2457f22baf selftest: Add basic infrastructure to build test image
The Makefile In tests/ could help build a VM image using Fedora cloud
image as base image, or, user can specify a base image using
BASE_IMAGE=<path/to/file>. The current repo will be packeged and
installed in the image, so the image could be used as a test image to
test kexec-tools.

The image building is splited into two steps:
The first step, it either convert the base image to qcow2 or create
a snapshot on it, and install basic packages (dracut, grubby, ...)
and do basic setups (setup crashkernel=, disable selinux, ...).
See tests/scripts/build-scripts/base-image.sh for detail.

The second step, it creates a snapshot on top of the image produced by
the previous step, and install the packaged kexec-tools of current
repo.  See tests/scripts/build-scripts/test-base-image.sh for detail.

In this way, if repo's content is changes, `make` will detect it and
only rebuild the second snapshot which speed up the rebuild by a lot.

The image will be located as tests/output/test-base-image, and in qcow2
format. And default user/password is set to root/fedora.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
2020-09-17 10:42:34 +08:00