Commit Graph

15 Commits

Author SHA1 Message Date
Tao Liu
aa9e70349b selftest: Add early kdump test
This patch will introduce early kdump test.

It reuses the code of nfs kdump test, in order to setup 2 seperated VMs,
one(the client) for trigger the early kdump and crash, the other(the server)
for saving vmcore and check if vmcore exists. In order to minimize the
repetted code, a soft link is made to copy the same server side code.

Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
2022-01-24 11:18:00 +08:00
Tao Liu
f4ab396574 selftest: run-test.sh: wait for subprocess instead of kill it
When run tests with 2 VMs, for example nfs/ssh kdump tests, client VM will do the
crash and dump, server VM will do vmcore saving and if-vmcore-exists
check.

Previously, when client VM finishes running, run-test.sh will kill the lead background
process, and then check if server VM has outputted "TEST PASSED" or "TEST FAILED" string.
However it didn't wait for server VM to finish. As a result, the server VM's final
outputs are not collected and checked, leaving the test result as "TEST RESULT NOT FOUND"
sometimes.

For example, the following is the pstree status of $(jobs -p) before it
gets killed. We can see the server VM is still running:

run-test.sh,172455 /root/kexec-tools/tests/scripts/run-test.sh --console nfs-early-kdump
  └─run-test.sh,172457 /root/kexec-tools/tests/scripts/run-test.sh --console...
      └─timeout,172480 --foreground 10m /root/kexec-tools/tests/scripts/run-qemu...
          └─qemu-system-x86,172481 -enable-kvm -cpu host -nodefaults...
              ├─{qemu-system-x86},172489
              ├─{qemu-system-x86},172492
              ├─{qemu-system-x86},172493
              ├─{qemu-system-x86},172628
              └─{qemu-system-x86},172629

In this patch, we will wait for $(jobs -p) to finish, in order to get
the complete output of test results.

Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Coiby Xu <coxu@redhat.com>
2022-01-24 11:17:52 +08:00
Coiby Xu
294c965ca3 selftest: kill VM reliably by recursively kill children processes
qemu is launched in nested subprocess and can't be killed by simply
killing the job ids,

        PID    Command
    2269634    │  ├─ sshd: root [priv]
    2269637    │  │  └─ sshd: root@pts/0
    2269638    │  │     └─ -bash
    2269744    │  │        └─ make test-run V=1
    2273117    │  │           └─ /bin/bash /root/kexec-tools-300/tests/scripts/run-test.sh
    2273712    │  │              ├─ /bin/bash /root/kexec-tools-300/tests/scripts/run-test.sh
    2273714    │  │              │  └─ /bin/bash /root/kexec-tools-300/tests/scripts/run-test.sh
    2273737    │  │              │     └─ timeout --foreground 10m /root/kexec-tools-300/tests/scripts/run-qemu -nodefaults -nographic -smp 2 -m 768M -monitor no
    2273738    │  │              │        └─ /usr/bin/qemu-system-x86_64 -enable-kvm -cpu host -nodefaults -nographic -smp 2 -m 768M -monitor none -serial stdio
    2273746    │  │              │           ├─ /usr/bin/qemu-system-x86_64 -enable-kvm -cpu host -nodefaults -nographic -smp 2 -m 768M -monitor none -serial std
    2273797    │  │              ├─ /bin/bash /root/kexec-tools-300/tests/scripts/run-test.sh
    2273798    │  │              │  └─ /bin/bash /root/kexec-tools-300/tests/scripts/run-test.sh
    2273831    │  │              │     └─ timeout --foreground 10m /root/kexec-tools-300/tests/scripts/run-qemu -nodefaults -nographic -smp 2 -m 768M -monitor no
    2273832    │  │              │        └─ /usr/bin/qemu-system-x86_64 -enable-kvm -cpu host -nodefaults -nographic -smp 2 -m 768M -monitor none -serial stdio
    2273840    │  │              │           ├─ /usr/bin/qemu-system-x86_64 -enable-kvm -cpu host -nodefaults -nographic -smp 2 -m 768M -monitor none -serial std

This led to the error "qemu-system-x86_64: can't bind ip=0.0.0.0 to
socket: Address already in use".

This patch will kill qemu by killing all the children of the job id.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-10-15 19:20:25 +08:00
Coiby Xu
62578ace21 selftest: ignore all spaces when compare the dmesg files
For the log entry that has multiple lines, "makedumpfile --dump-dmesg"
would indent the remaining lines while vmcore-dmesg doesn't. For
example, vmcore-dmesg.txt and vmcore-dmesg.txt.2 are the outputs of
vmcore-dmesg and "makedumpfile --dump-dmesg" respectively,

    ```
    diff -u vmcore-dmesg.txt vmcore-dmesg.txt.2
    --- vmcore-dmesg.txt    2021-03-28 22:13:09.986000000 -0400
    +++ vmcore-dmesg.txt.2  2021-03-28 22:13:39.920106131 -0400
    @@ -397,9 +397,9 @@
     [    1.710742] vc vcsa: hash matches
     [    1.711938] RAS: Correctable Errors collector initialized.
     [    1.713736] Unstable clock detected, switching default tracing clock to "global"
    -If you want to keep using the local clock, then add:
    -  "trace_clock=local"
    -on the kernel command line
    +               If you want to keep using the local clock, then add:
    +                 "trace_clock=local"
    +               on the kernel command line
     [    1.750539] ata1.01: NODEV after polling detection
     [    1.750973] ata1.00: ATA-7: QEMU HARDDISK, 2.5+, max UDMA/100
     [    1.752885] ata1.00: 8388608 sectors, multi 16: LBA48
    ```

Quite often, all three tests could fail because of the above difference. So
let's ignore all the spaces. This patch could fix bz1952299 [1].

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1952299

Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-06-08 22:21:47 +08:00
Coiby Xu
0a15d859bb selftest: fix the error of misplacing double quotes
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-06-08 22:21:47 +08:00
Coiby Xu
8619f58538 selftest: replace qemu-kvm with one based on dracut's run-qemu
Dracut's run-qemu could find  which virtualization technology to the
user in the order of kvm, kqemu, userspace. Using run-qemu could allow
running tests where qemu-kvm doesn't exist.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-03-24 15:51:02 +08:00
Coiby Xu
0dedb2c91a selftest: Fix bug of collecting test RPMs from argument
Currently, TEST_RPMS would be only using the last RPM.
Append each RPM path to TEST_RPMs instead,

Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2021-03-24 15:50:41 +08:00
Kairui Song
7b7e5d0743 selftest: Fix several test issue with Fedora 33
- ssh-copy-id is bugged and not working, use a more robust way to sync
  ssh keys
- systemd-resolvd will bind on port 53 so DHCP server won't work,
  disable systemd-resolvd's builtin DNS server

Signed-off-by: Kairui Song <kasong@redhat.com>
2020-11-18 23:57:13 +08:00
Kairui Song
616d359c5e selftest: add more detailed log and fix a test failure issue
Signed-off-by: Kairui Song <kasong@redhat.com>
2020-11-18 23:57:02 +08:00
Kairui Song
f85a291fcb selftest: Fix qcow2 image format detect
qemu-img will report "qcow2" or "qcow2 backing qcow2" for qcow2 image,
cover both case.

Signed-off-by: Kairui Song <kasong@redhat.com>
2020-11-18 01:53:01 +08:00
Kairui Song
aced2c06a0 selftest: Always use the get_image_fmt helper
Avoid code duplication.
2020-11-18 01:53:01 +08:00
Lianbo Jiang
3221f4e91f increase makdumpfile default message level to 7
Currently, the makedumpfile option '--message-level' is set to 1 when
dumping the vmcore, it only displays the progress indicator message,
but there are no common message and error message, it is important to
report some additional messages, especially for the error message,
which is very useful for the debugging.

In view of this, let's change the message level to 7 by default.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Acked-by: Kairui Song <kasong@redhat.com>
2020-10-27 17:45:32 +08:00
Kairui Song
c44cdb6703 selftest: Show the path of dumped vmcore on test end
Make the test script print following line when the test is finished and vmcore is successfully dumped:

  You can retrive the verify the vmcore file using following command:
  ./scripts/copy-from-image.sh \
      /home/kasong/fedpkg/kexec-tools/tests/output/ssh-kdump/0-server.img \
      /var/crash/192.168.77.62-2020-09-02-05:16:26/vmcore.flat ./
  Kernel package verion is: kernel-core-5.6.6-300.fc32.x86_64

Also add a helper to copy files out of the VM image.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
2020-09-17 10:43:02 +08:00
Kairui Song
a8dbd281f7 selftest: Add basic test framework
Now, by execute `make test-run` in tests/, kexec-tools sanity tests
will be performed using VMs. There are currently 3 test cases, for local
kdump, nfs kdump and ssh kdump.

For each test VM, the selftest framework will create a snapshot layer,
do setup as required by the test case, this ensure each test runs in a
clean VM.

This framework will install a custom systemd service that starts when
system have finished booting, and the service will do basic routine
(fetch and set boot counter, etc..), then call the test case which is
installed in /kexec-kdump-test/test.sh in VM.

Each VM will have two serial consoles, one for ordinary console usage,
one for the test communication and log. The test script will watch the
second test console to know the test status.

The test cases are located in tests/scripts/testcases, documents about
the test cases structure will be provided in following commits.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
2020-09-17 10:42:54 +08:00
Kairui Song
2457f22baf selftest: Add basic infrastructure to build test image
The Makefile In tests/ could help build a VM image using Fedora cloud
image as base image, or, user can specify a base image using
BASE_IMAGE=<path/to/file>. The current repo will be packeged and
installed in the image, so the image could be used as a test image to
test kexec-tools.

The image building is splited into two steps:
The first step, it either convert the base image to qcow2 or create
a snapshot on it, and install basic packages (dracut, grubby, ...)
and do basic setups (setup crashkernel=, disable selinux, ...).
See tests/scripts/build-scripts/base-image.sh for detail.

The second step, it creates a snapshot on top of the image produced by
the previous step, and install the packaged kexec-tools of current
repo.  See tests/scripts/build-scripts/test-base-image.sh for detail.

In this way, if repo's content is changes, `make` will detect it and
only rebuild the second snapshot which speed up the rebuild by a lot.

The image will be located as tests/output/test-base-image, and in qcow2
format. And default user/password is set to root/fedora.

Signed-off-by: Kairui Song <kasong@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
2020-09-17 10:42:34 +08:00