TipsAndTricks/DebuggingHardQemuFailures: Difference between revisions
No edit summary |
No edit summary |
||
(One intermediate revision by the same user not shown) | |||
Line 11: | Line 11: | ||
When the failure is intermittent this adds extra complexity. The first step is often to build an environment where you can reproduce the issue at will. This means reducing time taken to trigger the issue and perhaps brute forcing it by running many items in parallel. RP developed a script, "runqemu-parallel" which boots qemu using runqemu waiting for it to reach a login prompt, then immediately starting a new qemu. Around 60 of these can be run at once until failures appear. The script was optimised so that runqemu didn't need to call into bitbake, improving the speed it could cycle processes. | When the failure is intermittent this adds extra complexity. The first step is often to build an environment where you can reproduce the issue at will. This means reducing time taken to trigger the issue and perhaps brute forcing it by running many items in parallel. RP developed a script, "runqemu-parallel" which boots qemu using runqemu waiting for it to reach a login prompt, then immediately starting a new qemu. Around 60 of these can be run at once until failures appear. The script was optimised so that runqemu didn't need to call into bitbake, improving the speed it could cycle processes. | ||
Isolating to a particular code area is useful, e.g. does it happen without kvm? without kvm-irqchip? any particular distro? | |||
The qemu boot logs in some cases can show intriguing differences, e.g. the boot messages: | |||
<pre> | |||
[ 0.000000] BIOS bug: APIC version is 0 for CPU 0/0x0, fixing up to 0x10 | |||
[ 0.000000] BIOS bug: APIC version mismatch, boot CPU: 0, CPU 0: version 10 | |||
</pre> | |||
In one case RP ended up building a similar kernel version locally and then copying in the modules and kernel from the broken autobuilder. For Ubuntu this is something like: | In one case RP ended up building a similar kernel version locally and then copying in the modules and kernel from the broken autobuilder. For Ubuntu this is something like: | ||
Line 21: | Line 30: | ||
</pre> | </pre> | ||
and then copying in /boot/XXX and /lib/kernel/modules/XXX to the system and running update-grub. Kernel commandline can be tweaked in /etc/default/grub. | and then copying in /boot/XXX and /lib/kernel/modules/XXX to the system and running update-grub. Kernel commandline can be tweaked in /etc/default/grub. This was after trying the remote binaries, both native qemu ones and the kernel binary and image, just to isolate the problem to a particular area. Running a failing command and a working one under strace and comparing the logs can also sometimes be useful (although the logs can be large and you can have to replace the pid values in the logs to compare, meld can help visualise). | ||
Sometimes its memory fragmentation that causes the issue, particularly if the issue occurs after sustained uptime and builds. A "echo 1 > /proc/sys/vm/drop_caches" will often make this kind of issue disappear again. If that is the case, http://rpsys.net/fragment.tgz is a version of https://oss.oracle.com/projects/codefragments/src/trunk/fragment-slab/README which works on 4.13-4.15 kernels. Loading it (insmod fragment.ko) and then "echo 900000 > /proc/temp" will fragement the system memory enough to reproduce fragmentation issues. You'll need to experiment to find suitable numbers for your memory size. /proc/pagetypeinfo and /proc/slabinfo contain useful information about how much memory is available at each allocation size order. | Sometimes its memory fragmentation that causes the issue, particularly if the issue occurs after sustained uptime and builds. A "echo 1 > /proc/sys/vm/drop_caches" will often make this kind of issue disappear again. If that is the case, http://rpsys.net/fragment.tgz is a version of https://oss.oracle.com/projects/codefragments/src/trunk/fragment-slab/README which works on 4.13-4.15 kernels. Loading it (insmod fragment.ko) and then "echo 900000 > /proc/temp" will fragement the system memory enough to reproduce fragmentation issues. You'll need to experiment to find suitable numbers for your memory size. /proc/pagetypeinfo and /proc/slabinfo contain useful information about how much memory is available at each allocation size order. | ||
Line 37: | Line 46: | ||
$ trace-cmd report --cpu 27 | grep kmalloc.*ptr=[^0] | $ trace-cmd report --cpu 27 | grep kmalloc.*ptr=[^0] | ||
In some cases its useful to be able to attach gdb to a running qemu image, even cross architecture. To do this I found you can pass "-monitor pty" to qemu though runqemu (as well as -pidfile X so you can tell which pid is which qemu). When a qemu hangs, kill all the other qemu processes running and then you'll be left with a pty you can connect to with something like "screen /dev/pts/11". From the qemu monitor you could run "gdbserver" which waits for a connection from gdb on tcp 1234. To setup a suitable gdb, "MACHINE=qemuppc bitbake gdb-cross-powerpc -c addto_recipe_sysroot", then run it against the kernel, e.g. ./tmp/work/x86_64-linux/gdb-cross-powerpc/8.0-r0/recipe-sysroot-native/usr/bin/powerpc-poky-linux-musl/powerpc-poky-linux-musl-gdb ./tmp/deploy/images/qemuppc/vmlinux-qemuppc.bin. | |||
You can tell the currently executing process with something like p ((struct task_struct*) 0xcf15c180)->pid | |||
$25 = 108 | |||
assuming you can find "current" which in the ppc case is stored in one of the CPU registers. | |||
You can find disassembly of a kernel function with something horrible like: | |||
objdump -d vmlinux -l | grep eventpoll.c -C 500 | |||
Other random tips: | Other random tips: | ||
* https://elixir.free-electrons.com/linux/latest/source provides a nice way to navigating kernel source code. | * https://elixir.free-electrons.com/linux/latest/source provides a nice way to navigating kernel source code. |
Latest revision as of 15:41, 13 December 2017
Sometimes we hit hard to debug failures on the autobuilder infrastructure. This details some of the thinking and tricks used when debugging such failures.
SSH access to the autobuilders often helps and is available to those needing it to debug failures. There are some simple rules/steps:
- Pause the autobuilder (from https://autobuilder.yocto.io/buildslaves/ , e.g. https://autobuilder.yocto.io/buildslaves/fedora26.yocto.io)
- Let RP/Joshua/Halstead/Ross know that debugging is taking place (so we don't accidentally reboot or renable it)
- "sudo -iu pokybuild" so there are no permissions issues
- cd to the directory in the failing build (shown at the top of the failing build log)
- make sure auto.conf matches the failing configuration (subsequent builds may have reset it)
- source oe-init-build-env as usual and the build/debug away
When the failure is intermittent this adds extra complexity. The first step is often to build an environment where you can reproduce the issue at will. This means reducing time taken to trigger the issue and perhaps brute forcing it by running many items in parallel. RP developed a script, "runqemu-parallel" which boots qemu using runqemu waiting for it to reach a login prompt, then immediately starting a new qemu. Around 60 of these can be run at once until failures appear. The script was optimised so that runqemu didn't need to call into bitbake, improving the speed it could cycle processes.
Isolating to a particular code area is useful, e.g. does it happen without kvm? without kvm-irqchip? any particular distro?
The qemu boot logs in some cases can show intriguing differences, e.g. the boot messages:
[ 0.000000] BIOS bug: APIC version is 0 for CPU 0/0x0, fixing up to 0x10 [ 0.000000] BIOS bug: APIC version mismatch, boot CPU: 0, CPU 0: version 10
In one case RP ended up building a similar kernel version locally and then copying in the modules and kernel from the broken autobuilder. For Ubuntu this is something like:
sudo apt install install git build-essential kernel-package fakeroot libncurses5-dev libssl-dev ccache git clone -b linux-4.11.y git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git cp /boot/config-`uname -r` .config make -j `getconf _NPROCESSORS_ONLN` deb-pkg LOCALVERSION=-custom
and then copying in /boot/XXX and /lib/kernel/modules/XXX to the system and running update-grub. Kernel commandline can be tweaked in /etc/default/grub. This was after trying the remote binaries, both native qemu ones and the kernel binary and image, just to isolate the problem to a particular area. Running a failing command and a working one under strace and comparing the logs can also sometimes be useful (although the logs can be large and you can have to replace the pid values in the logs to compare, meld can help visualise).
Sometimes its memory fragmentation that causes the issue, particularly if the issue occurs after sustained uptime and builds. A "echo 1 > /proc/sys/vm/drop_caches" will often make this kind of issue disappear again. If that is the case, http://rpsys.net/fragment.tgz is a version of https://oss.oracle.com/projects/codefragments/src/trunk/fragment-slab/README which works on 4.13-4.15 kernels. Loading it (insmod fragment.ko) and then "echo 900000 > /proc/temp" will fragement the system memory enough to reproduce fragmentation issues. You'll need to experiment to find suitable numbers for your memory size. /proc/pagetypeinfo and /proc/slabinfo contain useful information about how much memory is available at each allocation size order.
Tracing kernel functions can be done with ftrace, this is handy for production kernels where you might not easily be able to rebuild the live kernel, or where you have a system which is in a "broken" state and you want to debug the problem. The magic incantations are something like:
$ trace-cmd record -b 20000 -T -e kmem
which logs all the kernel memory allocation requests. The -T option includes callgraph information. The trace.dat file generated can be huge, e.g. 19.6GB for multiple image tests running in parallel. To find memory allocations that failed (return pointer was NULL):
$ trace-cmd report | grep kmalloc.*ptr=[^0]
You could analyse by CPU to easily find the backtrace that corresponds to the allocation failure:
$ trace-cmd report --cpu 27 | grep kmalloc.*ptr=[^0]
In some cases its useful to be able to attach gdb to a running qemu image, even cross architecture. To do this I found you can pass "-monitor pty" to qemu though runqemu (as well as -pidfile X so you can tell which pid is which qemu). When a qemu hangs, kill all the other qemu processes running and then you'll be left with a pty you can connect to with something like "screen /dev/pts/11". From the qemu monitor you could run "gdbserver" which waits for a connection from gdb on tcp 1234. To setup a suitable gdb, "MACHINE=qemuppc bitbake gdb-cross-powerpc -c addto_recipe_sysroot", then run it against the kernel, e.g. ./tmp/work/x86_64-linux/gdb-cross-powerpc/8.0-r0/recipe-sysroot-native/usr/bin/powerpc-poky-linux-musl/powerpc-poky-linux-musl-gdb ./tmp/deploy/images/qemuppc/vmlinux-qemuppc.bin.
You can tell the currently executing process with something like p ((struct task_struct*) 0xcf15c180)->pid $25 = 108 assuming you can find "current" which in the ppc case is stored in one of the CPU registers.
You can find disassembly of a kernel function with something horrible like:
objdump -d vmlinux -l | grep eventpoll.c -C 500
Other random tips:
- https://elixir.free-electrons.com/linux/latest/source provides a nice way to navigating kernel source code.