Qemuppc Boot Hangs: Difference between revisions
No edit summary |
No edit summary |
||
(4 intermediate revisions by the same user not shown) | |||
Line 8: | Line 8: | ||
It is *not*: | It is *not*: | ||
* Specific to a host OS (observed on centos7, debian8, ubuntu1710) | * Specific to a host OS (observed on centos7, debian8, ubuntu1710) | ||
* Specific to image recipe (happens sato, sato-sdk, lsb, lsb-sdk) | * Specific to image recipe (happens sato, sato-sdk, lsb, lsb-sdk, minimal, full-cmdline) | ||
* Specific to DISTRO (poky and poky-lsb) | * Specific to DISTRO (poky and poky-lsb) | ||
Line 14: | Line 14: | ||
[ 6.667445] udevd[104]: starting version 3.2.2 | [ 6.667445] udevd[104]: starting version 3.2.2 | ||
[ 6.743262] udevd[105]: starting eudev-3.2.2 | [ 6.743262] udevd[105]: starting eudev-3.2.2 | ||
and is hanging in the poll() call in udevadm-settle.c:adm_settle(), it doesn't seem to return from the syscall. | and is hanging in the poll() call in udevadm-settle.c:adm_settle(), it doesn't seem to return from the syscall. Further tracing confirms that it goes into poll_schedule_timeout() in fs/select.c:do_poll() and doesn't come out again. The time parameters all look sane so its likely the kernel locking up or sleeping and failing to wake. | ||
In some cases the boot proceeds further and hangs in the dropbear key creation. | In some cases the boot proceeds further and hangs in the dropbear key creation. | ||
The timing of the usb-tablet device does vary in the boot logs however the hang has been observed with the usb-tablet device removed. | |||
It seems that running a build in parallel with testimage does help the issue occur more frequently. | |||
I wondered about a connection with host cpufreq changes. In order to test, I booted with "intel_pstate=disable" on the kernel commandline and tried: | |||
while true; do cpupower frequency-set -f 1200000 > /dev/null; sleep 0.25; cpupower frequency-set -f 2200000 > /dev/null; done | |||
along with setting to max cpu freq and also setting just to min cpu freq. | |||
I've observed hangs in all combinations although the likelyhood of a hang seemed higher with the alternating frequency. | |||
I also tried the script: | |||
import time | |||
import random | |||
import subprocess | |||
cpus = range(0,87) | |||
freqs = range(1200000, 2200000, 100000) | |||
while True: | |||
for cpunum in cpus: | |||
f = random.choice(freqs) | |||
with open("/sys/devices/system/cpu/cpu%d/cpufreq/scaling_setspeed" % cpunum, 'w') as speed: | |||
speed.write("%d\n" % f) | |||
time.sleep(0.05) | |||
which also seems to make hangs more likely somehow. | |||
The powersave=off commandline option does not help. |
Latest revision as of 10:31, 17 November 2017
qemuppc is randomly hanging at boot during sanity testing. For example:
- https://autobuilder.yocto.io/builders/nightly-ppc-lsb/builds/581
- https://autobuilder.yocto.io/builders/nightly-ppc/builds/592
Despite initial suspicions this was not due to entropy problems.
It is *not*:
- Specific to a host OS (observed on centos7, debian8, ubuntu1710)
- Specific to image recipe (happens sato, sato-sdk, lsb, lsb-sdk, minimal, full-cmdline)
- Specific to DISTRO (poky and poky-lsb)
The hang usually occurs after:
[ 6.667445] udevd[104]: starting version 3.2.2 [ 6.743262] udevd[105]: starting eudev-3.2.2
and is hanging in the poll() call in udevadm-settle.c:adm_settle(), it doesn't seem to return from the syscall. Further tracing confirms that it goes into poll_schedule_timeout() in fs/select.c:do_poll() and doesn't come out again. The time parameters all look sane so its likely the kernel locking up or sleeping and failing to wake.
In some cases the boot proceeds further and hangs in the dropbear key creation.
The timing of the usb-tablet device does vary in the boot logs however the hang has been observed with the usb-tablet device removed.
It seems that running a build in parallel with testimage does help the issue occur more frequently.
I wondered about a connection with host cpufreq changes. In order to test, I booted with "intel_pstate=disable" on the kernel commandline and tried: while true; do cpupower frequency-set -f 1200000 > /dev/null; sleep 0.25; cpupower frequency-set -f 2200000 > /dev/null; done along with setting to max cpu freq and also setting just to min cpu freq. I've observed hangs in all combinations although the likelyhood of a hang seemed higher with the alternating frequency. I also tried the script:
import time import random import subprocess cpus = range(0,87) freqs = range(1200000, 2200000, 100000) while True: for cpunum in cpus: f = random.choice(freqs) with open("/sys/devices/system/cpu/cpu%d/cpufreq/scaling_setspeed" % cpunum, 'w') as speed: speed.write("%d\n" % f) time.sleep(0.05)
which also seems to make hangs more likely somehow.
The powersave=off commandline option does not help.