[PATCH] selftests/powerpc: Add script to test HMI functionality

Denis Kirjanov kda at linux-powerpc.org
Wed Nov 18 22:33:27 AEDT 2015


On 11/18/15, Daniel Axtens <dja at axtens.net> wrote:
> HMIs (Hypervisor Management|Maintenance Interrupts) are a class of interrupt
> on POWER systems.
>
> HMI support has traditionally been exceptionally difficult to test. However
> Skiboot ships a tool that, with the correct magic numbers, will inject them.
>
> This, therefore, is a first pass at a script to inject HMIs and monitor
> Linux's response. It injects an HMI on each core on every chip in turn.
> It then watches dmesg to see if it's acknowledged by Linux.
>
> On a Tuletta, I observed that we see 8 (or sometimes 9 or more) events per
> injection, regardless of SMT setting, so we wait for 8 before progressing.
>
> It sits in a new scripts/ directory in selftests/powerpc, because it's not
> designed to be run as part of the regular make selftests process. In
> particular, it is quite possibly going to end up garding lots of your CPUs,
> so it should only be run if you know how to undo that.

Hi Daniel,

Could you explain why it's useful, and what it's useful for. Moreover,
it's POWER8 feature, right?
>
> CC: Mahesh J Salgaonkar <mahesh.salgaonkar at in.ibm.com>
> Signed-off-by: Daniel Axtens <dja at axtens.net>
> ---
>  tools/testing/selftests/powerpc/scripts/hmi.sh | 77
> ++++++++++++++++++++++++++
>  1 file changed, 77 insertions(+)
>  create mode 100755 tools/testing/selftests/powerpc/scripts/hmi.sh
>
> diff --git a/tools/testing/selftests/powerpc/scripts/hmi.sh
> b/tools/testing/selftests/powerpc/scripts/hmi.sh
> new file mode 100755
> index 000000000000..ebce03933784
> --- /dev/null
> +++ b/tools/testing/selftests/powerpc/scripts/hmi.sh
> @@ -0,0 +1,77 @@
> +#!/bin/sh
> +
> +# do we have ./getscom, ./putscom?
> +if [ -x ./getscom ] && [ -x ./putscom ]; then
> +	GETSCOM=./getscom
> +	PUTSCOM=./putscom
> +elif which getscom > /dev/null; then
> +	GETSCOM=$(which getscom)
> +	PUTSCOM=$(which putscom)
> +else
> +	cat <<EOF
> +Can't find getscom/putscom in . or \$PATH.
> +See https://github.com/open-power/skiboot.
> +The tool is in external/xscom-utils
> +EOF
> +	exit 1
> +fi
> +
> +# We will get 8 HMI events per injection
> +# todo: deal with things being offline
> +expected_hmis=8
> +COUNT_HMIS() {
> +    dmesg | grep -c 'Harmless Hypervisor Maintenance interrupt'
> +}
> +
> +# massively expand snooze delay, allowing injection on all cores
> +ppc64_cpu --smt-snooze-delay=1000000000
> +
> +# when we exit, restore it
> +trap "ppc64_cpu --smt-snooze-delay=100" 0 1
> +
> +# for each chip+core combination
> +# todo - less fragile parsing
> +egrep -o 'OCC: Chip [0-9a-f]+ Core [0-9a-f]' < /sys/firmware/opal/msglog |
> +while read chipcore; do
> +	chip=$(echo "$chipcore"|awk '{print $3}')
> +	core=$(echo "$chipcore"|awk '{print $5}')
> +	fir="0x1${core}013100"
> +
> +	# verify that Core FIR is zero as expected
> +	if [ "$($GETSCOM -c 0x${chip} $fir)" != 0 ]; then
> +		echo "FIR was not zero before injection for chip $chip, core $core.
> Aborting!"
> +		echo "Result of $GETSCOM -c 0x${chip} $fir:"
> +		$GETSCOM -c 0x${chip} $fir
> +		echo "If you get a -5 error, the core may be in idle state. Try
> stress-ng."
> +		echo "Otherwise, try $PUTSCOM -c 0x${chip} $fir 0"
> +		exit 1
> +	fi
> +
> +	# keep track of the number of HMIs handled
> +	old_hmis=$(COUNT_HMIS)
> +
> +	# do injection, adding a marker to dmesg for clarity
> +	echo "Injecting HMI on core $core, chip $chip" | tee /dev/kmsg
> +	# inject a RegFile recoverable error
> +	if ! $PUTSCOM -c 0x${chip} $fir 2000000000000000 > /dev/null; then
> +		echo "Error injecting. Aborting!"
> +		exit 1
> +	fi
> +
> +	# now we want to wait for all the HMIs to be processed
> +	# we expect one per thread on the core
> +	i=0;
> +	new_hmis=$(COUNT_HMIS)
> +	while [ $new_hmis -lt $((old_hmis + expected_hmis)) ] && [ $i -lt 12 ]; do
> +	    echo "Seen $((new_hmis - old_hmis)) HMI(s) out of $expected_hmis
> expected, sleeping"
> +	    sleep 5;
> +	    i=$((i + 1))
> +	    new_hmis=$(COUNT_HMIS)
> +	done
> +	if [ $i = 12 ]; then
> +	    echo "Haven't seen expected $expected_hmis recoveries after 1 min.
> Aborting."
> +	    exit 1
> +	fi
> +	echo "Processed $expected_hmis events; presumed success. Check dmesg."
> +	echo ""
> +done
> --
> 2.6.2
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev at lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev


More information about the Linuxppc-dev mailing list