[PATCH 0/1] powerpc/numa: do not skip node 0 in lookup table
Daniel Henrique Barboza
danielhb413 at gmail.com
Sat Sep 5 06:06:57 AEST 2020
I discussed this a bit with Aneesh Kumar in IBM internal Slack, a few weeks
ago, and he informed me that that this patch does not make sense with the
design used by the kernel. The kernel will assume that, for node 0, all
associativity domains must also be zeroed. This is why node 0 is skipped
when creating the distance table.
This of course has consequences for QEMU, so based on that, I've adapted
the QEMU implementation to not touch node 0.
Daniel
On 8/14/20 5:34 PM, Daniel Henrique Barboza wrote:
> Hi,
>
> This is a simple fix that I made while testing NUMA changes
> I'm making in QEMU [1]. Setting any non-zero value to the
> associativity of NUMA node 0 has no impact in the output
> of 'numactl' because the distance_lookup_table is never
> initialized for node 0.
>
> Seeing through the LOPAPR spec and git history I found no
> technical reason to skip node 0, which makes me believe this is
> a bug that got under the radar up until now because no one
> attempted to set node 0 associativity like I'm doing now.
>
> For anyone wishing to give it a spin, using the QEMU build
> in [1] and experimenting with NUMA distances, such as:
>
> sudo ./qemu-system-ppc64 -machine pseries-5.2,accel=kvm,usb=off,dump-guest-core=off -m 65536 -overcommit mem-lock=off -smp 4,sockets=4,cores=1,threads=1 -rtc base=utc -display none -vga none -nographic -boot menu=on -device spapr-pci-host-bridge,index=1,id=pci.1 -device spapr-pci-host-bridge,index=2,id=pci.2 -device spapr-pci-host-bridge,index=3,id=pci.3 -device spapr-pci-host-bridge,index=4,id=pci.4 -device qemu-xhci,id=usb,bus=pci.0,addr=0x2 -drive file=/home/danielhb/f32.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -device usb-kbd,id=input0,bus=usb.0,port=1 -device usb-mouse,id=input1,bus=usb.0,port=2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -msg timestamp=on \
> -numa node,nodeid=0,cpus=0 -numa node,nodeid=1,cpus=1 \
> -numa node,nodeid=2,cpus=2 -numa node,nodeid=3,cpus=3 \
> -numa dist,src=0,dst=1,val=80 -numa dist,src=0,dst=2,val=80 \
> -numa dist,src=0,dst=3,val=80 -numa dist,src=1,dst=2,val=80 \
> -numa dist,src=1,dst=3,val=80 -numa dist,src=2,dst=3,val=80
>
> The current kernel code will ignore the associativity of
> node 0, and numactl will output this:
>
> node distances:
> node 0 1 2 3
> 0: 10 160 160 160
> 1: 160 10 80 80
> 2: 160 80 10 80
> 3: 160 80 80 10
>
> With this patch:
>
> node distances:
> node 0 1 2 3
> 0: 10 160 160 160
> 1: 160 10 80 40
> 2: 160 80 10 20
> 3: 160 40 20 10
>
>
> If anyone wonders, this patch has no conflict with the proposed
> NUMA changes in [2] because Aneesh isn't changing this line.
>
>
> [1] https://github.com/danielhb/qemu/tree/spapr_numa_v1
> [2] https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20200731111916.243569-1-aneesh.kumar@linux.ibm.com/
>
>
> Daniel Henrique Barboza (1):
> powerpc/numa: do not skip node 0 when init lookup table
>
> arch/powerpc/mm/numa.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
More information about the Linuxppc-dev
mailing list