lpar issue for ZONE_DEVICE p2pmem in 4.14-rc

Oliver oohall at gmail.com
Fri Oct 27 19:07:07 AEDT 2017


On Thu, Oct 26, 2017 at 1:34 AM, Oliver <oohall at gmail.com> wrote:
> On Tue, Oct 24, 2017 at 7:17 AM, Stephen  Bates <sbates at raithlin.com> wrote:
>>
>>> [    3.537780] lpar: Attempting to resize HPT to shift 21
>>> [    3.539251] Unable to resize hash page table to target order 21: -1
>>> [    3.541079] Unable to create mapping for hot added memory 0xc000210000000000..0xc000210004000000: -2
>>
>>> For #1 above please check if your qemu supports H_RESIZE_HPT_* hcalls?
>>
>> Balbir do you have any suggestions as to how to test for this support? Note I am running this on my x86_64 host so there is no virtualization hardware in my QEMU. My qemu is very recent (QEMU emulator version 2.10.50 (v2.10.0-1026-gd8f932c-dirty)).
>
> Honestly I'd just ignore the resize error. The hash table stores PTE
> entries so it should be sized based on the amount of memory in the
> system. If it's drastically under sized there'll be a performance hit,
> but  everything should still work.
>
>>> For create mapping failures, the rc is -ENOENT. Can you help debug this further? We could do hcall tracing or enable debugging.
>>
>> Sure I can help debug. My original email also had all you needed to recreate this issue so that’s an option too?
>
> I'm not too sure what's happening there. My hunch is that the
> hypervisor (qemu in this case) is rejecting the attempt to map the PCI
> device MMIO space as cachable memory. On bare metal systems this can
> result in cache paradoxes which will kill the system so the hypervisor
> has an incentive to prevent that situation.

So I had a deeper look and found the hypervisor interface spec (PAPR)
says the hypervisor should reject attempts to map memory with
inappropriate attributes for the type of memory being mapped. The
pseries model in qemu interprets this by only allowing cacheable
mappings on memory ranges that it considers as RAM. While KVM will
allow any mappings provided they have the same cachable attribute as
the hypervisor's mapping. Either way trying to use
devm_memremap_pages() like this on pseries is fundementally broken.
The alternative approach you mentioned that uses ioremap() should work
fine though.

Also, Alexy (+cc) said he was interested in trying this on some real
hardware. Is there a test suite for p2pmem floating around that he can
use?

Thanks,
Oliver


More information about the Linuxppc-dev mailing list