[Skiboot] [PATCH] core/mem_region: Give up on freeing unmanaged memory block
Gavin Shan
gwshan at linux.vnet.ibm.com
Fri May 26 16:30:35 AEST 2017
On Fri, May 26, 2017 at 04:12:24PM +1000, Oliver O'Halloran wrote:
>On Fri, May 26, 2017 at 3:21 PM, Gavin Shan <gwshan at linux.vnet.ibm.com> wrote:
>> When adding memory regions to global list, the overlapped one
>> (including it's descriptor) is purged. However, the descriptor might
>> be in BSS (not HEAP) section. It will trigger the exception caused
>> by assert() in mem_free() as below backtrace indicates. The memory
>> regions causing this are "ibm,os-reserve" and "hostboot-reserve-0"
>> separately. The former descriptor is resident in BSS section while
>> the later one is in HEAP:
>>
>> CPU 0050 Backtrace:
>> S: 0000000031d43a00 R: 00000000300136d0 .backtrace+0x2c
>> S: 0000000031d43a90 R: 00000000300191ec ._abort+0x4c
>> S: 0000000031d43b10 R: 0000000030019268 .assert_fail+0x34
>> S: 0000000031d43b90 R: 0000000030015d08 .mem_free+0x64
>> S: 0000000031d43c20 R: 0000000030017080 .__free+0x38
>> S: 0000000031d43cb0 R: 0000000030015518 .add_region+0x230
>> S: 0000000031d43d60 R: 0000000030016854 .mem_region_init+0x2c8
>> S: 0000000031d43e30 R: 0000000030014994 .main_cpu_entry+0x3ec
>> S: 0000000031d43f00 R: 0000000030002648 boot_entry+0x198
>>
>> This fixes the issue by doing nothing when trying to freeing memory
>> block which is resident in unmanaged (not HEAP) region.
>>
>> Signed-off-by: Gavin Shan <gwshan at linux.vnet.ibm.com>
>
>Not a fan. The specific bug you're seeing here is because of hostboot
>being broken on FSP machines and reserving memory that it shouldn't.
>This patch papers over the problem by making any invalid free()
>silently fail which is something we should avoid.
>
Well, It's not to fail sliently. The memory block resident in
bss/data can't simply free'd. I agree it's reasonable to avoid freeing
the memory chunks, which in unmanaged from source. If you're going
to do that, this patch can be dropped sliently. On the other hand,
I also found the last skiboot can't boot and it's related to get_hb_reserved_me().
Everything becomes fine when it's simply commented out (or skipped). It seems the
memory blocks (in device-tree) that should have reserved by kernel, aren't
reserved properly. It leads to data/memory corruption eventually. I'm
thinking the correct thing to do is mering the memory region with the
overlapped one in mem_region.c.
Cheers,
Gavin
>> ---
>> core/mem_region.c | 7 ++++---
>> 1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/core/mem_region.c b/core/mem_region.c
>> index a314558..bfb5ac9 100644
>> --- a/core/mem_region.c
>> +++ b/core/mem_region.c
>> @@ -469,9 +469,10 @@ void mem_free(struct mem_region *region, void *mem, const char *location)
>> if (!mem)
>> return;
>>
>> - /* Your memory is in the region, right? */
>> - assert(mem >= region_start(region) + sizeof(*hdr));
>> - assert(mem < region_start(region) + region->len);
>> + /* Give up if it is in unmanaged and invalid regions */
>> + if (mem < (region_start(region) + sizeof(*hdr)) ||
>> + mem >= (region_start(region) + region->len))
>> + return;
>>
>> /* Grab header. */
>> hdr = mem - sizeof(*hdr);
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Skiboot mailing list
>> Skiboot at lists.ozlabs.org
>> https://lists.ozlabs.org/listinfo/skiboot
>
More information about the Skiboot
mailing list