[Skiboot] [PATCH] core/mem_region: Give up on freeing unmanaged memory block

Gavin Shan gwshan at linux.vnet.ibm.com
Fri May 26 16:30:35 AEST 2017


On Fri, May 26, 2017 at 04:12:24PM +1000, Oliver O'Halloran wrote:
>On Fri, May 26, 2017 at 3:21 PM, Gavin Shan <gwshan at linux.vnet.ibm.com> wrote:
>> When adding memory regions to global list, the overlapped one
>> (including it's descriptor) is purged. However, the descriptor might
>> be in BSS (not HEAP) section. It will trigger the exception caused
>> by assert() in mem_free() as below backtrace indicates. The memory
>> regions causing this are "ibm,os-reserve" and "hostboot-reserve-0"
>> separately. The former descriptor is resident in BSS section while
>> the later one is in HEAP:
>>
>> CPU 0050 Backtrace:
>>  S: 0000000031d43a00 R: 00000000300136d0   .backtrace+0x2c
>>  S: 0000000031d43a90 R: 00000000300191ec   ._abort+0x4c
>>  S: 0000000031d43b10 R: 0000000030019268   .assert_fail+0x34
>>  S: 0000000031d43b90 R: 0000000030015d08   .mem_free+0x64
>>  S: 0000000031d43c20 R: 0000000030017080   .__free+0x38
>>  S: 0000000031d43cb0 R: 0000000030015518   .add_region+0x230
>>  S: 0000000031d43d60 R: 0000000030016854   .mem_region_init+0x2c8
>>  S: 0000000031d43e30 R: 0000000030014994   .main_cpu_entry+0x3ec
>>  S: 0000000031d43f00 R: 0000000030002648   boot_entry+0x198
>>
>> This fixes the issue by doing nothing when trying to freeing memory
>> block which is resident in unmanaged (not HEAP) region.
>>
>> Signed-off-by: Gavin Shan <gwshan at linux.vnet.ibm.com>
>
>Not a fan. The specific bug you're seeing here is because of hostboot
>being broken on FSP machines and reserving memory that it shouldn't.
>This patch papers over the problem by making any invalid free()
>silently fail which is something we should avoid.
>

Well, It's not to fail sliently. The memory block resident in
bss/data can't simply free'd. I agree it's reasonable to avoid freeing
the memory chunks, which in unmanaged from source. If you're going
to do that, this patch can be dropped sliently. On the other hand,
I also found the last skiboot can't boot and it's related to get_hb_reserved_me().
Everything becomes fine when it's simply commented out (or skipped). It seems the
memory blocks (in device-tree) that should have reserved by kernel, aren't
reserved properly. It leads to data/memory corruption eventually. I'm
thinking the correct thing to do is mering the memory region with the
overlapped one in mem_region.c.

Cheers,
Gavin

>> ---
>>  core/mem_region.c | 7 ++++---
>>  1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/core/mem_region.c b/core/mem_region.c
>> index a314558..bfb5ac9 100644
>> --- a/core/mem_region.c
>> +++ b/core/mem_region.c
>> @@ -469,9 +469,10 @@ void mem_free(struct mem_region *region, void *mem, const char *location)
>>         if (!mem)
>>                 return;
>>
>> -       /* Your memory is in the region, right? */
>> -       assert(mem >= region_start(region) + sizeof(*hdr));
>> -       assert(mem < region_start(region) + region->len);
>> +       /* Give up if it is in unmanaged and invalid regions */
>> +       if (mem < (region_start(region) + sizeof(*hdr)) ||
>> +           mem >= (region_start(region) + region->len))
>> +               return;
>>
>>         /* Grab header. */
>>         hdr = mem - sizeof(*hdr);
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Skiboot mailing list
>> Skiboot at lists.ozlabs.org
>> https://lists.ozlabs.org/listinfo/skiboot
>



More information about the Skiboot mailing list