[Skiboot] [PATCH] core/mem_region: Give up on freeing unmanaged memory block

Oliver O'Halloran oohall at gmail.com
Fri May 26 16:57:19 AEST 2017


On Fri, May 26, 2017 at 4:30 PM, Gavin Shan <gwshan at linux.vnet.ibm.com> wrote:
> On Fri, May 26, 2017 at 04:12:24PM +1000, Oliver O'Halloran wrote:
>>On Fri, May 26, 2017 at 3:21 PM, Gavin Shan <gwshan at linux.vnet.ibm.com> wrote:
>>> When adding memory regions to global list, the overlapped one
>>> (including it's descriptor) is purged. However, the descriptor might
>>> be in BSS (not HEAP) section. It will trigger the exception caused
>>> by assert() in mem_free() as below backtrace indicates. The memory
>>> regions causing this are "ibm,os-reserve" and "hostboot-reserve-0"
>>> separately. The former descriptor is resident in BSS section while
>>> the later one is in HEAP:
>>>
>>> CPU 0050 Backtrace:
>>>  S: 0000000031d43a00 R: 00000000300136d0   .backtrace+0x2c
>>>  S: 0000000031d43a90 R: 00000000300191ec   ._abort+0x4c
>>>  S: 0000000031d43b10 R: 0000000030019268   .assert_fail+0x34
>>>  S: 0000000031d43b90 R: 0000000030015d08   .mem_free+0x64
>>>  S: 0000000031d43c20 R: 0000000030017080   .__free+0x38
>>>  S: 0000000031d43cb0 R: 0000000030015518   .add_region+0x230
>>>  S: 0000000031d43d60 R: 0000000030016854   .mem_region_init+0x2c8
>>>  S: 0000000031d43e30 R: 0000000030014994   .main_cpu_entry+0x3ec
>>>  S: 0000000031d43f00 R: 0000000030002648   boot_entry+0x198
>>>
>>> This fixes the issue by doing nothing when trying to freeing memory
>>> block which is resident in unmanaged (not HEAP) region.
>>>
>>> Signed-off-by: Gavin Shan <gwshan at linux.vnet.ibm.com>
>>
>>Not a fan. The specific bug you're seeing here is because of hostboot
>>being broken on FSP machines and reserving memory that it shouldn't.
>>This patch papers over the problem by making any invalid free()
>>silently fail which is something we should avoid.
>>
>
> Well, It's not to fail sliently. The memory block resident in
> bss/data can't simply free'd. I agree it's reasonable to avoid freeing
> the memory chunks, which in unmanaged from source. If you're going
> to do that, this patch can be dropped sliently. On the other hand,
> I also found the last skiboot can't boot and it's related to get_hb_reserved_me().
> Everything becomes fine when it's simply commented out (or skipped). It seems the
> memory blocks (in device-tree) that should have reserved by kernel, aren't
> reserved properly. It leads to data/memory corruption eventually. I'm
> thinking the correct thing to do is mering the memory region with the
> overlapped one in mem_region.c.

In this situation merging the overlapping reservations is fine, but we
can't rely on that being true in general. We've had problems in the
past with reservations for the OCC being overlapped with those for the
hostboot runtime services. In those situations I would prefer we just
crashed early rather than silently corrupting state.

>
> Cheers,
> Gavin
>
>>> ---
>>>  core/mem_region.c | 7 ++++---
>>>  1 file changed, 4 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/core/mem_region.c b/core/mem_region.c
>>> index a314558..bfb5ac9 100644
>>> --- a/core/mem_region.c
>>> +++ b/core/mem_region.c
>>> @@ -469,9 +469,10 @@ void mem_free(struct mem_region *region, void *mem, const char *location)
>>>         if (!mem)
>>>                 return;
>>>
>>> -       /* Your memory is in the region, right? */
>>> -       assert(mem >= region_start(region) + sizeof(*hdr));
>>> -       assert(mem < region_start(region) + region->len);
>>> +       /* Give up if it is in unmanaged and invalid regions */
>>> +       if (mem < (region_start(region) + sizeof(*hdr)) ||
>>> +           mem >= (region_start(region) + region->len))
>>> +               return;
>>>
>>>         /* Grab header. */
>>>         hdr = mem - sizeof(*hdr);
>>> --
>>> 2.7.4
>>>
>>> _______________________________________________
>>> Skiboot mailing list
>>> Skiboot at lists.ozlabs.org
>>> https://lists.ozlabs.org/listinfo/skiboot
>>
>


More information about the Skiboot mailing list