Different SIGSEGV codes (x86 and ppc64le)

Michael Ellerman mpe at ellerman.id.au
Wed Jan 20 16:34:21 AEDT 2016


On Tue, 2016-01-19 at 18:49 -0200, Breno Leitao wrote:

> During some debugging, we found that during a stack overflow, the SIGSEGV code
> returned is different on Power and Intel.
> 
> We were able to narrow down the test case to the follow simple code:
> 
>   https://github.com/leitao/stack/blob/master/overflow.c

[So the first thing I did was disable your signal handler, because that just
 complicates things.]

> On Power, the SIGSEV si->si_code is 2 (SEGV_ACCERR) , meaning "access error". On
> the other way around, the same test on x86 returns si->si_code = 1 (SEGV_MAPERR),
> meaning "invalid permission". Any idea why such difference?

This seems to be a result of the stack guard page. Whenever the lowest page of
the stack vma is faulted in, the kernel grows the vma down one page.

That means in do_page_fault() we don't ever see a bad area (ie. no vma found)
for the stack. Instead we find a vma, and call handle_mm_fault(), which then
tries to expand the stack down in check_stack_guard_page(). Then in
expand_downwards() we call acct_stack_growth() which checks the stack ulimit,
and that is what fails.

That means the failure comes from handle_mm_fault(), and by that point in the
logic we have already set code to SEGV_ACCERR. So even though we goto bad_area,
code is SEGV_ACCERR and that's what you see.

x86 on the other hand handles the error path differently, it passes the error
down to mm_fault_error(), which calls bad_area_nosemaphore(), which always
specifies SEGV_MAPERR for VM_FAULT_SIGSEGV.

The kernel describes those error codes as:

  #define SEGV_MAPERR	(__SI_FAULT|1)	/* address not mapped to object */
  #define SEGV_ACCERR	(__SI_FAULT|2)	/* invalid permissions for mapped object */

Which one is correct in this case isn't entirely clear. There is a stack
mapping, but you're not allowed to use it because of the stack ulimit, so
arguably ACCERR is more accurate.

However that's only true because of the stack guard page, which is supposed to
be somewhat invisible to userspace. If I disable the stack guard page logic,
userspace sees SEGV_MAPERR, so it seems that historically that's what is
expected.

So we should probably fix this on powerpc.

It also makes me think the logic we have in do_page_fault() to directly expand
the stack (around line 375) is now dead code.

cheers



More information about the Linuxppc-dev mailing list