[Linuxppc-users] reasigning fp breaks the call chain

Buse Yilmaz busey at vt.edu
Thu Sep 21 06:55:39 AEST 2017


Hello again,
Yes, it's really hard to give the exact context in an email I tried to
explain as much as I could. I know it sounds very abstract, apologies for
that.
> "the call chain itself is broken as soon as I switch to the destination",
what exactly does that mean?
I meant that I wasnt able to see the frames of still alive function
currently in the stack. As you guessed Dr. Weigand, I debug using gdb and
tell this by doing a backtrace. The prgram doesnt crash but it's missing
some frames on the way back to "main".

>"the callers don't have their frames on the stack yet" -- note that in the
ELFv2 ABI, there is an area of 32 bytes at the very bottom of each frame
including the back chain pointer, the LR save area, and the TOC save area.
This area formally belongs to the *caller* but is not actually used by the
caller -- the caller just allocates it on behalf of functions it calls,
which are free to use that space; all contents of that area are maintained
and used by the called function. So if you set up only the frame of the
callee, but do not properly initialize this area which is formally part of
the caller's frame, then the callee may not function correctly (in
particular when it attempts to return).
Yes backchain point was also set correctly. Since we aid the compiler by
LLVM's metadata, we collected all information needed for resuming the
execution on the destination architecture.

>- Talking about the back chain field, this is something that other ABIs
don't have in this form. This field links each frame to its caller's frame,
so everything that copies or recreates stack frames must of course also
recompute new values for the back chain fields (just like you apparently do
for SP and FP). Do you do that? I'm wondering because you point out that
you force usage of an FP, but on Power this is usually not necessary since
we already have the back chain that can be used for those purposes you
usually use the FP for on other ABIs.
Yes, we enforced usage of FP by simply passing -fno-*omit*-*frame*-*pointer.
*This was a design decision* s*imply because we have other architectures
such as x86 and arm and as youu also pointed out they use fp. And we wanted
to generalize the code to support all  architectures we have as much as
possible (well this might not be the best idea)

>This may seem pretty basic, but have you considered sigsetjmp() and
siglongjmp()?
No actually I haven't. But thanks for pointing this out Gerrit. We might
consider this if it'll make things easier.


I have an update. We resolved the issue. One thing I noticed was fp wasnt
correctly set after resuming the execution on the destination. It was
already pointing to the top of the stack like sp but the execution starts
from the beginning of the function so the old fp needs to be saved on the
stack first. This is fixed, but still the frames were missing. Also there
was something wrong with how the frames were walked back. There were 2
frames with the same return address, one is a should-not-exist frame. With
these we were able to solve the issue. Still I'm confused with how this
code was working on other architectures properly.


Thank you very much for helping me out although what I could provide you as
information was not enough!

On Wed, Sep 20, 2017 at 4:50 PM, Gerrit Huizenga [Notes] <gerrit at us.ibm.com>
wrote:

> This may seem pretty basic, but have you considered sigsetjmp() and
> siglongjmp()?
>
>
> gerrit
>
> --
> "Only those who will risk going too far can possibly find out how far one
> can go." ~ T. S. Eliot
>
> Gerrit Huizenga, STSM
> Power Open Source Ecosystem Lead
> gerrit at us.ibm.com
>
> [image: Inactive hide details for "Ulrich Weigand" ---09/20/2017 07:07:52
> AM---Hi Buse, I'm not sure I complete follow what exactly you]"Ulrich
> Weigand" ---09/20/2017 07:07:52 AM---Hi Buse, I'm not sure I complete
> follow what exactly you're doing. In particular,
>
> From: "Ulrich Weigand" <Ulrich.Weigand at de.ibm.com>
> To: Buse Yilmaz <busey at vt.edu>
> Cc: linuxppc-users at lists.ozlabs.org
> Date: 09/20/2017 07:07 AM
> Subject: Re: [Linuxppc-users] reasigning fp breaks the call chain
> Sent by: "Linuxppc-users" <linuxppc-users-bounces+gerrit=us.ibm.com at lists.
> ozlabs.org>
> ------------------------------
>
>
>
> Hi Buse,
>
> I'm not sure I complete follow what exactly you're doing. In particular,
> when you write "the call chain itself is broken as soon as I switch to the
> destination", what exactly does that mean? What specifically is "broken"
> and what are the symptoms of that breakage? Is it that the "backtrace"
> command in GDB doesn't show what you expect, is it that when you start
> executing code the function return crashes or doesn't return to the caller
> you expect, or what?
>
> Not knowing in more detail what you're doing, just some observations that
> make me suspicious:
>
> - "the callers don't have their frames on the stack yet" -- note that in
> the ELFv2 ABI, there is an area of 32 bytes at the very bottom of each
> frame including the back chain pointer, the LR save area, and the TOC save
> area. This area formally belongs to the *caller* but is not actually used
> by the caller -- the caller just allocates it on behalf of functions it
> calls, which are free to use that space; all contents of that area are
> maintained and used by the called function. So if you set up only the frame
> of the callee, but do not properly initialize this area which is formally
> part of the caller's frame, then the callee may not function correctly (in
> particular when it attempts to return).
>
> - Talking about the back chain field, this is something that other ABIs
> don't have in this form. This field links each frame to its caller's frame,
> so everything that copies or recreates stack frames must of course also
> recompute new values for the back chain fields (just like you apparently do
> for SP and FP). Do you do that? I'm wondering because you point out that
> you force usage of an FP, but on Power this is usually not necessary since
> we already have the back chain that can be used for those purposes you
> usually use the FP for on other ABIs.
>
>
> Mit freundlichen Gruessen / Best Regards
>
> Ulrich Weigand
>
> --
> Dr. Ulrich Weigand | Phone: +49-7031/16-3727 <+49%207031%20163727>
> STSM, GNU/Linux compilers and toolchain
> IBM Deutschland Research & Development GmbH
> Vorsitzende des Aufsichtsrats: Martina Koederitz | Geschäftsführung: Dirk
> Wittkopp
> Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart,
> HRB 243294
>
> [image: Inactive hide details for Buse Yilmaz ---19.09.2017
> 19:08:06---Hello, I'm working on a project that does migration between mach]Buse
> Yilmaz ---19.09.2017 19:08:06---Hello, I'm working on a project that does
> migration between machines with
>
> From: Buse Yilmaz <busey at vt.edu>
> To: linuxppc-users at lists.ozlabs.org
> Date: 19.09.2017 19:08
> Subject: [Linuxppc-users] reasigning fp breaks the call chain
> Sent by: "Linuxppc-users" <linuxppc-users-bounces+uweigand=
> de.ibm.com at lists.ozlabs.org>
> ------------------------------
>
>
>
> Hello,
> I'm working on a project that does migration between machines with
> different ISAs (currently x86_64, Aarch64 and PowerPC64). The migration is
> done with a compiler and runtime support based LLVM (3.7) that does stack
> transformation. It generates binaries for all ISAs and resumes the
> execution on the migrated architecture. For this purpose we record the
> registers and walk the call chain to record any other information needed
> such as callee-saved registers, live values and addresses that FP and SP
> point to, CFA, TOC...etc. We enforce the usage of an FP. Then create the
> same call chain on the destination architecture.
>
> To test our stack transformation first we try it on the same architecture
> assuming we do a migration from an architecture with ISA x to the machine
> with the same ISA.. We get an architecture say PowerPC, divide its stack
> into 2 and walk the call chain on the upper partuntil the leaf fuction is
> hit, then switch to the lower part assuming this is the destination
> architecture and rewrite the frames here.
>
> To resume the execution when we switch to the lower part of the stack, we
> jump to the beginning of the leaf function and attach FP and SP accordingly
> (we already know the whole register values of this function as well as its
> frame size) and load the register set with correct values.
>
> We're able to walk the chain up on the destination and create all the call
> frames, however the call chain itself is broken as soon as I switch to the
> destination. To be more precise it's broken when I move the FP to point to
> SP on the destination stack (this is how LLVM does it, FP points to the top
> of the stack ust as SP does). So I'm left with some frames missing, no
> crashes but the execution is not correctly performed.
>
> I assume that the backchain is broken on destination since we resume
> starting from the leaf function, at this point the callers don't have their
> frames on the stack yet.
>
> I wonder if creating frames on the destination in the reverse order (a.k.a
> like a normal execution would do, filling the stack with frames starting
> from the caller not the callee.
>
> I'm looking forward your help. apologies for what I have described being
> very abstract and long.
>
> P.S. We observe this behavior on neither x86 nor ARM.
>
> Thank you!
>
>
>
> --
> Buse_______________________________________________
> Linuxppc-users mailing list
> Linuxppc-users at lists.ozlabs.org
> *https://lists.ozlabs.org/listinfo/linuxppc-users*
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.ozlabs.org_listinfo_linuxppc-2Dusers&d=DwMFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=ZzfGBIaWDnwaVH6wsZg3mjvpGLeR7_C0DMvJty2qEhU&m=k-iJ7S-MAA0HA2vH-rPfBU_MmoRmlQMGxIV34mTAbBE&s=V7qn-4LRlgxQZ439jMNn7KAdjnHOCFGndfzMgPJgtZY&e=>
>
>
> _______________________________________________
> Linuxppc-users mailing list
> Linuxppc-users at lists.ozlabs.org
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.
> ozlabs.org_listinfo_linuxppc-2Dusers&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=
> ZzfGBIaWDnwaVH6wsZg3mjvpGLeR7_C0DMvJty2qEhU&m=k-iJ7S-MAA0HA2vH-rPfBU_
> MmoRmlQMGxIV34mTAbBE&s=V7qn-4LRlgxQZ439jMNn7KAdjnHOCFGndfzMgPJgtZY&e=
>
>
>


-- 
Buse
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/linuxppc-users/attachments/20170920/c3e93661/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-users/attachments/20170920/c3e93661/attachment.gif>


More information about the Linuxppc-users mailing list