[PATCH] Fix for OProfile callgraph for Power 64 bit user apps

Carl Love cel at us.ibm.com
Sat May 17 02:32:03 EST 2008


On Thu, 2008-05-15 at 11:01 -0700, Carl Love wrote:
> On Thu, 2008-05-15 at 20:47 +1000, Paul Mackerras wrote:
> > Carl Love writes:
> > 
> > > The following patch fixes the 64 bit user code backtrace 
> > > which currently may hang the system.  
> > 
> > What exactly is wrong with it?
> > 
> > Having now taken a much closer look, I now don't think Nate Case's
> > patch addresses this, since it only affects constant size arguments
> > <= 8 to copy_{to,from}_user_inatomic.
> > 
> > However, I don't see why your patch fixes anything.  It means we do
> > two access_ok calls and two __copy_from_user_inatomic calls, for 8
> > bytes, at sp and at sp + 16, rather than doing one access_ok and
> > __copy_from_user_inatomic for 24 bytes at sp.  Why does that make any
> > difference (apart from being slower)?
> > 
> > Paul
> 
> When I tried testing the oprofile call graph on a 64 bit app the system
> consistently hung.  I was able to isolate it to the
> __copy_from_user_inatomic() call.  When I made the change in my patch to
> make sure I was only requesting one of the values (8bytes) listed in the
> case statement this fixed the issue.  I do not know nor was I able to
> figure out why the __copy_from_user_inatomic() call failed trying to
> read 24 bytes.  The system would hang and any attempt to use printk to
> see what was going on failed as the output of the print would not go to
> the console before the system hangs.  
> 
> I backed out my patch, put in Nate's patch.  The call graph test ran
> fine.  I then backed out Nate's patch to go back and try to re-validate
> that the system still hangs with the original code and it is not
> hanging.  Not sure why it now seems to work.  I have done some other
> work on the system but I don't see how that would have changed this.
> Argh, I hate chasing phantom bugs!  I was working on 2.6.21. I believe
> the 2.6.21 kernel had not been changed. Let me load the latest 2.6.25
> and start over with a pristine kernel and see if I can reproduce the
> hang.  Sorry for all the hassle.
> 
>                  Carl Love
> 

I installed the latest 2.6.25 kernel and tested OProfile call graph on
the 64 bit user application.  I did not see any hangs for the tests that
I ran.  I tried things multiple times.  So, I guess we should drop the
OProfile callgraph patch.  Clearly if there still is a problem it is not
in how the OProfile call graph code is written but is probably in the
underlying calls, i.e. __copy_from_user_inatomic().  I will continue to
test the functionality and see if I can find a example where the system
will hang so we can investigate the underlying cause.

Thank you for your time on this. 

               Carl Love





More information about the Linuxppc-dev mailing list