mapping memory in 0xb space

Fri Oct 1 14:03:25 EST 2004

On Wed, Sep 29, 2004 at 12:14:08AM -0500, Igor Grobman wrote:
> On Wed, 29 Sep 2004, David Gibson wrote:
> 
> > On Tue, Sep 28, 2004 at 01:52:16PM -0500, Igor Grobman wrote:
> > > On Tue, 28 Sep 2004, David Gibson wrote:
> > > 
> > > >  Recent kernels don't even
> > > > have VSIDs allocated for the 0xb... region.
> > > 
> > > Looking at both 2.6.8 and 2.4.21, I don't see a difference in
> > > get_kernel_vsid() code.
> > 
> > Ok, *very* recent kernels.  The new VSID algorithm has gone into the
> > BK tree since 2.6.8.
> 
> >From the description I read, I might be better off using 0xfff.. addresses 
> with that algorithm.  Not a big deal.

Perhaps.  However, there are issues there as well: older kernels have
the same 41-bit address restriction (maybe somewhat extendable) in the
0xf region, just like 0xb.  The new VSID algo gives VSIDs for every
address above 0xc000000000000000 *except* the very last segment,
0xfffffffff0000000-0xffffffffffffffff.

> > > This leaves segments.  Both
> > > DataAccess_common and DataAccessSLB_common call
> > > do_stab_bolted/do_slb_bolted when confronted with an address in 0xb
> > > region.
> > 
> > Oh, so it does.  That, I think is a 2.4 thing, long gone in 2.6 (even
> > before the SLB rewrite, I'm pretty sure do_slb_bolted was only called
> > for 0xc addresses).
> 
> In my 2.4.21 source, do_slb_bolted does get called for 0xb addresses.
> And thanks for letting me know about power4 being SLB.  I was clueless on 
> the issue.

> > > Presumably, this will fault in the segments I am interested in.
> > 
> > Yes, actually, it should.  Ok, I guess the problem is deeper than I
> > thought.
> 
> Or is it?
> 
> > > Also, I narrowed it down to
> > > working (or appearing to work) as long as the highest 5 bits of the page
> > > index (those that end up as partial index in the HPTE) are zero.  This may
> > > just be a weird coincidence.
> > 
> > Could be.
> > 
> > > > Why on earth do you want to do this?
> > > 
> > > Good question ;-).  A long long time ago, I posted on this list and
> > > explained.  Since then, I found what appeared to be a solution, except
> > > that it appears power4 breaks it.  I am building a tool that allows
> > > dynamic splicing of code into a running kernel (see
> > > http://www.paradyn.org/html/kerninst.html).  In order for this to work, I
> > > need to be able to overwrite a single instruction with a jump to
> > > spliced-in code.  The target of the jump needs to be within the range (26
> > > bits).  Therefore, I have a choice of 0xbfff.. addresses with backward
> > > jumps from 0xc region, or the 0xff.. addresses for absolute jumps.  I
> > > chose 0xbff.., because I found already-working code, originally written
> > > for the performance counter interface.  Am I making more sense now?
> > 
> > Aha!  But this does actually explain the problem - there are only
> > VSIDs assigned for the first 2^41 bits of each region - so although
> > there are vsids for 0xb000000000000000-0xb00001ffffffffff, there
> > aren't any for 0xbff... addresses.  Likewise the Linux pagetables only
> > cover a 41-bit address range, but that won't matter if you're creating
> > HPTEs directly.
> 
> And this is why I avoided explaining fully in my first email :-).  I'd 
> like to solve one problem at a time.  What I said in my initial email
> is accurate.  Even within the valid VSID range, if the highest 5 bits of 
> the page index are not zero, I get a crash on access (e.g.  
> 0xb00001FFFFF00000, but works on 0xb00001FFF0000000).  

Hrm.  Ok.  I'm not sure why that would be.

> As for why I thought 0xbff would work,  I reasoned that
> since the highest bits are masked out in get_kernel_vsid(), and since 
> nobody else is using the 0xb region, it doesn't matter if I get a VSID 
> that is the same as some other VSID in 0xb region.  However, I did not 
> consider the bug in do_slb_bolted that you describe below.

Yes, with that bug the collision can be with a segment anywhere, not
just in the 0xb region.

> > You may have seen the comment in do_slb_bolted which claims to permit
> > a full 32-bits of ESID - it's wrong.  The code doesn't mask the ESID
> > down to 13 bits as get_kernel_vsid() does, but it probably should - an
> > overlarge ESID will cause collisions with VSIDs from entirely
> > different address places, which would be a Bad Thing.
> 
> This must be happening, although I would still like to know why it 
> misbehaves even within the valid VSID range.
> 
> > 
> > Actually, you should be able to allow ESIDs of up to 21 bits there (36
> > bit VSID - 15 bits of "context").  But you will need to make sure
> > get_kernel_vsid(), or whatever you're using to calculate the VAs for
> > the hash HPTEs is updated to match - at the moment I think it will
> > mask down to 13 bits.  I'm not sure if that will get you sufficiently
> > close to 0xc0... for your purposes.
> 
> No, it's not close enough--I really must have that very last segment.   
> It sounds like I was simply getting lucky on the power3 machine.
> Without the mask, I must have been getting random pages, and
> happily overwriting them.  
> 
> Any ideas on how I might  map that very last segment of 0xb, or for
> that matter the very last segment of 0xf ?  It need not be pretty,
> but it cannot involve modifying the kernel source, though it can rely on
> whatever dirty tricks a kernel module might get away with.  I don't
> want to modify the source, because I would like the tool to work on 
> unmodified kernels.

Um... right.  You know, I'm really not sure its possible without
changing the kernel source, short of binary patching the do_slb_bolted
code from a module.  Sorry.  The segment code's just really not set up
to handle this.

Though, come to that, you do only need one segment, so it might not be
that hard to binary patch in branch to some code of your own which
provides a VSID for that one segment.

> It's starting to sound like an impossible task (at least on non-recent 
> kernels).  I think I might go with a backup suboptimal solution, which 
> involves extra jumps, but at least it might work.

That may be a better idea.

-- 
David Gibson			| For every complex problem there is a
david AT gibson.dropbear.id.au	| solution which is simple, neat and
				| wrong.
http://www.ozlabs.org/people/dgibson