mapping memory in 0xb space

Wed Sep 29 15:14:08 EST 2004

On Wed, 29 Sep 2004, David Gibson wrote:

> On Tue, Sep 28, 2004 at 01:52:16PM -0500, Igor Grobman wrote:
> > On Tue, 28 Sep 2004, David Gibson wrote:
> > 
> > >  Recent kernels don't even
> > > have VSIDs allocated for the 0xb... region.
> > 
> > Looking at both 2.6.8 and 2.4.21, I don't see a difference in
> > get_kernel_vsid() code.
> 
> Ok, *very* recent kernels.  The new VSID algorithm has gone into the
> BK tree since 2.6.8.

>From the description I read, I might be better off using 0xfff.. addresses 
with that algorithm.  Not a big deal.

> 
> > This leaves segments.  Both
> > DataAccess_common and DataAccessSLB_common call
> > do_stab_bolted/do_slb_bolted when confronted with an address in 0xb
> > region.
> 
> Oh, so it does.  That, I think is a 2.4 thing, long gone in 2.6 (even
> before the SLB rewrite, I'm pretty sure do_slb_bolted was only called
> for 0xc addresses).

In my 2.4.21 source, do_slb_bolted does get called for 0xb addresses.
And thanks for letting me know about power4 being SLB.  I was clueless on 
the issue.

> 
> > Presumably, this will fault in the segments I am interested in.
> 
> Yes, actually, it should.  Ok, I guess the problem is deeper than I
> thought.

Or is it?

> 
> 
> > Also, I narrowed it down to
> > working (or appearing to work) as long as the highest 5 bits of the page
> > index (those that end up as partial index in the HPTE) are zero.  This may
> > just be a weird coincidence.
> 
> Could be.
> 
> 
> > > Why on earth do you want to do this?
> > 
> > Good question ;-).  A long long time ago, I posted on this list and
> > explained.  Since then, I found what appeared to be a solution, except
> > that it appears power4 breaks it.  I am building a tool that allows
> > dynamic splicing of code into a running kernel (see
> > http://www.paradyn.org/html/kerninst.html).  In order for this to work, I
> > need to be able to overwrite a single instruction with a jump to
> > spliced-in code.  The target of the jump needs to be within the range (26
> > bits).  Therefore, I have a choice of 0xbfff.. addresses with backward
> > jumps from 0xc region, or the 0xff.. addresses for absolute jumps.  I
> > chose 0xbff.., because I found already-working code, originally written
> > for the performance counter interface.  Am I making more sense now?
> 
> Aha!  But this does actually explain the problem - there are only
> VSIDs assigned for the first 2^41 bits of each region - so although
> there are vsids for 0xb000000000000000-0xb00001ffffffffff, there
> aren't any for 0xbff... addresses.  Likewise the Linux pagetables only
> cover a 41-bit address range, but that won't matter if you're creating
> HPTEs directly.

And this is why I avoided explaining fully in my first email :-).  I'd 
like to solve one problem at a time.  What I said in my initial email
is accurate.  Even within the valid VSID range, if the highest 5 bits of 
the page index are not zero, I get a crash on access (e.g.  
0xb00001FFFFF00000, but works on 0xb00001FFF0000000).  

As for why I thought 0xbff would work,  I reasoned that
since the highest bits are masked out in get_kernel_vsid(), and since 
nobody else is using the 0xb region, it doesn't matter if I get a VSID 
that is the same as some other VSID in 0xb region.  However, I did not 
consider the bug in do_slb_bolted that you describe below.

> 
> You may have seen the comment in do_slb_bolted which claims to permit
> a full 32-bits of ESID - it's wrong.  The code doesn't mask the ESID
> down to 13 bits as get_kernel_vsid() does, but it probably should - an
> overlarge ESID will cause collisions with VSIDs from entirely
> different address places, which would be a Bad Thing.

This must be happening, although I would still like to know why it 
misbehaves even within the valid VSID range.

> 
> Actually, you should be able to allow ESIDs of up to 21 bits there (36
> bit VSID - 15 bits of "context").  But you will need to make sure
> get_kernel_vsid(), or whatever you're using to calculate the VAs for
> the hash HPTEs is updated to match - at the moment I think it will
> mask down to 13 bits.  I'm not sure if that will get you sufficiently
> close to 0xc0... for your purposes.

No, it's not close enough--I really must have that very last segment.   
It sounds like I was simply getting lucky on the power3 machine.
Without the mask, I must have been getting random pages, and
happily overwriting them.  

Any ideas on how I might  map that very last segment of 0xb, or for
that matter the very last segment of 0xf ?  It need not be pretty,
but it cannot involve modifying the kernel source, though it can rely on
whatever dirty tricks a kernel module might get away with.  I don't
want to modify the source, because I would like the tool to work on 
unmodified kernels.

It's starting to sound like an impossible task (at least on non-recent 
kernels).  I think I might go with a backup suboptimal solution, which 
involves extra jumps, but at least it might work.

Thanks again,
Igor

> 
> > > On Mon, Sep 27, 2004 at 02:47:15PM -0500, Igor Grobman wrote:
> > > > I would like to be able to remap memory into 0xb space with 2.4.21
> > kernel.
> > > > I have code that works fine on power3 box, but fails miserably on my
> > dual
> > > > power4+ box.  I am using btmalloc() from pmc.c with a modified range.
> > > > Normal btmalloc() allocation works fine, but if I change the range to
> > > > start with, e.g. 0xb00001FFFFF00000 (instead of 0xb00..), the kernel
> > > > crashes when accessing the allocated page.
> > > >
> > > > For those of you unfamiliar with btmalloc() code, it finds a physical
> > page
> > > > using get_free_pages(), subtracts PAGE_OFFSET (i.e. 0xc00...) to form
> > a
> > > > physical address, then inserts the page into linux page tables, using
> > the
> > > > VSID calculated with get_kernel_vsid() and the physical address
> > > > calculated above.  It also inserts an HPTE into the hardware page
> > table.
> > > > It doesn't do anything with regards to allocating segments.  I
> > understand
> > > > the segments should be faulted in.  I looked at the code in head.S,
> > and it
> > > > appears that do_stab_bolted should be doing this.  Yet, I am missing
> > someth$
> > > > because the btmalloc() code does not in fact work for pages in the
> > range I
> > > > specified above.
> > > >
> > > > I am running this on 7029-6E3 (p615?) machine with two power4+
> > processors.
> > > > I am using the kernel from Suse's 2.4.21-215 source package.
> > > >
> > > > Any ideas and attempts to un-confuse me are welcome.
> > 
> 
>