[PATCH 3/6] 8xx: invalidate non present TLBs
Joakim Tjernlund
joakim.tjernlund at transmode.se
Fri Oct 9 10:01:34 EST 2009
Benjamin Herrenschmidt <benh at kernel.crashing.org> wrote on 09/10/2009 00:23:48:
>
> On Thu, 2009-10-08 at 15:08 -0700, Dan Malek wrote:
> > Hi Ben.
> >
> > On Oct 8, 2009, at 1:28 PM, Benjamin Herrenschmidt wrote:
> >
> > > While you are around ... I have a question :-)
> >
> > I'll try. Many brain cells have been replaced or lost
> > over the years :-)
>
> Replaced ? You lucky ! I only lose mines :-)
>
> > I thought we did a tlbie() (or whatever the equivalent is today)
> > when the PTE was updated in the table. An optimization was to
> > load the TLB with the entry at that time to avoid a subsequent miss.
> > In any case, the TLB entry has to be modified by the software.
>
> Ok, that's my understanding too and I think we had the tlbie in
> update_mmu_cache to do the trick, though the comment is misleading
> making it think that the only reason it's there is for the dcbst
> problem. At least that's my understanding. That was lost recently in 2.6
> so I'll have to put it back properly.
So you don't think my invalidate "only !present pages" patch in do_page_fault
is enough?
>
> I don't think we do the pre-load to avoid the second fault, but we
> certainly could and should.
>
> > I don't remember how C was used in the past, but I suspect
> > it just mirrored the Linux VM state. I'm quite certain the MMU
> > HW would never change a TLB entry. Where did you read this?
>
> MPC855UM, chapter 8.6 "Memory attributes":
>
> <<
> • Reference and change bit updates—The MPC850 does not generate an exception for
> an R (reference) bit update. In fact, there is no entry for an R bit in the TLB.
> The change bit (C) is bit 23 in the level-two descriptor, described in Table 8-4.
> Software updates C (changed) bits, but hardware treats the C bit (negated) as a
> write-protect attribute. Therefore, attempting to write to a page marked unmodified
> invalidates that entry and causes an implementation-specific DTLB error exception.
> ^^^^^^^^^^^^^^^^^^^^^^
> If change bits are not needed, set the C bit to one by default in the PTEs.
> >>
>
> And yes, it's weird and that's the only place I think where this is
> mentioned which makes me think it could well be a doco bug.
>
> > For most of the 8xx "early days," I used to just mark all write
> > pages as dirty. For some reason I just overloaded the write/changed
> > into one bit, it avoided TLB Error overhead and I think even some
> > silicon bugs. Since they were tiny and fairly static embedded
> > systems, it didn't have any effect on the way VM was managed.
>
> Well, nowadays at least, most of the time, a writeable page is also
> dirty unless it's a writeable shared mapping, and in that later case
> you really want to do proper dirty tracking. So I suspect we can
> simplify some of that and let the generic code handle dirty by mapping
> _PAGE_DIRTY to C and removing _PAGE_HWWRITE. We can also remove all
> of the asm munging from DataTLBError, and let the generic C code fixup
> DIRTY and ACCESSED when needed, since those should only rarely need a
> fixup.
>
> > The MMU HW on the 8xx is just a translator. I'm now really
> > certain it won't ever change a TLB entry. It's completely up to
> > software to make all TLB changes.
>
> That makes sense.
>
> > Just make it simple :-)
>
> Yeah. I think we can simplify the current code a lot, which will speed
> up TLB misses (well, nothing much you can do about the infamous errata
> #6 but that's another story). It won't give 2.6 back the 2.4 speed due
> to sheer bloat of the generic code but it might at least offset some of
> the loss by improving the overall TLB miss performance.
It won't get much faster than my current patch. Trapping all DTLB
Errors to C won't make it faster, only more correct should there be
a bug in the asm version. Actually there is one that has been there
all the time, guarded flag is not set by DTLB Error.
Jocke
>
> Now, I don't have any 8xx gear, so it will be up to Joakim, Scott etc...
> to get that stuff right :-)
Waiting for Rex and Scott to comment/test.
More information about the Linuxppc-dev
mailing list