[v3 0/9] parallelized "struct page" zeroing

David Miller davem at davemloft.net
Thu May 11 04:00:26 AEST 2017


From: Matthew Wilcox <willy at infradead.org>
Date: Wed, 10 May 2017 10:17:03 -0700

> On Wed, May 10, 2017 at 11:19:43AM -0400, David Miller wrote:
>> From: Michal Hocko <mhocko at kernel.org>
>> Date: Wed, 10 May 2017 16:57:26 +0200
>> 
>> > Have you measured that? I do not think it would be super hard to
>> > measure. I would be quite surprised if this added much if anything at
>> > all as the whole struct page should be in the cache line already. We do
>> > set reference count and other struct members. Almost nobody should be
>> > looking at our page at this time and stealing the cache line. On the
>> > other hand a large memcpy will basically wipe everything away from the
>> > cpu cache. Or am I missing something?
>> 
>> I guess it might be clearer if you understand what the block
>> initializing stores do on sparc64.  There are no memory accesses at
>> all.
>> 
>> The cpu just zeros out the cache line, that's it.
>> 
>> No L3 cache line is allocated.  So this "wipe everything" behavior
>> will not happen in the L3.
> 
> There's either something wrong with your explanation or my reading
> skills :-)
> 
> "There are no memory accesses"
> "No L3 cache line is allocated"
> 
> You can have one or the other ... either the CPU sends a cacheline-sized
> write of zeroes to memory without allocating an L3 cache line (maybe
> using the store buffer?), or the CPU allocates an L3 cache line and sets
> its contents to zeroes, probably putting it in the last way of the set
> so it's the first thing to be evicted if not touched.

There is no conflict in what I said.

Only an L2 cache line is allocated and cleared.  L3 is left alone.


More information about the Linuxppc-dev mailing list