[PATCH v2] powerpc: warn on emulation of dcbz instruction in kernel mode

Christian Lamparter chunkeey at gmail.com
Sat Aug 24 09:43:21 AEST 2024


On 8/23/24 9:19 PM, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Aug 23, 2024 at 03:54:59PM +0200, Christoph Hellwig wrote:
>> On Fri, Aug 23, 2024 at 08:06:00AM -0500, Segher Boessenkool wrote:
>>> What does "uncached memory" even mean here?  Literally it would be
>>> I=1 memory (uncachEABLE memory), but more likely you want M=0 memory
>>> here ("non-memory memory", "not well-behaved memory", MMIO often).
>>
>> Regular kernel memory vmapped with pgprot_noncached().
> 
> So, I=1 (and G=1).  Caching inhibited and guarded.  But M=1 (memory
> coherence required) as with any other real memory :-)
> 
>>> If memset() is expected to be used with M=0, you cannot do any serious
>>> optimisations to it at all.  If memset() is expected to be used with I=1
>>> it should use a separate code path for it, probably the caller should
>>> make the distinction.
>>
>> DMA coherent memory which uses uncached memory for platforms that
>> do not provide hardware dma coherence can end up just about anywhere
>> in the kernel.  We could use special routines for a few places in
>> the DMA subsystem, but there might be plenty of others.
> 
> Yeah.  It will just be plenty slow, as we see here, that's what the
> warning is for; but it works just fine :-)
> 
> The memset() code itself could chech for the storage attributes, but
> that is probably more expensive than just assuming the happy case.
> Maybe someone could try it out though!

Hmm, Ok! For what's worth I can at least test memset with dcbz+trap and
what it was in 2015, without dcbz in the code path. How about that?

I figured out of all the offenders (ethernet, crypto and sata).
The sata/hard drive would be the most sensitive device to measure any
performance difference. the MyBook Live already had an harddrive
(Seagate ST380815AS (very old)) installed... so I went with that.

I test with OpenWrt, since it has a fully working PowerPC images for
the device, I can use initramfs (so HDD/SDD is idle) and provides a
very bare minimum the hdparm -t "benchmark".
(hdparm -t ... just reads for three seconds and tells you how much it read).

the unmodified 6.6.47 kernel scored:

| Timing buffered disk reads: 220 MB in  3.02 seconds =  72.93 MB/sec
| Timing buffered disk reads: 222 MB in  3.02 seconds =  73.50 MB/sec
| Timing buffered disk reads: 216 MB in  3.00 seconds =  71.94 MB/sec

from what I can tell, each hdparm -t /dev/sda causes ~77000 fix_alignment traps.
(/sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size says it's 32 and
type is obviously "Data". If I'm not mistaken this means ~2400KiB of emulated
dcbz by the trap.)

For the test, I added the "old" memset from
<https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/lib/copy_32.S?id=df087e450d7ddc0b15bd8824206d964720b4f5e4#n120>
and replaced 6.6.47's memset in dma_pool_alloc() with it
<https://elixir.bootlin.com/linux/v6.6.47/source/mm/dmapool.c#L435>

now no WARNINGS are triggered and hdparm -t /dev/sda produces:

| Timing buffered disk reads: 220 MB in  3.00 seconds =  73.32 MB/sec
| Timing buffered disk reads: 218 MB in  3.02 seconds =  72.28 MB/sec
| Timing buffered disk reads: 224 MB in  3.03 seconds =  74.02 MB/sec

virtually no benefit?! Well, the HDD could be too slow. Let's try an old SSD:
Samsung 840 Evo 120 GB. This one manages to read 1276 MB in 3.06 seconds = ~416 MB/sec
in the same hdparm -t test on a reasonably modern PC when connected via a
usb3<->sata adapter.

unmodified 6.6.47 kernel:

| Timing buffered disk reads: 356 MB in  3.00 seconds = 118.61 MB/sec
| Timing buffered disk reads: 358 MB in  3.01 seconds = 119.12 MB/sec
| Timing buffered disk reads: 358 MB in  3.01 seconds = 119.03 MB/sec

modified 6.6.47 kernel:

| Timing buffered disk reads: 380 MB in  3.01 seconds = 126.30 MB/sec
| Timing buffered disk reads: 374 MB in  3.00 seconds = 124.61 MB/sec
| Timing buffered disk reads: 382 MB in  3.02 seconds = 126.62 MB/sec

Ok! There's something there. ~4%.

Cheers,
Christian


More information about the Linuxppc-dev mailing list