v2.6 performance slowdown on MPC8xx: Measuring TLB cache misses
Marcelo Tosatti
marcelo.tosatti at cyclades.com
Fri Apr 22 04:32:39 EST 2005
Hi everyone,
I found out that the previous TLB counter numbers were wrong, two
of the values were switched!
CPU is a 48Mhz 855T with 32 TLB entries, and 128Mb of RAM.
Now I've got valid results. With an idle machine, this are the results
of /proc/tlbmiss capture session with 1 second interval. Note that
idle actually means about 4/5 processes (AcsWeb, cy_pmd, cy_alarm, cy_wdt
kernel's keventd) running and switching over, but CPU is about 96-97%
idle.
As you can see, the ratio which TLB misses happen in v2.6 is
significantly higher, for both I/D caches, even with an almost idle machine.
The v2.6 kernel has grown in size relative to TLB usage (cache footprint),
which is, I start to believe, the major cause for this issue. If that
is the case other platforms will also suffer.
As one example, the number of page addresses which the "sys_read()"
system call needs to fetch to the I-cache in order to execute the task
(the calltree) is about twice in size as in v2.4.
Pantelis Antoniou informed that that 64 TLB-entry versions of MPC8xx
processors do not suffer such significant performance slowdown.
One point in reading these numbers is that v2.6 will count twice for
page fault misses which result in pte creation (DataTLBMiss->DataTLBError),
but I hope to change that for better precision. In this specific
case I guess it should not be significant given that no processes are
being created, mostly already mapped (periodic) routines are running.
I hope that capturing the TLB miss difference between v2.4 and v2.6
on a simple CPU intense benchmark such as the "dd" I've been using before
and multiplying that by translation cache miss penalty (20-23 clocks
on a miss versus 1 clock on a hit) should give us a good estimate
the real cost of these misses).
And I wonder, no other arches have been noticed this?
Comments are appreciated.
Capture session of /proc/tlbmiss with 1 second interval:
v2.6: v2.4:
I-TLB userspace misses: 2577 I-TLB userspace misses: 2192
I-TLB kernel misses: 1557 I-TLB kernel misses: 1328
D-TLB userspace misses: 7173 D-TLB userspace misses: 6801
D-TLB kernel misses: 4442 D-TLB kernel misses: 4260
* *
I-TLB userspace misses: 5324 I-TLB userspace misses: 4557
I-TLB kernel misses: 3277 I-TLB kernel misses: 2821
D-TLB userspace misses: 14399 D-TLB userspace misses: 13816
D-TLB kernel misses: 9069 D-TLB kernel misses: 8734
* *
I-TLB userspace misses: 8078 I-TLB userspace misses: 7003
I-TLB kernel misses: 4960 I-TLB kernel misses: 4360
D-TLB userspace misses: 22038 D-TLB userspace misses: 20952
D-TLB kernel misses: 13929 D-TLB kernel misses: 13299
* *
I-TLB userspace misses: 10791 I-TLB userspace misses: 9404
I-TLB kernel misses: 6643 I-TLB kernel misses: 5874
D-TLB userspace misses: 29350 D-TLB userspace misses: 27963
D-TLB kernel misses: 18555 D-TLB kernel misses: 17768
* *
I-TLB userspace misses: 13531 I-TLB userspace misses: 11801
I-TLB kernel misses: 8311 I-TLB kernel misses: 7390
D-TLB userspace misses: 36750 D-TLB userspace misses: 35123
D-TLB kernel misses: 23271 D-TLB kernel misses: 22416
* *
I-TLB userspace misses: 16434 I-TLB userspace misses: 14229
I-TLB kernel misses: 10172 I-TLB kernel misses: 8925
D-TLB userspace misses: 51096 D-TLB userspace misses: 42241
D-TLB kernel misses: 34982 D-TLB kernel misses: 26995
* *
I-TLB userspace misses: 19183 I-TLB userspace misses: 16646
I-TLB kernel misses: 11890 I-TLB kernel misses: 10445
D-TLB userspace misses: 58557 D-TLB userspace misses: 49291
D-TLB kernel misses: 39726 D-TLB kernel misses: 31479
* *
I-TLB userspace misses: 21973 I-TLB userspace misses: 19125
I-TLB kernel misses: 13596 I-TLB kernel misses: 12011
D-TLB userspace misses: 65933 D-TLB userspace misses: 56376
D-TLB kernel misses: 44401 D-TLB kernel misses: 36025
* *
I-TLB userspace misses: 24644 I-TLB userspace misses: 21509
I-TLB kernel misses: 15231 I-TLB kernel misses: 13526
D-TLB userspace misses: 73345 D-TLB userspace misses: 63431
D-TLB kernel misses: 49083 D-TLB kernel misses: 40567
* *
I-TLB userspace misses: 27451 I-TLB userspace misses: 23894
I-TLB kernel misses: 16974 I-TLB kernel misses: 15031
D-TLB userspace misses: 80652 D-TLB userspace misses: 70467
D-TLB kernel misses: 53739 D-TLB kernel misses: 45089
More information about the Linuxppc-embedded
mailing list