[PATCH 0/4] 8xx: Optimize TLB Miss code.

Heiko Schocher hs at denx.de
Mon Mar 8 18:46:29 EST 2010


Hello Joakim,

Joakim Tjernlund wrote:
[...]
> What would be interesting is to skip patch 3 and turn off
> MODULES add PIN_TLB and compare that against your unpatched .33 but
> with MODULES off and PIN_TLB on

run     version

1-4	Linux2.6.33-rc without module support and PIN_TLB=on
5-8	Linux2.6.33-rc without module support and PIN_TLB=on + patches 1,2,4

                 L M B E N C H  3 . 0   S U M M A R Y
                 ------------------------------------
		 (Alpha software, do not distribute)

Basic system parameters
------------------------------------------------------------------------------
Host                 OS Description              Mhz  tlb  cache  mem   scal
                                                     pages line   par   load
                                                           bytes
--------- ------------- ----------------------- ---- ----- ----- ------ ----
tqm8xx    Linux 2.6.33-       powerpc-linux-gnu   66    28    16 1.0100    1
tqm8xx    Linux 2.6.33-       powerpc-linux-gnu   66    28    16 1.0400    1
tqm8xx    Linux 2.6.33-       powerpc-linux-gnu   66    28    16 1.0300    1
tqm8xx    Linux 2.6.33-       powerpc-linux-gnu   66    28    16 1.0100    1
tqm8xx    Linux 2.6.33-       powerpc-linux-gnu   66    28    16 1.0400    1
tqm8xx    Linux 2.6.33-       powerpc-linux-gnu   66    28    16 1.0400    1
tqm8xx    Linux 2.6.33-       powerpc-linux-gnu   66     7    16 1.0400    1
tqm8xx    Linux 2.6.33-       powerpc-linux-gnu   66    28    16 1.0100    1

Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host                 OS  Mhz null null      open slct sig  sig  fork exec sh
                             call  I/O stat clos TCP  inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
tqm8xx    Linux 2.6.33-   66 2.97 8.91 127. 1238 270. 22.3 92.1 6386 27.K 83.K
tqm8xx    Linux 2.6.33-   66 3.05 8.99 129. 1208 261. 22.3 85.3 6418 27.K 83.K
tqm8xx    Linux 2.6.33-   66 3.05 8.81 128. 1205 270. 22.3 87.3 6342 27.K 82.K
tqm8xx    Linux 2.6.33-   66 3.05 8.82 132. 1215 270. 23.1 86.7 6357 27.K 82.K
tqm8xx    Linux 2.6.33-   66 3.28 9.29 128. 1257 260. 23.9 83.7 6511 28.K 84.K
tqm8xx    Linux 2.6.33-   66 3.34 9.35 126. 1264 271. 23.1 86.6 6437 27.K 84.K
tqm8xx    Linux 2.6.33-   66 3.19 8.97 130. 1212 271. 23.1 95.3 6480 27.K 84.K
tqm8xx    Linux 2.6.33-   66 3.28 8.76 127. 1229 269. 22.9 90.9 6293 27.K 82.K

Basic integer operations - times in nanoseconds - smaller is better
-------------------------------------------------------------------
Host                 OS  intgr intgr  intgr  intgr  intgr
                          bit   add    mul    div    mod
--------- ------------- ------ ------ ------ ------ ------
tqm8xx    Linux 2.6.33-   15.2   17.9 1.2500  124.1  202.4
tqm8xx    Linux 2.6.33-   15.6   18.0 1.1900  124.1  196.4
tqm8xx    Linux 2.6.33-   15.2   17.9 1.2400  124.9  202.5
tqm8xx    Linux 2.6.33-   15.2   17.9 1.2400  124.2  196.8
tqm8xx    Linux 2.6.33-   15.7   17.9 1.5500  124.2  203.6
tqm8xx    Linux 2.6.33-   15.7   17.9 1.5500  124.2  202.1
tqm8xx    Linux 2.6.33-   15.7   17.9 1.5700  125.0  202.2
tqm8xx    Linux 2.6.33-   15.7   17.9 1.5500  121.1  196.4

Basic uint64 operations - times in nanoseconds - smaller is better
------------------------------------------------------------------
Host                 OS int64  int64  int64  int64  int64
                         bit    add    mul    div    mod
--------- ------------- ------ ------ ------ ------ ------
tqm8xx    Linux 2.6.33-    15.          12.9 1944.1 1895.2
tqm8xx    Linux 2.6.33-    15.          12.9 1886.3 1894.4
tqm8xx    Linux 2.6.33-    15.          12.9 1944.1 1895.2
tqm8xx    Linux 2.6.33-    15.          12.9 1886.3 1894.8
tqm8xx    Linux 2.6.33-    15.          13.2 1944.1 1894.4
tqm8xx    Linux 2.6.33-    15.          13.2 1944.8 1896.3
tqm8xx    Linux 2.6.33-    15.          13.2 1945.2 1837.4
tqm8xx    Linux 2.6.33-    15.          13.2 1957.8 1907.4

Basic float operations - times in nanoseconds - smaller is better
-----------------------------------------------------------------
Host                 OS  float  float  float  float
                         add    mul    div    bogo
--------- ------------- ------ ------ ------ ------
tqm8xx    Linux 2.6.33- 1011.0 1620.2 5467.0 9868.0
tqm8xx    Linux 2.6.33- 1004.5 1630.1 5468.0 9852.0
tqm8xx    Linux 2.6.33- 1012.2 1620.5 5472.0 9855.0
tqm8xx    Linux 2.6.33- 1011.0 1620.2 5469.0 9866.0
tqm8xx    Linux 2.6.33- 1004.8 1617.3 5503.0 9856.0
tqm8xx    Linux 2.6.33- 1004.9 1577.1 5469.0 9859.0
tqm8xx    Linux 2.6.33- 1011.4 1618.5 5470.0 9859.0
tqm8xx    Linux 2.6.33- 1004.9 1620.5 5471.0 9904.0

Basic double operations - times in nanoseconds - smaller is better
------------------------------------------------------------------
Host                 OS  double double double double
                         add    mul    div    bogo
--------- ------------- ------  ------ ------ ------
tqm8xx    Linux 2.6.33- 1555.5 2789.5 3725.7  12.8K
tqm8xx    Linux 2.6.33- 1513.2 2772.0 3720.0  12.7K
tqm8xx    Linux 2.6.33- 1555.8 2772.1 3730.0  12.7K
tqm8xx    Linux 2.6.33- 1555.5 2699.0 3725.0  12.7K
tqm8xx    Linux 2.6.33- 1513.8 2699.5 3610.7  12.7K
tqm8xx    Linux 2.6.33- 1566.7 2771.6 3750.0  12.7K
tqm8xx    Linux 2.6.33- 1556.7 2789.2 3612.1  12.6K
tqm8xx    Linux 2.6.33- 1556.7 2698.5 3749.3  12.6K

Context switching - times in microseconds - smaller is better
-------------------------------------------------------------------------
Host                 OS  2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
                         ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
--------- ------------- ------ ------ ------ ------ ------ ------- -------
tqm8xx    Linux 2.6.33-   64.4   74.9  130.2  111.1  180.4   123.2   211.1
tqm8xx    Linux 2.6.33-   67.4   81.0  125.0  117.0  183.7   127.7   208.4
tqm8xx    Linux 2.6.33-   67.5   80.5   92.7  115.3  156.9   128.0   183.8
tqm8xx    Linux 2.6.33-   67.0   80.2   90.5  114.6  159.4   126.8   185.8
tqm8xx    Linux 2.6.33-   82.0   87.8   88.0  116.1  149.3   125.5   182.2
tqm8xx    Linux 2.6.33-   81.7   98.5   97.6  123.8  158.1   135.3   188.0
tqm8xx    Linux 2.6.33-   67.9   87.7   90.7  114.9  151.1   127.3   177.9
tqm8xx    Linux 2.6.33-   67.5   80.3   84.6  113.6  145.7   124.8   170.9

*Local* Communication latencies in microseconds - smaller is better
---------------------------------------------------------------------
Host                 OS 2p/0K  Pipe AF     UDP  RPC/   TCP  RPC/ TCP
                        ctxsw       UNIX         UDP         TCP conn
--------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
tqm8xx    Linux 2.6.33-  64.4 254.3 455. 648.0       941.8       2505
tqm8xx    Linux 2.6.33-  67.4 261.2 456. 645.8       909.1       2439
tqm8xx    Linux 2.6.33-  67.5 264.8 459. 638.5       932.0       2447
tqm8xx    Linux 2.6.33-  67.0 262.4 454. 643.9       909.9       2442
tqm8xx    Linux 2.6.33-  82.0 302.1 500. 651.4       937.2       2504
tqm8xx    Linux 2.6.33-  81.7 300.2 510. 643.2       909.7       2490
tqm8xx    Linux 2.6.33-  67.9 266.7 498. 645.5       923.4       2442
tqm8xx    Linux 2.6.33-  67.5 260.8 444. 640.3       917.7       2440

*Remote* Communication latencies in microseconds - smaller is better
---------------------------------------------------------------------
Host                 OS   UDP  RPC/  TCP   RPC/ TCP
                               UDP         TCP  conn
--------- ------------- ----- ----- ----- ----- ----
tqm8xx    Linux 2.6.33-
tqm8xx    Linux 2.6.33-
tqm8xx    Linux 2.6.33-
tqm8xx    Linux 2.6.33-
tqm8xx    Linux 2.6.33-
tqm8xx    Linux 2.6.33-
tqm8xx    Linux 2.6.33-
tqm8xx    Linux 2.6.33-

File & VM system latencies in microseconds - smaller is better
-------------------------------------------------------------------------------
Host                 OS   0K File      10K File     Mmap    Prot   Page   100fd
                        Create Delete Create Delete Latency Fault  Fault  selct
--------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
tqm8xx    Linux 2.6.33- 6097.6 3731.3  30.3K 4000.0  4026.0  20.5    31.9 131.9
tqm8xx    Linux 2.6.33- 5747.1 3623.2  32.3K 3952.6  4030.0  16.6    31.0 132.7
tqm8xx    Linux 2.6.33- 5405.4 3610.1  32.3K 3921.6  4004.0  15.5    30.0 131.9
tqm8xx    Linux 2.6.33- 5681.8 3891.1  35.7K 4219.4  3966.0 6.038    30.4 128.7
tqm8xx    Linux 2.6.33-  12.7K 3649.6  34.5K 7092.2  4066.0 3.604    31.4 133.6
tqm8xx    Linux 2.6.33- 5405.4 4032.3  38.5K 5494.5  4036.0  18.1    31.0 128.6
tqm8xx    Linux 2.6.33- 5405.4 3610.1  37.0K 7142.9  4078.0  15.4    31.0 133.2
tqm8xx    Linux 2.6.33- 5714.3 3623.2  30.3K 7194.2  4054.0  12.7    29.9 133.0

*Local* Communication bandwidths in MB/s - bigger is better
-----------------------------------------------------------------------------
Host                OS  Pipe AF    TCP  File   Mmap  Bcopy  Bcopy  Mem   Mem
                             UNIX      reread reread (libc) (hand) read write
--------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- -----
tqm8xx    Linux 2.6.33- 14.9 16.1 13.0   21.4   55.6   32.4   34.5 55.7  53.0
tqm8xx    Linux 2.6.33- 14.9 16.2 12.9   21.3   55.5   32.4   34.5 55.7  53.0
tqm8xx    Linux 2.6.33- 14.8 16.0 13.0   21.4   55.6   32.4   34.5 55.7  53.0
tqm8xx    Linux 2.6.33- 15.0 16.2 13.8   21.3   55.6   32.4   34.5 55.7  53.0
tqm8xx    Linux 2.6.33- 14.9 16.0 13.4   21.3   55.7   32.5   34.6 55.8  53.2
tqm8xx    Linux 2.6.33- 15.1 16.2 13.6   21.3   55.7   32.5   34.6 55.8  53.2
tqm8xx    Linux 2.6.33- 15.0 16.2 12.9   21.3   55.7   32.5   34.6 55.8  53.2
tqm8xx    Linux 2.6.33- 15.1 16.2 13.1   21.5   55.7   32.5   34.7 55.8  53.2

Memory latencies in nanoseconds - smaller is better
    (WARNING - may not be correct, check graphs)
------------------------------------------------------------------------------
Host                 OS   Mhz   L1 $   L2 $    Main mem    Rand mem    Guesses
--------- -------------   ---   ----   ----    --------    --------    -------
tqm8xx    Linux 2.6.33-    66   31.7  183.2       184.0      1163.0    No L2 cache?
tqm8xx    Linux 2.6.33-    66   31.7  183.2       184.0      1164.8    No L2 cache?
tqm8xx    Linux 2.6.33-    66   31.7  183.2       184.0      1163.2    No L2 cache?
tqm8xx    Linux 2.6.33-    66   31.7  183.2       183.8      1163.7    No L2 cache?
tqm8xx    Linux 2.6.33-    66   31.8  172.4       173.2      1147.3    No L2 cache?
tqm8xx    Linux 2.6.33-    66   31.8  172.5       173.2      1148.3    No L2 cache?
tqm8xx    Linux 2.6.33-    66   31.8  172.5       173.1      1146.9    No L2 cache?
tqm8xx    Linux 2.6.33-    66   31.8  172.5       173.2      1147.3    No L2 cache?

make[1]: Leaving directory `/home/hs/lmbench-3.0-a9/results'
-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany


More information about the Linuxppc-dev mailing list