<br><font size=2><tt>openib-general-bounces@openib.org wrote on 05/09/2006

09:49:19 AM:<br>

<br>

&gt; Quoting r. Roland Dreier &lt;rdreier@cisco.com&gt;:<br>

&gt; &gt; The trivial way to do it would be to use the same idea as the

current<br>

&gt; &gt; ehca driver: just create a thread for receive CQ events and a

thread<br>

&gt; &gt; for send CQ events, and defer CQ polling into those two threads.<br>

</tt></font>

<br><font size=2><tt>I have done some patch like that on top of splitting

CQ. The problem I found that hardware interrupt favors one CPU. Most of

the time these two threads are running on the same cpu according to my

debug output. You can easily find out by cat /proc/interrupts and /proc/irq/XXX/smp_affinity.

ehca has distributed interrupts evenly on SMP, so it gets the benefits

of two threads, and gains much better throughputs.</tt></font>

<br>

<br><font size=2><tt>The interesting thing is the UP results are much better

than SMP results with this approach on mthca.</tt></font>

<br><font size=2><tt><br>

&gt; For RX, isn't this basically what NAPI is doing?<br>

&gt; Only NAPI seems better, avoiding interrupts completely and avoiding

<br>

&gt; latency hit<br>

&gt; by only getting triggered on high load ...<br>

&gt; <br>

&gt; -- <br>

&gt; MST<br>

</tt></font>

<br><font size=2><tt>According to some results from different resouces,

NAPI only gives 3%-10% performance improvement on single CQ.</tt></font>

<br><font size=2><tt>I am trying a simple NAPI patch on splitting CQ now

to see how much performance there.</tt></font>

<br>

<br><font size=2><tt>Thanks</tt></font>

<br><font size=2 face="sans-serif">Shirley Ma<br>

IBM Linux Technology Center<br>

15300 SW Koll Parkway<br>

Beaverton, OR 97006-6063<br>

Phone(Fax): (503) 578-7638</font>