[patch 4/4] powerpc 2.6.21-rt1: reduce scheduling latency by changing tlb flush size

Tsutomu OWA tsutomu.owa at toshiba.co.jp
Tue May 15 18:08:33 EST 2007


At Tue, 15 May 2007 17:38:27 +1000, Benjamin Herrenschmidt wrote:

> > +#ifdef CONFIG_PREEMPT_RT
> > +	/*
> > +	 * Since flushing tlb needs expensive hypervisor call(s) on celleb,
> > +	 * always flush it on RT to reduce scheduling latency.
> > +	 */
> > +	if (machine_is(celleb)) {
> > +		flush_tlb_pending();
> > +		return;
> > +	}
> > +#endif /* CONFIG_PREEMPT_RT */

> Any reason to do that only on celleb ? :-)

  Well, at least it does not harm any other platforms :)

> Also, we might want to still batch, though in smaller amount. 

  Yeah, it's a good point.

>                                                                Have you measured
> the time it takes ? We might want to modulate the amount based on wether we
> are using native hash tables or an hypervisor.

  Yes, here is the trace log.  Accordint it, flushing  9 entries takes about 50us.
It means that flushing 192 (PPC64_TLB_BATCH_NR) entries takes 1ms.
Please note that tracing causes *roughly* 20-30% overhead, though.

  I'm afraid I don't have numbers w/ native version, but I suppose you have :)

# cat /proc/latency_trace
preemption latency trace v1.1.5 on 2.6.21-rt1
--------------------------------------------------------------------
 latency: 60 us, #53/53, CPU#0 | (M:rt VP:0, KP:0, SP:1 HP:1 #P:1)
    -----------------
    | task: inetd-358 (uid:0 nice:0 policy:0 rt_prio:0)
    -----------------
 => started at: .__switch_to+0xa8/0x178 <c00000000000feec>
 => ended at:   .__start+0x4000000000000000/0x8 <00000000>

                 _------=> CPU#            
                / _-----=> irqs-off        
               | / _----=> need-resched    
               || / _---=> hardirq/softirq 
               ||| / _--=> preempt-depth   
               |||| /                      
               |||||     delay             
   cmd     pid ||||| time  |   caller      
      \   /    |||||   \   |   /           
   inetd-358   0D..2    0us : .user_trace_start (.__switch_to)
   inetd-358   0D..2    0us : .rt_up (.user_trace_start)
   inetd-358   0D..3    1us : .rt_mutex_unlock (.rt_up)
   inetd-358   0D..3    2us : .__flush_tlb_pending (.__switch_to)
   inetd-358   0D..4    3us : .flush_hash_range (.__flush_tlb_pending)
   inetd-358   0D..4    3us : .flush_hash_page (.flush_hash_range)
   inetd-358   0D..4    4us : .beat_lpar_hpte_invalidate (.flush_hash_page)
   inetd-358   0D..4    5us : .__spin_lock_irqsave (.beat_lpar_hpte_invalidate)
   inetd-358   0D..5    6us+: .beat_lpar_hpte_getword0 (.beat_lpar_hpte_invalidate)
   inetd-358   0D..5   13us : .__spin_unlock_irqrestore (.beat_lpar_hpte_invalidate)
   inetd-358   0D..4   13us : .flush_hash_page (.flush_hash_range)
   inetd-358   0D..4   14us : .beat_lpar_hpte_invalidate (.flush_hash_page)
   inetd-358   0D..4   14us : .__spin_lock_irqsave (.beat_lpar_hpte_invalidate)
   inetd-358   0D..5   15us+: .beat_lpar_hpte_getword0 (.beat_lpar_hpte_invalidate)
   inetd-358   0D..5   18us : .__spin_unlock_irqrestore (.beat_lpar_hpte_invalidate)
   inetd-358   0D..4   19us : .flush_hash_page (.flush_hash_range)
   inetd-358   0D..4   20us : .beat_lpar_hpte_invalidate (.flush_hash_page)
   inetd-358   0D..4   20us : .__spin_lock_irqsave (.beat_lpar_hpte_invalidate)
   inetd-358   0D..5   21us+: .beat_lpar_hpte_getword0 (.beat_lpar_hpte_invalidate)
   inetd-358   0D..5   24us : .__spin_unlock_irqrestore (.beat_lpar_hpte_invalidate)
   inetd-358   0D..4   25us : .flush_hash_page (.flush_hash_range)
   inetd-358   0D..4   25us : .beat_lpar_hpte_invalidate (.flush_hash_page)
   inetd-358   0D..4   26us : .__spin_lock_irqsave (.beat_lpar_hpte_invalidate)
   inetd-358   0D..5   26us+: .beat_lpar_hpte_getword0 (.beat_lpar_hpte_invalidate)
   inetd-358   0D..5   30us : .__spin_unlock_irqrestore (.beat_lpar_hpte_invalidate)
   inetd-358   0D..4   30us : .flush_hash_page (.flush_hash_range)
   inetd-358   0D..4   31us : .beat_lpar_hpte_invalidate (.flush_hash_page)
   inetd-358   0D..4   31us : .__spin_lock_irqsave (.beat_lpar_hpte_invalidate)
   inetd-358   0D..5   32us+: .beat_lpar_hpte_getword0 (.beat_lpar_hpte_invalidate)
   inetd-358   0D..5   36us : .__spin_unlock_irqrestore (.beat_lpar_hpte_invalidate)
   inetd-358   0D..4   36us : .flush_hash_page (.flush_hash_range)
   inetd-358   0D..4   37us : .beat_lpar_hpte_invalidate (.flush_hash_page)
   inetd-358   0D..4   37us : .__spin_lock_irqsave (.beat_lpar_hpte_invalidate)
   inetd-358   0D..5   38us+: .beat_lpar_hpte_getword0 (.beat_lpar_hpte_invalidate)
   inetd-358   0D..5   41us : .__spin_unlock_irqrestore (.beat_lpar_hpte_invalidate)
   inetd-358   0D..4   42us : .flush_hash_page (.flush_hash_range)
   inetd-358   0D..4   42us : .beat_lpar_hpte_invalidate (.flush_hash_page)
   inetd-358   0D..4   43us : .__spin_lock_irqsave (.beat_lpar_hpte_invalidate)
   inetd-358   0D..5   43us+: .beat_lpar_hpte_getword0 (.beat_lpar_hpte_invalidate)
   inetd-358   0D..5   47us : .__spin_unlock_irqrestore (.beat_lpar_hpte_invalidate)
   inetd-358   0D..4   48us : .flush_hash_page (.flush_hash_range)
   inetd-358   0D..4   48us : .beat_lpar_hpte_invalidate (.flush_hash_page)
   inetd-358   0D..4   49us : .__spin_lock_irqsave (.beat_lpar_hpte_invalidate)
   inetd-358   0D..5   49us+: .beat_lpar_hpte_getword0 (.beat_lpar_hpte_invalidate)
   inetd-358   0D..5   52us : .__spin_unlock_irqrestore (.beat_lpar_hpte_invalidate)
   inetd-358   0D..4   53us : .flush_hash_page (.flush_hash_range)
   inetd-358   0D..4   54us : .beat_lpar_hpte_invalidate (.flush_hash_page)
   inetd-358   0D..4   54us : .__spin_lock_irqsave (.beat_lpar_hpte_invalidate)
   inetd-358   0D..5   55us+: .beat_lpar_hpte_getword0 (.beat_lpar_hpte_invalidate)
   inetd-358   0D..5   58us : .__spin_unlock_irqrestore (.beat_lpar_hpte_invalidate)
   inetd-358   0D..2   59us : .user_trace_stop (.__switch_to)
   inetd-358   0D..2   60us : .user_trace_stop (.__switch_to)
-- owa



More information about the Linuxppc-dev mailing list