[RFC PATCH] powerpc/powernv: Provide a way to force a core into SMT4 mode

Paul Mackerras paulus at ozlabs.org
Sat Jan 27 13:34:11 AEDT 2018


On Fri, Jan 26, 2018 at 05:06:10PM -0800, Ram Pai wrote:
> On Thu, Jan 25, 2018 at 04:05:12PM +1100, Paul Mackerras wrote:
> > POWER9 processors up to and including "Nimbus" v2.2 have hardware
> > bugs relating to transactional memory and thread reconfiguration.
> > One of these bugs has a workaround which is to get the core into
> > SMT4 state temporarily.  This workaround is only needed when
> > running bare-metal.
> 
> ..snip..
> >   */
> >  _GLOBAL(power9_idle_stop)
> >  	std	r3, PACA_REQ_PSSCR(r13)
> 	
> this instruction can go a little later and save a few cycles, in the
> case it need not have to stop ?
> 
> > +	sync
> > +	lwz	r5, PACA_DONT_STOP(r13)
> > +	cmpwi	r5, 0
> > +	bne	1f
> 
> I mean 'std r3, ...' can move here.

That would introduce a race condition, where this thread would miss
seeing the other thread's store to paca->dont_stop, and the other
thread would miss seeing this thread's store to paca->requested_psscr.

> > +	/* order setting dont_stop vs testing requested_psscr */
> > +	mb();
> > +	for (thr = 0; thr < threads_per_core; ++thr) {
> > +		if (!tpaca[thr].requested_psscr)
> > +			++awake_threads;
> > +		else
> > +			poke_threads |= (1 << thr);
> 
> 			ppc_msgsnd(...)  can be called here in the else part?

It could, but I wanted to avoid disturbing the other threads with the
msgsnd if it was not necessary.  Hence the second loop to do the
msgsnds once we have determined that we really need to do them.

> 	
> > +	}
> > +
> > +	/* If at least 3 threads are awake, the core is in SMT4 already */
> 
> small nitpick --  this comment mentions SMT4 and 3 threads. But the code
> is generically applicable to SMTn and (n-1) threads.

Sure - it's easier to understand a concrete example than something
more general, that's why the comment is about the specific use case
not the general capability of the code.

> > +	if (awake_threads < threads_per_core - 1) {
> 
> 
> > +		/* We have to wake some threads; we'll use msgsnd */
> > +		for (thr = 0; thr < threads_per_core; ++thr) {
> > +			if (poke_threads & (1 << thr))
> > +				ppc_msgsnd(PPC_DBELL_MSGTYPE, 0,
> > +					   tpaca[thr].hw_cpu_id);
> > +		}
> 
> and this loop can be deleted, which inturn can leads to further optimizations.

... at the cost of other threads taking doorbell interrupts
unnecessarily.  I thought it better to put more burden on the thread
needing this synchronization and less on the other threads (on
average).

Paul.


More information about the Linuxppc-dev mailing list