[Skiboot] 答复: Qestions about skiboot timer

Nicholas Piggin npiggin at gmail.com
Thu Sep 7 14:18:16 AEST 2023


+cc skiboot list

Hey no problem, sorry I only just noticed your msg.

Yes that is the scenario I was imagining. Comment says "Warning: no
locking at the moment, do at init time only", however there are
definitely some pollers that get added after secondary CPUs call in to
their idle loop and therefore they can get jobs and could potentially do
something that runs pollers I think.

Possibly buggy, definitely fragile. There's actually not that many
pollers, maybe it wouldn't be too hard to move them all to timers and
remove poller code.

Thanks,
Nick

On Thu Sep 7, 2023 at 12:16 PM AEST,  wrote:
> hi Nick,
>
>   Thanks for detailed analyzation, benefit a lot. 
>
>   As for question 3, further study the code, opal_poll_lock is only used in opal_add_poller() and opal_del_poller() process.
> The scenario you mentioned remind me , I also think it is a risk. 
>       thread0 is executing  opal_add_poller(), 
>       thread 1 is executing  list_for_each(&opal_pollers, poll_ent, link)  
>   thread 1 may go wild when the link pointer is modifying.   
>
>   Thanks, really appreciate your rigorous analyzation.
> BR
>
> -----邮件原件-----
> 发件人: Nicholas Piggin [mailto:npiggin at gmail.com] 
> 发送时间: 2023年9月6日 13:13
> 收件人: yanqin.zhang at shingroup.cn; abhishek at linux.ibm.com; clombard at linux.ibm.com
> 抄送: 刘宇 <yu.liu at shingroup.cn>
> 主题: Re: Qestions about skiboot timer
>
> On Tue Aug 8, 2023 at 12:26 PM AEST,  wrote:
> >  
> >
> > Hi skiboot experts,
> >
> >  
> >
> > Here are some questions on skiboot timer module, please help to answer:
> >
> >  
> >
> > 1.     On skiboot boot phase, OS not setup, so there is no OPAL_POLL_EVENTS
> > or hardware interrupt that can trigger check_timer().
> >
> > How can we trigger check_timer() in this phase? APPs like PCIe/I2C can 
> > only actively call check_timer() to trigger?
>
> Basically it is driven by opal_run_pollers being called when we are waiting.
>
> I started on some patches that allowed skiboot to take timer interrupts during boot, but from there to actually running non-trivial code would be more work. Likely you'd add a new irq-safe timer variant and just run those, when you had converted its code over to use local irq disable/enable... In the end I didn't see the benefit justify added complexity.
>
> >
> > 2.     Some events can be added in opal poller list, some can be added in
> > timer(TIMER_POLL),
> >
> > They can all be executed in OPAL_POLL_EVENTS –> opal_run_pollers(), Is 
> > there any criterion to add which one?
>
> See commit eec6e53ff8. Looks like TIMER_POLL is preferred, but we never really got around to switching much to the new API. I think pollers were just never such a big problem even if timers have some advantages.
>
> >
> > 3.     timer(TIMER_POLL) has overhead lock in __check_poll_timers, but there
> > is no lock in opal poller process. 
> >
> > Besides, __check_poll_timers remove the timer that processed, so
> > timer(TIMER_POLL) Is a one-time timer, not like opal poller which is 
> > period timer.
> >
> > It seem opal poller is more advantage than timer(TIMER_POLL)?
>
> pollers seem to avoid a lock because they don't support removal.
> But now I look at it I'm nervous. AFAIK, list_add is not designed to be safe vs concurrent list_for_each. There are no memory or compiler barriers. How would you stop the list walker from loading some old value of a ->next pointer of a newly added object and then walk off into the wilderness?
>
> I think we may need to hold the poll lock when running pollers.
>
> Thanks,
> Nick
>  



More information about the Skiboot mailing list