[Skiboot] [PATCH] Ensure we run pollers in cpu_wait_job()
Stewart Smith
stewart at linux.vnet.ibm.com
Thu Oct 8 12:12:13 AEDT 2015
Patrick Williams <patrick at stwcx.xyz> writes:
> On Thu, Oct 08, 2015 at 09:59:09AM +1100, Stewart Smith wrote:
>> Patrick Williams <patrick at stwcx.xyz> writes:
>> > On Wed, Oct 07, 2015 at 06:22:44AM -0500, Patrick Williams wrote:
>> >> On Wed, Oct 07, 2015 at 10:48:42AM +1100, Stewart Smith wrote:
>> >> > unsigned long ticks = usecs_to_tb(5);
>> >> > + unsigned long period = msecs_to_tb(5);
>> >> >
>> >> > + time_waited+=ticks;
>> >> > + if (time_waited % period == 0)
>> >> > + opal_run_pollers();
>> >>
>> >> This feels incorrect to me. If there is a rounding error in your
>> >> msecs_to_tb vs usecs_to_tb there is no certainty that LCM(period,ticks)
>> >> is period. In that case your still not going to run opal_run_pollers as
>> >> often as you would like.
>> >
>> > I wrote this wrong. What I meant was that there is no certainty that:
>> > period % ticks == 0
>>
>> S, I went and wrote a unit test of course (which was lacking for all the
>> tb functions anyway), and then went "well, everything looks good" until
>> I remembered reading the riveting Chapter 6 of Book II of the PowerISA
>> and the bits about the variation of incrementing the timebase over
>> specified frequency and then went "darn... you're right, this isn't
>> about math at all!"
>>
>> Except that here we're okay, because we're not using the timebase
>> register at all, we're using the computed ticks off a constant tb_hz
>> value of 512mhz and if time_wait(ticks) takes longer/shorter than what
>> we ask it doesn't actually matter, we increment time_waited by the exact
>> number of ticks we *meant* to wait for and will end up hitting
>> time_waited % period == 0 when we expect to (or at least close enough
>> to) - as it's just integer math with pretty nice round numbers.
>>
>
> Since tb_hz is currently defined to 512000000, (msecs_to_tb(1) %
> usecs_to_tb(1) == 0), so you don't have any problem. If tb_hz is not a
> nice multiple of 100000, then you would have a problem.
>
> Example:
> tb_hz = 520093696; // 0x1F000000
> usecs_to_tb(5) == 2600
> msecs_to_tb(5) == 2600470
>
> In that case (time_waited % period == 0) will never be true.
>
> If you're confident the tb_hz will always be 512000000, then I guess it
> doesn't matter, but I know Hostboot runs at a time when the tb is
> actually variable.
At least for the tb_hz that's used in the math, pretty confident (at
least for the moment) that we'll be doing the math on a clean 512mhz
value.
I've put an explicit assert in a unit test with a comment about it to
hopefully catch it if we ever go and read what the timebase hz
*actually* is - because, as you point out, things will break.
More information about the Skiboot
mailing list