[Skiboot] [PATCH] i2c: Fix i2c request hang during opal init if timers are not checked
Frederic Barrat
fbarrat at linux.ibm.com
Tue Dec 11 19:17:20 AEDT 2018
Le 11/12/2018 à 04:41, Stewart Smith a écrit :
> Frederic Barrat <fbarrat at linux.ibm.com> writes:
>> If an i2c request cannot go through the first time, because the bus is
>> found in error and need a reset or it's locked by the OCC for example,
>> the underlying i2c implementation is using timers to manage the
>> request. However during opal init, opal pollers may not be called, it
>> depends in the context in which the i2c request is made. If the
>> pollers are not called, the timers are not checked and we can end up
>> with an i2c request which will not move foward and skiboot hangs.
>>
>> Fix it by explicitly checking the timers if we are waiting for an i2c
>> request to complete and it seems to be taking a while.
>
> Okay, I see what happens here: we're doing the i2c request in the main
> thread and doing time_wait() with a value that'll never run pollers and
> thus never check_timers() either.
>
> Looking at the code here, is there any reason we shouldn't just call
> check_timers(false) straight after the time_wait() ?
The only reason I was reluctant to do so is that the default
time_to_wait we get from i2c_run_req() is pretty short, so I didn't want
to degrade a common path for the benefit of a supposedly rarer error
path (bus in use by OCC, bus found in error state, ...).
Also I didn't call opal_run_pollers() for a similar reason, as the call
is more expensive and I got scared by how we treat it in
time_wait_poll(), seemingly avoiding calling it too frequently.
I basically did the minimum required to get the i2c request unstuck, so
that the main thread can get back to do real work during opal init.
Fred
>
>>
>> Signed-off-by: Frederic Barrat <fbarrat at linux.ibm.com>
>> ---
>> core/i2c.c | 16 ++++++++++++++++
>> 1 file changed, 16 insertions(+)
>>
>> diff --git a/core/i2c.c b/core/i2c.c
>> index 26926fc2..3c600024 100644
>> --- a/core/i2c.c
>> +++ b/core/i2c.c
>> @@ -21,6 +21,7 @@
>> #include <opal-msg.h>
>> #include <timebase.h>
>> #include <processor.h>
>> +#include <timer.h>
>>
>> static LIST_HEAD(i2c_bus_list);
>>
>> @@ -177,6 +178,7 @@ int i2c_request_send(int bus_id, int dev_addr, int read_write,
>> struct i2c_bus *bus;
>> uint64_t time_to_wait = 0;
>> struct i2c_sync_userdata ud;
>> + uint64_t timer_period = msecs_to_tb(5), timer_count;
>>
>> bus = i2c_find_bus_by_id(bus_id);
>> if (!bus) {
>> @@ -215,6 +217,7 @@ int i2c_request_send(int bus_id, int dev_addr, int read_write,
>>
>> for (retries = 0; retries <= MAX_NACK_RETRIES; retries++) {
>> waited = 0;
>> + timer_count = 0;
>>
>> i2c_queue_req(req);
>>
>> @@ -224,6 +227,19 @@ int i2c_request_send(int bus_id, int dev_addr, int read_write,
>> time_to_wait = REQ_COMPLETE_POLLING;
>> time_wait(time_to_wait);
>> waited += time_to_wait;
>> + timer_count += time_to_wait;
>> + if (timer_count > timer_period) {
>> + /*
>> + * The above request may be relying on
>> + * timers to complete, yet there may
>> + * not be called, especially during
>> + * opal init. We could be looping here
>> + * forever. So explicitly check the
>> + * timers once in a while
>> + */
>> + check_timers(false);
>> + timer_count = 0;
>> + }
>> } while (!ud.done);
>>
>> lwsync();
>> --
>> 2.19.1
>>
>> _______________________________________________
>> Skiboot mailing list
>> Skiboot at lists.ozlabs.org
>> https://lists.ozlabs.org/listinfo/skiboot
>>
>
More information about the Skiboot
mailing list