[Skiboot] [PATCH v7 2/2] Add SBE driver support
Vasant Hegde
hegdevasant at linux.vnet.ibm.com
Tue Apr 17 21:15:03 AEST 2018
On 04/17/2018 12:10 PM, Benjamin Herrenschmidt wrote:
> On Tue, 2018-04-17 at 11:34 +0530, Vasant Hegde wrote:
>> On 04/17/2018 09:38 AM, Benjamin Herrenschmidt wrote:
>>> On Mon, 2018-04-16 at 17:17 +0530, Vasant Hegde wrote:
.../...
>>
>> Oh yes. If someone is polling without lock yes we have trouble.
>> Will fix.
>>
>>> msg->state should be set to "done" *inside* p9_sbe_msg_complete, and
>>> the latter should make sure it snapshots msg->complete before it
>>> changes msg->state (with a nice fat sync in between). It also needs to
>>> take it out of the list before updating the state as well.
>>
>> Yeah. I can do something like bleow in p9_sbe_msg_complete()
>>
>> static void p9_sbe_msg_complete(struct p9_sbe *sbe, struct p9_sbe_msg *msg)
>> {
>> void (*comp)(struct p9_sbe_msg *msg);
>>
>> comp = msg->complete;
>> list_del(&msg);
>
> sync(); <---- important
Yes. Fixed.
>
>> msg->state = sbe_msg_done;
>> if (comp) {
>> unlock(&sbe->lock);
>> comp(msg);
>> lock(&sbe->lock);
>> }
>> }
>>
>> But now we always endup setting msg->state = done. How do we pass error state to
>> caller?
>
> Pass it as an argument to msg_complete ?
Yes. That should work fine.
>
>> May be allocate resp structure for all messages?
>>
>>>
>>>> +/* WARNING: This will drop sbe->lock */
>>>> +static void p9_sbe_process_queue(struct p9_sbe *sbe)
>>>> +{
>>>> + int rc, retry_cnt = 0;
>>>> + struct p9_sbe_msg *msg = NULL;
>>>> +
>>>> + if (p9_sbe_mbox_busy(sbe))
>>>> + return;
>>>> +
>>>> + while (!list_empty(&sbe->msg_list)) {
>>>> + msg = list_top(&sbe->msg_list, struct p9_sbe_msg, link);
>>>> + /* Send message */
>>>> + rc = p9_sbe_msg_send(sbe->chip_id, msg);
>>>> + if (rc == OPAL_SUCCESS)
>>>> + break;
>>>> +
>>>> + prlog(PR_ERR, "Failed to send message to SBE [chip id = %x]\n",
>>>> + sbe->chip_id);
>>>> + if (msg->resp) {
>>>> + p9_sbe_set_primary_rc(msg->resp,
>>>> + SBE_STATUS_PRI_GENERIC_ERR);
>>>> + }
>>>> + msg->state = sbe_msg_error;
>>>> + p9_sbe_msg_complete(sbe, msg);
>>>> +
>>>> + /*
>>>> + * Repeatedly failed to send message to SBE. Lets stop
>>>> + * sending message.
>>>> + */
>>>> + if (retry_cnt++ >= 3) {
>>>> + prlog(PR_ERR, "Temporarily stopped sending "
>>>> + "message to SBE\n");
>>>> + return;
>>>> + }
>>>> + }
>>>> +
>>>> + if (list_empty(&sbe->msg_list))
>>>> + return;
>>>> +
>>>> + prlog(PR_TRACE, "Message queued [chip id = 0x%x]:\n\t Reg0 : %016llx"
>>>> + "\n\t Reg1 : %016llx\n\t Reg2 : %016llx\n\t Reg3 : %016llx\n",
>>>> + sbe->chip_id, msg->reg[0], msg->reg[1], msg->reg[2], msg->reg[3]);
>>>> +
>>>> + msg->timeout = mftb() + msecs_to_tb(SBE_CMD_TIMEOUT_MAX);
>>>> + p9_sbe_mbox_update_state(sbe, sbe_mbox_send);
>>>> + msg->state = sbe_msg_sent;
>>>
>>> I would personally have the 2 above inside p9_sbe_msg_send(), any
>>> reason why not ?
>>
>> I don't have strong reason for that. I just wanted sbe_msg_send to just send
>> message without dealing with message state etc (like I do in sbe_msg_recv). I
>> can move above two statement inside msg_send().
>
> it's not that much better to have that stuff opencoded in the general
> queue processing either ... not biggie tho.
Actually if I move above hunk to send_msg() I can remove list_empty() check as well.
>>
>>
>> .../...
>>
>>>> +
>>>> +static void __p9_sbe_interrupt(struct p9_sbe *sbe)
>>>> +{
>>>> + bool has_response = false;
>>>> + int rc;
>>>> + u64 data = 0, val;
>>>> + struct p9_sbe_msg *msg = NULL;
>>>> +
>>>> /* Read doorbell register */
>>>> - rc = xscom_read(chip_id, PSU_HOST_DOORBELL_REG_RW, &data);
>>>> + rc = xscom_read(sbe->chip_id, PSU_HOST_DOORBELL_REG_RW, &data);
>>>> if (rc) {
>>>> prlog(PR_ERR, "Failed to read SBE to Host doorbell register "
>>>> - "[chip id = %x]\n", chip_id);
>>>> - goto clr_interrupt;
>>>> + "[chip id = %x]\n", sbe->chip_id);
>>>> + p9_sbe_reg_dump(sbe->chip_id);
>>>> + return;
>>>> }
>>>>
>>>> - /* SBE passtrhough command, call prd handler */
>>>> - if (data & SBE_HOST_PASSTHROUGH) {
>>>> - prd_sbe_passthrough(chip_id);
>>>> + /* Called from poller path, nothing to process */
>>>> + if (!data)
>>>> + return;
>>>> +
>>>> + /* Read SBE response before clearing doorbell register */
>>>> + if (data & SBE_HOST_RESPONSE_WAITING) {
>>>> + if (!list_empty(&sbe->msg_list)) {
>>>> + msg = list_top(&sbe->msg_list, struct p9_sbe_msg, link);
>>>> + p9_sbe_handle_response(sbe->chip_id, msg);
>>>> + msg->state = sbe_msg_done;
>>>> + has_response = true;
>>>> + }
>>>> +
>>>> + /* Reset SBE MBOX state */
>>>> + p9_sbe_mbox_update_state(sbe, sbe_mbox_idle);
>>>> }
>>>
>>> Why do you do that one before the doorbell bit clearing ?
>>>
>>> Also you can hit other cases further down such as the "return"
>>> in the xscom_write error case or the SBE_HOST_RESET which will
>>> make you "lose" the content of "msg" and overwrite its state.
>>
>> Yeah. May be I will have separate local "resp" message, copy response to that
>> and then if everything is fine (like writing xscom , handling reset), I will
>> call msg_complete() with this new resp (as in I will copy "resp" to msg->resp).
>
> Why copy ? You don't need to copy anything ...
>
>>>
>>> You may want to use a separate variable:
>>>
>>> struct p9_sbe_msg *resp.
>>>
>>
>> Yeah. Separate message is cleaner.
>
> Separate variable, same message... or not. Up to you.
>>
>>> Also you have another possible issue. If you send a message that takes
>>> a response, and you get both the ack and the resp bits, you'll end
>>> up doing the above, and *then* hit send_complete, is that what you
>>> really want ?
>>
>> Actually this is fine. I'm processing like below
>> Holding lock
>> process ACK -> if message expects response then send_complete just sets message
>> state
>> and returns
>> process response -> At this stage if we have response then it calls msg->complete()
>>
>>
>>>
>>> You should also double check when you get the response bit that
>>> the message at the head of the list is indeed expecting a response.
>>
>> SBE message is serialized. Until we get response to first message we are not
>> sending second one. So top of the message is the one which is waiting for
>> response. Worst case we may
>> get both ACK and response bit .. which we are handling.
>
> You should still look at ACK first, before you look at response.
>>>
>>> If you don't want to worry too much about the lock dropping, what
>>> you can do is a loop, clearing only the bit you processed and reading
>>> the doorbell again, though that's a bit slower.
>>
>> I think current code is fine with above modification. We should be able to
>> handle with single xscom_read.
>
> I'll review again when you do the changes and see if I can convince
> myself that it is.
Sure.
-Vasant
More information about the Skiboot
mailing list