[Skiboot] [PATCH v7 2/2] Add SBE driver support

Tue Apr 17 21:15:03 AEST 2018

On 04/17/2018 12:10 PM, Benjamin Herrenschmidt wrote:
> On Tue, 2018-04-17 at 11:34 +0530, Vasant Hegde wrote:
>> On 04/17/2018 09:38 AM, Benjamin Herrenschmidt wrote:
>>> On Mon, 2018-04-16 at 17:17 +0530, Vasant Hegde wrote:

.../...

>>
>> Oh yes. If someone is polling without lock yes we have trouble.
>> Will fix.
>>
>>> msg->state should be set to "done" *inside* p9_sbe_msg_complete, and
>>> the latter should make sure it snapshots msg->complete before it
>>> changes msg->state (with a nice fat sync in between). It also needs to
>>> take it out of the list before updating the state as well.
>>
>> Yeah. I can do something like bleow in p9_sbe_msg_complete()
>>
>> static void p9_sbe_msg_complete(struct p9_sbe *sbe, struct p9_sbe_msg *msg)
>> {
>> 	void (*comp)(struct p9_sbe_msg *msg);
>>
>> 	comp = msg->complete;
>> 	list_del(&msg);
> 
> 	sync(); <---- important

Yes. Fixed.

> 
>> 	msg->state = sbe_msg_done;
>> 	if (comp) {
>> 		unlock(&sbe->lock);
>> 		comp(msg);
>> 		lock(&sbe->lock);
>> 	}
>> }
>>
>> But now we always endup setting msg->state = done. How do we pass error state to
>> caller?
> 
> Pass it as an argument to msg_complete ?

Yes. That should work fine.

> 
>> May be allocate resp structure for all messages?
>>
>>>
>>>> +/* WARNING: This will drop sbe->lock */
>>>> +static void p9_sbe_process_queue(struct p9_sbe *sbe)
>>>> +{
>>>> +	int rc, retry_cnt = 0;
>>>> +	struct p9_sbe_msg *msg = NULL;
>>>> +
>>>> +	if (p9_sbe_mbox_busy(sbe))
>>>> +		return;
>>>> +
>>>> +	while (!list_empty(&sbe->msg_list)) {
>>>> +		msg = list_top(&sbe->msg_list, struct p9_sbe_msg, link);
>>>> +		/* Send message */
>>>> +		rc = p9_sbe_msg_send(sbe->chip_id, msg);
>>>> +		if (rc == OPAL_SUCCESS)
>>>> +			break;
>>>> +
>>>> +		prlog(PR_ERR, "Failed to send message to SBE [chip id = %x]\n",
>>>> +		      sbe->chip_id);
>>>> +		if (msg->resp) {
>>>> +			p9_sbe_set_primary_rc(msg->resp,
>>>> +					      SBE_STATUS_PRI_GENERIC_ERR);
>>>> +		}
>>>> +		msg->state = sbe_msg_error;
>>>> +		p9_sbe_msg_complete(sbe, msg);
>>>> +
>>>> +		/*
>>>> +		 * Repeatedly failed to send message to SBE. Lets stop
>>>> +		 * sending message.
>>>> +		 */
>>>> +		if (retry_cnt++ >= 3) {
>>>> +			prlog(PR_ERR, "Temporarily stopped sending "
>>>> +			      "message to SBE\n");
>>>> +			return;
>>>> +		}
>>>> +	}
>>>> +
>>>> +	if (list_empty(&sbe->msg_list))
>>>> +		return;
>>>> +
>>>> +	prlog(PR_TRACE, "Message queued [chip id = 0x%x]:\n\t Reg0 : %016llx"
>>>> +	      "\n\t Reg1 : %016llx\n\t Reg2 : %016llx\n\t Reg3 : %016llx\n",
>>>> +	      sbe->chip_id, msg->reg[0], msg->reg[1], msg->reg[2], msg->reg[3]);
>>>> +
>>>> +	msg->timeout = mftb() + msecs_to_tb(SBE_CMD_TIMEOUT_MAX);
>>>> +	p9_sbe_mbox_update_state(sbe, sbe_mbox_send);
>>>> +	msg->state = sbe_msg_sent;
>>>
>>> I would personally have the 2 above inside p9_sbe_msg_send(), any
>>> reason why not ?
>>
>> I don't have strong reason for that. I just wanted sbe_msg_send to just send
>> message without dealing with message state etc (like I do in sbe_msg_recv). I
>> can move above two statement inside msg_send().
> 
> it's not that much better to have that stuff opencoded in the general
> queue processing either ... not biggie tho.

Actually if I move above hunk to send_msg() I can remove list_empty() check as well.

>>
>>
>> .../...
>>
>>>> +
>>>> +static void __p9_sbe_interrupt(struct p9_sbe *sbe)
>>>> +{
>>>> +	bool has_response = false;
>>>> +	int rc;
>>>> +	u64 data = 0, val;
>>>> +	struct p9_sbe_msg *msg = NULL;
>>>> +
>>>>    	/* Read doorbell register */
>>>> -	rc = xscom_read(chip_id, PSU_HOST_DOORBELL_REG_RW, &data);
>>>> +	rc = xscom_read(sbe->chip_id, PSU_HOST_DOORBELL_REG_RW, &data);
>>>>    	if (rc) {
>>>>    		prlog(PR_ERR, "Failed to read SBE to Host doorbell register "
>>>> -		      "[chip id = %x]\n", chip_id);
>>>> -		goto clr_interrupt;
>>>> +		      "[chip id = %x]\n", sbe->chip_id);
>>>> +		p9_sbe_reg_dump(sbe->chip_id);
>>>> +		return;
>>>>    	}
>>>>    
>>>> -	/* SBE passtrhough command, call prd handler */
>>>> -	if (data & SBE_HOST_PASSTHROUGH) {
>>>> -		prd_sbe_passthrough(chip_id);
>>>> +	/* Called from poller path, nothing to process */
>>>> +	if (!data)
>>>> +		return;
>>>> +
>>>> +	/* Read SBE response before clearing doorbell register */
>>>> +	if (data & SBE_HOST_RESPONSE_WAITING) {
>>>> +		if (!list_empty(&sbe->msg_list)) {
>>>> +			msg = list_top(&sbe->msg_list, struct p9_sbe_msg, link);
>>>> +			p9_sbe_handle_response(sbe->chip_id, msg);
>>>> +			msg->state = sbe_msg_done;
>>>> +			has_response = true;
>>>> +		}
>>>> +
>>>> +		/* Reset SBE MBOX state */
>>>> +		p9_sbe_mbox_update_state(sbe, sbe_mbox_idle);
>>>>    	}
>>>
>>> Why do you do that one before the doorbell bit clearing ?
>>>
>>> Also you can hit other cases further down such as the "return"
>>> in the xscom_write error case or the SBE_HOST_RESET which will
>>> make you "lose" the content of "msg" and overwrite its state.
>>
>> Yeah. May be I will have separate local "resp" message, copy response to that
>> and then if everything is fine (like writing xscom , handling reset), I will
>> call msg_complete() with this new resp (as in I will copy "resp" to msg->resp).
> 
> Why copy ? You don't need to copy anything ...
> 
>>>
>>> You may want to use a separate variable:
>>>
>>> struct p9_sbe_msg *resp.
>>>
>>
>> Yeah. Separate message is cleaner.
> 
> Separate variable, same message... or not. Up to you.
>>
>>> Also you have another possible issue. If you send a message that takes
>>> a response, and you get both the ack and the resp bits, you'll end
>>> up doing the above, and *then* hit send_complete, is that what you
>>> really want ?
>>
>> Actually this is fine. I'm processing like below
>> 	Holding lock
>> 	process ACK -> if message expects response then send_complete just sets message
>> state
>> 					and returns
>> 	process response -> At this stage if we have response then it calls msg->complete()
>>
>>
>>>
>>> You should also double check when you get the response bit that
>>> the message at the head of the list is indeed expecting a response.
>>
>> SBE message is serialized. Until we get response to first message we are not
>> sending second one. So top of the message is the one which is waiting for
>> response. Worst case we may
>> get both ACK and response bit .. which we are handling.
> 
> You should still look at ACK first, before you look at response.
>>>
>>> If you don't want to worry too much about the lock dropping, what
>>> you can do is a loop, clearing only the bit you processed and reading
>>> the doorbell again, though that's a bit slower.
>>
>> I think current code is fine with above modification. We should be able to
>> handle with single xscom_read.
> 
> I'll review again when you do the changes and see if I can convince
> myself that it is.

Sure.

-Vasant