[Skiboot] [PATCH v7 2/2] Add SBE driver support

Vasant Hegde hegdevasant at linux.vnet.ibm.com
Tue Apr 17 16:04:06 AEST 2018


On 04/17/2018 09:38 AM, Benjamin Herrenschmidt wrote:
> On Mon, 2018-04-16 at 17:17 +0530, Vasant Hegde wrote:
>>
>> +static inline bool p9_sbe_mbox_busy(struct p9_sbe *sbe)
>> +{
>> +	return sbe->state != sbe_mbox_idle ? true : false;
>> +}
> 
> What's wrong with just return sbe->state != sbe_mbox_idle ?
> 

That should work fine. Will fix.

>> +static inline void p9_sbe_mbox_update_state(struct p9_sbe *sbe,
>> +					    enum p9_sbe_mbox_state state)
>> +{
>> +	sbe->state = state;
>> +}
> 
> What's the point of the above wrapper ? I don't find it makes
> anything clearer...

Ok. Will remove.


.../...

>> +}
>> +
>> +/* WARNING: This will drop sbe->lock */
>> +static void p9_sbe_send_complete(struct p9_sbe *sbe)
>> +{
>> +	struct p9_sbe_msg *msg;
>> +
>> +	if (list_empty(&sbe->msg_list))
>>   		return;
>>   
>> +	msg = list_top(&sbe->msg_list, struct p9_sbe_msg, link);
>> +	/* Need response */
>> +	if (msg->response) {
>> +		msg->state = sbe_msg_wresp;
>> +	} else {
>> +		msg->state = sbe_msg_done;
>> +		p9_sbe_mbox_update_state(sbe, sbe_mbox_idle);
>> +		p9_sbe_msg_complete(sbe, msg);
>> +	}
>> +}
> 
> Subtle race: if somebody is polling on msg->state waiting for it to be
> done without holding the SBE lock (which could happen), then there's a
> risk that the message gets freed right after msg->state is updated.
> 

Oh yes. If someone is polling without lock yes we have trouble.
Will fix.

> msg->state should be set to "done" *inside* p9_sbe_msg_complete, and
> the latter should make sure it snapshots msg->complete before it
> changes msg->state (with a nice fat sync in between). It also needs to
> take it out of the list before updating the state as well.

Yeah. I can do something like bleow in p9_sbe_msg_complete()

static void p9_sbe_msg_complete(struct p9_sbe *sbe, struct p9_sbe_msg *msg)
{
	void (*comp)(struct p9_sbe_msg *msg);

	comp = msg->complete;
	list_del(&msg);

	msg->state = sbe_msg_done;
	if (comp) {
		unlock(&sbe->lock);
		comp(msg);
		lock(&sbe->lock);
	}
}

But now we always endup setting msg->state = done. How do we pass error state to 
caller?
May be allocate resp structure for all messages?

> 
>> +/* WARNING: This will drop sbe->lock */
>> +static void p9_sbe_process_queue(struct p9_sbe *sbe)
>> +{
>> +	int rc, retry_cnt = 0;
>> +	struct p9_sbe_msg *msg = NULL;
>> +
>> +	if (p9_sbe_mbox_busy(sbe))
>> +		return;
>> +
>> +	while (!list_empty(&sbe->msg_list)) {
>> +		msg = list_top(&sbe->msg_list, struct p9_sbe_msg, link);
>> +		/* Send message */
>> +		rc = p9_sbe_msg_send(sbe->chip_id, msg);
>> +		if (rc == OPAL_SUCCESS)
>> +			break;
>> +
>> +		prlog(PR_ERR, "Failed to send message to SBE [chip id = %x]\n",
>> +		      sbe->chip_id);
>> +		if (msg->resp) {
>> +			p9_sbe_set_primary_rc(msg->resp,
>> +					      SBE_STATUS_PRI_GENERIC_ERR);
>> +		}
>> +		msg->state = sbe_msg_error;
>> +		p9_sbe_msg_complete(sbe, msg);
>> +
>> +		/*
>> +		 * Repeatedly failed to send message to SBE. Lets stop
>> +		 * sending message.
>> +		 */
>> +		if (retry_cnt++ >= 3) {
>> +			prlog(PR_ERR, "Temporarily stopped sending "
>> +			      "message to SBE\n");
>> +			return;
>> +		}
>> +	}
>> +
>> +	if (list_empty(&sbe->msg_list))
>> +		return;
>> +
>> +	prlog(PR_TRACE, "Message queued [chip id = 0x%x]:\n\t Reg0 : %016llx"
>> +	      "\n\t Reg1 : %016llx\n\t Reg2 : %016llx\n\t Reg3 : %016llx\n",
>> +	      sbe->chip_id, msg->reg[0], msg->reg[1], msg->reg[2], msg->reg[3]);
>> +
>> +	msg->timeout = mftb() + msecs_to_tb(SBE_CMD_TIMEOUT_MAX);
>> +	p9_sbe_mbox_update_state(sbe, sbe_mbox_send);
>> +	msg->state = sbe_msg_sent;
> 
> I would personally have the 2 above inside p9_sbe_msg_send(), any
> reason why not ?

I don't have strong reason for that. I just wanted sbe_msg_send to just send 
message without dealing with message state etc (like I do in sbe_msg_recv). I 
can move above two statement inside msg_send().


.../...

>> +
>> +static void __p9_sbe_interrupt(struct p9_sbe *sbe)
>> +{
>> +	bool has_response = false;
>> +	int rc;
>> +	u64 data = 0, val;
>> +	struct p9_sbe_msg *msg = NULL;
>> +
>>   	/* Read doorbell register */
>> -	rc = xscom_read(chip_id, PSU_HOST_DOORBELL_REG_RW, &data);
>> +	rc = xscom_read(sbe->chip_id, PSU_HOST_DOORBELL_REG_RW, &data);
>>   	if (rc) {
>>   		prlog(PR_ERR, "Failed to read SBE to Host doorbell register "
>> -		      "[chip id = %x]\n", chip_id);
>> -		goto clr_interrupt;
>> +		      "[chip id = %x]\n", sbe->chip_id);
>> +		p9_sbe_reg_dump(sbe->chip_id);
>> +		return;
>>   	}
>>   
>> -	/* SBE passtrhough command, call prd handler */
>> -	if (data & SBE_HOST_PASSTHROUGH) {
>> -		prd_sbe_passthrough(chip_id);
>> +	/* Called from poller path, nothing to process */
>> +	if (!data)
>> +		return;
>> +
>> +	/* Read SBE response before clearing doorbell register */
>> +	if (data & SBE_HOST_RESPONSE_WAITING) {
>> +		if (!list_empty(&sbe->msg_list)) {
>> +			msg = list_top(&sbe->msg_list, struct p9_sbe_msg, link);
>> +			p9_sbe_handle_response(sbe->chip_id, msg);
>> +			msg->state = sbe_msg_done;
>> +			has_response = true;
>> +		}
>> +
>> +		/* Reset SBE MBOX state */
>> +		p9_sbe_mbox_update_state(sbe, sbe_mbox_idle);
>>   	}
> 
> Why do you do that one before the doorbell bit clearing ?
> 
> Also you can hit other cases further down such as the "return"
> in the xscom_write error case or the SBE_HOST_RESET which will
> make you "lose" the content of "msg" and overwrite its state.

Yeah. May be I will have separate local "resp" message, copy response to that 
and then if everything is fine (like writing xscom , handling reset), I will 
call msg_complete() with this new resp (as in I will copy "resp" to msg->resp).

> 
> You may want to use a separate variable:
>
> struct p9_sbe_msg *resp.
>

Yeah. Separate message is cleaner.


> Also you have another possible issue. If you send a message that takes
> a response, and you get both the ack and the resp bits, you'll end
> up doing the above, and *then* hit send_complete, is that what you
> really want ?

Actually this is fine. I'm processing like below
	Holding lock
	process ACK -> if message expects response then send_complete just sets message 
state
					and returns
	process response -> At this stage if we have response then it calls msg->complete()


> 
> You should also double check when you get the response bit that
> the message at the head of the list is indeed expecting a response.

SBE message is serialized. Until we get response to first message we are not 
sending second one. So top of the message is the one which is waiting for 
response. Worst case we may
get both ACK and response bit .. which we are handling.

> 
> If you don't want to worry too much about the lock dropping, what
> you can do is a loop, clearing only the bit you processed and reading
> the doorbell again, though that's a bit slower.

I think current code is fine with above modification. We should be able to 
handle with single xscom_read.


Thanks!
-Vasant



More information about the Skiboot mailing list