[Skiboot] [PATCH v5 4/5] Add SBE driver support

Tue Jul 25 00:40:03 AEST 2017

On 07/24/2017 04:03 AM, Benjamin Herrenschmidt wrote:
> On Sun, 2017-07-23 at 17:00 +0530, Vasant Hegde wrote:
>>
>> +	/*
>> +	 * SBE uses TB value for scheduling timer. Hence init after
>> +	 * chiptod init
>> +	 */
>> +	sbe_init();
>> +
>
> 	p9_sbe_init();

Sure. Will rename all functions.

will change P8 related function names as well (like p8_sbe_init_timer).

>
>>  	/* Initialize i2c */
>>  	p8_i2c_init();
>>
>> diff --git a/hw/sbe-p9.c b/hw/sbe-p9.c
>> index c9c3feb..3af1094 100644
>> --- a/hw/sbe-p9.c
>> +++ b/hw/sbe-p9.c
>> @@ -23,39 +23,613 @@
>>  #include <sbe-p9.h>
>>  #include <skiboot.h>
>>  #include <timebase.h>
>> -#include <timer.h>
>>  #include <trace.h>
>>  #include <xscom.h>
>
> Add some comment documenting the protocol & doorbell register use.

Sure.

>
>> -void sbe_interrupt(uint32_t chip_id)
>> +struct sbe {
>
> p9_sbe... (and everything else here)

Ok.

.../...

>> +
>> +static u64 sbe_rreg(u32 chip_id, u64 reg)
>> +{
>> +	u64 data = 0;
>> +
>> +	xscom_read(chip_id, reg, &data);
>
> Error handling ?

Missed to handle error here. Will add check .

.../...

>> +
>> +static void sbe_fillmsg(struct sbe_msg *msg, u16 cmd,
>> +			u16 ctrl_flag, u64 reg1, u64 reg2, u64 reg3)
>> +{
>> +	bool response = !!(ctrl_flag & SBE_CMD_CTRL_RESP_REQ);
>> +
>> +	/* Seqence ID is filled by sbe_queue_msg() */
>> +	msg->reg[0] = ((u64)ctrl_flag << 32) | cmd;
>> +	msg->reg[1] = reg1;
>> +	msg->reg[2] = reg2;
>> +	msg->reg[3] = reg3;
>> +
>> +	prlog(PR_TRACE, "MBOX message :\n\t Reg0 : %llx\n\t "
>> +	      "Reg1 : %llx\n\t Reg2 : %llx\n\t Reg3 : %lld\n",
>> +	      msg->reg[0], msg->reg[1], msg->reg[2], msg->reg[3]);
>
> That trace is more useful at the point where the message is queued.

Yes. Will fix .

>
>> +	msg->state = sbe_msg_unused;
>> +	msg->response = response;
>> +}
>> +
>> +/*
>> + * Handles "command with direct data" format only.
>> + *
>> + * Note: All mbox messages of our interest uses direct data format. if we need
>> + *       need indirect data format then we may have to enhance this function.
>
> How does indirect work, can you tell me more ?

In case of direct message data is passed as part of mbox message itself (via 
registers).
But in case of indirect data is passed via memory and we pass memory address in 
mbox command.

 From implementation point of view their is a difference in message format.

register 0 -> word 0 :
	direct cmd  : reserved << 16 | ctrl flag
	indirect cmd: mem_addr_size_dword

Presently this function handles only direct command. Whenever we need support 
for indirect data format, we can tweak this function or add wrapper around this 
function.

>
>> +struct sbe_msg *sbe_mkmsg(u16 cmd, u16 ctrl_flag,
>> +			  u64 reg1, u64 reg2, u64 reg3)
>> +{
>> +	struct sbe_msg *msg;
>> +
>> +	msg = zalloc(sizeof(struct sbe_msg));
>> +	if (!msg) {
>> +		prlog(PR_ERR, "Failed to allocate memory for SBE message\n");
>> +		return NULL;
>> +	}
>> +
>> +	sbe_fillmsg(msg, cmd, ctrl_flag, reg1, reg2, reg3);
>> +	return msg;
>> +}
>> +
>> +static inline bool sbe_busy(struct sbe_msg *msg)
>> +{
>> +	switch (msg->state) {
>> +	case sbe_msg_unused:
>> +	case sbe_msg_queued:
>> +	case sbe_msg_done:
>> +	case sbe_msg_timeout:
>> +	case sbe_msg_cancelled:
>
> Your enum has 8 states, it's easier to test the 3 that
> return true than the 5 that return false no ?

Yes. You are right. Also may be I can get rid of sbe_msg_timeout .

 > Also this
> isn't clear what that means, can you elaborate ?

I've added comment in header file.

159         sbe_msg_unused = 0,     /* Free */
160         sbe_msg_queued,         /* Queued to SBE list */
161         sbe_msg_sent,           /* Sent to SBE */
162         sbe_msg_wresp,          /* Waiting for response */
163         sbe_msg_response,       /* Got response */
164         sbe_msg_done,           /* Complete */
165         sbe_msg_timeout,        /* Message timeout */
166         sbe_msg_cancelled,      /* Message removed from queue */

>
>> +		return false;
>> +	default:	/* + sbe_msg_response, sbe_msg_sent, sbe_msg_wresp */
>> +		break;
>> +	}
>> +
>> +	return true;
>> +}
>> +
>> +/* Read SBE doorbell to confirm its ready to accept command. */
>> +static bool sbe_hw_ready(u32 chip_id)
>>  {
>>  	int rc;
>>  	u64 data;
>> +
>> +	rc = xscom_read(chip_id, PSU_SBE_DOORBELL_REG_RW, &data);
>> +	if (rc) {
>> +		prlog(PR_ERR, "Failed to read Host to SBE doorbell register "
>> +		      "[chip id = %x]\n", chip_id);
>> +		return false;
>> +	}
>> +
>> +	/* Waiting for SBE to process previous message */
>> +	if (data & HOST_SBE_MSG_WAITING)
>> +		return false;
>
> So when you try to send 2 msg back to back, the first one gets sent,
> which sets the above right ?

We populate data registers and then enable above bit in doorbell register.
This will trigger interrupt to SBE.

Yes. First message sets this bit.

>
> Then the SBE clears it ? How long does it take ? Do we get an interrupt
> to know it's been cleared ? Otherwise how do we know we can send the
> next message if it's not a message that takes a response ?

Yes. SBE clears it. I didn't profile how long it takes. Presently I'm assuming 
once SBE clears
this bit we are good to proceed.

And yes, for messages that doesn't need response I'm not waiting. That needs to 
be fixed.

Actually we can request for acknowledgment from SBE (like part of control flag 
in message header).
SBE will read message and sends ACK interrupt to host.

We can enforce this before sending message to SBE. But we have to handle extra 
interrupt.
Is it fine to handle extra interrupt (specially for timer messages)?

>
>> +	return true;
>> +}
>> +
>> +static int sbe_msg_send(u32 chip_id, struct sbe_msg *msg)
>> +{
>> +	int rc, i;
>> +	u64 addr, *data;
>> +
>> +	addr = PSU_HOST_SBE_MBOX_REG0;
>> +	data = &msg->reg[0];
>> +
>> +	for (i = 0; i < NR_HOST_SBE_MBOX_REG; i++) {
>> +		rc = xscom_write(chip_id, addr, *data);
>> +		if (rc)
>> +			return rc;
>> +
>> +		addr++;
>> +		data++;
>> +	}
>> +
>> +	rc = xscom_write(chip_id, PSU_SBE_DOORBELL_REG_OR,
>> +			 HOST_SBE_MSG_WAITING);
>> +	if (rc == OPAL_SUCCESS)
>> +		msg->state = sbe_msg_sent;
>
> I would do the state change in the caller

Makes sense. Will fix in next iteration.

>
>> +
>> +	return rc;
>> +}
>> +
>> +static int sbe_msg_receive(u32 chip_id, struct sbe_msg *resp)
>> +{
>> +	int i;
>> +	int rc = OPAL_SUCCESS;
>> +	u64 addr, *data;
>> +
>> +	addr = PSU_HOST_SBE_MBOX_REG4;
>> +	data = &resp->reg[0];
>> +
>> +	for (i = 0; i < NR_HOST_SBE_MBOX_REG; i++) {
>> +		rc = xscom_read(chip_id, addr, data);
>> +		if (rc)
>> +			return rc;
>> +
>> +		addr++;
>> +		data++;
>> +	}
>> +
>> +	return rc;
>> +}
>> +
>> +/* This is called with sbe->lock */
>> +static void sbe_msg_complete(struct sbe *sbe, struct sbe_msg *msg)
>> +{
>> +	prlog(PR_INSANE, "Completing msg [chip id = %x], reg0 : 0x%llx\n",
>> +	      sbe->chip_id, msg->reg[0]);
>> +
>> +	msg->state = sbe_msg_done;
>> +	if (msg->response)
>> +		msg->resp = sbe->resp;
>> +
>> +	list_del_from(&sbe->msg_list, &msg->link);
>> +
>> +	if (msg->complete) {
>> +		unlock(&sbe->lock);
>> +		msg->complete(msg);
>> +		lock(&sbe->lock);
>> +	}
>> +}
>> +
>> +/* This is called with sbe->lock */
>> +static void sbe_msg_send_complete(struct sbe *sbe, struct sbe_msg *msg)
>> +{
>
> Tis should be inside __sbe_poke_queue(), that arbitrary split into 2
> functions isn't useful

Makes sense. Will fix in next iteration.

>
>> +	u64 timeout;
>> +
>> +	prlog(PR_INSANE, "Completing send [chip id = %x], reg0 : 0x%llx\n",
>> +	      sbe->chip_id, msg->reg[0]);
>> +
>> +	/* Need response */
>> +	if (msg->response) {
>> +		timeout = sbe_get_cmdclass_timeout(msg->reg[0] & 0xffff);
>> +		msg->timeout = mftb() + msecs_to_tb(timeout);
>> +		msg->state = sbe_msg_wresp;
>> +	} else {
>> +		sbe_msg_complete(sbe, msg);
>
> Don't you get some sort of indication from the SBE that the message has
> been accepted ? Otherwise how do you know that it's ready to send a new
> one and you should process you queue further ?

As described above, we can request for ack from SBE. That way we are sure
that SBE has read the message.

>
> Also add a clear comment that sbe_msg_complete can drop the lock and
> carry that comment accross the call chain.

sure.

>
>> +	}
>> +}
>> +
>> +static void __sbe_poke_queue(struct sbe *sbe, struct sbe_msg *msg)
>> +{
>
> Call this something like sbe_process_queue

Sure.

>
>> +	int rc;
>> +
>> +	if (!sbe_hw_ready(sbe->chip_id))
>> +		return;
>> +
>> +	/* Send message */
>> +	rc = sbe_msg_send(sbe->chip_id, msg);
>> +	if (rc) {
>> +		prlog(PR_ERR, "Failed to send message to SBE "
>> +		      "[chip id = %x]\n", sbe->chip_id);
>> +		sbe_set_primary_rc(sbe->resp, SBE_STATUS_PRI_GENERIC_ERR);
>> +		sbe_msg_complete(sbe, msg);
>> +	} else {
>> +		sbe_msg_send_complete(sbe, msg);
>
> I'm not sure why that is a separate function. I would "return" at the
> end of the error case and just do the rest direclty here.

Yes, separate function not required.

>
>> +	}
>> +}
>> +
>> +static void sbe_poke_queue(struct sbe *sbe)
>> +{
>> +	struct sbe_msg *msg;
>> +
>> +	while (!list_empty(&sbe->msg_list)) {
>> +		msg = list_top(&sbe->msg_list, struct sbe_msg, link);
>> +		if (sbe_busy(msg))
>> +			return;
>
> I don't like that you are using the state of the top message in the
> queue to know if the SBE is busy. I'd rather you kept a separate state
> of the mbox itself.

Yeah. Separate state of MBOX is required when we handle SBE R/R etc. I thought 
for now
message state is enough. Let me try to introduce/use per SBE state.

>
>> +		__sbe_poke_queue(sbe, msg);
>> +	}
>> +}
>> +
>> +/*
>> + * WARNING:
>> + *         Only one command is accepted in the command buffer until response
>> + *         to the command is enqueued in the response buffer by SBE.
>> + *
>> + *         Head of msg_list contains in-flight message. Hence we should always
>> + *         add new message to tail of the list.
>> + */
>> +int sbe_queue_msg(u32 chip_id, struct sbe_msg *msg,
>> +		  void (*comp)(struct sbe_msg *msg))
>> +{
>>  	struct proc_chip *chip;
>> +	struct sbe *sbe;
>> +
>> +	if (!msg)
>> +		return OPAL_PARAMETER;
>> +
>> +	if (sbe_default_chip_id == -1)
>> +		return OPAL_HARDWARE;
>> +
>> +	/* Default to SBE on master chip */
>> +	if (chip_id == -1)
>> +		chip = get_chip(sbe_default_chip_id);
>> +	else
>> +		chip = get_chip(chip_id);
>> +
>> +	if (chip == NULL || chip->sbe == NULL)
>> +		return OPAL_INTERNAL_ERROR;
>> +
>> +	sbe = chip->sbe;
>> +
>> +	/* Set completion */
>> +	msg->complete = comp;
>> +	msg->state = sbe_msg_queued;
>> +
>> +	lock(&sbe->lock);
>> +	/* Update sequence number */
>> +	msg->reg[0] = msg->reg[0] | ((u64)sbe->cur_seq << 16);
>> +	sbe->cur_seq++;
>> +
>> +	/* Reset sequence number */
>> +	if (sbe->cur_seq == 0xffff)
>> +		sbe->cur_seq = 1;
>> +
>> +	/* Add message to queue */
>> +	list_add_tail(&sbe->msg_list, &msg->link);
>> +	sbe_poke_queue(sbe);
>> +	unlock(&sbe->lock);
>> +
>> +	return OPAL_SUCCESS;
>> +}
>> +
>> +int sbe_sync_msg(u32 chip_id, struct sbe_msg *msg, bool autofree)
>> +{
>> +	int rc = OPAL_SUCCESS;
>> +
>> +	rc = sbe_queue_msg(chip_id, msg, NULL);
>> +	if (rc)
>> +		goto free_msg;
>> +
>> +	while (msg->state != sbe_msg_done &&
>> +	       msg->state != sbe_msg_cancelled) {
>> +		cpu_relax();
>> +		sbe_timeout_poll(NULL);
>> +	}
>>
> What about timeout and other non-sense states ?

Yes. I missed timeout state. Anyway I'm going to remove that state.

>
>> +	if (msg->state == sbe_msg_done)
>> +		rc = 0;
>> +	else
>> +		rc = -1;
>> +
>> +	if (msg->response && msg->resp)
>> +		rc = sbe_get_primary_rc(msg->resp);
>> +
>> +free_msg:
>> +	if (autofree)
>> +		sbe_freemsg(msg);
>> +
>> +	return rc;
>> +}
>> +
>> +int sbe_cancelmsg(u32 chip_id, struct sbe_msg *msg)
>> +{
>> +	struct proc_chip *chip;
>> +	struct sbe *sbe;
>>
>>  	chip = get_chip(chip_id);
>> -	if (chip == NULL)
>> +	if (chip == NULL || chip->sbe == NULL)
>> +		return OPAL_PARAMETER;
>> +
>> +	sbe = chip->sbe;
>> +
>> +	lock(&sbe->lock);
>> +	if (msg->state != sbe_msg_queued) {
>> +		unlock(&sbe->lock);
>> +		return OPAL_BUSY;
>> +	}
>> +
>> +	list_del_from(&sbe->msg_list, &msg->link);
>> +	msg->state = sbe_msg_cancelled;
>> +	unlock(&sbe->lock);
>
> What if that message has a pending response ? You must not lose that
> state since the mbox will still send you a response... another reason
> to have a mbox state rather than message state.

I'm removing messages that are in queued state (not yet sent to SBE). So I think 
above code is fine.

>
>> +	return OPAL_SUCCESS;
>> +}
>> +
>> +static void sbe_handle_response(struct sbe *sbe)
>> +{
>> +	u16 send_seq, resp_seq;
>> +	int rc;
>> +	struct sbe_msg *msg;
>> +
>> +	if (list_empty(&sbe->msg_list))
>> +		return;
>> +
>> +	/* Clear response registers */
>> +	memset(sbe->resp, 0, sizeof(struct sbe_msg));
>> +
>> +	msg = list_top(&sbe->msg_list, struct sbe_msg, link);
>
> So if the message was cancelled you now give its response to whatever
> the next one was, doesn't look quite right to me.

I'm not canceling already queued message. So its fine.

>
>> +	rc = sbe_msg_receive(sbe->chip_id, sbe->resp);
>> +	if (rc != OPAL_SUCCESS) {
>> +		prlog(PR_ERR, "Failed to read response message "
>> +		      "[chip id = %x]\n", sbe->chip_id);
>> +		sbe_set_primary_rc(sbe->resp, SBE_STATUS_PRI_GENERIC_ERR);
>> +		return;
>> +	}
>> +
>> +	/* Update message state */
>> +	msg->state = sbe_msg_response;
>> +
>> +	/* Validate sequence number */
>> +	send_seq = (msg->reg[0] >> 16) & 0xffff;
>> +	resp_seq = (sbe->resp->reg[0] >> 16) & 0xffff;
>> +	if (send_seq != resp_seq) {
>> +		/*
>> +		 * XXX Reset SBE and repost message.
>> +		 *     Lets send sequence error to caller until SBE reset works.
>> +		 */
>> +		prlog(PR_ERR, "Invalid sequence id [chip id = %x]\n",
>> +		      sbe->chip_id);
>> +		sbe_set_primary_rc(sbe->resp, SBE_STATUS_PRI_SEQ_ERR);
>>  		return;
>> +	}
>> +}
>> +
>> +void sbe_interrupt(uint32_t chip_id)
>> +{
>> +	bool msg_complete = false;
>> +	int rc;
>> +	u64 data = 0;
>> +	struct proc_chip *chip;
>> +	struct sbe_msg *msg;
>> +	struct sbe *sbe;
>> +
>> +	chip = get_chip(chip_id);
>> +	if (chip == NULL || chip->sbe == NULL)
>> +		return;
>> +	sbe = chip->sbe;
>>
>>  	/* Read doorbell register */
>>  	rc = xscom_read(chip_id, PSU_HOST_DOORBELL_REG_RW, &data);
>>  	if (rc) {
>>  		prlog(PR_ERR, "Failed to read SBE to Host doorbell register "
>>  		      "[chip id = %x]\n", chip_id);
>> -		goto clr_interrupt;
>> +		sbe_reg_dump(chip_id);
>> +		return;
>>  	}
>>
>> -	/* SBE passtrhough command, call prd handler */
>> -	if (data & SBE_HOST_PASSTHROUGH) {
>> -		prd_sbe_passthrough(chip_id);
>> +	/* Called from poller path (sbe_timeout_poll), nothing to process */
>> +	if (!data)
>> +		return;
>> +
>> +	/* Read SBE response before clearing doorbell register */
>> +	if (data & SBE_HOST_RESPONSE_WAITING) {
>> +		lock(&sbe->lock);
>> +		sbe_handle_response(sbe);
>> +		msg_complete = true;
>> +		unlock(&sbe->lock);
>>  	}
>>
>> -clr_interrupt:
>> -	/* Clears all the bits */
>> +	/* Clear doorbell register so that we can send next message */
>>  	rc = xscom_write(chip_id, PSU_HOST_DOORBELL_REG_AND,
>>  			 SBE_HOST_RESPONSE_CLEAR);
>>  	if (rc) {
>>  		prlog(PR_ERR, "Failed to clear SBE to Host doorbell "
>>  		      "register [chip id = %x]\n", chip_id);
>> +		return;
>> +	}
>> +
>> +	/* SBE came back from reset */
>> +	if (data & SBE_HOST_RESET) {
>> +		prlog(PR_NOTICE, "Came back from reset [chip id = %x]\n",
>> +		      chip_id);
>> +
>> +		/* Repost message */
>> +		lock(&sbe->lock);
>> +		if (!list_empty(&sbe->msg_list)) {
>> +			msg = list_top(&sbe->msg_list, struct sbe_msg, link);
>> +			msg->state = sbe_msg_queued;
>> +			sbe_poke_queue(sbe);
>> +		}
>> +		unlock(&sbe->lock);
>> +		return;
>> +	}
>> +
>> +	/* SBE passthrough command, call prd handler */
>> +	if (data & SBE_HOST_PASSTHROUGH) {
>> +		prd_sbe_passthrough(chip_id);
>> +	}
>> +
>> +	/* Handle SBE response */
>> +	if (msg_complete) {
>> +		lock(&sbe->lock);
>> +		if (!list_empty(&sbe->msg_list)) {
>> +			msg = list_top(&sbe->msg_list, struct sbe_msg, link);
>> +			sbe_msg_complete(sbe, msg);
>> +		}
>> +		sbe_poke_queue(sbe);
>> +		unlock(&sbe->lock);
>> +	}
>> +}
>
> Lots of lock/unlock happening here. It might be worth to have
> sbe_msg_complete be called sbe_msg_complete_unlock and have it return
> with the lock dropped, and same with sbe_handle_response(). Another
> option is to organize things a bit differently here. You have
> sbe_handle_response() not complete directly, but return the completed
> message. Then you can do all the other stuff you have to do, and
> finally call complete() with the lock dropped at the end of the
> function after you're re-poked the queue.

Yes, too many lock/unlock is happening in this function. Let me try to reognaize 
this function.

>
> Also your current code is racy vs. cancel. You really need to look into
> that a bit more.
>

As mentioned above, I'm cancelling only queued messages (messages that are not 
yet sent to SBE).
I don't see any race here. Did I miss anything here?

Thanks
-Vasant