[PATCH net v3] ibmvnic fix NULL tx_pools and rx_tools issue at do_reset

Mingming Cao mmc at linux.vnet.ibm.com
Wed Aug 26 11:14:25 AEST 2020



> On Aug 25, 2020, at 5:31 PM, David Miller <davem at davemloft.net> wrote:
> 
> From: Dany Madden <drt at linux.ibm.com>
> Date: Tue, 25 Aug 2020 13:26:41 -0400
> 
>> From: Mingming Cao <mmc at linux.vnet.ibm.com>
>> 
>> At the time of do_rest, ibmvnic tries to re-initalize the tx_pools
>> and rx_pools to avoid re-allocating the long term buffer. However
>> there is a window inside do_reset that the tx_pools and
>> rx_pools were freed before re-initialized making it possible to deference
>> null pointers.
>> 
>> This patch fix this issue by always check the tx_pool
>> and rx_pool are not NULL after ibmvnic_login. If so, re-allocating
>> the pools. This will avoid getting into calling reset_tx/rx_pools with
>> NULL adapter tx_pools/rx_pools pointer. Also add null pointer check in
>> reset_tx_pools and reset_rx_pools to safe handle NULL pointer case.
>> 
>> Signed-off-by: Mingming Cao <mmc at linux.vnet.ibm.com>
>> Signed-off-by: Dany Madden <drt at linux.ibm.com>
> 
> Applied, but:
> 
>> +	if (!adapter->rx_pool)
>> +		return -1;
>> +
> 
> This driver has poor error code usage, it's a random mix of hypervisor
> error codes, normal error codes like -EINVAL, and internal error codes.
> Sometimes used all in the same function.
> 

Agree need to improve. For this patch/fix,  -1 is  chosen to follow other part of the driver that check NULL pointer and return -1 . We should  go through all of -1 cases and replace with normal proper error code. That should be a seperate patch. 

> For example:
> 
> static int ibmvnic_send_crq(struct ibmvnic_adapter *adapter,
> 			    union ibmvnic_crq *crq)
> ...
> 	if (!adapter->crq.active &&
> 	    crq->generic.first != IBMVNIC_CRQ_INIT_CMD) {
> 		dev_warn(dev, "Invalid request detected while CRQ is inactive, possible device state change during reset\n");
> 		return -EINVAL;
> 	}
> ...
> 	rc = plpar_hcall_norets(H_SEND_CRQ, ua,
> 				cpu_to_be64(u64_crq[0]),
> 				cpu_to_be64(u64_crq[1]));
> 
> 	if (rc) {
> 		if (rc == H_CLOSED) {
> ...
> 	return rc;
> 
> So obviously this function returns a mix of negative erro codes
> and Hypervisor codes such as H_CLOSED.
> 
> And stuff like:
> 
> 	rc = __ibmvnic_open(netdev);
> 	if (rc)
> 		return IBMVNIC_OPEN_FAILED;

Agree. 

Mingming


More information about the Linuxppc-dev mailing list