[PATCH 03/14] ibmvnic: simplify ibmvnic_set_queue_affinity()y

Thu Jan 9 01:08:51 AEDT 2025

On Tue, Jan 07, 2025 at 03:04:40PM -0800, Yury Norov wrote:
> On Tue, Jan 07, 2025 at 02:43:01PM -0800, Yury Norov wrote:
> > On Tue, Jan 07, 2025 at 04:37:17PM -0600, Nick Child wrote:
> > > On Sat, Dec 28, 2024 at 10:49:35AM -0800, Yury Norov wrote:
> > > > A loop based on cpumask_next_wrap() opencodes the dedicated macro
> > > > for_each_online_cpu_wrap(). Using the macro allows to avoid setting
> > > > bits affinity mask more than once when stride >= num_online_cpus.
> > > > 
> > > > This also helps to drop cpumask handling code in the caller function.
> > > > 
> > > > Signed-off-by: Yury Norov <yury.norov at gmail.com>
> > > > ---
> > > >  drivers/net/ethernet/ibm/ibmvnic.c | 17 ++++++++++-------
> > > >  1 file changed, 10 insertions(+), 7 deletions(-)
> > > > 
> > > > diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
> > > > index e95ae0d39948..4cfd90fb206b 100644
> > > > --- a/drivers/net/ethernet/ibm/ibmvnic.c
> > > > +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> > > > @@ -234,11 +234,16 @@ static int ibmvnic_set_queue_affinity(struct ibmvnic_sub_crq_queue *queue,
> > > >  		(*stragglers)--;
> > > >  	}
> > > >  	/* atomic write is safer than writing bit by bit directly */
> > > > -	for (i = 0; i < stride; i++) {
> > > > -		cpumask_set_cpu(*cpu, mask);
> > > > -		*cpu = cpumask_next_wrap(*cpu, cpu_online_mask,
> > > > -					 nr_cpu_ids, false);
> > > > +	for_each_online_cpu_wrap(i, *cpu) {
> > > > +		if (!stride--)
> > > > +			break;
> > > > +		cpumask_set_cpu(i, mask);
> > > >  	}
> > > > +
> > > > +	/* For the next queue we start from the first unused CPU in this queue */
> > > > +	if (i < nr_cpu_ids)
> > > > +		*cpu = i + 1;
> > > > +
> > > This should read '*cpu = i'. Since the loop breaks after incrementing i.
> > > Thanks!
> > 
> > cpumask_next_wrap() makes '+ 1' for you. The for_each_cpu_wrap() starts
> > exactly where you point. So, this '+1' needs to be explicit now.
> > 
> > Does that make sense?
> 
> Ah, I think I see what you mean. It should be like this, right?
> 
>   for_each_online_cpu_wrap(i, *cpu) {
>   	if (!stride--) {
>         	*cpu = i + 1;
>   		break;
>         }
>   	cpumask_set_cpu(i, mask);
>   }
Not quite, for_each_online_cpu_wrap will increment i to point to the
next online cpu, then enter the body of the loop. When we break (beacuse
stride is zero), we exit the loop early before i is added to any mask, i
is the next unassigned online cpu.
I tested this to make sure, we see unused cpus (#7, #23)  with the patch as is:
  IRQ : 256 -> ibmvnic-30000003-tx0
	/proc/irq/256/smp_affinity_list:0-6
  IRQ : 257 -> ibmvnic-30000003-tx1
	/proc/irq/257/smp_affinity_list:16-22
  IRQ : 258 -> ibmvnic-30000003-rx0
	/proc/irq/258/smp_affinity_list:8-14
  IRQ : 259 -> ibmvnic-30000003-rx1
	/proc/irq/259/smp_affinity_list:24-30