[RFC PATCH v3 07/10] sched/core: Push current task from paravirt CPU

Fri Sep 12 03:06:18 AEST 2025

Hello Shrikanth,

On 9/11/2025 10:22 PM, Shrikanth Hegde wrote:
>>> +    if (is_cpu_paravirt(cpu))
>>> +        push_current_from_paravirt_cpu(rq);
>>
>> Does this mean paravirt CPU is capable of handling an interrupt but may
>> not be continuously available to run a task?
> 
> When i run hackbench which involves fair bit of IRQ stuff, it moves out.
> 
> For example,
> 
> echo 600-710 > /sys/devices/system/cpu/paravirt
> 
> 11:31:54 AM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> 11:31:57 AM  598    2.04    0.00   77.55    0.00   18.37    0.00    1.02    0.00    0.00    1.02
> 11:31:57 AM  599    1.01    0.00   79.80    0.00   17.17    0.00    1.01    0.00    0.00    1.01
> 11:31:57 AM  600    0.00    0.00    0.00    0.00    0.00    0.00    0.99    0.00    0.00   99.01
> 11:31:57 AM  601    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> 11:31:57 AM  602    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> 
> 
> There could some workloads which doesn't move irq's out, for which needs irqbalance change.
> Looking into it.
> 
>  Or is the VMM expected to set
>> the CPU on the paravirt mask and give the vCPU sufficient time to move the
>> task before yanking it away from the pCPU?
>>
> 
> If the vCPU is running something, it is going to run at some point on pCPU.
> hypervisor will give the cycles to this vCPU by preempting some other vCPU.
> 
> It is that using this infra, there is should be nothing on that paravirt vCPU.
> That way collectively VMM gets only limited request for pCPU which it can satify
> without vCPU preemption.

Ack! Just wanted to understand the usage.

P.S. I remember discussions during last LPC where we could communicate
this unavailability via CPU capacity. Was that problematic for some
reason? Sorry if I didn't follow this discussion earlier.

[..snip..]
>>> +    local_irq_save(flags);
>>> +    preempt_disable();
>>
>> Disabling IRQs implies preemption is disabled.
>>
> 
> Most of the places stop_one_cpu_nowait called with preemption & irq disabled.
> stopper runs at the next possible opportunity.

But is there any reason to do both local_irq_save() and
preempt_disable()? include/linux/preempt.h defines preemptible() as:

    #define preemptible()   (preempt_count() == 0 && !irqs_disabled())

so disabling IRQs should be sufficient right or am I missing something?

> 
> stop_one_cpu_nowait
>  ->queues the task into stopper list
>     -> wake_up_process(stopper)
>        -> set need_resched
>          -> stopper runs as early as possible.
>         
-- 
Thanks and Regards,
Prateek