[EXTERNAL] [PATCH] target/ppc: Fix mtmsr(d) L=1 variant that loses interrupts
Cédric Le Goater
clg at kaod.org
Wed Apr 15 16:49:58 AEST 2020
On 4/14/20 1:11 PM, Nicholas Piggin wrote:
> If mtmsr L=1 sets MSR[EE] while there is a maskable exception pending,
> it does not cause an interrupt. This causes the test case to hang:
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.gnu.org_archive_html_qemu-2Dppc_2019-2D10_msg00826.html&d=DwIDAg&c=jf_iaSHvJObTbx-siA1ZOg&r=XHJsZhhuWSw9713Fp0ciew&m=TQfi_v-8XYgz7MiMDAZ_CjkyalSh9-EXhQ3oDesUm74&s=pFoesXbioVBh5wCuzEnzwgfze6X7e-a9unkfUgsRwiw&e=
>
> More recently, Linux reduced the occurance of operations (e.g., rfi)
> which stop translation and allow pending interrupts to be processed.
> This started causing hangs in Linux boot in long-running kernel tests,
> running with '-d int' shows the decrementer stops firing despite DEC
> wrapping and MSR[EE]=1.
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.ozlabs.org_pipermail_linuxppc-2Ddev_2020-2DApril_208301.html&d=DwIDAg&c=jf_iaSHvJObTbx-siA1ZOg&r=XHJsZhhuWSw9713Fp0ciew&m=TQfi_v-8XYgz7MiMDAZ_CjkyalSh9-EXhQ3oDesUm74&s=EhkRfxvQMomvneYweWDEIUktCkKykgIqEmdhA0PtiwU&e=
>
> The cause is the broken mtmsr L=1 behaviour, which is contrary to the
> architecture. From Power ISA v3.0B, p.977, Move To Machine State Register,
> Programming Note states:
>
> If MSR[EE]=0 and an External, Decrementer, or Performance Monitor
> exception is pending, executing an mtmsrd instruction that sets
> MSR[EE] to 1 will cause the interrupt to occur before the next
> instruction is executed, if no higher priority exception exists
>
> Fix this by handling L=1 exactly the same way as L=0, modulo the MSR
> bits altered.
>
> The confusion arises from L=0 being "context synchronizing" whereas L=1
> is "execution synchronizing", which is a weaker semantic. However this
> is not a relaxation of the requirement that these exceptions cause
> interrupts when MSR[EE]=1 (e.g., when mtmsr executes to completion as
> TCG is doing here), rather it specifies how a pipelined processor can
> have multiple instructions in flight where one may influence how another
> behaves.
I was expecting more changes but this looks fine.
Reviewed-by: Cédric Le Goater <clg at kaod.org>
> Cc: qemu-stable at nongnu.org
> Reported-by: Anton Blanchard <anton at ozlabs.org>
> Reported-by: Nathan Chancellor <natechancellor at gmail.com>
> Tested-by: Nathan Chancellor <natechancellor at gmail.com>
> Signed-off-by: Nicholas Piggin <npiggin at gmail.com>
I gave it a try on PowerNV, pseries and mac99. All good.
Tested-by: Cédric Le Goater <clg at kaod.org>
I don't know how we could include tests in QEMU such as the one Anton
sent. These are good exercisers for our exception model.
Thanks,
C.
> ---
> Thanks very much to Nathan for reporting and testing it, I added his
> Tested-by tag despite a more polished patch, as the the basics are
> still the same (and still fixes his test case here).
>
> This bug possibly goes back to early v2.04 / mtmsrd L=1 support around
> 2007, and the code has been changed several times since then so may
> require some backporting.
>
> 32-bit / mtmsr untested at the moment, I don't have an environment
> handy.
>
>
> target/ppc/translate.c | 46 +++++++++++++++++++++++++-----------------
> 1 file changed, 27 insertions(+), 19 deletions(-)
>
> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> index b207fb5386..9959259dba 100644
> --- a/target/ppc/translate.c
> +++ b/target/ppc/translate.c
> @@ -4361,30 +4361,34 @@ static void gen_mtmsrd(DisasContext *ctx)
> CHK_SV;
>
> #if !defined(CONFIG_USER_ONLY)
> + if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
> + gen_io_start();
> + }
> if (ctx->opcode & 0x00010000) {
> - /* Special form that does not need any synchronisation */
> + /* L=1 form only updates EE and RI */
> TCGv t0 = tcg_temp_new();
> + TCGv t1 = tcg_temp_new();
> tcg_gen_andi_tl(t0, cpu_gpr[rS(ctx->opcode)],
> (1 << MSR_RI) | (1 << MSR_EE));
> - tcg_gen_andi_tl(cpu_msr, cpu_msr,
> + tcg_gen_andi_tl(t1, cpu_msr,
> ~(target_ulong)((1 << MSR_RI) | (1 << MSR_EE)));
> - tcg_gen_or_tl(cpu_msr, cpu_msr, t0);
> + tcg_gen_or_tl(t1, t1, t0);
> +
> + gen_helper_store_msr(cpu_env, t1);
> tcg_temp_free(t0);
> + tcg_temp_free(t1);
> +
> } else {
> /*
> * XXX: we need to update nip before the store if we enter
> * power saving mode, we will exit the loop directly from
> * ppc_store_msr
> */
> - if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
> - gen_io_start();
> - }
> gen_update_nip(ctx, ctx->base.pc_next);
> gen_helper_store_msr(cpu_env, cpu_gpr[rS(ctx->opcode)]);
> - /* Must stop the translation as machine state (may have) changed */
> - /* Note that mtmsr is not always defined as context-synchronizing */
> - gen_stop_exception(ctx);
> }
> + /* Must stop the translation as machine state (may have) changed */
> + gen_stop_exception(ctx);
> #endif /* !defined(CONFIG_USER_ONLY) */
> }
> #endif /* defined(TARGET_PPC64) */
> @@ -4394,15 +4398,23 @@ static void gen_mtmsr(DisasContext *ctx)
> CHK_SV;
>
> #if !defined(CONFIG_USER_ONLY)
> - if (ctx->opcode & 0x00010000) {
> - /* Special form that does not need any synchronisation */
> + if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
> + gen_io_start();
> + }
> + if (ctx->opcode & 0x00010000) {
> + /* L=1 form only updates EE and RI */
> TCGv t0 = tcg_temp_new();
> + TCGv t1 = tcg_temp_new();
> tcg_gen_andi_tl(t0, cpu_gpr[rS(ctx->opcode)],
> (1 << MSR_RI) | (1 << MSR_EE));
> - tcg_gen_andi_tl(cpu_msr, cpu_msr,
> + tcg_gen_andi_tl(t1, cpu_msr,
> ~(target_ulong)((1 << MSR_RI) | (1 << MSR_EE)));
> - tcg_gen_or_tl(cpu_msr, cpu_msr, t0);
> + tcg_gen_or_tl(t1, t1, t0);
> +
> + gen_helper_store_msr(cpu_env, t1);
> tcg_temp_free(t0);
> + tcg_temp_free(t1);
> +
> } else {
> TCGv msr = tcg_temp_new();
>
> @@ -4411,9 +4423,6 @@ static void gen_mtmsr(DisasContext *ctx)
> * power saving mode, we will exit the loop directly from
> * ppc_store_msr
> */
> - if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
> - gen_io_start();
> - }
> gen_update_nip(ctx, ctx->base.pc_next);
> #if defined(TARGET_PPC64)
> tcg_gen_deposit_tl(msr, cpu_msr, cpu_gpr[rS(ctx->opcode)], 0, 32);
> @@ -4422,10 +4431,9 @@ static void gen_mtmsr(DisasContext *ctx)
> #endif
> gen_helper_store_msr(cpu_env, msr);
> tcg_temp_free(msr);
> - /* Must stop the translation as machine state (may have) changed */
> - /* Note that mtmsr is not always defined as context-synchronizing */
> - gen_stop_exception(ctx);
> }
> + /* Must stop the translation as machine state (may have) changed */
> + gen_stop_exception(ctx);
> #endif
> }
>
>
More information about the Linuxppc-dev
mailing list