[Lguest] lguest: mapping switcher would thwack fixmap

Tue Apr 30 09:52:06 EST 2013

Paul Bolle <pebolle at tiscali.nl> writes:
> On Wed, 2013-04-17 at 15:01 +0200, Paul Bolle wrote:
>> On Wed, 2013-04-17 at 12:15 +0200, Paul Bolle wrote:
>> But the machine once again reboots (triple faults?).
>
> 0) I've looked at this using qemu (running a minimal VM running the same
> (patched) kernel). That VM also reboots after launching the lguest tool.
>
> 1) In that VM's kernel the fixmap is found at 0xffd35000 and the
> (patched) switcher is loaded af 0xffd31000. That's to be expected: one
> page for the switcher code, two pages for one CPU and one, mysterious,
> VMA guard page makes 0x4000.
>
> 2) qemu's debug log learns:
>
> Triple fault
> CPU Reset (CPU 0)
> EAX=ffd32000 EBX=1c139000 ECX=00000001 EDX=dfbec000
> ESI=ffd32000 EDI=bfb182cc EBP=dd903f2c ESP=ffd32fb8
> EIP=ffd3103d EFL=00000082 [--S----] CPL=0 II=0 A20=1 SMM=0 HLT=0
> ES =007b 00000000 ffffffff 00cff300 DPL=3 DS   [-WA]
> CS =0050 00000000 ffffffff 00cf9b00 DPL=0 CS32 [-RA]
> SS =0058 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
> DS =007b 00000000 ffffffff 00cff300 DPL=3 DS   [-WA]
> FS =00d8 1ef3c000 ffffffff 008f9300 DPL=0 DS16 [-WA]
> GS =00e0 dfbf3500 00000018 00409100 DPL=0 DS   [--A]
> LDT=0000 00000000 00000000 00008200 DPL=0 LDT
> TR =0080 ffd33020 00000067 00008900 DPL=0 TSS32-avl
> GDT=     ffd33888 000000ff
> IDT=     ffd33088 000007ff
> CR0=8005003b CR2=ffd330c8 CR3=1c139000 CR4=00000610
> DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 
> DR6=00000000 DR7=00000000
> CCS=00000fb8 CCD=00000089 CCO=LOGICB  
> EFER=0000000000000000
> FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
> FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
> FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
> FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
> FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
> XMM00=00000000000000000000000000000000 XMM01=0000ff00000000000000000000ff0000
> XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
> XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
> XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
>
> 3) The internet learned me that at triple fault cr2 holds the faulting
> address (ie, 0xffd330c8). In this case that is the double fault entry in
> the IDT. See, lguest puts the IDT at 0xffd33088, and 0xffd330c8 is 64 or
> 8 entries bytes above that, and double fault is entry number nine of the
> IDT. (This all matches neatly with what lguest is doing with the two
> pages it puts in the switcher area per cpu. I'll omit the details. I did
> triple check this.)
>
> 4) The faulting instruction is at 0xffd3103d. That is in the actual
> switcher code, only 61 bytes deep! Peeking into
> drivers/lguest/x86/switcher_32.o that is just _after_ this instruction:
>         // Once our page table's switched, the Guest is live!
>         // The Host fades as we run this final step.
>         // Our "struct lguest_pages" is now read-only.
>         movl    %ebx, %cr3
>
> That matches what we find in EBX and CR3 above: both contain 0x1c139000.
> That value could be legit, as the VM was running with 512M of memory.
>
> 5) Any ideas what this all exactly means? Is something going awry when
> we put the page tables at (physical) 0x1c139000 or is that perhaps a
> bogus value? How can I dig deeper into the cause of this triple fault?

Yes, it probably means that the page table is complete crap: we fault
accessing the next instruction, and we fail to access the IDT to do a
page fault.

eg. we screwed up the Switcher mapping in the page table.

Can I have your .config so I can try to reproduce here?

Thanks,
Rusty.