[Lguest] lguest: mapping switcher would thwack fixmap

Paul Bolle pebolle at tiscali.nl
Mon May 6 21:11:11 EST 2013


On Mon, 2013-05-06 at 12:46 +0930, Rusty Russell wrote:
> Paul Bolle <pebolle at tiscali.nl> writes:
> > Please note that I've jumped to v3.9 in the meantime. It still triggers
> > this triple fault. But note that CONFIG_MICROCODE_INTEL_EARLY=y causes
> > the unhandled trap 13 error, so this .config will only work (ie, triple
> > fault your machine), on AMD hardware. I hope to send a separate message
> > for this last issue shortly. 
> 
> Hmm, neither for me.  When I remove CONFIG_MICROCODE, it just works
> (well, it panics unable to mount root, but that's due to lack of
> non-modular block device).
> 
> I'm running under KVM on Intel, using my latest kernel.

Well, after doing desperate things, like studying the code, I've finally
chosen a more structured approach: I've littered the lguest driver with
printk's! And that helped me to pinpoint the problem here.

See, basically the last thing I could see was a call to guest_set_pgd()
with idx=1023. So, apparently the guest tells us the page table entry
for the upper 4MB of the virtual address space has changed. But that is
were the Switcher hangs out! And, somehow, the call to
allocate_switcher_mapping() doesn't put the switcher's single page back
into the guest's page tables.

The very next call to run_guest_once() triggers the Triple Fault. We
call lguest_entry() with a reference to a page table without, say, an
entry for the Switcher. So when switch_to_guest() does "movl %ebx, %cr3"
we're stuck. The next instruction (popl %eax) fails because EIP points
somewhere into those last 4MB of virtual address space, which aren't
part of our current page table any more. Hence Page Fault, and since the
Double Fault entry of the IDT resides there too, we finally end up with
a Triple Fault.

Does that make sense?

Anyhow, adding a line to make sure the Switcher is placed at a 4M
boundary does the trick:
    switcher_addr = (switcher_addr / 0x400000ul) * 0x400000ul;

(That line was copied by hand.)

And now I'm able to actually run a guest in qemu (that is, end up in a
functional dracut emergency shell). That is way past the moment the
guest would cause a Triple Fault beforehand.

(The Intel early microcode stuff prevents me from running lguest on real
hardware. I haven't yet recompiled my kernel. I'm glad to do so if you
want additional testing.)


Paul Bolle



More information about the Lguest mailing list