[Lguest] [PATCH 18/25] [PATCH] turn priviled operations into macros in entry.S
Steven Rostedt
rostedt at goodmis.org
Thu Aug 9 03:07:55 EST 2007
--
On Wed, 8 Aug 2007, Andi Kleen wrote:
> On Wed, Aug 08, 2007 at 09:47:05AM -0400, Steven Rostedt wrote:
> >
> > [...]
> > asm volatile ("pushq %2; pushq %%rsp; pushfq; pushq %3; call *%6;"
> > /* The stack we pushed is off by 8, due to the
> > previous pushq */
> > "addq $8, %%rsp"
>
> Weird stack inbalance, i'm surprised that works. %6 must be doing strange
> things.
Heh, no it's very subtle. We do the "push %rsp" after the "push %2". So on
return from the call, the stack that was pushed, is really 8 bytes off.
Because we didn't save the stack pointer from the entry of the asm, we
saved it after doing that "push %2". :-)
We get back to this call via a retq from the HV switcher code. So that rsp
that was pushed will be the stack.
>
> > /* Need to read manual, does rdmsr clear
> > * the top 32 bits of rax? */
> > xor %rax, %rax
> >
> > movl $MSR_GS_BASE,%ecx
> > rdmsr
>
>
> This seems incredibly slow. Since GS changes are controlled in
> the kernel why don't you just cache them in the per cpu area?
> The only case where user can reload it is using segment
> selector changes, which can be also handled.
>
> Also why do you need the guest gs value anyways?
I did a swapgs here, so we have the HV GS value. I don't know of an easy
way to read the GS value to put it into the RSP.
>
> You could just use your own stack and let the guest
> switch to its own.
Well, it is quite complex, since we can't save anything in the writable
section that the guest can't also write to. Remember, we are in ring 1 for
the guest kernel, so we have no protection. All the host related stuff is
in read only sections when we are using the guest cr3.
What I would like to do is have the GS point to the read only section, and
load the RSP from a value there. But first we need to store the old value
of RSP. The problem is, where do you store it. Hmm, I guess I can create
my own "big" offset from the read only section :)
OK this gives me an idea, I *can* point the kernel GS to the vcpu read
only section, and have a way to store the value with the GS offset. The
read only section is always mapped before the read write section, so it
should be easy to calculate the diff!
Thanks for the comment, you guided me to this great idea!
-- Steve
More information about the Lguest
mailing list