[Lguest] proposed patch for mmap'ed private block device

Rusty Russell rusty at rustcorp.com.au
Fri May 23 13:20:12 EST 2008


On Friday 23 May 2008 00:16:46 ron minnich wrote:
> On Thu, May 22, 2008 at 5:11 AM, Rusty Russell <rusty at rustcorp.com.au> 
wrote:
> > OK, patch is nice.  It only works for small disks unfortunately, but it's
> > simpler than using a (sparse) temporary file and a bitmap (as I did in
> > qemu).
>
> do you want a more standard patch format, and where do  I send it? I'd
> like to get this in for real as I keep forward porting it :-)

For lguest64, I'd say yes.  For 32-bit it's a little too limited to be 
generically useful I think.

That said, perhaps this is the motivation we need to create an extension API.  
My plan was to try to dlopen ./<optionname>.so if we don't recognise an 
option.  It shouldn't need access to too much stuff, to create new device 
types at least?

> On 2.6.23 I could reliably start and run 100 or more guests, which
> allowed me to prototype cluster code on my laptop. One part of this
> picture *seems* to be that different tapx devices end up with the same
> mac, but that didn't happen last night with 50 guests, so that can't
> be all of it.
>
> The oops was (apparently) in a switch_to, and happened after all
> guests had booted and I pinged one of them, so I feel network is in
> this picture somehow.

Once I've finished patch shuffling, I'll see if I can reproduce it here.

I'll add it above "Reproduce Ron's gdb crash".

> BTW I'm intrigued by your idea of mapping disk directly into guest.
> I'd like to preserve the copy-on-write semantics of the mmap block
> device. I wonder if we could do the folliowing:
> 1. set up special E820 "copy on write" map entry
> 2. kernel boots, maps that e820 segment write-protected
> 3. write faults on that segment result in copy-on-write behavior for
> the page in question

I don't know enough about Linux mm/vfs innards to know if it could use those 
pages directly in the page cache, or would end up making copies (I know we 
have some execute in place support tho).  Perhaps it could be used as a giant 
ramdisk.

The fact that it's COW should be transparent to the guest, no?

Cheers,
Rusty.



More information about the Lguest mailing list