[PATCH/RFC] ps3/block: Add ps3vram-ng driver for accessing video RAM as block device
Geert Uytterhoeven
Geert.Uytterhoeven at sonycom.com
Mon Mar 9 21:52:12 EST 2009
On Mon, 9 Mar 2009, Jens Axboe wrote:
> On Mon, Mar 09 2009, Jens Axboe wrote:
> > On Mon, Mar 09 2009, Geert Uytterhoeven wrote:
> > > On Fri, 6 Mar 2009, Jens Axboe wrote:
> > > > On Fri, Mar 06 2009, Geert Uytterhoeven wrote:
> > > > > On Fri, 6 Mar 2009, Jens Axboe wrote:
> > > > > > On Fri, Mar 06 2009, Geert Uytterhoeven wrote:
> > > > > > > On Fri, 6 Mar 2009, Jens Axboe wrote:
> > > > > > > > On Thu, Mar 05 2009, Geert Uytterhoeven wrote:
> > > > > > > > > But then I noticed ps3vram_make_request() may be called concurrently,
> > > > > > > > > so I had to add a mutex to avoid data corruption. This slows the
> > > > > > > > > driver down, and in the end, the version with a thread turns out to be
> > > > > > > > > ca. 1% faster. The version without a thread is about 50 lines less
> > > > > > > > > code, though.
> > > > > > > >
> > > > > > > > That is correct, ->make_request_fn may get reentered. I'm not surprised
> > > > > > > > that performance dropped if you just shoved everything under a mutex.
> > > > > > > > You could be a little more smart and queue concurrent bio's for
> > > > > > > > processing when the current one is complete though, there are several
> > > > > > > > approaches there that be a lot faster than going all the way through the
> > > > > > > > IO stack and scheduler just to avoid concurrency.
> > > > > > >
> > > > > > > Yes, using a spinlock and queueing requests on a list if the driver is
> > > > > > > busy can be done after 2.6.29...
> > > > > >
> > > > > > Certainly. Even just replacing your current mutex with a spinlock during
> > > > > > the memcpy() would surely be a lot faster. Or even just grabbing the
> > > > > > mutex before calling into the write for the duration of the bio. The way
> > > > > > you do it is certain context switch death :-)
> > > > >
> > > > > It's not just the memcpy(). ps3vram_{up,down}load() call msleep(), so
> > > > > I cannot use a spinlock.
> > > >
> > > > Ah right, I hadn't looked close enough. But putting the mutex_lock()
> > > > outside of the bio_for_each_segment() is going to be much faster than
> > > > getting/releasing it for each segment.
> > >
> > > It doesn't seem to make any measurable difference, so I'm gonna leave it for
> > > now.
> >
> > It will depend on where the bio's are coming from. If they are all
> > single segment, then there will be no difference. If they contain
> > multiple segments, you reduce the lock/release by that amount.
> >
> > But yeah, just leave it as-is for now. You can send a final patch for
> > inclusion though. Unless I'm mistaken, I only saw the original and then
> > an incremental patch for changing it to ->make_request_fn?
>
> There was a full version, my mistake. I got confused by the removal of
Indeed.
> the old driver in another directory :-)
Can you please ack it? Thx!
With kind regards,
Geert Uytterhoeven
Software Architect
Sony Techsoft Centre Europe
The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium
Phone: +32 (0)2 700 8453
Fax: +32 (0)2 700 8622
E-mail: Geert.Uytterhoeven at sonycom.com
Internet: http://www.sony-europe.com/
A division of Sony Europe (Belgium) N.V.
VAT BE 0413.825.160 · RPR Brussels
Fortis · BIC GEBABEBB · IBAN BE41293037680010
More information about the Linuxppc-dev
mailing list