[PATCH v12 00/31] Speculative page faults

Joel Fernandes joel at joelfernandes.org
Tue Dec 15 05:10:20 AEDT 2020


On Mon, Dec 14, 2020 at 10:36:29AM +0100, Laurent Dufour wrote:
> Le 14/12/2020 à 03:03, Joel Fernandes a écrit :
> > On Tue, Jul 07, 2020 at 01:31:37PM +0800, Chinwen Chang wrote:
> > [..]
> > > > > Hi Laurent,
> > > > > 
> > > > > We merged SPF v11 and some patches from v12 into our platforms. After
> > > > > several experiments, we observed SPF has obvious improvements on the
> > > > > launch time of applications, especially for those high-TLP ones,
> > > > > 
> > > > > # launch time of applications(s):
> > > > > 
> > > > > package           version      w/ SPF      w/o SPF      improve(%)
> > > > > ------------------------------------------------------------------
> > > > > Baidu maps        10.13.3      0.887       0.98         9.49
> > > > > Taobao            8.4.0.35     1.227       1.293        5.10
> > > > > Meituan           9.12.401     1.107       1.543        28.26
> > > > > WeChat            7.0.3        2.353       2.68         12.20
> > > > > Honor of Kings    1.43.1.6     6.63        6.713        1.24
> > > > 
> > > > That's great news, thanks for reporting this!
> > > > 
> > > > > 
> > > > > By the way, we have verified our platforms with those patches and
> > > > > achieved the goal of mass production.
> > > > 
> > > > Another good news!
> > > > For my information, what is your targeted hardware?
> > > > 
> > > > Cheers,
> > > > Laurent.
> > > 
> > > Hi Laurent,
> > > 
> > > Our targeted hardware belongs to ARM64 multi-core series.
> > 
> > Hello!
> > 
> > I was trying to develop an intuition about why does SPF give improvement for
> > you on small CPU systems. This is just a high-level theory but:
> > 
> > 1. Assume the improvement is because of elimination of "blocking" on
> > mmap_sem.
> > Could it be that the mmap_sem is acquired in write-mode unnecessarily in some
> > places, thus causing blocking on mmap_sem in other paths? If so, is it
> > feasible to convert such usages to acquiring them in read-mode?
> 
> That's correct, and the goal of this series is to try not holding the
> mmap_sem in read mode during page fault processing.
> 
> Converting mmap_sem holder from write to read mode is not so easy and that
> work as already been done in some places. If you think there are areas where
> this could be done, you're welcome to send patches fixing that.
> 
> > 2. Assume the improvement is because of lesser read-side contention on
> > mmap_sem.
> > On small CPU systems, I would not expect reducing cache-line bouncing to give
> > such a dramatic improvement in performance as you are seeing.
> 
> I don't think cache line bouncing reduction is the main sourcec of
> performance improvement, I would rather think this is the lower part here.
> I guess this is mainly because during loading time a lot of page fault is
> occuring and thus SPF is reducing the contention on the mmap_sem.

Thanks for the reply. I think I also wrongly assumed that acquiring mmap
rwsem in write mode in a syscall makes SPF moot. Peter explained to me on IRC
that tere's still perf improvement in write mode if an unrelated VMA is
modified while another VMA is faulting.  CMIIW - not an mm expert by any
stretch.

Thanks!

 - Joel



More information about the Linuxppc-dev mailing list