[Skiboot] [PATCH] core/pci: Introduce virtual device and filter cache
alistair at popple.id.au
Thu Dec 8 13:25:42 AEDT 2016
On Thu, 8 Dec 2016 11:10:39 AM Stewart Smith wrote:
> Alistair Popple <alistair at popple.id.au> writes:
> > On Tue, 13 Sep 2016 06:01:14 PM Gavin Shan wrote:
> >> On Mon, Sep 12, 2016 at 10:59:18AM +1000, Alistair Popple wrote:
> >> >On Thu, 8 Sep 2016 05:19:32 PM Gavin Shan wrote:
> >> >> This introduces virtual device and filter cache to speed up the
> >> >> searching for PCI virtual device and config space register filter
> >> >> in PCI config access path. With this applied, the original bandwidth
> >> >> is regained during the NPU1 bandwidth testing.
> >> >
> >> >Have you done any profiling that shows the virtual pci filters are the
> > source
> >> >of the slow down? I am quite confused why this is an issue as once the
> > links
> >> >are setup nothing should be touching the virtual config spaces (apart from
> > a
> >> >watchdog thread running once a second in the nvidia device driver).
> >> >
> >> Alistair, I didn't do the profiling. As the code change included in
> >> this patch is just introduce a virtual PCI device and filter cache.
> >> Both of them are used in PCI config access with help of the virtual
> >> PCI filters. It proves we have lots of PCI config traffic hitting the
> >> virtual PCI filters indirectly, right?
> > Perhaps it does indirectly, but the bandwidth could also be lower due to
> > incorrect implementation of config space (especially as we shouldn't be
> > getting any config space traffic). So it would be good to get more direct
> > proof that the slow down is due to lots of config space traffic during the
> > bandwidth test so that:
> > 1) We can be sure it's just a code performance issue and not a config space
> > implementation detail.
> > 2) So that we can go back to nVidia and work out why there are so many config
> > space accesses.
> > All we really need to do is either the bandwidth test under perf or add some
> > printf's to the config space filter.
> >> >I am concerned the slow down may be due to some other reason such as link
> >> >training not work correctly so would like some confirmation that slow
> > config
> >> >space access is what is causing this. Also if the slow down is due to
> > frequent
> >> >config space accesses then we should be looking to reduce the frequency of
> >> >accesses. There is probably also plenty of scope to further optimise these
> >> >code paths as they were assumed to be on a fairly slow path when written.
> >> >
> >> Yeah, I agree the PCI config access is the slow path. I'm not sure why
> >> we have huge PCI config traffic during the testing. I will rerun the test
> >> with more code to reveal how much PCI config read/write are issued and
> >> their distribution when I get a chance. It's not high priority as I think
> > :-)
> > What I meant is that no effort has been put into optimisation of that code
> > path because it shouldn't be on a performance critical path. Given the PCI
> > virtual device patches are upstream we should work this out sooner rather than
> > later :-)
> Did we ever get a resolution on this? I was kind of holding off on
> merging until some kind of consensus occured that this was the slow path?
I haven't had a chance to look into it myself yet. Gavin have you?
More information about the Skiboot