[Skiboot] [PATCH] core/pci: Introduce virtual device and filter cache

Gavin Shan gwshan at linux.vnet.ibm.com
Fri Dec 9 11:04:47 AEDT 2016


On Thu, Dec 08, 2016 at 01:25:42PM +1100, Alistair Popple wrote:
>On Thu, 8 Dec 2016 11:10:39 AM Stewart Smith wrote:
>> Alistair Popple <alistair at popple.id.au> writes:
>> > On Tue, 13 Sep 2016 06:01:14 PM Gavin Shan wrote:
>> >> On Mon, Sep 12, 2016 at 10:59:18AM +1000, Alistair Popple wrote:
>> >> >On Thu, 8 Sep 2016 05:19:32 PM Gavin Shan wrote:
>> >> >> This introduces virtual device and filter cache to speed up the
>> >> >> searching for PCI virtual device and config space register filter
>> >> >> in PCI config access path. With this applied, the original bandwidth
>> >> >> is regained during the NPU1 bandwidth testing.
>> >> >
>> >> >Have you done any profiling that shows the virtual pci filters are the 
>> > source 
>> >> >of the slow down? I am quite confused why this is an issue as once the 
>> > links 
>> >> >are setup nothing should be touching the virtual config spaces (apart from 
>> > a 
>> >> >watchdog thread running once a second in the nvidia device driver).
>> >> >
>> >> 
>> >> Alistair, I didn't do the profiling. As the code change included in
>> >> this patch is just introduce a virtual PCI device and filter cache.
>> >> Both of them are used in PCI config access with help of the virtual
>> >> PCI filters. It proves we have lots of PCI config traffic hitting the
>> >> virtual PCI filters indirectly, right?
>> >
>> > Perhaps it does indirectly, but the bandwidth could also be lower due to 
>> > incorrect implementation of config space (especially as we shouldn't be 
>> > getting any config space traffic). So it would be good to get more direct 
>> > proof that the slow down is due to lots of config space traffic during the 
>> > bandwidth test so that:
>> >
>> > 1) We can be sure it's just a code performance issue and not a config space 
>> > implementation detail.
>> > 2) So that we can go back to nVidia and work out why there are so many config 
>> > space accesses.
>> >
>> > All we really need to do is either the bandwidth test under perf or add some 
>> > printf's to the config space filter.
>> >
>> >> >I am concerned the slow down may be due to some other reason such as link 
>> >> >training not work correctly so would like some confirmation that slow 
>> > config 
>> >> >space access is what is causing this. Also if the slow down is due to 
>> > frequent 
>> >> >config space accesses then we should be looking to reduce the frequency of 
>> >> >accesses. There is probably also plenty of scope to further optimise these 
>> >> >code paths as they were assumed to be on a fairly slow path when written.
>> >> >
>> >> 
>> >> Yeah, I agree the PCI config access is the slow path. I'm not sure why
>> >> we have huge PCI config traffic during the testing. I will rerun the test
>> >> with more code to reveal how much PCI config read/write are issued and
>> >> their distribution when I get a chance. It's not high priority as I think 
>> > :-)
>> >
>> > What I meant is that no effort has been put into optimisation of that code 
>> > path because it shouldn't be on a performance critical path. Given the PCI 
>> > virtual device patches are upstream we should work this out sooner rather than 
>> > later :-)
>> 
>> Did we ever get a resolution on this? I was kind of holding off on
>> merging until some kind of consensus occured that this was the slow path?
>
>I haven't had a chance to look into it myself yet. Gavin have you?
>

No, I didn't get chance looking at it, but why not just merge it?  With it,
the performance is improved. Isn't the result we want? :-)

Thanks,
Gavin



More information about the Skiboot mailing list