[SLOF] [PATCH] pci-phb: Fix stack underflow in phb-pci-walk-bridge
Thomas Huth
thuth at redhat.com
Wed Oct 12 18:42:42 AEDT 2016
On 12.10.2016 02:24, Alexey Kardashevskiy wrote:
> On 12/10/16 02:31, Thomas Huth wrote:
>> On 10.10.2016 05:32, Alexey Kardashevskiy wrote:
>>> On 23/09/16 04:02, Thomas Huth wrote:
>>>> The sequence "my-space pci-htype@ pci-out" in phb-pci-walk-bridge
>>>> is bugged: pci-htype@ already consumes the my-space item from the
>>>> stack, only leaving one item for pci-out. But pci-out needs two
>>>> input items on the stack, the PCI address and a character item.
>>>> So this rather should be "my-space dup pci-htype@ pci-out" instead.
>>>> However, using the output of pci-htype@ as input character for
>>>> pci-out also does not make much sense, since this is likely an
>>>> unprintable character. So let's simply use a question mark here
>>>> instead to indicate that we did not recognize the type of the
>>>> PCI device.
>>>>
>>>> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1377083
>>>> Signed-off-by: Thomas Huth <thuth at redhat.com>
>>>
>>> Thanks, applied.
>>>
>>> I got a question though - why did you add buglink here? It says that adding
>>> 2 bridges crashes SLOF.
>>>
>>> 1) it is not clear at all why PCI header type is not 0 or 1
>>> 2) how this patch fixes it
>>> 3) how "pci: Fix secondary and subordinate PCI bus enumeration with
>>> board-qemu" fixes it if this patch does not fix it
>>
>> Have a look at https://bugzilla.redhat.com/show_bug.cgi?id=1377083#c5 :
>> There are actually two problems discovered by the bug. First, SLOF
>> crashes when using multiple pci-bridges. This patch here just fixes the
>> crash, but SLOF is then still unable to the PCI devices. That second
>> problem is fixed by my other patch ("pci: Fix secondary and subordinate
>> PCI bus enumeration with board-qemu").
>
> But the crash happens when PCI header type is not 0 or 1 - how is this even
> possible, what is that "52 4498 (�) : ffff ffff"? I fail to see how it can
> be fixed by just by changing ascending vs. descending scan order
Ah, sorry, I missed that this context was missing ;-) The problem is
that SLOF wrote wrong values in the PCI bridge's secondary and
subordinate bus number registers, so access to the devices that are
attached to such a bridge is not working at all. That means read
accesses to the config space of such device result in a 0xff value -
i.e. the PCI header type was reported as 0xff and SLOF then crashed.
Thomas
More information about the SLOF
mailing list