[Skiboot] [PATCH v7 18/22] fadump: Add documentation
Vasant Hegde
hegdevasant at linux.vnet.ibm.com
Sat May 18 21:12:43 AEST 2019
On 05/16/2019 11:05 AM, Nicholas Piggin wrote:
> Vasant Hegde's on May 14, 2019 9:23 pm:
>> On 05/09/2019 10:28 AM, Nicholas Piggin wrote:
>>> Vasant Hegde's on April 13, 2019 7:15 pm:
>>>> diff --git a/doc/opal-api/opal-fadump-manage-173.rst b/doc/opal-api/opal-fadump-manage-173.rst
>>>> new file mode 100644
>>>> index 000000000..916167503
>>>> --- /dev/null
>>>> +++ b/doc/opal-api/opal-fadump-manage-173.rst
>>>> @@ -0,0 +1,73 @@
>>>> +.. _opal-api-fadump-manage:
>>>> +
>>>> +OPAL fadump manage call
>>>> +=======================
>>>> +::
>>>> +
>>>> + #define OPAL_FADUMP_MANAGE 173
>>>> +
>>>> +This call is used to manage FADUMP (aka MPIPL) on OPAL platform.
>>>> +Linux kernel will use this call to register/unregister FADUMP.
>>>> +
>>>> +Parameters
>>>> +----------
>>>> +::
>>>> +
>>>> + uint64_t command
>>>> + void *data
>>>> + uint64_t dsize
>>>> +
>>>> +``command``
>>>> + ``command`` parameter supports below values:
>>>> +
>>>> +::
>>>> +
>>>> + 0x01 - Register for fadump
>>>> + 0x02 - Unregister fadump
>>>> + 0x03 - Invalidate existing fadump
>>>> +
>>>> +``data``
>>>> + ``data`` is valid when ``command`` is 0x01 (registration).
>>>> + We use fadump structure (see below) to pass Linux kernel
>>>> + memory reservation details.
>>>> +
>>>> +::
>>>> +
>>>> +
>>>> + struct fadump_section {
>>>> + u8 source_type;
>>>> + u8 reserved[7];
>>>> + u64 source_addr;
>>>> + u64 source_size;
>>>> + u64 dest_addr;
>>>> + u64 dest_size;
>>>> + } __packed;
>>>> +
>>>> + struct fadump {
>>>> + u16 fadump_section_size;
>>>> + u16 section_count;
>>>> + u32 crashing_cpu;
>>>> + u64 reserved;
>>>> + struct fadump_section section[];
>>>> + };
>>>
>>> This API seems quite complicated. The kernel wants to tell firmware to
>>> preserve some ranges of memory in case of reboot, and to have those
>>> ranges advertised to the reboot kernel.
>>
>> Kernel informs OPAL about range of memory to be preserved during MPIPL
>> (source, destination, size).
>
> Well it also contains crashing_cpu, type, and comes in this clunky
> structure.
crashing_cpu : This information is passed by OPAL to kernel during MPIPL boot.
So that
kernel can generate proper backtrace for OPAL dump. This is not needed for
registration.
This is *OPAL* generated information. Kernel won't pass this information.
(For kernel initiated crash, kernel will keep track of crashing CPU pt_regs data
and it will use
that to generate vmcore).
Type : Identifies memory content type (like OPAL, kernel, etc). During MPIPL
registration
we pass this data to HDAT.. Hostboot will just copy this back to Result table
inside HDAT.
During MPIPL boot, OPAL passes this information to kernel.. so that kernel can
generate
proper dumps.
>
>> After reboot, we will result range from hostboot . We pass that to kernel via
>> device tree.
>>
>>>
>>> Why not just an API which can add a range, and delete a range, and
>>> that's it? Range would just be physical start, end, plus an arbitrary
>>> tag (which caller can use to retrieve metadata that is used to
>>> decipher the dump).
>>
>> We want one to one mapping between source and destination.
>
> Ah yes, sure that too. So two calls, one which adds or removes
> (source, dest, length) entries, and another which sets a tag.
Sorry. I'm still not getting what we gain by multiple calls here.
- With structure we can pass all information in one call. So kernel can make
single call for registration.
- Its controlled by version field (we realized need of version field during
review and I will
add that in v8) . So makes it easy to handle comparability issues.
- Its easy to extend/modify later without breaking API. If we just pass source,
destination,
length then for any change in future we have to add new API.
Only thing I mixed up in the structure is `crashing_cpu` information. This is
not needed
for registration. This is needed during MPIPL boot for OPAL core. May be this
is creating
confusion. May be we can remove this field from structure and put it in device tree.
>
>> Also we have
>> to update this information in HDAT so that hostboot can access it.
>
> That's okay though, isn't it? You can return failure if you don't
> have enough room.
Yes. that's fine.
>
>> Also having structure allows us to pass all these information nicely to OPAL.
>
> I don't think OPAL needs to know about the kernel crash metadata, and
> it could get its own by looking at addresses and tags that come up.
As explained above kernel won't pass metadata to OPAL. Kernel keeps track
of crashing CPU information and uses it during vmcore generation.
> Although I'm not really convinced it's a good idea to have a
> cooperative system where you have kernel and OPAL both managing crash
> dumps at the same time...
OPAL is not going to manage dumps. During registration it updates HDAT with
information
needed to capture dump. During MPIPL boot, it just passes those information to
kernel.
Kernel will generate both vmcore and opalcore based on the information provided
by OPAL.
> I really think OPAL crash information and
> especially when the host is running could benefit from more thought.
I think OPAL core is really useful to debug OPAL issues.
>
>> Finally this is similar concept we have in PowerVM LPAR as well. Hence I have
>> added structure.
>
> Is that a point for or against this structure? :)
In this case, I'm in favor of structure :-)
-Vasant
More information about the Skiboot
mailing list