[PATCH v5 1/3] powerpc: make fadump resilient with memory add/remove events
Aneesh Kumar K V
aneesh.kumar at linux.ibm.com
Fri Nov 17 16:31:02 AEDT 2023
On 11/17/23 10:03 AM, Sourabh Jain wrote:
> Hi Aneesh,
>
> Thanks for reviewing the patch.
>
> On 15/11/23 10:14, Aneesh Kumar K.V wrote:
>> Sourabh Jain<sourabhjain at linux.ibm.com> writes:
>>
>> ....
>>
>>> diff --git a/arch/powerpc/include/asm/fadump-internal.h b/arch/powerpc/include/asm/fadump-internal.h
>>> index 27f9e11eda28..7be3d8894520 100644
>>> --- a/arch/powerpc/include/asm/fadump-internal.h
>>> +++ b/arch/powerpc/include/asm/fadump-internal.h
>>> @@ -42,7 +42,25 @@ static inline u64 fadump_str_to_u64(const char *str)
>>> #define FADUMP_CPU_UNKNOWN (~((u32)0))
>>> -#define FADUMP_CRASH_INFO_MAGIC fadump_str_to_u64("FADMPINF")
>>> +/*
>>> + * The introduction of new fields in the fadump crash info header has
>>> + * led to a change in the magic key, from `FADMPINF` to `FADMPSIG`.
>>> + * This alteration ensures backward compatibility, enabling the kernel
>>> + * with the updated fadump crash info to handle kernel dumps from older
>>> + * kernels.
>>> + *
>>> + * To prevent the need for further changes to the magic number in the
>>> + * event of future modifications to the fadump header, a version field
>>> + * has been introduced to track the fadump crash info header version.
>>> + *
>>> + * Historically, there was no connection between the magic number and
>>> + * the fadump crash info header version. However, moving forward, the
>>> + * `FADMPINF` magic number in header will be treated as version 0, while
>>> + * the `FADMPSIG` magic number in header will include a version field to
>>> + * determine its version.
>>> + */
>>> +#define FADUMP_CRASH_INFO_MAGIC fadump_str_to_u64("FADMPSIG")
>>> +#define FADUMP_VERSION 1
>>>
>> Can we keep the old magic details as
>>
>> #define FADUMP_CRASH_INFO_MAGIC_OLD fadump_str_to_u64("FADMPINF")
>> #define FADUMP_CRASH_INFO_MAGIC fadump_str_to_u64("FADMPSIG")
>
> Sure.
>
>> Also considering the struct need not be backward compatible, can we just
>> do
>>
>> struct fadump_crash_info_header {
>> u64 magic_number;
>> u32 crashing_cpu;
>> u64 elfcorehdr_addr;
>> u64 elfcorehdr_size;
>> u64 vmcoreinfo_raddr;
>> u64 vmcoreinfo_size;
>> struct pt_regs regs;
>> struct cpumask cpu_mask;
>> };
>> static inline bool fadump_compatible(struct fadump_crash_info_header
>> *fdh)
>> {
>> return (fdh->magic_number == FADUMP_CRASH_INFO_MAGIC)
>> }
>>
>> and fail fadump if we find it not compatible?
>
> Agree that it is unsafe to collect a dump with an incompatible fadump crash info header.
>
> Given that I am updating the fadump crash info header, we can make a few arrangements
> like adding a size filed for the dynamic size attribute like pt_regs and cpumask to ensure
> better compatibility in the future.
>
> Additionally, let's introduce a version field to the fadump crash info header to avoid changing
> the magic number in the future.
>
I am not sure whether we need to add all the complexity to enable supporting different fadump kernel
version. Is that even a possible use case with fadump? Can't we always assume that with fadump the
crash kernel and fadump kernel will be same version? if yes we can simply fail with a magic number
mismatch because that indicates an user config error?
-aneesh
More information about the Linuxppc-dev
mailing list