[PATCH v5 1/3] powerpc: make fadump resilient with memory add/remove events

Sourabh Jain sourabhjain at linux.ibm.com
Wed Nov 22 21:35:59 AEDT 2023


Hello Michael,


On 22/11/23 10:47, Michael Ellerman wrote:
> Aneesh Kumar K V <aneesh.kumar at linux.ibm.com> writes:
> ...
>> I am not sure whether we need to add all the complexity to enable supporting different fadump kernel
>> version. Is that even a possible use case with fadump? Can't we always assume that with fadump the
>> crash kernel and fadump kernel will be same version?
> How sure are we of that?
>
> Don't we go through grub when we boot into the 2nd kernel. And so
> couldn't it choose to boot a different kernel, for whatever reason.
>
> I don't think we need to support different pt_reg / cpumask sizes, but
> requiring the exact same kernel version is too strict I think.
Agree.
>
> But maybe I'm wrong. Would be good to hear what distro folks think.

How about checking fadump crash info header compatibility in the 
following way?

static bool is_fadump_header_compatible(struct fadump_crash_info_header 
*fdh)
{
     if (fdh->magic_number == FADUMP_CRASH_INFO_MAGIC_OLD) {
         pr_err("Old magic number, can't process the dump.");
         return false;
     }

     if (fdh->magic_number != FADUMP_CRASH_INFO_MAGIC) {
         pr_err("Fadump header is corrupted.");
         return false;
     }

     /*
      * If the kernel version of the first/crashed kernel and the 
second/fadump
      * kernel is not same, then only collect the dump if the size of all
      * non-primitive type members of the fadump header is the same 
across kernels.
      */
     if (strcmp(fdh->kernel_version, init_uts_ns.name.release)) {
         if (fdh->pt_regs_sz != sizeof(struct pt_regs) || 
fdh->cpu_mask_sz != sizeof(struct cpumask)) {
             pr_err("Fadump header size mismatch.\n")
             return false;
         } else
             pr_warn("Kernel version mismatch; dump data is unreliable.\n");
     }

     return true;
}

And the new fadump crash info header will be: As suggested by Hari.

/* fadump crash info structure */
struct fadump_crash_info_header {
     u64        magic_number;
+  u32        version;
     u32        crashing_cpu;
     u64        elfcorehdr_addr;
+  u64        elfcorehdr_size;
+  u64        vmcoreinfo_raddr;
+  u64        vmcoreinfo_size;
+  u8          kernel_version[__NEW_UTS_LEN + 1];
+  u32        pt_regs_sz;
     struct pt_regs    regs;
+  u32        cpu_mask_sz;
     struct cpumask    cpu_mask;
};

Thanks,
Sourabh Jain


More information about the Linuxppc-dev mailing list