[PATCH] kdump: Fix for machine checkstop on DMA fault

Haren Myneni hbabu at us.ibm.com
Fri Mar 24 10:06:22 EST 2006

linuxppc-dev-bounces+hbabu=us.ibm.com at ozlabs.org wrote on 03/23/2006 
12:12:58 PM:

> On Thu, Mar 23, 2006 at 12:19:04AM -0600, Olof Johansson wrote:
> > The crash kernel needs to be even more careful, and instead read out
> > the entries that are mapped and reserve them. This would require a bit
> > more plumbing since there's no way to read an entry right now, but 
> > remove that hole.
> Actually, what's probably easier is to allocate some entries when the
> purgatory is set up, and make the crash kernel only use those by 
> the device tree accordingly. Sort of how regular memory is handled right
> now. That'd be a cleaner solution with less changes needed.
> The trick will be to get a decent size contiguous allocation, but the
> same applies for the memory reserve.

Olof, Thanks for your comments/suggestions.

On JS21, immediately after the tce entries are initialized, the machine 
checkstops with an error "Internal CPU 1 Fault Error" on bladecenter MM. 
If we do not initialize tce entries for crash kernel, allows the ongoing 
DMA continue to the old kernel memory. I though that, ongoing DMA will be 
stopped when the device reset happens later by the drivers. I think, some 
hardening is already included in some drivers to take care of this 
behavior. I might be wrong. So far, I had e100 issue after testing on p5, 
p4, js20 and js21. Probably, it could be lucky scenario.
So, will be keeping the same change (posted here) plus your suggestion. 
Right? Can we apply same approach even for power-4?


> -Olof
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev at ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20060323/90f18eeb/attachment.htm>

More information about the Linuxppc-dev mailing list