[V2] cxl: Check periodically the coherent platform function's state
mpe at ellerman.id.au
Wed May 11 07:48:40 AEST 2016
On Fri, 2016-22-04 at 13:39:22 UTC, Christophe Lombard wrote:
> In the PowerVM environment, the PHYP CoherentAccel component manages
> the state of the Coherent Accelerator Processor Interface adapter and
> virtualizes CAPI resources, handles CAPP, PSL, PSL Slice errors - and
> interrupts - and provides a new set of hcalls for the OS APIs to utilize
> Accelerator Function Unit (AFU).
> During the course of operation, a coherent platform function can
> encounter errors. Some possible reason for errors are:
> • Hardware recoverable and unrecoverable errors
> • Transient and over-threshold correctable errors
> PHYP implements its own state model for the coherent platform function.
> The state of the AFU is available through a hcall.
> The current implementation of the cxl driver, for the PowerVM
> environment, checks this state of the AFU only when an action is
> requested - open a device, ioctl command, memory map, attach/detach a
> process - from an external driver - cxlflash, libcxl. If an error is
> detected the cxl driver handles the error according the content of the
> Power Architecture Platform Requirements document.
> But in case of low-level troubles (or error injection), the PHYP
> component may reset the card and change the AFU state. The PHYP
> interface doesn't provide any way to be notified when that happens thus
> implies that the cxl driver:
> • cannot handle immediatly the state change of the AFU.
> • cannot notify other drivers (cxlflash, ...)
> The purpose of this patch is to wake up the cpu periodically to check
> the current state of each AFU and to see if we need to enter an error
> recovery path.
> Signed-off-by: Christophe Lombard <clombard at linux.vnet.ibm.com>
> Acked-by: Ian Munsie <imunsie at au1.ibm.com>
Applied to powerpc next, thanks.
More information about the Linuxppc-dev