If an afu interrupt is in flight when an eeh error is triggered the
control still reaches the function native_irq_multiplexed and the
PE-Handle read from the CXL_PSL_PEHandle_An register is 0xffff. The
function then erroneously assumes that the interrupt belonged to a
detached context and generates a warning with full stack dump in the
kernel log complaining:
"Unable to demultiplex CXL PSL IRQ for PE 65535 DSISR
ffffffff DAR
ffffffff. (Possible AFU HW issue - was a term/remove acked with
outstanding transactions"
To fix this the patch adds new code to the function
native_irq_multiplexed function to compares the read value of register
CXL_PSL_PEHandle_An to ~0ULL. If true then logs a warning message
saying that the interrupt is being ignored and returns IRQ_HANDLED from
the irq handler.
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
struct cxl_afu *afu = data;
struct cxl_context *ctx;
struct cxl_irq_info irq_info;
- int ph = cxl_p2n_read(afu, CXL_PSL_PEHandle_An) & 0xffff;
- int ret;
-
+ u64 phreg = cxl_p2n_read(afu, CXL_PSL_PEHandle_An);
+ int ph, ret;
+
+ /* check if eeh kicked in while the interrupt was in flight */
+ if (unlikely(phreg == ~0ULL)) {
+ dev_warn(&afu->dev,
+ "Ignoring slice interrupt(%d) due to fenced card",
+ irq);
+ return IRQ_HANDLED;
+ }
+ /* Mask the pe-handle from register value */
+ ph = phreg & 0xffff;
if ((ret = native_get_irq_info(afu, &irq_info))) {
WARN(1, "Unable to get CXL IRQ Info: %i\n", ret);
return fail_psl_irq(afu, &irq_info);