When hardware is removed on a Stratus, the system may crash like this:
ACPI: PCI interrupt for device 0000:7c:00.1 disabled
Trying to free nonexistent resource <
00000000a8000000-
00000000afffffff>
Trying to free nonexistent resource <
00000000a4800000-
00000000a480ffff>
uhci_hcd 0000:7e:1d.0: remove, state 1
usb usb2: USB disconnect, address 1
usb 2-1: USB disconnect, address 2
Unable to handle kernel paging request at
0000000000100100 RIP:
[<
ffffffff88021950>] :uhci_hcd:uhci_scan_schedule+0xa2/0x89c
#4 [
ffff81011de17e50] uhci_scan_schedule at
ffffffff88021918
#5 [
ffff81011de17ed0] uhci_irq at
ffffffff88023cb8
#6 [
ffff81011de17f10] usb_hcd_irq at
ffffffff801f1c1f
#7 [
ffff81011de17f20] handle_IRQ_event at
ffffffff8001123b
#8 [
ffff81011de17f50] __do_IRQ at
ffffffff800ba749
This occurs because an interrupt scans uhci->skelqh, which is
being freed. We do the right thing: disable the interrupts in the
device, and do not do any processing if the interrupt is shared
with other source, but it's possible that another CPU gets
delayed somewhere (e.g. loops) until we started freeing.
The agreed-upon solution is to wait for interrupts to play out
before proceeding. No other bareers are neceesary.
A backport of this patch was tested on a 2.6.18 based kernel.
Testing of 2.6.32-based kernels is under way, but it takes us
forever (months) to turn this around. So I think it's a good
patch and we should keep it.
Tracked in RH bz#516851
Signed-Off-By: Pete Zaitcev <zaitcev@redhat.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>