nvme-pci: Remove watchdog timer
authorKeith Busch <keith.busch@intel.com>
Wed, 7 Jun 2017 18:32:50 +0000 (20:32 +0200)
committerChristoph Hellwig <hch@lst.de>
Thu, 15 Jun 2017 12:30:08 +0000 (14:30 +0200)
commitb2a0eb1a0ac72869c910a79d935a0b049ec78ad9
tree5950dcdb125ab720107e27eea4bb83391df04ab8
parent97f6ef6464dbd235a4d9bdfc05d949aab24fc927
nvme-pci: Remove watchdog timer

The controller status polling was added to preemptively reset a failed
controller. This early detection would allow commands that would normally
timeout a chance for a retry, or find broken links when the platform
didn't support hotplug.

This once-per-second MMIO read, however, created more problems than
it solves. This often races with PCIe Hotplug events that required
complicated syncing between work queues, frequently triggered PCIe
Completion Timeout errors that also lead to fatal machine checks, and
unnecessarily disrupts low power modes by running on idle controllers.

This patch removes the watchdog timer, and instead checks controller
health only on an IO timeout when we have a reason to believe something
is wrong. If the controller is failed, the driver will disable immediately
and request scheduling a reset.

Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
drivers/nvme/host/pci.c