nmi_watchdog: Fallback to software events when no hardware pmu detected
authorDon Zickus <dzickus@redhat.com>
Fri, 12 Feb 2010 22:19:20 +0000 (17:19 -0500)
committerIngo Molnar <mingo@elte.hu>
Sun, 14 Feb 2010 08:19:44 +0000 (09:19 +0100)
Not all arches have a PMU or have perf_event support for their
PMU.  The nmi_watchdog will fail in those cases.  Fallback to
using software events to generate nmi_watchdog traffic with
local apic interrupts.

Tested on a Pentium4 and it worked as expected, excepting for
detecting cpu lockups.

The problem with using software events as a cpu lock up detector
is the nmi_watchdog uses the logic that if local apic interrupts
stop incrementing then the cpu is probably locked up.  But with
software events we use the local apic to trigger the
nmi_watchdog callback to see if local apic interrupts are still
firing, which obviously they are otherwise we wouldn't have been
triggered.

The algorithm to detect cpu lock ups is the same as the old
nmi_watchdog. Perhaps we need to find a better way to detect
lock ups?

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: peterz@infradead.org
Cc: gorcunov@gmail.com
Cc: aris@redhat.com
LKML-Reference: <1266013161-31197-3-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
kernel/nmi_watchdog.c

index 73c1954a97bb8cd40d558f1ff3b1d3b7aaccec8f..4f23505d887d116d9ecd83b3401a7e413b0cf9e4 100644 (file)
@@ -166,8 +166,12 @@ cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
                wd_attr.sample_period = hw_nmi_get_sample_period();
                event = perf_event_create_kernel_counter(&wd_attr, hotcpu, -1, wd_overflow);
                if (IS_ERR(event)) {
-                       printk(KERN_ERR "nmi watchdog failed to create perf event on %i: %p\n", hotcpu, event);
-                       return NOTIFY_BAD;
+                       wd_attr.type = PERF_TYPE_SOFTWARE;
+                       event = perf_event_create_kernel_counter(&wd_attr, hotcpu, -1, wd_overflow);
+                       if (IS_ERR(event)) {
+                               printk(KERN_ERR "nmi watchdog failed to create perf event on %i: %p\n", hotcpu, event);
+                               return NOTIFY_BAD;
+                       }
                }
                per_cpu(nmi_watchdog_ev, hotcpu) = event;
                perf_event_enable(per_cpu(nmi_watchdog_ev, hotcpu));