drm/i915: Apply a cond_resched() to the saturated signaler
authorChris Wilson <chris@chris-wilson.co.uk>
Tue, 4 Apr 2017 12:05:31 +0000 (13:05 +0100)
committerChris Wilson <chris@chris-wilson.co.uk>
Tue, 4 Apr 2017 12:48:21 +0000 (13:48 +0100)
If the engine is continually completing nops, we can saturate the
signaler and keep it working indefinitely. This angers the NMI watchdog!

A good example is to disable semaphores on snb and run igt/gem_exec_nop -
the parallel, multi-engine workloads are more than sufficient to hog the
CPU, preventing the system from even processing ICMP echo replies.

v2: Tvrtko dug into cond_resched() on x86 and found that it only
depended upon preempt_count and not tif_need_resched() - which means
that we would always call schedule() at that point.

Fixes: c81d46138da6 ("drm/i915: Convert trace-irq to the breadcrumb waiter")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170404120531.10737-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
drivers/gpu/drm/i915/intel_breadcrumbs.c

index 308c56a021abb5f4a3634e665e2161444016951c..9ccbf26124c6169d2e35a6e915c3da09845ba269 100644 (file)
@@ -580,6 +580,8 @@ static int intel_breadcrumbs_signaler(void *arg)
        signaler_set_rtpriority();
 
        do {
+               bool do_schedule = true;
+
                set_current_state(TASK_INTERRUPTIBLE);
 
                /* We are either woken up by the interrupt bottom-half,
@@ -626,7 +628,18 @@ static int intel_breadcrumbs_signaler(void *arg)
                        spin_unlock_irq(&b->rb_lock);
 
                        i915_gem_request_put(request);
-               } else {
+
+                       /* If the engine is saturated we may be continually
+                        * processing completed requests. This angers the
+                        * NMI watchdog if we never let anything else
+                        * have access to the CPU. Let's pretend to be nice
+                        * and relinquish the CPU if we burn through the
+                        * entire RT timeslice!
+                        */
+                       do_schedule = need_resched();
+               }
+
+               if (unlikely(do_schedule)) {
                        DEFINE_WAIT(exec);
 
                        if (kthread_should_park())