If the engine is continually completing nops, we can saturate the
signaler and keep it working indefinitely. This angers the NMI watchdog!
A good example is to disable semaphores on snb and run igt/gem_exec_nop -
the parallel, multi-engine workloads are more than sufficient to hog the
CPU, preventing the system from even processing ICMP echo replies.
v2: Tvrtko dug into cond_resched() on x86 and found that it only
depended upon preempt_count and not tif_need_resched() - which means
that we would always call schedule() at that point.
Fixes:
c81d46138da6 ("drm/i915: Convert trace-irq to the breadcrumb waiter")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170404120531.10737-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
(cherry picked from commit
a7980a640cbd339aa80f406d1786a275a2c320bc)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
signaler_set_rtpriority();
do {
+ bool do_schedule = true;
+
set_current_state(TASK_INTERRUPTIBLE);
/* We are either woken up by the interrupt bottom-half,
spin_unlock_irq(&b->rb_lock);
i915_gem_request_put(request);
- } else {
+
+ /* If the engine is saturated we may be continually
+ * processing completed requests. This angers the
+ * NMI watchdog if we never let anything else
+ * have access to the CPU. Let's pretend to be nice
+ * and relinquish the CPU if we burn through the
+ * entire RT timeslice!
+ */
+ do_schedule = need_resched();
+ }
+
+ if (unlikely(do_schedule)) {
DEFINE_WAIT(exec);
if (kthread_should_park())