drm/i915: Prevent TLB error on first execution on SNB
authorChris Wilson <chris@chris-wilson.co.uk>
Fri, 13 Feb 2015 14:35:59 +0000 (14:35 +0000)
committerJani Nikula <jani.nikula@intel.com>
Tue, 10 Mar 2015 13:30:23 +0000 (15:30 +0200)
Long ago I found that I was getting sporadic errors when booting SNB,
with the symptom being that the first batch died with IPEHR != *ACTHD,
typically caused by the TLB being invalid. These magically disappeared
if I held the forcewake during the entire ring initialisation sequence.
(It can probably be shortened to a short critical section, but the whole
initialisation is full of register writes and so we would be taking and
releasing forcewake almost continually, and so holding it over the
entire sequence will probably be a net win!)

Note some of the kernels I encounted the issue already had the deferred
forcewake release, so it is still relevant.

I know that there have been a few other reports with similar failure
conditions on SNB, I think such as
References: https://bugs.freedesktop.org/show_bug.cgi?id=80913

v2: Wrap i915_gem_init_hw() with its own security blanket as we take
that path following resume and reset.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
drivers/gpu/drm/i915/i915_gem.c

index ac7fe39d38a30eb541a1b90f40a6c01bc299838b..5b205863b6596d7fa72e6f465a017d9bcb204e78 100644 (file)
@@ -4793,6 +4793,9 @@ i915_gem_init_hw(struct drm_device *dev)
        if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
                return -EIO;
 
+       /* Double layer security blanket, see i915_gem_init() */
+       intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+
        if (dev_priv->ellc_size)
                I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf));
 
@@ -4825,7 +4828,7 @@ i915_gem_init_hw(struct drm_device *dev)
        for_each_ring(ring, dev_priv, i) {
                ret = ring->init_hw(ring);
                if (ret)
-                       return ret;
+                       goto out;
        }
 
        for (i = 0; i < NUM_L3_SLICES(dev); i++)
@@ -4842,9 +4845,11 @@ i915_gem_init_hw(struct drm_device *dev)
                DRM_ERROR("Context enable failed %d\n", ret);
                i915_gem_cleanup_ringbuffer(dev);
 
-               return ret;
+               goto out;
        }
 
+out:
+       intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
        return ret;
 }
 
@@ -4878,6 +4883,14 @@ int i915_gem_init(struct drm_device *dev)
                dev_priv->gt.stop_ring = intel_logical_ring_stop;
        }
 
+       /* This is just a security blanket to placate dragons.
+        * On some systems, we very sporadically observe that the first TLBs
+        * used by the CS may be stale, despite us poking the TLB reset. If
+        * we hold the forcewake during initialisation these problems
+        * just magically go away.
+        */
+       intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+
        ret = i915_gem_init_userptr(dev);
        if (ret)
                goto out_unlock;
@@ -4904,6 +4917,7 @@ int i915_gem_init(struct drm_device *dev)
        }
 
 out_unlock:
+       intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
        mutex_unlock(&dev->struct_mutex);
 
        return ret;