summaryrefslogtreecommitdiff
path: root/drivers
diff options
context:
space:
mode:
authorChris Wilson <chris@chris-wilson.co.uk>2016-04-09 10:57:53 +0100
committerChris Wilson <chris@chris-wilson.co.uk>2016-04-09 12:08:53 +0100
commit9b9ed3093613288247a27a55a6dd07f1222150f1 (patch)
treeeef209b9bf3d813b9f37dadf47a73a3cb483ac99 /drivers
parent782f6bc0aba037436d6a04d19b23f8b61020a576 (diff)
downloadlwn-9b9ed3093613288247a27a55a6dd07f1222150f1.tar.gz
lwn-9b9ed3093613288247a27a55a6dd07f1222150f1.zip
drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+
In order to ensure seqno/irq coherency, we currently read a ring register. The mmio transaction following the interrupt delays the inspection of the seqno long enough for the MI_STORE_DWORD_IMM to update the CPU cache. However, it is only the memory timing that is important for the purposes of the delay, we do not need nor desire the extra forcewake. v3: Update commentary Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [v2] Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-1-git-send-email-chris@chris-wilson.co.uk
Diffstat (limited to 'drivers')
-rw-r--r--drivers/gpu/drm/i915/intel_ringbuffer.c13
1 files changed, 11 insertions, 2 deletions
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 39aa318ad779..69cc3bc20495 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1573,10 +1573,19 @@ gen6_ring_get_seqno(struct intel_engine_cs *engine, bool lazy_coherency)
{
/* Workaround to force correct ordering between irq and seqno writes on
* ivb (and maybe also on snb) by reading from a CS register (like
- * ACTHD) before reading the status page. */
+ * ACTHD) before reading the status page.
+ *
+ * Note that this effectively stalls the read by the time it takes to
+ * do a memory transaction, which more or less ensures that the write
+ * from the GPU has sufficient time to invalidate the CPU cacheline.
+ * Alternatively we could delay the interrupt from the CS ring to give
+ * the write time to land, but that would incur a delay after every
+ * batch i.e. much more frequent than a delay when waiting for the
+ * interrupt (with the same net latency).
+ */
if (!lazy_coherency) {
struct drm_i915_private *dev_priv = engine->dev->dev_private;
- POSTING_READ(RING_ACTHD(engine->mmio_base));
+ POSTING_READ_FW(RING_ACTHD(engine->mmio_base));
}
return intel_read_status_page(engine, I915_GEM_HWS_INDEX);