diff options
author | Chris Wilson <chris@chris-wilson.co.uk> | 2016-04-09 10:57:53 +0100 |
---|---|---|
committer | Chris Wilson <chris@chris-wilson.co.uk> | 2016-04-09 12:08:53 +0100 |
commit | 9b9ed3093613288247a27a55a6dd07f1222150f1 (patch) | |
tree | eef209b9bf3d813b9f37dadf47a73a3cb483ac99 /drivers | |
parent | 782f6bc0aba037436d6a04d19b23f8b61020a576 (diff) | |
download | lwn-9b9ed3093613288247a27a55a6dd07f1222150f1.tar.gz lwn-9b9ed3093613288247a27a55a6dd07f1222150f1.zip |
drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+
In order to ensure seqno/irq coherency, we currently read a ring register.
The mmio transaction following the interrupt delays the inspection of
the seqno long enough for the MI_STORE_DWORD_IMM to update the CPU
cache. However, it is only the memory timing that is important for the
purposes of the delay, we do not need nor desire the extra forcewake.
v3: Update commentary
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [v2]
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-1-git-send-email-chris@chris-wilson.co.uk
Diffstat (limited to 'drivers')
-rw-r--r-- | drivers/gpu/drm/i915/intel_ringbuffer.c | 13 |
1 files changed, 11 insertions, 2 deletions
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 39aa318ad779..69cc3bc20495 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1573,10 +1573,19 @@ gen6_ring_get_seqno(struct intel_engine_cs *engine, bool lazy_coherency) { /* Workaround to force correct ordering between irq and seqno writes on * ivb (and maybe also on snb) by reading from a CS register (like - * ACTHD) before reading the status page. */ + * ACTHD) before reading the status page. + * + * Note that this effectively stalls the read by the time it takes to + * do a memory transaction, which more or less ensures that the write + * from the GPU has sufficient time to invalidate the CPU cacheline. + * Alternatively we could delay the interrupt from the CS ring to give + * the write time to land, but that would incur a delay after every + * batch i.e. much more frequent than a delay when waiting for the + * interrupt (with the same net latency). + */ if (!lazy_coherency) { struct drm_i915_private *dev_priv = engine->dev->dev_private; - POSTING_READ(RING_ACTHD(engine->mmio_base)); + POSTING_READ_FW(RING_ACTHD(engine->mmio_base)); } return intel_read_status_page(engine, I915_GEM_HWS_INDEX); |