GitHub/moto-9609/android_kernel_motorola_exynos9610.git
7 years agodrm/i915/gvt: Make mmio_attribute as type u8 to save 1.5MB memory
Changbin Du [Tue, 6 Jun 2017 07:56:11 +0000 (15:56 +0800)]
drm/i915/gvt: Make mmio_attribute as type u8 to save 1.5MB memory

Type u8 is big enough to contain all MMIO attribute flags. As the
total MMIO size is 2MB so we saved 1.5MB memory.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
7 years agodrm/i915/gvt: Cleanup struct intel_gvt_mmio_info
Changbin Du [Tue, 6 Jun 2017 07:56:10 +0000 (15:56 +0800)]
drm/i915/gvt: Cleanup struct intel_gvt_mmio_info

The size, length, addr_mask fields actually are not necessary. Every
tracked mmio has DWORD size, and addr_mask is a legacy field.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
7 years agodrm/i915/gvt: Optimize MMIO register handling for some large MMIO blocks
Changbin Du [Tue, 6 Jun 2017 07:56:09 +0000 (15:56 +0800)]
drm/i915/gvt: Optimize MMIO register handling for some large MMIO blocks

Some of traced MMIO registers are a large continuous section. These
stuffed the MMIO lookup hash table and so waste lots of memory and
get much lower lookup performance.

Here we picked out these sections by special handling. These sections
include:
  o Display pipe registers, total 768.
  o The PVINFO page, total 1024.
  o MCHBAR_MIRROR, total 65536.
  o CSR_MMIO, total 3072.

So we removed 70,400 items from the hash table, and speed up guest
boot time by ~500ms.

v2:
  o add a local function find_mmio_block().
  o fix comments.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
7 years agodrm/i915/gvt: add gtt_invalidate API to flush the GTT TLB
Chuanxiao Dong [Fri, 2 Jun 2017 07:34:24 +0000 (15:34 +0800)]
drm/i915/gvt: add gtt_invalidate API to flush the GTT TLB

add gtt_invalidate API to handle the GTT TLB flush instead of
hiding in write_pte64 function. This can avoid overkill when using
write_pte64

Suggested-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
7 years agodrm/i915/gvt: Add runtime_pm get/put to proctect MMIO accessing
Chuanxiao Dong [Fri, 2 Jun 2017 07:34:23 +0000 (15:34 +0800)]
drm/i915/gvt: Add runtime_pm get/put to proctect MMIO accessing

In some cases, GVT-g is accessing MMIO without holding runtime_pm
and this patch can add the inline API for doing the runtime_pm get/put
to make sure when accessing HW MMIO the i915 HW is really powered on.

Suggested-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
7 years agodrm/i915/gvt: remove redundant -Wall
Nick Desaulniers [Sun, 21 May 2017 07:15:27 +0000 (00:15 -0700)]
drm/i915/gvt: remove redundant -Wall

This flag is already set in the top level Makefile of the kernel.

Also, by having set CONFIG_DRM_I915_GVT, thereby appending -Wall to
ccflags, you undo all the -Wno-* cflags previously set in the Make
variable KBUILD_CFLAGS.

For example:

cc foo.c -Wall -Wno-format -Wall

resets -Wformat.

Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
7 years agodrm/i915/gvt: Legacy HSW related MMIO handler clean up
fred gao [Thu, 25 May 2017 07:32:27 +0000 (15:32 +0800)]
drm/i915/gvt: Legacy HSW related MMIO handler clean up

remove all the legacy pre-BDW mmio handlers and the corresponding
usage/definition since pre-BDW platforms are not supported in GVT
environment.

v2:
- clean up all the left dirty code before BDW, e.g
  all D_HSW usage and itself, D_IVB, D_PRE_BDW. (Zhenyu)
v3:
- change is based on gvt-staging. (Zhenyu)

Signed-off-by: fred gao <fred.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
7 years agodrm/i915/gvt: Trigger scheduling after context complete
Ping Gao [Wed, 24 May 2017 01:14:11 +0000 (09:14 +0800)]
drm/i915/gvt: Trigger scheduling after context complete

The time based scheduler poll context busy status at every
micro-second during vGPU switch, it will make GPU idle for a while
when the context is very small and completed before the next
micro-second arrival. Trigger scheduling immediately after context
complete will eliminate GPU idle and improve performance.

Create two vGPU with same type, run Heaven simultaneously:
Before this patch:
 +---------+----------+----------+
 |         |  vGPU1   |   vGPU2  |
 +---------+----------+----------+
 |  Heaven |  357     |    354   |
 +-------------------------------+

After this patch:
 +---------+----------+----------+
 |         |  vGPU1   |   vGPU2  |
 +---------+----------+----------+
 |  Heaven |  397     |    398   |
 +-------------------------------+

v2: Let need_reschedule protect by gvt-lock.

Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
7 years agodrm/i915/gvt: Support event based scheduling
Ping Gao [Wed, 24 May 2017 12:30:17 +0000 (20:30 +0800)]
drm/i915/gvt: Support event based scheduling

This patch decouple the time slice calculation and scheduler, let
other event be able to trigger scheduling without impact the
calculation for QoS.

v2: add only one new enum definition.
v3: fix typo.

Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
7 years agodrm/i915/gvt: Delete gvt_dbg_cmd() in cmd_parser_exec()
Xiong Zhang [Mon, 22 May 2017 21:38:09 +0000 (05:38 +0800)]
drm/i915/gvt: Delete gvt_dbg_cmd() in cmd_parser_exec()

Since cmd message have been recorded in trace, gvt_dbg_cmd isn't
necessary. This will reduce much of dmesg as gvt_dbg_cmd is repeated
on each workload.

Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
7 years agodrm/i915/gvt: Change flood gvt dmesg into trace
Xiong Zhang [Mon, 22 May 2017 21:38:08 +0000 (05:38 +0800)]
drm/i915/gvt: Change flood gvt dmesg into trace

Currently gvt dmesg is so heavy at drm.debug=0x2 that guest and
host almost couldn't run on xengt.

This patch transfer these repeated messages into trace, so dmesg
is light at drm.debug=0x2, and user could get the target message through
trace event and trace filter.

Suggested-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
7 years agodrm/i915/gvt: clean up the unused last_ctx_submit_time of struct intel_vgpu
Changbin Du [Mon, 22 May 2017 09:46:47 +0000 (17:46 +0800)]
drm/i915/gvt: clean up the unused last_ctx_submit_time of struct intel_vgpu

Clean up it as it is not used now.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
7 years agodrm/i915/gvt: add RING_INSTDONE and SC_INSTDONE mmio handler in GVT-g
Weinan Li [Fri, 19 May 2017 15:48:34 +0000 (23:48 +0800)]
drm/i915/gvt: add RING_INSTDONE and SC_INSTDONE mmio handler in GVT-g

kernel hangcheck needs to check RING_INSTDONE and SC_INSTDONE registers'
state to know if hardware is still running. In GVT-g environment, we need
to emulate these registers changing for all the guests although they are
not render owner. Here we return the physical state for all the guests,
then if INSTDONE is changing guest can know hardware is still running
although its workload is pending.

Read INSTDONE isn't one correct way to know if guest trigger gfx reset,
especially with Linux guest, it will read ACTH first, then check INSTDONE
and SUBSLICE registers to check if hardware is still running, at last
trigger gfx reset when it finds all the registers is frozen. In Windows
guest, read INSTDONE usually happens when OS detect TDR.

With the difference between Windows and Linux guest, "disable_warn_untrack"
may let debug log run into wrong state(Linux guest trigger hangcheck
with no ACTHD changed, then check INSTDONE), but actually there is no TDR
happened.

The new policy is always WARN with untrack MMIO r/w. Bad effect is many
noisy untrack mmio warning logs exist when real TDR happen. Even so you can
control the log output or not by setting the debug mask bit.

v2: remove log in instdone_mmio_read

Suggested-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
7 years agodrm/i915/gvt: implement per-vm mmio switching optimization
Changbin Du [Thu, 4 May 2017 02:52:38 +0000 (10:52 +0800)]
drm/i915/gvt: implement per-vm mmio switching optimization

Commit ab9da627906a ("drm/i915: make context status notifier head be
per engine") gives us a chance to inspect every single request. Then
we can eliminate unnecessary mmio switching for same vGPU. We only
need mmio switching for different VMs (including host).

This patch introduced a new general API intel_gvt_switch_mmio() to
replace the old intel_gvt_load/restore_render_mmio(). This function
can be further optimized for vGPU to vGPU switching.

To support individual ring switch, we track the owner who occupy
each ring. When another VM or host request a ring we do the mmio
context switching. Otherwise no need to switch the ring.

This optimization is very useful if only one guest has plenty of
workloads and the host is mostly idle. The best case is no mmio
switching will happen.

v2:
  o fix missing ring switch issue. (chuanxiao)
  o support individual ring switch.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
7 years agodrm/i915/gvt: refactor function intel_vgpu_submit_execlist
Changbin Du [Thu, 4 May 2017 10:36:54 +0000 (18:36 +0800)]
drm/i915/gvt: refactor function intel_vgpu_submit_execlist

The function intel_vgpu_submit_execlist could be more simpler. It
actually does:
  1) validate the submission. The first context must be valid,
     and all two must be privilege_access.
  2) submit valid contexts. The first one need emulate schedule_in.

We do not need a bitmap, valid desc copy valid_desc. Local variable
emulate_schedule_in also can be optimized out.

v2: dump desc content in err msg (Zhi Wang)

Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
7 years agodrm/i915/gvt: rewrite the trace gvt:gvt_command using trace style approach
Changbin Du [Wed, 3 May 2017 01:20:10 +0000 (09:20 +0800)]
drm/i915/gvt: rewrite the trace gvt:gvt_command using trace style approach

The gvt:gvt_command trace involve unnecessary overhead even this trace is
not enabled. We need improve it.

The kernel trace infrastructure provide a full api to define a trace event.
We should leverage them if possible. And one important thing is that a trace
point should store raw data but not format string.

This patch include two part work:
1) Refactor the gvt_command trace definition, including:
  o only store raw trace data.
  o use __dynamic_array() to declare a variable size buffer.
  o use __print_array() to format raw cmd data.
  o rename vm_id as vgpu_id.

2) Improve the trace invoking, including:
  o remove the cycles calculation for handler. We can get this data
    by any perf tool.
  o do not make a backup for raw cmd data which just doesn't make sense.

With this patch, this trace has no overhead if it is not enabled. And we are
trace style now.

The final output example:
  gvt workload 0-211   [000] ...1   120.555964: gvt_command: vgpu1 ring 0: buf_type 0, ip_gma e161e880, raw cmd {0x4000000}
  gvt workload 0-211   [000] ...1   120.556014: gvt_command: vgpu1 ring 0: buf_type 0, ip_gma e161e884, raw cmd {0x7a000004,0x1004000,0xe1511018,0x0,0x7d,0x0}
  gvt workload 0-211   [000] ...1   120.556062: gvt_command: vgpu1 ring 0: buf_type 0, ip_gma e161e89c, raw cmd {0x7a000004,0x140000,0x0,0x0,0x0,0x0}
  gvt workload 0-211   [000] ...1   120.556110: gvt_command: vgpu1 ring 0: buf_type 0, ip_gma e161e8b4, raw cmd {0x10400002,0xe1511018,0x0,0x7d}

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
7 years agodrm/i915: Update DRIVER_DATE to 20170529
Daniel Vetter [Mon, 29 May 2017 07:00:58 +0000 (09:00 +0200)]
drm/i915: Update DRIVER_DATE to 20170529

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
7 years agodrm/i915: Keep the forcewake timer alive for 1ms past the most recent use
Chris Wilson [Fri, 26 May 2017 13:22:09 +0000 (14:22 +0100)]
drm/i915: Keep the forcewake timer alive for 1ms past the most recent use

Currently the timer is armed for 1ms after the first use and is killed
immediately, dropping the forcewake as early as possible. However, for
very frequent operations the forcewake dance has a large impact on
latency and keeping the timer alive until we are idle is preferred. To
achieve this, if we call intel_uncore_forcewake_get whilst the timer is
alive (repeated use), then set a flag to restart the timer on expiry
rather than drop the forcewake usage count. The timer is racy, the
consequence of the race is to expire the timer earlier than is now
desired but does not impact on correct behaviour. The offset the race
slightly, we set the active flag again on intel_uncore_forcewake_put.

The effect should be to reduce the jitter of reacquiring the fw every
1ms on a busy system. However, the cost is to keep the timer alive for
an extra 1ms on a nearly idle system. We chose to incur the jitter
previously to keep the timer off for as much as possible.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170526132209.14640-1-chris@chris-wilson.co.uk
7 years agodrm/i915/guc: capture GuC logs if FW fails to load
Daniele Ceraolo Spurio [Mon, 22 May 2017 17:50:28 +0000 (10:50 -0700)]
drm/i915/guc: capture GuC logs if FW fails to load

We're currently deleting the GuC logs if the FW fails to load, but those
are still useful to understand why the loading failed. Keeping the
object around allows us to access them after driver load is completed.

v2: keep the object around instead of using kernel memory (chris)
    don't store the logs in the gpu_error struct (Chris)
    add a check on guc_log_level to avoid snapshotting empty logs

v3: use separate debugfs for error log (Chris)

v4: rebased

v5: clean up obj selection, move err_load inside guc_log, move err_load
    cleanup, rename functions (Michal)

v6: move obj back to intel_guc, move functions to intel_uc.c, don't
    clear obj on new GuC load, free object only if enable_guc_loading
    is set (Michal)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1495475428-19295-1-git-send-email-daniele.ceraolospurio@intel.com
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Tested-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agodrm/i915/guc: Introduce buffer based cmd transport
Michal Wajdeczko [Fri, 26 May 2017 11:13:25 +0000 (11:13 +0000)]
drm/i915/guc: Introduce buffer based cmd transport

Buffer based command transport can replace MMIO based mechanism.
It may be used to perform host-2-guc and guc-to-host communication.

Portions of this patch are based on work by:
 Michel Thierry <michel.thierry@intel.com>
 Robert Beckett <robert.beckett@intel.com>
 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

v2: use gem_object_pin_map (Chris)
    don't use DEBUG_RATELIMITED (Chris)
    don't track action stats (Chris)
    simplify next fence (Chris)
    use READ_ONCE (Chris)
    move blob allocation to new function (Chris)

v3: use static owner id (Daniele)
v4: but keep channel initialization generic (Daniele)
    and introduce owner_sub_id (Daniele)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170526111326.87280-3-michal.wajdeczko@intel.com
7 years agodrm/i915/guc: Disable send function on fini
Michal Wajdeczko [Fri, 26 May 2017 11:13:24 +0000 (11:13 +0000)]
drm/i915/guc: Disable send function on fini

In earlier patch 789a625 we were enabling send function only
after successful init. For completeness, we should make sure
that we disable it on fini.

v2: don't group steps by submission flag (Chris)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170526111326.87280-2-michal.wajdeczko@intel.com
7 years agodrm: Add definition for eDP backlight frequency
Puthikorn Voravootivat [Tue, 23 May 2017 22:38:04 +0000 (15:38 -0700)]
drm: Add definition for eDP backlight frequency

This patch adds the following definition
- Bit mask for EDP_PWMGEN_BIT_COUNT and min/max cap
  register which only use bit 0:4
- Base frequency (27 MHz) for backlight PWM frequency
  generator.

Signed-off-by: Puthikorn Voravootivat <puthik@chromium.org>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170523223805.46372-5-puthik@chromium.org
7 years agodrm/i915: Drop AUX backlight enable check for backlight control
Puthikorn Voravootivat [Tue, 23 May 2017 22:38:01 +0000 (15:38 -0700)]
drm/i915: Drop AUX backlight enable check for backlight control

There are some panel that
(1) does not support display backlight enable via AUX
(2) support display backlight adjustment via AUX
(3) support display backlight enable via eDP BL_ENABLE pin

The current driver required that (1) must be support to enable (2).
This patch drops that requirement.

Signed-off-by: Puthikorn Voravootivat <puthik@chromium.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170523223805.46372-2-puthik@chromium.org
7 years agodrm/i915: Consolidate #ifdef CONFIG_INTEL_IOMMU
Chris Wilson [Thu, 25 May 2017 12:16:12 +0000 (13:16 +0100)]
drm/i915: Consolidate #ifdef CONFIG_INTEL_IOMMU

We depend on intel_iommu_gfx_mapped for various workarounds, but that is
only available under an #ifdef CONFIG_INTEL_IOMMU. Refactor all the
cut-and-paste ifdefs to a common routine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170525121612.2190-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
7 years agodrm/i915: Only GGTT vma may be pinned and prevent shrinking
Chris Wilson [Thu, 25 May 2017 07:25:28 +0000 (08:25 +0100)]
drm/i915: Only GGTT vma may be pinned and prevent shrinking

As only GGTT vma may be permanently pinned and are always at the head of
the object's vma list, as soon as we seen a ppGTT vma we can stop
searching for any_vma_pinned().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170525072528.11185-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
7 years agodrm/i915: Serialize GTT/Aperture accesses on BXT
Jon Bloomfield [Wed, 24 May 2017 15:54:11 +0000 (08:54 -0700)]
drm/i915: Serialize GTT/Aperture accesses on BXT

BXT has a H/W issue with IOMMU which can lead to system hangs when
Aperture accesses are queued within the GAM behind GTT Accesses.

This patch avoids the condition by wrapping all GTT updates in stop_machine
and using a flushing read prior to restarting the machine.

The stop_machine guarantees no new Aperture accesses can begin while
the PTE writes are being emmitted. The flushing read ensures that
any following Aperture accesses cannot begin until the PTE writes
have been cleared out of the GAM's fifo.

Only FOLLOWING Aperture accesses need to be separated from in flight
PTE updates. PTE Writes may follow tightly behind already in flight
Aperture accesses, so no flushing read is required at the start of
a PTE update sequence.

This issue was reproduced by running
igt/gem_readwrite and
igt/gem_render_copy
simultaneously from different processes, each in a tight loop,
with INTEL_IOMMU enabled.

This patch was originally published as:
drm/i915: Serialize GTT Updates on BXT

v2: Move bxt/iommu detection into static function
    Remove #ifdef CONFIG_INTEL_IOMMU protection
    Make function names more reflective of purpose
    Move flushing read into static function

v3: Tidy up for checkpatch.pl

Testcase: igt/gem_concurrent_blit
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: John Harrison <john.C.Harrison@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1495641251-30022-1-git-send-email-jon.bloomfield@intel.com
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agodrm/i915: Convert i915_gem_object_ops->flags values to use BIT()
Chris Wilson [Tue, 23 May 2017 10:31:16 +0000 (11:31 +0100)]
drm/i915: Convert i915_gem_object_ops->flags values to use BIT()

Having just watched someone add a new value, 0x3, without realising that
the flags were bit values, I have come to appreciate the value in using
BIT.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170523103116.32239-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
7 years agodrm/i915/selftests: Silence compiler warning in igt_ctx_exec
Chris Wilson [Tue, 23 May 2017 19:44:12 +0000 (20:44 +0100)]
drm/i915/selftests: Silence compiler warning in igt_ctx_exec

The compiler doesn't always spot the guard that object is allocated on
the first pass, leading to:

drivers/gpu/drm/i915/selftests/i915_gem_context.c: warning: 'obj' may be used uninitialized in this function [-Wuninitialized]:  => 370:8

v2: Make it more obvious by setting obj to NULL on the first pass and
any later pass where we need to reallocate.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: 791ff39ae32a ("drm/i915: Live testing for context execution")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
c: <drm-intel-fixes@lists.freedesktop.org> # v4.12-rc1+
Link: http://patchwork.freedesktop.org/patch/msgid/20170523194412.1195-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
7 years agodrm/i915/guc: Skip port assign on first iteration of GuC dequeue
Michał Winiarski [Tue, 23 May 2017 10:23:59 +0000 (12:23 +0200)]
drm/i915/guc: Skip port assign on first iteration of GuC dequeue

If port[0] is occupied and we're trying to dequeue request from
different context, we will inevitably hit BUG_ON in port_assign.
Let's skip it - similar to what we're doing in execlists counterpart.

Fixes: 77f0d0e925e8a0 ("drm/i915/execlists: Pack the count into the low bits of the port.request")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Wajdeczko <michal.wajdeczko@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170523102400.9614-2-michal.winiarski@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agodrm/i915: Remove misleading comment in request_alloc
Michał Winiarski [Tue, 23 May 2017 10:23:58 +0000 (12:23 +0200)]
drm/i915: Remove misleading comment in request_alloc

Passing NULL ctx to request_alloc would lead to null-ptr-deref.

v2: Let's not replace the comment with a BUG_ON

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170523102400.9614-1-michal.winiarski@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agodrm/i915/g33: Improve reset reliability
Mika Kuoppala [Mon, 22 May 2017 09:02:44 +0000 (12:02 +0300)]
drm/i915/g33: Improve reset reliability

We improved the reset reliablity on gen4 with
stopping all engines before commencing reset, in
commit 2c80353f3cd0 ("drm/i915/g4x: Improve gpu reset reliability")

Evidence indicates that this same trick works with g33.

v2: proper gen naming, comment readability (Chris)

Testcase: igt/gem_busy/*-hang #blb-e6850
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170522090244.2557-1-mika.kuoppala@intel.com
7 years agoRevert "drm/i915: Restore lost "Initialized i915" welcome message"
Daniel Vetter [Wed, 17 May 2017 13:15:57 +0000 (15:15 +0200)]
Revert "drm/i915: Restore lost "Initialized i915" welcome message"

This reverts commit bc5ca47c0af4f949ba889e666b7da65569e36093.

Gabriel put this back into generic code with

commit 75f6dfe3e652e1adef8cc1b073c89f3e22103a8f
Author: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Date:   Wed Dec 28 12:32:11 2016 -0200

    drm: Deduplicate driver initialization message

but somehow he missed Chris' patch to add the message meanwhile.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101025
Fixes: 75f6dfe3e652 ("drm: Deduplicate driver initialization message")
Cc: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v4.11+
Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517131557.7836-1-daniel.vetter@ffwll.ch
7 years agodrm/i915/huc: Update GLK HuC version
Anusha Srivatsa [Thu, 18 May 2017 17:47:11 +0000 (10:47 -0700)]
drm/i915/huc: Update GLK HuC version

Update version of HuC from 01.07.1748 to the
version 02.00.1748

Cc: Ander Conselvan <ander.conselvan.de.oliveira@intel.com>
Cc: John Spotswood <john.a.spotswood@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1495129631-2930-1-git-send-email-anusha.srivatsa@intel.com
7 years agodrm/i915: Check for allocation failure
Colin Ian King [Fri, 19 May 2017 17:56:17 +0000 (18:56 +0100)]
drm/i915: Check for allocation failure

The memory allocation for C is not being null checked and hence we
could end up with a null pointer dereference. Fix this with a null
pointer check. (I really should have noticed this when I was fixing an
earlier issue.)

Detected by CoverityScan, CID#1436406 ("Dereference null return")

Fixes: 47624cc3301b60 ("drm/i915: Import the kfence selftests for i915_sw_fence")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170519175617.7036-1-colin.king@canonical.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agodrm/i915/guc: Remove action status and statistics from debugfs
Michal Wajdeczko [Mon, 15 May 2017 17:06:09 +0000 (17:06 +0000)]
drm/i915/guc: Remove action status and statistics from debugfs

Usefulness of these stats was over-advertised.

v2: remove duplicated engine stats (Chris)

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170515170610.35528-1-michal.wajdeczko@intel.com
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agodrm/i915/g4x: Improve gpu reset reliability
Mika Kuoppala [Fri, 19 May 2017 09:13:40 +0000 (12:13 +0300)]
drm/i915/g4x: Improve gpu reset reliability

ELK seems to very picky about the preconditions to reset.
Evidence on Eaglelake (8086:2e12 (rev 03)) shows that it does
not like if reset occurs when there is active ring.

Ville found out that there is workaround with name
'WaMediaResetMainRingCleanup' which suggests that we need to
cleanup rings before resetting. It is unclear what cleanup
exactly means but evidence shows that stopping the ring
does have an effect on reset reliability. This patch makes
reset successful on hangs induced by chained batches (the igt ones).
Note that if the hang is inside a shader, it is possible
that our attempts to stop the ring achieves anything.

v2: zero ctl,head,tail also. bug ref. use driver debugs (Chris)
v3: specify platform on testcases, comment tidyup (Chris)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100942
Testcase: igt/gem_busy/*-hang #elk
Testcase: igt/gem_ringfill/hang-* #elk
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170519091340.21439-1-mika.kuoppala@intel.com
7 years agodrm/i915/guc: Remove last submission result from debugfs
Michal Wajdeczko [Thu, 18 May 2017 11:31:04 +0000 (11:31 +0000)]
drm/i915/guc: Remove last submission result from debugfs

Debugfs does not seems to be a right place to display transient data.
If we want to capture errors, we should log them.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170518113104.54400-3-michal.wajdeczko@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agodrm/i915/guc: Remove failed doorbell stat from debugfs
Michal Wajdeczko [Thu, 18 May 2017 11:31:03 +0000 (11:31 +0000)]
drm/i915/guc: Remove failed doorbell stat from debugfs

This stat is almost always zero unless fatal error occurs,
which should be reported by other means anyway.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170518113104.54400-2-michal.wajdeczko@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agodrm/i915/guc: Remove stale comment for q_fail
Michal Wajdeczko [Thu, 18 May 2017 11:31:02 +0000 (11:31 +0000)]
drm/i915/guc: Remove stale comment for q_fail

This member was dropped long time ago.

Fixes: 774439e1 ("drm/i915/guc: re-optimise i915_guc_client layout")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170518113104.54400-1-michal.wajdeczko@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agodrm/i915: Reorder media/render reset on g4x
Chris Wilson [Thu, 18 May 2017 20:48:11 +0000 (21:48 +0100)]
drm/i915: Reorder media/render reset on g4x

Ville found a reference to WaMediaResetBeforeFullReset which we presume
means that we should simply do the media reset first.

References: https://bugs.freedesktop.org/show_bug.cgi?id=100942
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170518204811.7408-2-chris@chris-wilson.co.uk
7 years agodrm/i915: Try harder to reset the GPU
Chris Wilson [Thu, 18 May 2017 20:48:10 +0000 (21:48 +0100)]
drm/i915: Try harder to reset the GPU

Repeat the reset a couple of times if at first we do not succeed.

v2: differentiate which path/engine failed with a debug message

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170513083726.502-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170518204811.7408-1-chris@chris-wilson.co.uk
7 years agodrm/i915/selftests: Pretend to be a gfx pci device
Chris Wilson [Thu, 18 May 2017 09:46:15 +0000 (10:46 +0100)]
drm/i915/selftests: Pretend to be a gfx pci device

Set the class on our mock pci device to GFX. This should be useful for
utilities like intel-iommu that special case gfx devices.

References: https://bugs.freedesktop.org/show_bug.cgi?id=101080
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170518094638.5469-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
7 years agodrm/i915: Check C for null pointer rather than B
Colin Ian King [Thu, 18 May 2017 13:39:42 +0000 (14:39 +0100)]
drm/i915: Check C for null pointer rather than B

There are two occasions where pointer B is being check for a NULL
when it should be pointer C instead. Fix these.

Detected by CoverityScan, CID#1436348,1436349 ("Logically Dead Code")

Fixes: 47624cc3301b60 ("drm/i915: Import the kfence selftests for i915_sw_fence")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170518133942.5660-1-colin.king@canonical.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agodrm/i915: Fix new -Wint-in-bool-context gcc compiler warning
Hans de Goede [Thu, 18 May 2017 11:06:44 +0000 (13:06 +0200)]
drm/i915: Fix new -Wint-in-bool-context gcc compiler warning

This commit fixes the following compiler warning:

drivers/gpu/drm/i915/intel_dsi.c: In function ‘intel_dsi_prepare’:
drivers/gpu/drm/i915/intel_dsi.c:1487:23: warning:
    ?: using integer constants in boolean context [-Wint-in-bool-context]
       PORT_A ? PORT_C : PORT_A),

Fixes: f4c3a88e5f04 ("drm/i915: Tighten mmio arrays for MIPI_PORT")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170518110644.9902-1-hdegoede@redhat.com
7 years agodrm/i915/skl+: use linetime latency if ddb size is not available
Kumar, Mahesh [Wed, 17 May 2017 11:58:29 +0000 (17:28 +0530)]
drm/i915/skl+: use linetime latency if ddb size is not available

This patch make changes to use linetime latency if allocated
DDB size during plane watermark calculation is not available.
linetime is the time, display engine takes to fetch one line worth of
pixels with given pixel clock rate.
This is required to implement new DDB allocation algorithm.

In New Algorithm DDB is allocated based on WM values, because of which
number of DDB blocks will not be available during WM calculation,
So this "linetime latency" is suggested by SV/HW team to be used during
switch-case for WM blocks selection.
linetime latency us = pipe horizontal total pixels/adjusted pixel rate MHz

Changes since v1:
 - Rebase on top of Paulo's patch series
Changes since v2:
 - Fix if-else condition (pointed by Maarten)
Changes since v3:
 - Use common function for timetime_us calculation (Paulo)
 - rebase on drm-tip
Changes since v4:
 - Use consistent name for fixed_point operation
Changes since v5:
 - Improve commit message
 - rename skl_get_linetime_us to intel_get_linetime_us
 - fix watermark result selection (Matt)

Signed-off-by: "Mahesh Kumar" <mahesh1.kumar@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517115831.13830-11-mahesh1.kumar@intel.com
7 years agodrm/i915/skl+: Perform wm level calculations in separate function
Kumar, Mahesh [Wed, 17 May 2017 11:58:28 +0000 (17:28 +0530)]
drm/i915/skl+: Perform wm level calculations in separate function

Instead of iterating over planes & wm levels in a single function use
skl_compute_wm_level function to interate over WM levels.
Change name of function to skl_compute_wm_levels (Matt).

These changes are to clean-up WM code & will help in making only new
ddb algorithm related changes in later patch in series.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517115831.13830-10-mahesh1.kumar@intel.com
7 years agodrm/i915/skl+: Watermark calculation cleanup
Kumar, Mahesh [Wed, 17 May 2017 11:58:27 +0000 (17:28 +0530)]
drm/i915/skl+: Watermark calculation cleanup

This patch cleanup/reorganises the watermark calculation functions.
This patch make use of already available macro
"drm_atomic_crtc_state_for_each_plane_state" to walk through
plane_state list instead of calculating plane_state in function itself.

This restructuring will help later patch for new DDB allocation
algorithm to do only algo related changes.

Changes from V1:
 - split the patch in two parts as per Matt's comment

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517115831.13830-9-mahesh1.kumar@intel.com
7 years agodrm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation
Kumar, Mahesh [Wed, 17 May 2017 11:58:26 +0000 (17:28 +0530)]
drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation

DDB minimum requirement of crtc configuration (cumulative of all the
enabled planes in crtc) may exceed the allocated DDB for crtc/pipe.
This patch make changes to fail the flip/ioctl if minimum requirement
for pipe exceeds the total ddb allocated to the pipe.
Previously it succeeded but making alloc_size a negative value. Which
will make subsequent calculations for plane ddb allocation bogus & may
lead to screen corruption or system hang.

Changes from V1:
 - Improve commit message as per Ander's comment
 - Remove extra parentheses (Ander)

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517115831.13830-8-mahesh1.kumar@intel.com
7 years agodrm/i915/skl+: no need to memset again
Kumar, Mahesh [Wed, 17 May 2017 11:58:25 +0000 (17:28 +0530)]
drm/i915/skl+: no need to memset again

We are already doing memset of ddb structure at the begining of skl_allocate_pipe_ddb
function, No need to again do a memset.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517115831.13830-7-mahesh1.kumar@intel.com
7 years agodrm/i915/skl: Fail the flip if no FB for WM calculation
Kumar, Mahesh [Wed, 17 May 2017 11:58:24 +0000 (17:28 +0530)]
drm/i915/skl: Fail the flip if no FB for WM calculation

Fail the flip if no FB is present but plane_state is set as visible.
Above is not a valid combination so instead of continue fail the flip.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517115831.13830-6-mahesh1.kumar@intel.com
7 years agodrm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point
Kumar, Mahesh [Wed, 17 May 2017 11:58:23 +0000 (17:28 +0530)]
drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point

This patch make changes to calculate adjusted plane pixel rate &
plane downscale amount using fixed_point functions available.
This patch will give uniformity in code, & will help to avoid mixing of
32bit uint32_t variable for fixed-16.16 with fixed_16_16_t variables in
later patch in the series.

Changes from V1:
 - Rebase based on wrapper name change
 - Remove unnecessary comment

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517115831.13830-5-mahesh1.kumar@intel.com
7 years agodrm/i915: Use fixed_16_16 wrapper for division operation
Kumar, Mahesh [Wed, 17 May 2017 11:58:22 +0000 (17:28 +0530)]
drm/i915: Use fixed_16_16 wrapper for division operation

Don't use fixed_16_16 structure members directly, instead use wrapper to
perform fixed_16_16 division operation.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517115831.13830-4-mahesh1.kumar@intel.com
7 years agodrm/i915: Add more wrapper for fixed_point_16_16 operations
Kumar, Mahesh [Wed, 17 May 2017 11:58:21 +0000 (17:28 +0530)]
drm/i915: Add more wrapper for fixed_point_16_16 operations

This patch adds few wrapper to perform fixed_point_16_16 operations
mul_round_up_u32_fixed16 : Multiplies u32 and fixed_16_16_t variables
   & returns u32 result with rounding-up.
mul_fixed16 : Multiplies two fixed_16_16_t variable & returns fixed_16_16
div_round_up_fixed16 : Perform division operation on fixed_16_16_t
       variables & return u32 result with round-off
div_round_up_u32_fixed16 : devide uint32_t variable by fixed_16_16 variable
   and round_up the result to uint32_t.

These wrappers will be used by later patches in the series.

Changes from V1:
 - Rename wrapper as per Matt's comment
Changes from V2:
 - Fix indentation

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517115831.13830-3-mahesh1.kumar@intel.com
7 years agodrm/i915: fix naming of fixed_16_16 wrapper.
Kumar, Mahesh [Wed, 17 May 2017 11:58:20 +0000 (17:28 +0530)]
drm/i915: fix naming of fixed_16_16 wrapper.

fixed_16_16_div_round_up(_u64), wrapper for fixed_16_16 division
operation don't really round_up the result. Wrapper round_up only the
fraction part of the result to make it 16-bit.
This patch eliminates round_up keyword from the wrapper.

Later patch will introduce the new wrapper to do rounding-off the result
and give unt32_t output to cleanup mix use of fixed_16_16_t & uint32_t
variables.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517115831.13830-2-mahesh1.kumar@intel.com
7 years agodrm/i915: Don't force serialisation on marking up execlists irq posted
Chris Wilson [Wed, 17 May 2017 12:10:07 +0000 (13:10 +0100)]
drm/i915: Don't force serialisation on marking up execlists irq posted

Since we coordinate with the execlists tasklet using a locked schedule
operation that ensures that after we set the engine->irq_posted we
always have an invocation of the tasklet, we do not need to use a locked
operation to set the engine->irq_posted itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-12-chris@chris-wilson.co.uk
7 years agodrm/i915: Stop inlining the execlists IRQ handler
Chris Wilson [Wed, 17 May 2017 12:10:06 +0000 (13:10 +0100)]
drm/i915: Stop inlining the execlists IRQ handler

As the handler is now quite complex, involving a few atomics, the cost
of the function preamble is negligible in comparison and so we should
leave the function out-of-line for better I$.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-11-chris@chris-wilson.co.uk
7 years agodrm/i915/execlists: Reduce lock contention between schedule/submit_request
Chris Wilson [Wed, 17 May 2017 12:10:05 +0000 (13:10 +0100)]
drm/i915/execlists: Reduce lock contention between schedule/submit_request

If we do not require to perform priority bumping, and we haven't yet
submitted the request, we can update its priority in situ and skip
acquiring the engine locks -- thus avoiding any contention between us
and submit/execute.

v2: Remove the stack element from the list if we can do the early
assignment.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-10-chris@chris-wilson.co.uk
7 years agodrm/i915: Create a kmem_cache to allocate struct i915_priolist from
Chris Wilson [Wed, 17 May 2017 12:10:04 +0000 (13:10 +0100)]
drm/i915: Create a kmem_cache to allocate struct i915_priolist from

The i915_priolist are allocated within an atomic context on a path where
we wish to minimise latency. If we use a dedicated kmem_cache, we have
the advantage of a local freelist from which to service new requests
that should keep the latency impact of an allocation small. Though
currently we expect the majority of requests to be at default priority
(and so hit the preallocate priolist), once userspace starts using
priorities they are likely to use many fine grained policies improving
the utilisation of a private slab.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-9-chris@chris-wilson.co.uk
7 years agodrm/i915: Split execlist priority queue into rbtree + linked list
Chris Wilson [Wed, 17 May 2017 12:10:03 +0000 (13:10 +0100)]
drm/i915: Split execlist priority queue into rbtree + linked list

All the requests at the same priority are executed in FIFO order. They
do not need to be stored in the rbtree themselves, as they are a simple
list within a level. If we move the requests at one priority into a list,
we can then reduce the rbtree to the set of priorities. This should keep
the height of the rbtree small, as the number of active priorities can not
exceed the number of active requests and should be typically only a few.

Currently, we have ~2k possible different priority levels, that may
increase to allow even more fine grained selection. Allocating those in
advance seems a waste (and may be impossible), so we opt for allocating
upon first use, and freeing after its requests are depleted. To avoid
the possibility of an allocation failure causing us to lose a request,
we preallocate the default priority (0) and bump any request to that
priority if we fail to allocate it the appropriate plist. Having a
request (that is ready to run, so not leading to corruption) execute
out-of-order is better than leaking the request (and its dependency
tree) entirely.

There should be a benefit to reducing execlists_dequeue() to principally
using a simple list (and reducing the frequency of both rbtree iteration
and balancing on erase) but for typical workloads, request coalescing
should be small enough that we don't notice any change. The main gain is
from improving PI calls to schedule, and the explicit list within a
level should make request unwinding simpler (we just need to insert at
the head of the list rather than the tail and not have to make the
rbtree search more complicated).

v2: Avoid use-after-free when deleting a depleted priolist

v3: Michał found the solution to handling the allocation failure
gracefully. If we disable all priority scheduling following the
allocation failure, those requests will be executed in fifo and we will
ensure that this request and its dependencies are in strict fifo (even
when it doesn't realise it is only a single list). Normal scheduling is
restored once we know the device is idle, until the next failure!
Suggested-by: Michał Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-8-chris@chris-wilson.co.uk
7 years agodrm/i915: Use a define for the default priority [0]
Chris Wilson [Wed, 17 May 2017 12:10:02 +0000 (13:10 +0100)]
drm/i915: Use a define for the default priority [0]

Explicitly assign the default priority, and give it a name. After much
discussion, we have chosen to call it I915_PRIORITY_NORMAL!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-7-chris@chris-wilson.co.uk
7 years agodrm/i915: Don't mark an execlists context-switch when idle
Chris Wilson [Wed, 17 May 2017 12:10:01 +0000 (13:10 +0100)]
drm/i915: Don't mark an execlists context-switch when idle

If we *know* that the engine is idle, i.e. we have not more contexts in
flight, we can skip any spurious CSB idle interrupts. These spurious
interrupts seem to arrive long after we assert that the engines are
completely idle, triggering later assertions:

[  178.896646] intel_engine_is_idle(bcs): interrupt not handled, irq_posted=2
[  178.896655] ------------[ cut here ]------------
[  178.896658] kernel BUG at drivers/gpu/drm/i915/intel_engine_cs.c:226!
[  178.896661] invalid opcode: 0000 [#1] SMP
[  178.896663] Modules linked in: i915(E) x86_pkg_temp_thermal(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) ghash_clmulni_intel(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) intel_gtt(E) i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) aesni_intel(E) prime_numbers(E) evdev(E) aes_x86_64(E) drm(E) crypto_simd(E) cryptd(E) glue_helper(E) mei_me(E) mei(E) lpc_ich(E) efivars(E) mfd_core(E) battery(E) video(E) acpi_pad(E) button(E) tpm_tis(E) tpm_tis_core(E) tpm(E) autofs4(E) i2c_i801(E) fan(E) thermal(E) i2c_designware_platform(E) i2c_designware_core(E)
[  178.896694] CPU: 1 PID: 522 Comm: gem_exec_whispe Tainted: G            E   4.11.0-rc5+ #14
[  178.896702] task: ffff88040aba8d40 task.stack: ffffc900003f0000
[  178.896722] RIP: 0010:intel_engine_init_global_seqno+0x1db/0x1f0 [i915]
[  178.896725] RSP: 0018:ffffc900003f3ab0 EFLAGS: 00010246
[  178.896728] RAX: 0000000000000000 RBX: ffff88040af54000 RCX: 0000000000000000
[  178.896731] RDX: ffff88041ec933e0 RSI: ffff88041ec8cc48 RDI: ffff88041ec8cc48
[  178.896734] RBP: ffffc900003f3ac8 R08: 0000000000000000 R09: 000000000000047d
[  178.896736] R10: 0000000000000040 R11: ffff88040b344f80 R12: 0000000000000000
[  178.896739] R13: ffff88040bce0000 R14: ffff88040bce52d8 R15: ffff88040bce0000
[  178.896742] FS:  00007f2cccc2d8c0(0000) GS:ffff88041ec80000(0000) knlGS:0000000000000000
[  178.896746] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  178.896749] CR2: 00007f41ddd8f000 CR3: 000000040bb03000 CR4: 00000000001406e0
[  178.896752] Call Trace:
[  178.896768]  reset_all_global_seqno.part.33+0x4e/0xd0 [i915]
[  178.896782]  i915_gem_request_alloc+0x304/0x330 [i915]
[  178.896795]  i915_gem_do_execbuffer+0x8a1/0x17d0 [i915]
[  178.896799]  ? remove_wait_queue+0x48/0x50
[  178.896812]  ? i915_wait_request+0x300/0x590 [i915]
[  178.896816]  ? wake_up_q+0x70/0x70
[  178.896819]  ? refcount_dec_and_test+0x11/0x20
[  178.896823]  ? reservation_object_add_excl_fence+0xa5/0x100
[  178.896835]  i915_gem_execbuffer2+0xab/0x1f0 [i915]
[  178.896844]  drm_ioctl+0x1e6/0x460 [drm]
[  178.896858]  ? i915_gem_execbuffer+0x260/0x260 [i915]
[  178.896862]  ? dput+0xcf/0x250
[  178.896866]  ? full_proxy_release+0x66/0x80
[  178.896869]  ? mntput+0x1f/0x30
[  178.896872]  do_vfs_ioctl+0x8f/0x5b0
[  178.896875]  ? ____fput+0x9/0x10
[  178.896878]  ? task_work_run+0x80/0xa0
[  178.896881]  SyS_ioctl+0x3c/0x70
[  178.896885]  entry_SYSCALL_64_fastpath+0x17/0x98
[  178.896888] RIP: 0033:0x7f2ccb455ca7
[  178.896890] RSP: 002b:00007ffcabec72d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  178.896894] RAX: ffffffffffffffda RBX: 000055f897a44b90 RCX: 00007f2ccb455ca7
[  178.896897] RDX: 00007ffcabec74a0 RSI: 0000000040406469 RDI: 0000000000000003
[  178.896900] RBP: 00007f2ccb70a440 R08: 00007f2ccb70d0a4 R09: 0000000000000000
[  178.896903] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  178.896905] R13: 000055f89782d71a R14: 00007ffcabecf838 R15: 0000000000000003
[  178.896908] Code: 00 31 d2 4c 89 ef 8d 70 48 41 ff 95 f8 06 00 00 e9 68 fe ff ff be 0f 00 00 00 48 c7 c7 48 dc 37 a0 e8 fa 33 d6 e0 e9 0b ff ff ff <0f> 0b 0f 0b 0f 0b 0f 0b 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00

On the other hand, by ignoring the interrupt do we risk running out of
space in CSB ring? Testing for a few hours suggests not, i.e. that we
only seem to get the odd delayed CSB idle notification.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-6-chris@chris-wilson.co.uk
7 years agodrm/i915/execlists: Pack the count into the low bits of the port.request
Chris Wilson [Wed, 17 May 2017 12:10:00 +0000 (13:10 +0100)]
drm/i915/execlists: Pack the count into the low bits of the port.request

add/remove: 1/1 grow/shrink: 5/4 up/down: 391/-578 (-187)
function                                     old     new   delta
execlists_submit_ports                       262     471    +209
port_assign.isra                               -     136    +136
capture                                     6344    6359     +15
reset_common_ring                            438     452     +14
execlists_submit_request                     228     238     +10
gen8_init_common_ring                        334     341      +7
intel_engine_is_idle                         106     105      -1
i915_engine_info                            2314    2290     -24
__i915_gem_set_wedged_BKL                    485     411     -74
intel_lrc_irq_handler                       1789    1604    -185
execlists_update_context                     294       -    -294

The most important change there is the improve to the
intel_lrc_irq_handler and excclist_submit_ports (net improvement since
execlists_update_context is now inlined).

v2: Use the port_api() for guc as well (even though currently we do not
pack any counters in there, yet) and hide all port->request_count inside
the helpers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-5-chris@chris-wilson.co.uk
7 years agodrm/i915: Redefine ptr_pack_bits() and friends
Chris Wilson [Wed, 17 May 2017 12:09:59 +0000 (13:09 +0100)]
drm/i915: Redefine ptr_pack_bits() and friends

Rebrand the current (pointer | bits) pack/unpack utility macros as
explicit bit twiddling for PAGE_SIZE so that we can use the more
flexible underlying macros for different bits.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-4-chris@chris-wilson.co.uk
7 years agodrm/i915: Make ptr_unpack_bits() more function-like
Chris Wilson [Wed, 17 May 2017 12:09:58 +0000 (13:09 +0100)]
drm/i915: Make ptr_unpack_bits() more function-like

ptr_unpack_bits() is a function-like macro, as such it is meant to be
replaceable by a function. In this case, we should be passing in the
out-param as a pointer.

Bizarrely this does affect code generation:

function                                     old     new   delta
i915_gem_object_pin_map                      409     389     -20

An improvement(?) in this case, but one can't help wonder what
strict-aliasing optimisations we are preventing.

The generated code looks identical in using ptr_unpack_bits (no extra
motions to stack, the pointer and bits appear to be kept in registers),
the difference appears to be code ordering and with a reorder it is able
to use smaller forward jumps.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-3-chris@chris-wilson.co.uk
7 years agodrm/i915: Import the kfence selftests for i915_sw_fence
Chris Wilson [Wed, 17 May 2017 12:09:57 +0000 (13:09 +0100)]
drm/i915: Import the kfence selftests for i915_sw_fence

A long time ago, I wrote some selftests for the struct kfence idea. Now
that we have infrastructure in i915/igt for running kselftests, include
some for i915_sw_fence.

v2: INIT_WORK_ONSTACK/destroy_work_on_stack (Mika)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-2-chris@chris-wilson.co.uk
7 years agodrm/i915: Remove kref from i915_sw_fence
Chris Wilson [Wed, 17 May 2017 12:09:56 +0000 (13:09 +0100)]
drm/i915: Remove kref from i915_sw_fence

My original intention was for i915_sw_fence to be the base class and
provide the reference count for the container. This was from starting
with a design to handle async_work. In practice, for i915 we embed
fences into structs which have their own independent reference counting,
making the i915_sw_fence.kref duplicitous. If we remove the kref, we
remove the i915_sw_fence's ability to free itself and its independence,
it can only exist within a container and must be supplied with a
callback to handle its release.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-1-chris@chris-wilson.co.uk
7 years agodrm/i915/gen9: Reintroduce WaEnableYV12BugFixInHalfSliceChicken7
Arkadiusz Hiler [Fri, 12 May 2017 11:20:15 +0000 (13:20 +0200)]
drm/i915/gen9: Reintroduce WaEnableYV12BugFixInHalfSliceChicken7

This basically reverts commit 465418c6064c
("drm/i915/gen9: Remove WaEnableYV12BugFixInHalfSliceChicken7")
with small addition - marking it as affecting GLK as well.

It was incorrectly considered fixed in production steppings.

References: HSD#2126385, HSD#2131381, HSDES#1504433555, BSID#0764
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
[Mika: s/KBL/GLK on commit message]
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170512112015.19082-1-arkadiusz.hiler@intel.com
7 years agodrm/i915: use vma->size for appgtt allocate_va_range
Matthew Auld [Tue, 16 May 2017 08:55:14 +0000 (09:55 +0100)]
drm/i915: use vma->size for appgtt allocate_va_range

For the aliasing ppgtt we clear the va range up to vma->size, but seem
to allocate up to vma->node.size, which is a little inconsistent given
that vma->node.size >= vma->size. Not that is really matters all that
much since we preallocate anyway, but for consistency just use
vma->size.

Fixes: ff685975d97f ("drm/i915: Move allocate_va_range to GTT")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170516085514.5853-1-matthew.auld@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agogpu: drm: i915: compress logic into one line
Gustavo A. R. Silva [Mon, 15 May 2017 22:00:28 +0000 (17:00 -0500)]
gpu: drm: i915: compress logic into one line

Simplify logic to avoid unnecessary variable declaration and assignment.

Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170515220028.GA15149@embeddedgus
7 years agogpu: drm: i915: remove dead code
Gustavo A. R. Silva [Mon, 15 May 2017 21:56:05 +0000 (16:56 -0500)]
gpu: drm: i915: remove dead code

Local variable has_reduced_clock is assigned to a constant value and it is
never updated again. Remove this variable and the dead code it guards.

Addresses-Coverity-ID: 1362230
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170515215605.GA14963@embeddedgus
7 years agodrm/i915/guc:fix spelling mistake: "adddress" -> "address"
Colin Ian King [Tue, 16 May 2017 09:22:35 +0000 (10:22 +0100)]
drm/i915/guc:fix spelling mistake: "adddress" -> "address"

Trivial fix to spelling mistake in seq_printf message.

Fixes: a8b9370fc79c1 ("drm/i915/guc: Dump the GuC stage descriptor pool in debugfs")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170516092235.28640-1-colin.king@canonical.com
7 years agodrm/i915/glk: Calculate high/low switch count for GLK
Madhav Chauhan [Tue, 9 May 2017 13:29:24 +0000 (18:59 +0530)]
drm/i915/glk: Calculate high/low switch count for GLK

As per BSPEC, high/low switch count to be programmed in
terms of byteclock using exit_zero_count and prep_count.
For Geminilake exit/prep counts are already calculated
in terms of byteclock. This patch calculates high/low
switch count using counts value in byteclock, old calculation
leads to screen flicker/shift issue while resuming from S3/S4.

Signed-off-by: Madhav Chauhan <madhav.chauhan@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1494336565-19185-1-git-send-email-madhav.chauhan@intel.com
7 years agodrm/i915: Fixup 64bit divides in timelines selftest
Chris Wilson [Sat, 13 May 2017 09:41:54 +0000 (10:41 +0100)]
drm/i915: Fixup 64bit divides in timelines selftest

Some 64b divides snuck in when doing the prng timing compensation.

Fixes: 4797948071f6 ("drm/i915: Squash repeated awaits on the same fence")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170513094154.3581-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
7 years agodrm/i915: Update DRIVER_DATE to 20170515
Daniel Vetter [Mon, 15 May 2017 07:11:48 +0000 (09:11 +0200)]
drm/i915: Update DRIVER_DATE to 20170515

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
7 years agodrm/i915/perf: rate limit spurious oa report notice
Robert Bragg [Thu, 11 May 2017 15:43:31 +0000 (16:43 +0100)]
drm/i915/perf: rate limit spurious oa report notice

This change is pre-emptively aiming to avoid a potential cause of kernel
logging noise in case some condition were to result in us seeing invalid
OA reports.

The workaround for the OA unit's tail pointer race condition is what
avoids the primary known cause of invalid reports being seen and with
that in place we aren't expecting to see this notice but it can't be
entirely ruled out.

Just in case some condition does lead to the notice then it's likely
that it will be triggered repeatedly while attempting to append a
sequence of reports and depending on the configured OA sampling
frequency that might be a large number of repeat notices.

v2: (Chris) avoid inconsistent warning on throttle with
    printk_ratelimit()
v3: (Matt) init and summarise with stream init/close not driver init/fini

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170511154345.962-9-lionel.g.landwerlin@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agodrm/i915/perf: better pipeline aged/aging tail updates
Robert Bragg [Thu, 11 May 2017 15:43:30 +0000 (16:43 +0100)]
drm/i915/perf: better pipeline aged/aging tail updates

This updates the tail pointer race workaround handling to updating the
'aged' pointer before looking to start aging a new one. There's the
possibility that there is already new data available and so we can
immediately start aging a new pointer without having to first wait for a
later hrtimer callback (and then another to age).

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170511154345.962-8-lionel.g.landwerlin@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agodrm/i915/perf: improve invalid OA format debug message
Robert Bragg [Thu, 11 May 2017 15:43:29 +0000 (16:43 +0100)]
drm/i915/perf: improve invalid OA format debug message

A minor improvement to debugging output

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170511154345.962-7-lionel.g.landwerlin@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agodrm/i915/perf: improve tail race workaround
Robert Bragg [Thu, 11 May 2017 15:43:28 +0000 (16:43 +0100)]
drm/i915/perf: improve tail race workaround

There's a HW race condition between OA unit tail pointer register
updates and writes to memory whereby the tail pointer can sometimes get
ahead of what's been written out to the OA buffer so far (in terms of
what's visible to the CPU).

Although this can be observed explicitly while copying reports to
userspace by checking for a zeroed report-id field in tail reports, we
want to account for this earlier, as part of the _oa_buffer_check to
avoid lots of redundant read() attempts.

Previously the driver used to define an effective tail pointer that
lagged the real pointer by a 'tail margin' measured in bytes derived
from OA_TAIL_MARGIN_NSEC and the configured sampling frequency.
Unfortunately this was flawed considering that the OA unit may also
automatically generate non-periodic reports (such as on context switch)
or the OA unit may be enabled without any periodic sampling.

This improves how we define a tail pointer for reading that lags the
real tail pointer by at least %OA_TAIL_MARGIN_NSEC nanoseconds, which
gives enough time for the corresponding reports to become visible to the
CPU.

The driver now maintains two tail pointers:
 1) An 'aging' tail with an associated timestamp that is tracked until we
    can trust the corresponding data is visible to the CPU; at which point
    it is considered 'aged'.
 2) An 'aged' tail that can be used for read()ing.

The two separate pointers let us decouple read()s from tail pointer aging.

The tail pointers are checked and updated at a limited rate within a
hrtimer callback (the same callback that is used for delivering POLLIN
events) and since we're now measuring the wall clock time elapsed since
a given tail pointer was read the mechanism no longer cares about
the OA unit's periodic sampling frequency.

The natural place to handle the tail pointer updates was in
gen7_oa_buffer_is_empty() which is called as part of blocking reads and
the hrtimer callback used for polling, and so this was renamed to
oa_buffer_check() considering the added side effect while checking
whether the buffer contains data.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170511154345.962-6-lionel.g.landwerlin@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agodrm/i915/perf: no head/tail ref in gen7_oa_read
Robert Bragg [Thu, 11 May 2017 15:43:27 +0000 (16:43 +0100)]
drm/i915/perf: no head/tail ref in gen7_oa_read

This avoids redundantly passing an (inout) head and tail pointer to
gen7_append_oa_reports() from gen7_oa_read which doesn't need to
reference either itself.

Moving the head/tail reads and writes into gen7_append_oa_reports should
have no functional effect except to avoid some redundant head pointer
writes in cases where nothing was copied to userspace.

This is a stepping stone towards updating how the head and tail pointer
state is managed to improve the workaround for the OA unit's tail
pointer race. It reduces the number of places we need to read/write the
head and tail pointers.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170511154345.962-5-lionel.g.landwerlin@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agodrm/i915/perf: avoid read back of head register
Robert Bragg [Thu, 11 May 2017 15:43:26 +0000 (16:43 +0100)]
drm/i915/perf: avoid read back of head register

There's no need for the driver to keep reading back the head pointer
from hardware since the hardware doesn't update it automatically. This
way we can treat any invalid head pointer value as a software/driver
bug instead of spurious hardware behaviour.

This change is also a small stepping stone towards re-working how
the head and tail state is managed as part of an improved workaround
for the tail register race condition.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170511154345.962-4-lionel.g.landwerlin@intel.com
7 years agodrm/i915/perf: avoid poll, read, EAGAIN busy loops
Robert Bragg [Thu, 11 May 2017 15:43:25 +0000 (16:43 +0100)]
drm/i915/perf: avoid poll, read, EAGAIN busy loops

If the function for checking whether there is OA buffer data available
(during a poll or blocking read) has false positives then we want to
avoid a situation where the subsequent read() returns EAGAIN (after
a more accurate check) followed by a poll() immediately reporting
the same false positive POLLIN event and effectively maintaining a
busy loop until there really is data.

This makes sure that we clear the .pollin event status whenever we
return EAGAIN to userspace which will throttle subsequent POLLIN events
and repeated attempts to read to the 5ms intervals of the hrtimer
callback we have.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170511154345.962-3-lionel.g.landwerlin@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agodrm/i915/perf: fix gen7_append_oa_reports comment
Robert Bragg [Thu, 11 May 2017 15:43:24 +0000 (16:43 +0100)]
drm/i915/perf: fix gen7_append_oa_reports comment

If I'm going to complain about a back-to-front convention then the least
I can do is not muddle the comment up too.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170511154345.962-2-lionel.g.landwerlin@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agodrm/i915: Restore brightness level in aux backlight driver
Puthikorn Voravootivat [Thu, 11 May 2017 23:02:23 +0000 (16:02 -0700)]
drm/i915: Restore brightness level in aux backlight driver

Some panel will default to zero brightness when turning the
panel off and on again. This patch restores last brightness
level back when panel is turning back on.

Signed-off-by: Puthikorn Voravootivat <puthik@chromium.org>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170511230225.142870-8-puthik@chromium.org
7 years agodrm/i915: Set backlight mode before enable backlight
Puthikorn Voravootivat [Thu, 11 May 2017 23:02:21 +0000 (16:02 -0700)]
drm/i915: Set backlight mode before enable backlight

We should set backlight mode register before set register to
enable the backlight.

Signed-off-by: Puthikorn Voravootivat <puthik@chromium.org>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170511230225.142870-6-puthik@chromium.org
7 years agodrm/i915: Correctly enable backlight brightness adjustment via DPCD
Puthikorn Voravootivat [Thu, 11 May 2017 23:02:18 +0000 (16:02 -0700)]
drm/i915: Correctly enable backlight brightness adjustment via DPCD

intel_dp_aux_enable_backlight() assumed that the register
BACKLIGHT_BRIGHTNESS_CONTROL_MODE can only has value 01
(DP_EDP_BACKLIGHT_CONTROL_MODE_PRESET) when initialize.

This patch fixed that by handling all cases of that register.

Signed-off-by: Puthikorn Voravootivat <puthik@chromium.org>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170511230225.142870-3-puthik@chromium.org
7 years agodrm/i915: Fix cap check for intel_dp_aux_backlight driver
Puthikorn Voravootivat [Thu, 11 May 2017 23:02:17 +0000 (16:02 -0700)]
drm/i915: Fix cap check for intel_dp_aux_backlight driver

intel_dp_aux_backlight driver should check for the
DP_EDP_BACKLIGHT_BRIGHTNESS_AUX_SET_CAP before enable the driver.

Signed-off-by: Puthikorn Voravootivat <puthik@chromium.org>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170511230225.142870-2-puthik@chromium.org
7 years agodrm/i915: don't do allocate_va_range again on PIN_UPDATE
Matthew Auld [Fri, 12 May 2017 09:14:23 +0000 (10:14 +0100)]
drm/i915: don't do allocate_va_range again on PIN_UPDATE

If a vma is already bound to a ppgtt, we incorrectly call
allocate_va_range again when doing a PIN_UPDATE, which will result in
over accounting within our paging structures, such that when we do
unbind something we don't actually destroy the structures and end up
inadvertently recycling them. In reality this probably isn't too bad,
but once we start touching PDEs and PDPEs for 64K/2M/1G pages this
apparent recycling will manifest into lots of really, really subtle
bugs.

v2: Fix the testing of vma->flags for aliasing_ppgtt_bind_vma

Fixes: ff685975d97f ("drm/i915: Move allocate_va_range to GTT")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170512091423.26085-1-chris@chris-wilson.co.uk
7 years agodrm/i915: set initialised only when init_context callback is NULL
Chuanxiao Dong [Thu, 11 May 2017 10:07:42 +0000 (18:07 +0800)]
drm/i915: set initialised only when init_context callback is NULL

During execlist_context_deferred_alloc() we presumed that the context is
uninitialised (we only just allocated the state object for it!) and
chose to optimise away the later call to engine->init_context() if
engine->init_context were NULL. This breaks with GVT's contexts that are
marked as pre-initialised to avoid us annoyingly calling
engine->init_context(). The fix is to not override ce->initialised if it
is already true.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1494497262-24855-1-git-send-email-chuanxiao.dong@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agodrm/i915: Do not sync RCU during shrinking
Joonas Lahtinen [Wed, 10 May 2017 11:00:40 +0000 (14:00 +0300)]
drm/i915: Do not sync RCU during shrinking

Due to the complex dependencies between workqueues and RCU, which
are not easily detected by lockdep, do not synchronize RCU during
shrinking.

On low-on-memory systems (mem=1G for example), the RCU sync leads
to all system workqueus freezing and unrelated lockdep splats are
displayed according to reports. GIT bisecting done by J. R.
Okajima points to the commit where RCU syncing was extended.

RCU sync gains us very little benefit in real life scenarios
where the amount of memory used by object backing storage is
dominant over the metadata under RCU, so drop it altogether.

 " Yeeeaah, if core could just, go ahead and reclaim RCU
   queues, that'd be great. "

  - Chris Wilson, 2016 (0eafec6d3244)

v2: More information to commit message.
v3: Remove "grep _rcu_" escapee from i915_gem_shrink_all (Andrea)

Fixes: c053b5a506d3 ("drm/i915: Don't call synchronize_rcu_expedited under struct_mutex")
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reported-by: J. R. Okajima <hooanon05g@gmail.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Hugh Dickins <hughd@google.com>
Tested-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: J. R. Okajima <hooanon05g@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: <stable@vger.kernel.org> # v4.11+
Link: http://patchwork.freedesktop.org/patch/msgid/1494414040-11160-1-git-send-email-joonas.lahtinen@linux.intel.com
7 years agodrm/i915/guc: Make scratch register base and count flexible
Michal Wajdeczko [Wed, 10 May 2017 12:59:27 +0000 (12:59 +0000)]
drm/i915/guc: Make scratch register base and count flexible

We are using some scratch registers in MMIO based send function.
Make their base and count flexible in preparation of upcoming
GuC firmware/hardware changes. While around, change cmd len
parameter verification from WARN_ON to GEM_BUG_ON as we don't
need this all the time.

v2: call out WARN/GEM_BUG change in the commit msg (Daniele)
v3: don't overqualify the ints (Chris)
v4: rebase and use proper enum

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
7 years agodrm/i915/guc: Move notification code into virtual function
Michal Wajdeczko [Wed, 10 May 2017 12:59:26 +0000 (12:59 +0000)]
drm/i915/guc: Move notification code into virtual function

Prepare for alternate GuC notification mechanism.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
[Joonas: Added newlines]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
7 years agodrm/i915: Remove vma unpin in intel_plane_destroy
Maarten Lankhorst [Thu, 11 May 2017 08:28:44 +0000 (10:28 +0200)]
drm/i915: Remove vma unpin in intel_plane_destroy

commit a667fb402c1e856209bf9e77ba41fc1cf356b867
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date:   Thu Dec 15 15:29:44 2016 +0100

    drm/i915: Disable all crtcs during driver unload, v2.

made sure that all crtc's are disabled on driver unload, but only the
following commit made sure all fb's are cleaned up correctly:

commit 9b2104f423de5c148749a07e8197dbab4c449877
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date:   Tue Feb 21 14:51:40 2017 +0100

    drm/atomic: Make disable_all helper fully disable the crtc.

Finally remove this and add a WARN_ON when vma is set. It should
have been removed by intel_cleanup_plane_fb().

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170511082844.13965-2-maarten.lankhorst@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
7 years agodrm/i915: Fix hw state verifier access to crtc->state.
Maarten Lankhorst [Thu, 11 May 2017 08:28:43 +0000 (10:28 +0200)]
drm/i915: Fix hw state verifier access to crtc->state.

We shouldn't inspect crtc->state, instead grab the crtc state.
At this point the hw state verifier should be able to run even if
crtc->state has been updated (which cannot currently happen).

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170511082844.13965-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
7 years agodrm/i915/guc: Dump the GuC stage descriptor pool in debugfs
Oscar Mateo [Wed, 10 May 2017 15:04:51 +0000 (15:04 +0000)]
drm/i915/guc: Dump the GuC stage descriptor pool in debugfs

We are missing pieces of information that could be useful for GuC
debugging.

v2: Reuse some code (Joonas)

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
[Joonas: Removed extra newline and s/uint32_t/u32/ for checkpatch.pl]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1494428691-20672-1-git-send-email-oscar.mateo@intel.com
7 years agodrm/i915: Fix __intel_wait_for_register_fw to not sleep in atomic
Daniel Vetter [Wed, 10 May 2017 15:19:32 +0000 (17:19 +0200)]
drm/i915: Fix __intel_wait_for_register_fw to not sleep in atomic

The unconditionally fallback to the blocking wait_for resulted in
impressive fireworks at boot-up on my snb here. Make sure if we set
the slow timeout to 0 that we never ever sleep. The tail of the
callchain was

intel_wait_for_register
-> __intel_wait_for_register_fw
  -> usleep_range
     -> BOOM

It blew up in intel_crt_detect load detection code on the
ADPA_CRT_HOTPLUG_FORCE_TRIGGER in the ADPA register.

v2: Shut up gcc.

v3: Use uninitialized_var() (Chris).

Fixes: 0564654340e2 ("drm/i915: Acquire uncore.lock over intel_uncore_wait_for_register()")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1494429572-15118-1-git-send-email-daniel.vetter@ffwll.ch
7 years agodrm/i915: Simplify cursor register write sequence
Ville Syrjälä [Mon, 27 Mar 2017 18:55:46 +0000 (21:55 +0300)]
drm/i915: Simplify cursor register write sequence

It looks like simply writing all the cursor register every single
time might be slightly faster than checking to see of each of
them need to be written. So if any other register apart from
CURPOS needs to be written let's just write all the registers.

CURPOS is left as a special case mainly for 845/865 where we have to
disable the cursor to change many of the cursor parameters. This
introduces a slight chance of the cursor flickering when things get
updated (since we're not currently doing the vblank evade for cursor
updates). If we write CURPOS alone then that obviously can't happen.
And let's follow the same pattern in the i9xx code just for symmetry.
I wasn't able to see a singificant performance difference between
this and just writing all the registers unconditionally.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170327185546.2977-16-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
7 years agodrm/i915: Relax 845/865 CURBASE alignemnt requirement to 32 bytes
Ville Syrjälä [Mon, 27 Mar 2017 18:55:45 +0000 (21:55 +0300)]
drm/i915: Relax 845/865 CURBASE alignemnt requirement to 32 bytes

Supposedly 845/865 require only 32 byte alignment for CURBASE. Let's
relax the checks to allow that instead of demanding 4KiB alignment.
This will allow cursor panning in 8 pixel units.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170327185546.2977-15-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
7 years agodrm/i915: Handle fb offset and src coordinates for cursors
Ville Syrjälä [Mon, 27 Mar 2017 18:55:44 +0000 (21:55 +0300)]
drm/i915: Handle fb offset and src coordinates for cursors

The cursor plane doesn't have any kind of source offset register, so
the only form of panning possible is via a the base address register.
The alignment required by CURBASE ranges from 32B to 16KiB depending
on the platform. Let's make sure the user didn't ask for something
we can't do.

Obviously this is impossible to hit via the legacy cursor ioctl since
the src offsets are always 0, but via the plane/atomic ioctls the user
can ask for pretty much anything so we have to deal with this.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170327185546.2977-14-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
7 years agodrm/i915: Fix gen3 physical cursor alignment requirements
Ville Syrjälä [Mon, 27 Mar 2017 18:55:43 +0000 (21:55 +0300)]
drm/i915: Fix gen3 physical cursor alignment requirements

Bspec tells us that gen3 platforms need 4KiB alignment for CURBASE
rather than the 256 byte alignment required by i85x. Let's fix that
and pull the code to determine the correct alignment to a helper
function.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170327185546.2977-13-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
7 years agodrm/i915: Support variable cursor height on ivb+
Ville Syrjälä [Mon, 27 Mar 2017 18:55:42 +0000 (21:55 +0300)]
drm/i915: Support variable cursor height on ivb+

IVB introduced the CUR_FBC_CTL register which allows reducing the cursor
height down to 8 lines from the otherwise square cursor dimensions.
Implement support for it. CUR_FBC_CTL can't be used when the cursor
is rotated.

Commandeer the otherwise unused cursor->cursor.size to track the
current value of CUR_FBC_CTL to optimize away redundant CUR_FBC_CTL
writes, and to notice when we need to arm the update via CURBASE if
just CUR_FBC_CTL changes.

v2: Reverse the gen check to make it sane
v3: Only enable CUR_FBC_CTL when cursor is enabled, adapt to
    earlier code changes which means we now actually turn off
    the cursor when we're supposed to unlike v2
v4: Add a comment about rotation vs. CUR_FBC_CTL,
    rebase due to 'dirty' (Chris)
v5: Rebase to the atomic world
    Handle 180 degree rotation
    Add HAS_CUR_FBC()
v6: Rebase
v7: Rebase due to I915_WRITE_FW/uncore.lock
    s/size/fbc_ctl/

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170327185546.2977-12-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>