Paulo Zanoni [Tue, 1 Apr 2014 18:37:12 +0000 (15:37 -0300)]
drm/i915: add GEN5_IRQ_FINI
Same as the _INIT macro: the goal is to reuse the GEN8 macros, but
there are still some slight differences.
v2: - Rebase.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Paulo Zanoni [Tue, 1 Apr 2014 18:37:11 +0000 (15:37 -0300)]
drm/i915: use GEN8_IRQ_INIT on GEN5
And rename it to GEN5_IRQ_INIT.
We have discussed doing equivalent changes on July 2013, and I even
sent a patch series for this: "[PATCH 00/15] Unify interrupt register
init/reset". Now that the BDW code was merged, I have one more
argument in favor of these changes.
Here's what really changes with the Gen 5 IRQ init code:
- We now clear the IIR registers at preinstall (they are also
cleared at postinstall, but we will change that later).
- We have an additional POSTING_READ at the IMR register.
v2: - Fix typo in commit message.
- Add POSTING_READ calls to the macros (Ben, Daniel, Jani).
Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Paulo Zanoni [Tue, 1 Apr 2014 18:37:10 +0000 (15:37 -0300)]
drm/i915: also use GEN5_IRQ_INIT with south display interrupts
This interrupt gets initialized with a different IER value, so it was
not using the macro. The problem is that we plan to modify the macro
to make it do additional things, and we want the SDE interrupts
updated too. So let's make sure we call the macro, then, after it, we
do the necessary SDE-specific changes.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Paulo Zanoni [Tue, 1 Apr 2014 18:37:09 +0000 (15:37 -0300)]
drm/i915: add GEN5_IRQ_INIT macro
The goal is to reuse the GEN8 macros, but a few changes are needed, so
let's make things easier to review.
I could also use these macros on older code, but since I plan to
change how the interrupts are initialized, we'll risk breaking the
older code in the next commits, so I'll leave this out for now.
v2: - Rebase.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Pradeep Bhat [Fri, 28 Mar 2014 04:44:57 +0000 (10:14 +0530)]
drm/i915: Adding VBT fields to support eDP DRRS feature
This patch reads the DRRS support and Mode type from VBT fields.
The read information will be stored in VBT struct during BIOS
parsing. The above functionality is needed for decision making
whether DRRS feature is supported in i915 driver for eDP panels.
This information helps us decide if seamless DRRS can be done
at runtime to support certain power saving features. This patch
was tested by setting necessary bit in VBT struct and merging
the new VBT with system BIOS so that we can read the value.
v2: Incorporated review comments from Chris Wilson
Removed "intel_" as a prefix for DRRS specific declarations.
v3: Incorporated Jani's review comments
Removed function which deducts drrs mode from panel_type. Modified some
print statements. Made changes to use DRRS_NOT_SUPPORTED as 0 instead of -1.
v4: Incorporated Jani's review comments.
Modifications around setting vbt drrs_type.
Signed-off-by: Pradeep Bhat <pradeep.bhat@intel.com>
Signed-off-by: Vandana Kannan <vandana.kannan@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
[danvet: Drop the misleading/redundant comment about the added drrs
field in the vbt struct as discussed with Jani on irc.]
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ville Syrjälä [Tue, 1 Apr 2014 09:59:08 +0000 (12:59 +0300)]
drm: Make drm_clflush_virt_range() void*
Currently drm_cflush_virt_rage() takes a char* so the caller probably
has to do pointless casting to avoid compiler warnings. Make the
argument void* instead to avoid such issues.
v2: Use void* arithmetic (Chris)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ville Syrjälä [Tue, 1 Apr 2014 07:54:36 +0000 (10:54 +0300)]
drm/i915: Refactor gmch hpd irq handling
Pull all the gmch platform hotplug interrupt handling into one
function.
v2: Move the IIR check to the caller
s/drm_i915_private_t/struct drm_i915_private/
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Add posting read comment suggested by Chris.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ben Widawsky [Tue, 1 Apr 2014 00:16:43 +0000 (17:16 -0700)]
drm/i915/bdw: RPS frequency bits are the same as HSW
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ben Widawsky [Tue, 1 Apr 2014 00:16:42 +0000 (17:16 -0700)]
drm/i915/bdw: Extract rp_state_caps logic
We have a need for duplicated parsing of the RP_STATE_CAPS register (and
the setting of the associated fields). To reuse some code, we can
extract the function into a simple helper.
This patch also addresses the fact that we missed doing this for gen8,
something we should have done anyway.
This could be two patches, one to extract, and one to add gen8, but it's
trivial enough that I think one is fine. I will accept a request to
split it. Please notice the fix addressed by v2 below.
Valleyview is left untouched because it is different.
v2: Logically rebased on top of
commit
dd0a1aa19bd3d7203e58157b84cea78bbac605ac
Author: Jeff McGee <jeff.mcgee@intel.com>
Date: Tue Feb 4 11:32:31 2014 -0600
drm/i915: Restore rps/rc6 on reset
Note with the above change the fix for gen8 is also handled (which was
not the case in Jeff's original patch).
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ben Widawsky [Tue, 1 Apr 2014 00:16:41 +0000 (17:16 -0700)]
drm/i915/bdw: Set initial rps freq to RP1
Programming it outside of the rp0-rp1 range is considered a programming
error. Since we do not know that the previous value would actually be in
the range, program something we've read from the hardware, and therefore
know will work.
This is potentially an issue for platforms whose ranges are outside the
norms given in the programming guide (ie. early silicon)
v2: Use RP1 instead of RPn
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ville Syrjälä [Mon, 31 Mar 2014 15:21:26 +0000 (18:21 +0300)]
drm/i915: Split dp post_disable hooks
Split the post_disable hooks for DP to g4x and vlv variants. We'll
need another variant soon, so this should make it look a bit cleaner.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ville Syrjälä [Mon, 31 Mar 2014 18:29:41 +0000 (21:29 +0300)]
drm/i915: Kill crtc->plane checks from the primary plane update hooks
These were apparently meant to protect the SAREA which only has
room for two pipes, but things clearly went a bit wonky when
first the .update_plane() hooks were split up and then pipe C
got introduced.
The checks actually protecting the SAREA live in
intel_crtc_update_sarea() these days, so the checks in the primary
plane update hooks are just historical leftovers which are to be
eliminated.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel Vetter [Mon, 31 Mar 2014 14:23:03 +0000 (16:23 +0200)]
drm/i915: Deprecate UMS harder
Progess according to the deprecation plan laid out in
commit
b30324adaf8d2e5950a602bde63030d15a61826f
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Wed Nov 13 22:11:25 2013 +0100
drm/i915: Deprecated UMS support
and disable UMS for 3.16. Note that it has been over 5 years since the
last UMS-supporting piece of userspace was released.
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Jani Nikula [Mon, 31 Mar 2014 11:27:22 +0000 (14:27 +0300)]
drm/i915: drop the typedef for drm_i915_private_t
There are no longer users of drm_i915_private_t. Drop the typedef. Good
riddance.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Chris Wilson <chris@chris-wislon.co.uk>
[danvet: Add the hunk in i915_cmd_parser.c here which had to be
relocated to the how this was merged.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Damien Lespiau [Mon, 31 Mar 2014 10:24:08 +0000 (11:24 +0100)]
drm/i915: Use a macro to express the range of valid gens for reg_read
The reg_read whitelist has a gen bitmask to code the gens we're allowing
the register to be read on. Until now, it was a literal, but we can be
a bit more expressive.
To ease the review, a small test program:
$ cat bit-range.c
#include <stdio.h>
#include <stdint.h>
#define U32_C(x) x ## U
#define GENMASK(h, l) (((U32_C(1) << ((h) - (l) + 1)) - 1) << (l))
#define GEN_RANGE(l, h) GENMASK(h, l)
int main(int argc, char **argv)
{
printf("0x%08x\n", GEN_RANGE(1, 1));
printf("0x%08x\n", GEN_RANGE(1, 2));
printf("0x%08x\n", GEN_RANGE(4, 4));
printf("0x%08x\n", GEN_RANGE(4, 5));
printf("0x%08x\n", GEN_RANGE(1, 31));
printf("0x%08x\n", GEN_RANGE(4, 8));
return 0;
}
$ ./bit-range
0x00000002
0x00000006
0x00000010
0x00000030
0xfffffffe
0x000001f0
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Deepak S [Mon, 31 Mar 2014 06:00:02 +0000 (11:30 +0530)]
drm/i915: Match debugfs interface name to new RPS naming
Let's change the i915_cur_delayinfo to i915_frequency_info to be in sync
with new RPS naming convention.
v2: Add "i915_frequency_info" as debugfs interface name (Ben)
Signed-off-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Damien Lespiau [Fri, 28 Mar 2014 16:54:26 +0000 (16:54 +0000)]
drm/i915: Hide the per forcewake-engine register ranges
These defines are only used in intel_uncore.c.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Damien Lespiau [Fri, 28 Mar 2014 16:54:25 +0000 (16:54 +0000)]
drm/i915: Hide vlv_force_wake_{get, put}() in intel_uncore.c
That function isn't used outside this file anymore.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Christoph Jaeger [Fri, 28 Mar 2014 09:19:24 +0000 (10:19 +0100)]
drm/i915: drop __FUNCTION__ as argument to DRM_DEBUG_KMS
DRM_DEBUG_KMS includes printing the function name.
Signed-off-by: Christoph Jaeger <christophjaeger@linux.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Damien Lespiau [Fri, 28 Mar 2014 14:17:49 +0000 (14:17 +0000)]
drm/i915: Don't store the max cursor width/height in the crtc
Those values are, global, only used in one function and already stored
in mode_config.cursor_{width,height}.
As a result, this initialization code has been moved from the
crtc_init() function to the global modeset_init() one.
I also renamed CURSOR_{WIDTH,HEIGHT} to MAX_CURSOR_{WIDTH,HEIGHT} to be
more accurate about what these value really are.
Cc: Sagar Kamble <sagar.a.kamble@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Imre Deak [Thu, 27 Mar 2014 15:45:11 +0000 (17:45 +0200)]
drm/i915: vlv: get power domain for eDP vdd
Besides D0 device state we need the proper power wells to be on on
some platforms, so get the port power domain reference instead of an RPM
reference.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Antti Koskipää <antti.koskipaa@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Imre Deak [Thu, 27 Mar 2014 15:45:10 +0000 (17:45 +0200)]
drm/i915: vlv: cache current CD clock rate
Instead of reading out the CD clock rate from the HW at each modeset, do
this only during driver init and resume and use the cached value during
modeset. This moves things towards a state where the sw and hw side
setup is separated. It's also needed for VLV RPM, where we don't put
device into D0 state until modeset_global_resources is called and thus
can't access any display/gfx registers.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Antti Koskipää <antti.koskipaa@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Thu, 27 Mar 2014 09:06:14 +0000 (09:06 +0000)]
drm/i915: Add PM interrupt details and RPS thresholds to debugfs
When trying to determine whether RPS is working as intended, more
information is better. In particular, what interrupts are being
generated and the various thresholds for generating them.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Kenneth Graunke [Wed, 26 Mar 2014 05:52:03 +0000 (22:52 -0700)]
drm/i915: Add OACONTROL to the command parser register whitelist.
Mesa needs to be able to write OACONTROL in order to expose the
Observability Architecture's performance counters via OpenGL.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
[danvet: Add comment that this is just a temporary work-around and
that we need to check more things before we can allow OACONTROL writes
for real everywhere.]
[danvet 2: Squash in fixup to avoid a DRM_ERROR due to unsorted reg
list, spotted by Jani.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Brad Volkin [Tue, 18 Feb 2014 18:15:57 +0000 (10:15 -0800)]
drm/i915: Enable command parsing by default
v2: rebased
OTC-Tracker: AXIA-4631
Change-Id: I6747457e1fe7494bd42787af51198fcba398ad78
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
[danvet: Resolve tiny conflict in module option text.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Brad Volkin [Tue, 18 Feb 2014 18:15:56 +0000 (10:15 -0800)]
drm/i915: Add a CMD_PARSER_VERSION getparam
So userspace can query the kernel for command parser support.
v2: Add i915_cmd_parser_get_version(), history log, and kerneldoc
OTC-Tracker: AXIA-4631
Change-Id: I58af650db9f6753c2dcac9c54ab432fd31db302f
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Brad Volkin [Tue, 18 Feb 2014 18:15:55 +0000 (10:15 -0800)]
drm/i915: Reject commands that would store to global HWS page
PIPE_CONTROL and MI_FLUSH_DW have bits that would write to the
hardware status page. The driver stores request tracking info
there, so don't let userspace overwrite it.
v2: trailing comma fix, rebased
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Brad Volkin [Tue, 18 Feb 2014 18:15:54 +0000 (10:15 -0800)]
drm/i915: Enable PPGTT command parser checks
Various commands that access memory have a bit to determine whether
the graphics address specified in the command should use the GGTT or
PPGTT for translation. These checks ensure that the bit indicates
PPGTT translation.
Most of these checks use the existing bit-checking infrastructure.
The PIPE_CONTROL and MI_FLUSH_DW commands, however, are multi-function
commands. The GGTT/PPGTT bit is only relevant for certain uses of the
command. As such, this change also extends the bit-checking code to
include a "condition" mask and offset. If the condition mask is non-zero
then the parser only performs the bit check when the bits specified by
the condition mask/offset are also non-zero.
NOTE: At this point in the series PPGTT must be enabled for the parser
to work correctly. If it's not enabled, userspace will not be setting
the PPGTT bits the way the parser requires. VLV is the only platform
where this is a problem, so at this point, we disable parsing for VLV.
v2: whitespace and trailing commas fixes, rebased
OTC-Tracker: AXIA-4631
Change-Id: I3f4c76b6734f1956ec47e698230f97d0998ff92b
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
[danvet: Drop the unecessary cast Jani spotted.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Brad Volkin [Tue, 18 Feb 2014 18:15:53 +0000 (10:15 -0800)]
drm/i915: Reject commands that explicitly generate interrupts
The driver leaves most interrupts masked during normal operation,
so there would have to be additional work to enable userspace to
safely request/receive an interrupt.
v2: trailing commas, rebased
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Brad Volkin [Tue, 18 Feb 2014 18:15:52 +0000 (10:15 -0800)]
drm/i915: Enable register whitelist checks
MI_STORE_REGISTER_MEM, MI_LOAD_REGISTER_MEM, and MI_LOAD_REGISTER_IMM
commands allow userspace access to registers. Only certain registers
should be allowed for such access, so enable checking for those commands.
Each ring gets its own register whitelist.
MI_LOAD_REGISTER_REG on HSW also allows register access but is currently
unused by userspace components. Leave it rejected.
PIPE_CONTROL and MEDIA_VFE_STATE allow register access based on certain
bits being set. Reject those as well.
v2: trailing commas, rebased
OTC-Tracker: AXIA-4631
Change-Id: Ie614a2f0eb2e5917de809e5a17957175d24cc44f
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Brad Volkin [Tue, 18 Feb 2014 18:15:51 +0000 (10:15 -0800)]
drm/i915: Add register whitelist for DRM master
These are used to implement scanline waits in the X server.
v2: Use #defines instead of magic numbers
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Brad Volkin [Tue, 18 Feb 2014 18:15:50 +0000 (10:15 -0800)]
drm/i915: Add register whitelists for mesa
These registers are currently used by mesa for blitting,
transform feedback extensions, and performance monitoring
extensions.
v2: REG64 macro
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Brad Volkin [Tue, 18 Feb 2014 18:15:49 +0000 (10:15 -0800)]
drm/i915: Allow some privileged commands from master
The Intel DDX uses these to implement scanline waits in the X server.
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Brad Volkin [Tue, 18 Feb 2014 18:15:48 +0000 (10:15 -0800)]
drm/i915: Reject privileged commands
The spec defines most of these commands as privileged. A few others,
like the semaphore mbox command and some display commands, are also
reserved for the driver's use. Subsequent patches relax some of
these restrictions.
v2: Rebased
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Brad Volkin [Tue, 18 Feb 2014 18:15:47 +0000 (10:15 -0800)]
drm/i915: Initial command parser table definitions
Add command tables defining irregular length commands for each ring.
This requires a few new command opcode definitions.
v2: Whitespace adjustment in command definitions, sparse fix for !F
OTC-Tracker: AXIA-4631
Change-Id: I064bceb457e15f46928058352afe76d918c58ef5
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ben Widawsky [Tue, 25 Mar 2014 01:06:00 +0000 (18:06 -0700)]
drm/i915: Allow full PPGTT with param override
When PPGTT was disabled by default, the patch also prevented the user
from overriding this behavior via module parameter. Being able to test
this on arbitrary kernels is extremely beneficial to track down the
remaining bugs. The patch that prevented this was:
commit
93a25a9e2d67765c3092bfaac9b855d95e39df97
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Mar 6 09:40:43 2014 +0100
drm/i915: Disable full ppgtt by default
By default PPGTT is set to -1. 0 means off, 1 means aliasing only, 2
means full, all other values are reserved.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ben Widawsky [Sun, 23 Mar 2014 05:47:21 +0000 (22:47 -0700)]
drm/i915: Split out GTT specific header file
This file contains all necessary defines, prototypes and typesdefs for
manipulating GEN graphics address translation (this does not include the
legacy AGP driver)
Reiterating the comment in the header,
"Please try to maintain the following order within this file unless it
makes sense to do otherwise. From top to bottom:
1. typedefs
2. #defines, and macros
3. structure definitions
4. function prototypes
Within each section, please try to order by generation in ascending
order, from top to bottom (ie. GEN6 on the top, GEN8 on the bottom)."
I've made some minor cleanups, and fixed a couple of typos while here -
but there should be no functional changes.
The purpose of the patch is to reduce clutter in our main header file,
making room for new growth, and make documentation of our interfaces
easier by splitting things out.
With a little more work, like making i915_gtt a pointer, we could
potentially completely isolate this header from i915_drv.h. At the
moment however, I don't think it's worth the effort.
Personally, I would have liked to put the PTE encoding functions in this
file too, but I didn't want to rock the boat too much.
A similar patch has been in use on my machine for some time. This exact
patch though has only been compile tested.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel Vetter [Tue, 18 Mar 2014 09:26:04 +0000 (10:26 +0100)]
drm/i915: make semaphore signaller detection more robust
Extract all this logic into a new helper function
semaphore_wait_to_signaller_ring because:
- The current code has way too much magic.
- The current code doesn't look at bi16, which encodes VECS signallers
on HSW. Those are just added after the fact, so can't be encoded in
a neat formula.
- The current logic can't blow up since it limits its value range
sufficiently, but that's a bit too tricky to rely on in my opinion.
Especially when we start to add bdw support.
- I'm not a big fan of the explicit ring->semaphore_register list, but
I think it's more robust to use the same mapping both when
constructing the semaphore commands and when decoding them.
- Finally add a FIXME comment about lack of broadwell support here,
like in the earlier ipehr semaphore cmd detection function.
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
[danvet: Actually drop the untrue claim in the commit message Chris
pointed out.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel Vetter [Fri, 14 Mar 2014 23:08:56 +0000 (00:08 +0100)]
drm/i915: Add FIXME for bdw semaphore detection in hancheck
Currently not an issue since we don't emit sempahores, but better
not forget about those.
As a little prep work extract the ipehr decoding for cleaner control
flow. And apply a bit of polish.
Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Fri, 21 Mar 2014 17:18:54 +0000 (17:18 +0000)]
drm/i915: Rename GFX_TLB_INVALIDATE_ALWAYS
The documentation calls this GFX_MODE bit "Flush TLB invalidate Mode".
However, that is not a good name for an enable bit as it doesn't make it
clear what is enabled. An even worse name is GFX_TLB_INVALIDATE_ALWAYS
as enabling that bit actually prevents the TLB from being invalidated at
every flush. This leads to great confusion when reading code and
proposed patches. To get around this try to bake in what is enabled by
setting the bit and call it GFX_TLB_INVALIDATE_EXPLICIT.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Gupta, Sourab" <sourab.gupta@intel.com>
Acked-by: "Gupta, Sourab" <sourab.gupta@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Paulo Zanoni [Tue, 1 Apr 2014 17:55:13 +0000 (14:55 -0300)]
drm/i915: don't get/put runtime PM at the debugfs forcewake file
Because gen6_gt_force_wake_{get,put} should already be responsible for
getting/putting runtime PM. If we keep these calls, debugfs will not
be testing the get/put calls of the forcewake functions.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Paulo Zanoni [Tue, 1 Apr 2014 17:55:12 +0000 (14:55 -0300)]
drm/i915: fix WARNs when reading DDI state while suspended
If runtime PM is enabled and we unset all modes, we will runtime
suspend after __intel_set_mode() , then function
intel_modeset_check_state() will try to read the HW state while it is
suspended and trigger lots of WARNs because it shouldn't be reading
registers.
So on this patch we make intel_ddi_connector_get_hw_state() return
false in case the power domain is disabled, and we also make
intel_display_power_enabled() return false in case the device is
suspended. Notice that we can't just use
intel_display_power_enabled_sw() because while the driver is being
initialized the power domain refcounts are not reflecting the real
state of the hardware.
Just for reference, I have previously published an alternate patch for
this problem, called "drm/i915: get runtime PM at intel_set_mode".
Testcase: igt/pm_pc8
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Paulo Zanoni [Tue, 1 Apr 2014 17:55:11 +0000 (14:55 -0300)]
drm/i915: don't read cursor registers on powered down pipes
At i915_display_info, don't call cursor_position() for a disabled
CRTC, since the CRTC may be on a powered down pipe, and this will
cause "Unclaimed register before interrupt" error messages.
Testcase: igt/pm_pc8/debugfs-read
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Paulo Zanoni [Tue, 1 Apr 2014 17:55:10 +0000 (14:55 -0300)]
drm/i915: get runtime PM at i915_display_info
Otherwise we may get some WARNs complaining that we're reading a
register while we're suspended.
Testcase: igt/pm_pc8/debugfs-read
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Paulo Zanoni [Tue, 1 Apr 2014 17:55:09 +0000 (14:55 -0300)]
drm/i915: don't read pp_ctrl_reg if we're suspended
... at edp_have_panel_vdd. Just return false, saying we don't have the
panel VDD since the device is suspended.
We started getting WARNs about this problem since the patch that
started checking if we're suspended while reading registers.
Example backtrace provided by Paulo:
[ 63.572201] [drm:hsw_enable_pc8] Enabling package C8+
[ 63.581831] [drm:i915_runtime_suspend] Device suspended
[ 63.664798] ------------[ cut here ]------------
[ 63.664824] WARNING: CPU: 3 PID: 828 at
drivers/gpu/drm/i915/intel_uncore.c:47
assert_device_not_suspended.isra.7+0x32/0x40 [i915]()
[ 63.664826] Device suspended
[ 63.664828] Modules linked in: ccm fuse ip6table_filter ip6_tables
ebtable_nat ebtables arc4 ath9k_htc ath9k_common ath9k_hw mac80211 ath
cfg80211 iTCO_wdt iTCO_vendor_support x86_pkg_temp_thermal coretemp
microcode i2c_i801 e1000e pcspkr serio_raw lpc_ich ptp pps_core mei_me
mei mfd_core dm_crypt i915 crc32_pclmul crc32c_intel
ghash_clmulni_intel i2c_algo_bit drm_kms_helper drm video
[ 63.664867] CPU: 3 PID: 828 Comm: kworker/3:3 Not tainted 3.14.0+ #153
[ 63.664869] Hardware name: Intel Corporation Shark Bay Client
platform/WhiteTip Mountain 1, BIOS HSWLPTU1.86C.0133.R00.
1309172123
09/17/2013
[ 63.664887] Workqueue: events edp_panel_vdd_work [i915]
[ 63.664889]
0000000000000009 ffff88009d745c28 ffffffff8167ec6f
ffff88009d745c70
[ 63.664895]
ffff88009d745c60 ffffffff8106c8ed ffff880036278000
00000000000c7204
[ 63.664900]
ffff88014f2d3040 ffff880036278070 0000000000000001
ffff88009d745cc0
[ 63.664905] Call Trace:
[ 63.664911] [<
ffffffff8167ec6f>] dump_stack+0x4d/0x66
[ 63.664916] [<
ffffffff8106c8ed>] warn_slowpath_common+0x7d/0xa0
[ 63.664920] [<
ffffffff8106c95c>] warn_slowpath_fmt+0x4c/0x50
[ 63.664926] [<
ffffffff810bd6be>] ? mark_held_locks+0xae/0x130
[ 63.664941] [<
ffffffffa00d80d2>]
assert_device_not_suspended.isra.7+0x32/0x40 [i915]
[ 63.664956] [<
ffffffffa00d99d2>] gen6_read32+0x32/0x120 [i915]
[ 63.664969] [<
ffffffffa00d99a0>] ? gen6_read8+0x120/0x120 [i915]
[ 63.664985] [<
ffffffffa0106f8f>] edp_have_panel_vdd+0x3f/0x50 [i915]
[ 63.665000] [<
ffffffffa01074e8>] edp_panel_vdd_off_sync+0x58/0x1c0 [i915]
[ 63.665004] [<
ffffffff8108a06c>] ? process_one_work+0x18c/0x560
[ 63.665018] [<
ffffffffa0107684>] edp_panel_vdd_work+0x34/0x50 [i915]
[ 63.665022] [<
ffffffff8108a0d7>] process_one_work+0x1f7/0x560
[ 63.665026] [<
ffffffff8108a06c>] ? process_one_work+0x18c/0x560
[ 63.665031] [<
ffffffff8108ae2b>] worker_thread+0x11b/0x3a0
[ 63.665035] [<
ffffffff8108ad10>] ? manage_workers.isra.21+0x2a0/0x2a0
[ 63.665039] [<
ffffffff810916fc>] kthread+0xfc/0x120
[ 63.665043] [<
ffffffff81091600>] ? kthread_create_on_node+0x230/0x230
[ 63.665048] [<
ffffffff8169082c>] ret_from_fork+0x7c/0xb0
[ 63.665052] [<
ffffffff81091600>] ? kthread_create_on_node+0x230/0x230
[ 63.665054] ---[ end trace
1250bcc890af9999 ]---
[ 63.665060] [drm:edp_panel_vdd_off_sync] Turning eDP VDD off
[ 63.665061] ------------[ cut here ]------------
Testcase: igt/pm_pc8
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Paulo Zanoni [Tue, 1 Apr 2014 17:55:08 +0000 (14:55 -0300)]
drm/i915: get runtime PM at i915_reg_read_ioctl
To avoid WARNs when we call it.
Testcase: igt/pm_pc8/reg-read-ioctl
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75693
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Paulo Zanoni [Tue, 1 Apr 2014 17:55:07 +0000 (14:55 -0300)]
drm/i915: don't schedule force_wake_timer at gen6_read
So far force_wake_timer was only used by gen6_gt_force_wake_put. Since
we always had balanced gen6_gt_force_wake_get/put calls, we could
guarantee balanced calls to intel_runtime_pm_get/put.
Commit
8232644ccf099548710843e97360a3fcd6d28e04, "drm/i915: Convert
the forcewake worker into a timer func" started scheduling the
force_wake_timer at gen6_read, which resulted in an unbalanced
runtime_pm refcount.
So this commit just reverts to the old behavior until we can find a
proper way to used delayed force_wake from the register read/write
macros without leaving the runtime_pm refcounts unbalanced and without
runtime suspending the driver while forcewake is active.
Testcase: igt/pm_pc8/rte
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76544
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Imre Deak [Mon, 31 Mar 2014 12:10:44 +0000 (15:10 +0300)]
drm/i915: vlv: reserve the GT power context only once during driver init
Atm we reserve/allocate and free the power context during GT power
enable/disable time. There is no need to do this, we can reserve/allocate
the buffer once during driver loading and free it during driver cleanup.
The re-reservation can also fail in case the driver previously manages to
allocate something on the given fixed address.
The buffer isn't exepected to move even if allocated by the BIOS, for
safety add an assert to check this assumption.
This also fixed a bug for Ville, where re-reserving the context failed
during a GPU reset (I assume because something else got allocated on its
fixed address).
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Jani Nikula [Mon, 31 Mar 2014 11:27:21 +0000 (14:27 +0300)]
drm/i915: prefer struct drm_i915_private to drm_i915_private_t
Remove the rest of the references to drm_i915_private_t. No functional
changes.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
[danvet: Drop hunk in i915_cmd_parser.c]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Jani Nikula [Mon, 31 Mar 2014 11:27:20 +0000 (14:27 +0300)]
drm/i915/overlay: prefer struct drm_i915_private to drm_i915_private_t
No functional changes.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Jani Nikula [Mon, 31 Mar 2014 11:27:19 +0000 (14:27 +0300)]
drm/i915/ringbuffer: prefer struct drm_i915_private to drm_i915_private_t
No functional changes.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Jani Nikula [Mon, 31 Mar 2014 11:27:18 +0000 (14:27 +0300)]
drm/i915/display: prefer struct drm_i915_private to drm_i915_private_t
No functional changes.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Jani Nikula [Mon, 31 Mar 2014 11:27:17 +0000 (14:27 +0300)]
drm/i915/irq: prefer struct drm_i915_private to drm_i915_private_t
Also drop any unnecessary casts. No functional changes.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Jani Nikula [Mon, 31 Mar 2014 11:27:16 +0000 (14:27 +0300)]
drm/i915/gem: prefer struct drm_i915_private to drm_i915_private_t
No functional changes.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Jani Nikula [Mon, 31 Mar 2014 11:27:15 +0000 (14:27 +0300)]
drm/i915/dma: prefer struct drm_i915_private to drm_i915_private_t
Also drop any unnecessary casts. No functional changes.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Jani Nikula [Mon, 31 Mar 2014 11:27:14 +0000 (14:27 +0300)]
drm/i915/debugfs: prefer struct drm_i915_private to drm_i915_private_t
No functional changes.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Fri, 28 Mar 2014 08:03:34 +0000 (08:03 +0000)]
drm/i915: Mask PM/RPS interrupt generation based on activity
The speculation is that we can conserve more power by masking off
the interrupts at source (PMINTRMSK) rather than filtering them by the
up/down thresholds (RPINTLIM). We can select which events we know will
be active based on the current frequency versus our imposed range, i.e.
if at minimum, we know we will not want to generate any more
down-interrupts and vice versa.
v2: We only need the TIMEOUT when above min frequency.
v3: Tweak VLV at the same time
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Deepak S <deepak.s@linux.intel.com>
Reviewed-by:Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Thu, 27 Mar 2014 08:24:20 +0000 (08:24 +0000)]
drm/i915: Refactor gen6_set_rps
What used to be a short-circuit now needs to adjust interrupt masking in
response to user requests for changing the min/max allowed frequencies.
This is currently done by a special case and early return, but the next
patch adds another common action to take, so refactor the code to reduce
duplication.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by:Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Thu, 27 Mar 2014 08:24:19 +0000 (08:24 +0000)]
Revert "drm/i915: Disable/Enable PM Intrrupts based on the current freq."
This reverts commit
2754436913b94626a5414d82f0996489628c513d.
Conflicts:
drivers/gpu/drm/i915/i915_irq.c
The partial application of interrupt masking without regard to other
pathways for adjusting the RPS frequency results in completely disabling
the PM interrupts. This leads to excessive power consumption as the GPU
is kept at max clocks (until the failsafe mechanism fires of explicitly
downclocking the GPU when all requests are idle). Or equally as bad for
the UX, the GPU is kept at minimum clocks and prevented from upclocking
in response to a requirement for more power.
Testcase: pm_rps/blocking
Cc: Deepak S <deepak.s@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by:Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ville Syrjälä [Fri, 28 Mar 2014 21:29:32 +0000 (23:29 +0200)]
drm/i915: Make sure vsyncshift is positive
If vsyncshift comes out as negative, add one htotal to it to get the
corresponding positive value.
This is rather theoretical as it would require a mode where the
hsync+back porch is very long and the active+front porch very short.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ville Syrjälä [Fri, 28 Mar 2014 21:29:31 +0000 (23:29 +0200)]
drm/i915: Fix the interlace mode selection for gmch platforms
PIPECONF_INTERLACE_W_FIELD_INDICATION is only meant to be used for sdvo
since it implies a slightly weird vsync shift of htotal/2. For everything
else we should use PIPECONF_INTERLACE_W_SYNC_SHIFT and let the value in
the VSYNCSHIFT register take effect.
The only exception is gen3 simply because VSYNCSHIFT didn't exist yet.
Gen2 doesn't support interlaced modes at all, so we can drop the
explicit gen2 checks.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ville Syrjälä [Fri, 28 Mar 2014 21:29:30 +0000 (23:29 +0200)]
drm/i915: Program VSYNCSHIFT in a more consistent manner
When interlaced sdvo output is used, vsyncshift should supposedly
be (htotal-1)/2. In reality PIPECONF/TRANSCONF will override it by
using the legacy vsyncshift interlace mode which causes the hardware
to ignore the VSYNCSHIFT register.
The only odd thing here is that on PCH platforms we program the
VSYNCSHIFT on both CPU and PCH, and it's not entirely clear if both
sides have to agree on the value or not. On the CPU side there's no
way to override the value via PIPECONF anymore, so if we want to make
the CPU side agree with the PCH side, we should probably program the
approriate value into VSYNCSHIFT manually. So let's do that, but for
now leave the PCH side to still use the legacy interlace mode in
TRANSCONF.
We can also drop the gen2 check since gen2 doesn't support interlaced
modes at all.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Jesse Barnes [Thu, 27 Mar 2014 18:56:08 +0000 (11:56 -0700)]
drm/i915/vlv: use W_SYNC_SHIFT for interlaced modes on VLV
This makes HDMI testers happier on VLV platforms. It may be that we
need it for any non-SVO platform, but I don't have any tests to back
that up, so I'm leaving other pre-ILK platforms alone for now.
Tested-by: "Clint Taylor <clinton.a.taylor@intel.com>"
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74964
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel Vetter [Wed, 26 Mar 2014 22:42:53 +0000 (23:42 +0100)]
drm/i915: restrict vt-d stolen memory workaround to pre-gen8
We want future generations to at least attempt to use all features, so
restrict the stolen memory disabling when vt-d is enabled to the
latest generation we have reports for. Which is a HSW per the original
report.
Also once we get a bit a hold of some of the mysterious framebuffer in
stolen memory issues that still haunt bugzilla, we should probably
drop this hack again and see what happens.
This was introduced in
commit
0f4706d2740f2a221cd502922b22e522009041d9
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Mar 18 14:50:50 2014 +0200
drm/i915: Disable stolen memory when DMAR is active
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
References: https://bugs.freedesktop.org/show_bug.cgi?id=68535
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel Vetter [Mon, 31 Mar 2014 08:40:13 +0000 (10:40 +0200)]
Merge tag 'v3.14' into drm-intel-next-queued
Linux 3.14
The vt-d w/a merged late in 3.14-rc needs a bit of fine-tuning, hence
backmerge.
Conflicts:
drivers/gpu/drm/i915/i915_gem_gtt.c
drivers/gpu/drm/i915/intel_ddi.c
drivers/gpu/drm/i915/intel_dp.c
All trivial adjacent lines changed type conflicts, so trivial git
doesn't even show them in the merg commit.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Linus Torvalds [Mon, 31 Mar 2014 03:40:15 +0000 (20:40 -0700)]
Linux 3.14
Linus Torvalds [Mon, 31 Mar 2014 00:26:08 +0000 (17:26 -0700)]
Merge branch 'for-linus-2' of git://git./linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
"Switch mnt_hash to hlist, turning the races between __lookup_mnt() and
hash modifications into false negatives from __lookup_mnt() (instead
of hangs)"
On the false negatives from __lookup_mnt():
"The *only* thing we care about is not getting stuck in __lookup_mnt().
If it misses an entry because something in front of it just got moved
around, etc, we are fine. We'll notice that mount_lock mismatch and
that'll be it"
* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
switch mnt_hash to hlist
don't bother with propagate_mnt() unless the target is shared
keep shadowed vfsmounts together
resizable namespace.c hashes
Randy Dunlap [Fri, 28 Mar 2014 16:45:33 +0000 (09:45 -0700)]
MAINTAINERS: resume as Documentation maintainer
I am the new kernel tree Documentation maintainer (except for parts that
are handled by other people, of course).
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Rob Landley <rob@landley.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 31 Mar 2014 00:20:40 +0000 (17:20 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
"Some more updates for the input subsystem.
You will get a fix for race in mousedev that has been causing quite a
few oopses lately and a small fixup for force feedback support in
evdev"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: mousedev - fix race when creating mixed device
Input: don't modify the id of ioctl-provided ff effect on upload failure
Eric Paris [Sun, 30 Mar 2014 23:07:54 +0000 (19:07 -0400)]
AUDIT: Allow login in non-init namespaces
It its possible to configure your PAM stack to refuse login if audit
messages (about the login) were unable to be sent. This is common in
many distros and thus normal configuration of many containers. The PAM
modules determine if audit is enabled/disabled in the kernel based on
the return value from sending an audit message on the netlink socket.
If userspace gets back ECONNREFUSED it believes audit is disabled in the
kernel. If it gets any other error else it refuses to let the login
proceed.
Just about ever since the introduction of namespaces the kernel audit
subsystem has returned EPERM if the task sending a message was not in
the init user or pid namespace. So many forms of containers have never
worked if audit was enabled in the kernel.
BUT if the container was not in net_init then the kernel network code
would send ECONNREFUSED (instead of the audit code sending EPERM). Thus
by pure accident/dumb luck/bug if an admin configured the PAM stack to
reject all logins that didn't talk to audit, but then ran the login
untility in the non-init_net namespace, it would work!! Clearly this was
a bug, but it is a bug some people expected.
With the introduction of network namespace support in 3.14-rc1 the two
bugs stopped cancelling each other out. Now, containers in the
non-init_net namespace refused to let users log in (just like PAM was
configfured!) Obviously some people were not happy that what used to let
users log in, now didn't!
This fix is kinda hacky. We return ECONNREFUSED for all non-init
relevant namespaces. That means that not only will the old broken
non-init_net setups continue to work, now the broken non-init_pid or
non-init_user setups will 'work'. They don't really work, since audit
isn't logging things. But it's what most users want.
In 3.15 we should have patches to support not only the non-init_net
(3.14) namespace but also the non-init_pid and non-init_user namespace.
So all will be right in the world. This just opens the doors wide open
on 3.14 and hopefully makes users happy, if not the audit system...
Reported-by: Andre Tomt <andre@tomt.net>
Reported-by: Adam Richter <adam_richter2004@yahoo.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Theodore Ts'o [Sun, 30 Mar 2014 14:20:01 +0000 (10:20 -0400)]
ext4: atomically set inode->i_flags in ext4_set_inode_flags()
Use cmpxchg() to atomically set i_flags instead of clearing out the
S_IMMUTABLE, S_APPEND, etc. flags and then setting them from the
EXT4_IMMUTABLE_FL, EXT4_APPEND_FL flags, since this opens up a race
where an immutable file has the immutable flag cleared for a brief
window of time.
Reported-by: John Sullivan <jsrhbz@kanargh.force9.co.uk>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Fri, 21 Mar 2014 01:10:51 +0000 (21:10 -0400)]
switch mnt_hash to hlist
fixes RCU bug - walking through hlist is safe in face of element moves,
since it's self-terminating. Cyclic lists are not - if we end up jumping
to another hash chain, we'll loop infinitely without ever hitting the
original list head.
[fix for dumb braino folded]
Spotted by: Max Kellermann <mk@cm4all.com>
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 21 Mar 2014 14:14:08 +0000 (10:14 -0400)]
don't bother with propagate_mnt() unless the target is shared
If the dest_mnt is not shared, propagate_mnt() does nothing -
there's no mounts to propagate to and thus no copies to create.
Might as well don't bother calling it in that case.
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 21 Mar 2014 00:34:43 +0000 (20:34 -0400)]
keep shadowed vfsmounts together
preparation to switching mnt_hash to hlist
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 28 Feb 2014 18:46:44 +0000 (13:46 -0500)]
resizable namespace.c hashes
* switch allocation to alloc_large_system_hash()
* make sizes overridable by boot parameters (mhash_entries=, mphash_entries=)
* switch mountpoint_hashtable from list_head to hlist_head
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Linus Torvalds [Sat, 29 Mar 2014 22:01:09 +0000 (15:01 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
"A late breaking fix from John. (The bug fixed has a hard lockup
potential, but that was not observed, warnings were)"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
time: Revert to calling clock_was_set_delayed() while in irq context
Linus Torvalds [Sat, 29 Mar 2014 22:00:27 +0000 (15:00 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client
Pull Ceph fix from Sage Weil:
"This drops a bad assert that a few users have been hitting but we've
only recently been able to track down"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
rbd: drop an unsafe assertion
Dmitry Torokhov [Thu, 6 Mar 2014 20:57:24 +0000 (12:57 -0800)]
Input: mousedev - fix race when creating mixed device
We should not be using static variable mousedev_mix in methods that can be
called before that singleton gets assigned. While at it let's add open and
close methods to mousedev structure so that we do not need to test if we
are dealing with multiplexor or normal device and simply call appropriate
method directly.
This fixes: https://bugzilla.kernel.org/show_bug.cgi?id=71551
Reported-by: GiulioDP <depasquale.giulio@gmail.com>
Tested-by: GiulioDP <depasquale.giulio@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Elias Vanderstuyft [Sat, 29 Mar 2014 19:08:45 +0000 (12:08 -0700)]
Input: don't modify the id of ioctl-provided ff effect on upload failure
If a new (id == -1) ff effect was uploaded from userspace,
ff-core.c::input_ff_upload() will have assigned a positive number to the
new effect id. Currently, evdev.c::evdev_do_ioctl() will save this new id
to userspace, regardless of whether the upload succeeded or not.
On upload failure, this can be confusing because the dev->ff->effects[]
array will not contain an element at the index of that new effect id.
This patch fixes this by leaving the id unchanged after upload fails.
Note: Unfortunately applications should still expect changed effect id for
quite some time.
This has been discussed on:
http://www.mail-archive.com/linux-input@vger.kernel.org/msg08513.html
("ff-core effect id handling in case of a failed effect upload")
Suggested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Elias Vanderstuyft <elias.vds@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Alex Elder [Tue, 25 Mar 2014 13:36:02 +0000 (15:36 +0200)]
rbd: drop an unsafe assertion
Olivier Bonvalet reported having repeated crashes due to a failed
assertion he was hitting in rbd_img_obj_callback():
Assertion failure in rbd_img_obj_callback() at line 2165:
rbd_assert(which >= img_request->next_completion);
With a lot of help from Olivier with reproducing the problem
we were able to determine the object and image requests had
already been completed (and often freed) at the point the
assertion failed.
There was a great deal of discussion on the ceph-devel mailing list
about this. The problem only arose when there were two (or more)
object requests in an image request, and the problem was always
seen when the second request was being completed.
The problem is due to a race in the window between setting the
"done" flag on an object request and checking the image request's
next completion value. When the first object request completes, it
checks to see if its successor request is marked "done", and if
so, that request is also completed. In the process, the image
request's next_completion value is updated to reflect that both
the first and second requests are completed. By the time the
second request is able to check the next_completion value, it
has been set to a value *greater* than its own "which" value,
which caused an assertion to fail.
Fix this problem by skipping over any completion processing
unless the completing object request is the next one expected.
Test only for inequality (not >=), and eliminate the bad
assertion.
Tested-by: Olivier Bonvalet <ob@daevel.fr>
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Linus Torvalds [Fri, 28 Mar 2014 22:09:37 +0000 (15:09 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) We've discovered a common error in several networking drivers, they
put VLAN offload features into ->vlan_features, which would suggest
that they support offloading 2 or more levels of VLAN encapsulation.
Not only do these devices not do that, but we don't have the
infrastructure yet to handle that at all.
Fixes from Vlad Yasevich.
2) Fix tcpdump crash with bridging and vlans, also from Vlad.
3) Some MAINTAINERS updates for random32 and bonding.
4) Fix late reseeds of prandom generator, from Sasha Levin.
5) Bridge doesn't handle stacked vlans properly, fix from Toshiaki
Makita.
6) Fix deadlock in openvswitch, from Flavio Leitner.
7) get_timewait4_sock() doesn't report delay times correctly, fix from
Eric Dumazet.
8) Duplicate address detection and addrconf verification need to run in
contexts where RTNL can be obtained. Move them to run from a
workqueue. From Hannes Frederic Sowa.
9) Fix route refcount leaking in ip tunnels, from Pravin B Shelar.
10) Don't return -EINTR from non-blocking recvmsg() on AF_UNIX sockets,
from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (28 commits)
vlan: Warn the user if lowerdev has bad vlan features.
veth: Turn off vlan rx acceleration in vlan_features
ifb: Remove vlan acceleration from vlan_features
qlge: Do not propaged vlan tag offloads to vlans
bridge: Fix crash with vlan filtering and tcpdump
net: Account for all vlan headers in skb_mac_gso_segment
MAINTAINERS: bonding: change email address
MAINTAINERS: bonding: change email address
ipv6: move DAD and addrconf_verify processing to workqueue
tcp: fix get_timewait4_sock() delay computation on 64bit
openvswitch: fix a possible deadlock and lockdep warning
bridge: Fix handling stacked vlan tags
bridge: Fix inabillity to retrieve vlan tags when tx offload is disabled
vhost: validate vhost_get_vq_desc return value
vhost: fix total length when packets are too short
random32: avoid attempt to late reseed if in the middle of seeding
random32: assign to network folks in MAINTAINERS
net/mlx4_core: pass pci_device_id.driver_data to __mlx4_init_one during reset
core, nfqueue, openvswitch: Orphan frags in skb_zerocopy and handle errors
vlan: Set hard_header_len according to available acceleration
...
David S. Miller [Fri, 28 Mar 2014 21:17:16 +0000 (17:17 -0400)]
Merge branch 'vlan_offloads'
Vlad Yasevich says:
====================
Audit all drivers for correct vlan_features.
Some drivers set vlan acceleration features in vlan_features. This causes
issues with Q-in-Q/802.1ad configurations.
Audit all the drivers for correct vlan_features. Fix broken ones.
Add a warning to vlan code to help catch future offenders.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Fri, 28 Mar 2014 02:14:49 +0000 (22:14 -0400)]
vlan: Warn the user if lowerdev has bad vlan features.
Some drivers incorrectly assign vlan acceleration features to
vlan_features thus causing issues for Q-in-Q vlan configurations.
Warn the user of such cases.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Fri, 28 Mar 2014 02:14:48 +0000 (22:14 -0400)]
veth: Turn off vlan rx acceleration in vlan_features
For completeness, turn off vlan rx acceleration in vlan_features so
that it doesn't show up on q-in-q setups.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Fri, 28 Mar 2014 02:14:47 +0000 (22:14 -0400)]
ifb: Remove vlan acceleration from vlan_features
Do not include vlan acceleration features in vlan_features as that
precludes correct Q-in-Q operation.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Fri, 28 Mar 2014 02:14:46 +0000 (22:14 -0400)]
qlge: Do not propaged vlan tag offloads to vlans
qlge driver turns off NETIF_F_HW_CTAG_FILTER, but forgets to
turn off HW_CTAG_TX and HW_CTAG_RX on vlan devices. With the
current settings, q-in-q will only generate a single vlan header.
Remember to mask off CTAG_TX and CTAG_RX features in vlan_features.
CC: Shahed Shaikh <shahed.shaikh@qlogic.com>
CC: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
CC: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Fri, 28 Mar 2014 01:51:18 +0000 (21:51 -0400)]
bridge: Fix crash with vlan filtering and tcpdump
When the vlan filtering is enabled on the bridge, but
the filter is not configured on the bridge device itself,
running tcpdump on the bridge device will result in a
an Oops with NULL pointer dereference. The reason
is that br_pass_frame_up() will bypass the vlan
check because promisc flag is set. It will then try
to get the table pointer and process the packet based
on the table. Since the table pointer is NULL, we oops.
Catch this special condition in br_handle_vlan().
Reported-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
CC: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Thu, 27 Mar 2014 21:26:18 +0000 (17:26 -0400)]
net: Account for all vlan headers in skb_mac_gso_segment
skb_network_protocol() already accounts for multiple vlan
headers that may be present in the skb. However, skb_mac_gso_segment()
doesn't know anything about it and assumes that skb->mac_len
is set correctly to skip all mac headers. That may not
always be the case. If we are simply forwarding the packet (via
bridge or macvtap), all vlan headers may not be accounted for.
A simple solution is to allow skb_network_protocol to return
the vlan depth it has calculated. This way skb_mac_gso_segment
will correctly skip all mac headers.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Thu, 27 Mar 2014 17:43:50 +0000 (18:43 +0100)]
MAINTAINERS: bonding: change email address
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jay Vosburgh [Thu, 27 Mar 2014 17:33:44 +0000 (10:33 -0700)]
MAINTAINERS: bonding: change email address
Update my email address.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 28 Mar 2014 20:57:13 +0000 (13:57 -0700)]
Merge branch 'akpm' (patches from Andrew Morton)
Merge two fixes from Andrew Morton:
"The x86 fix should come from x86 guys but they appear to be
conferencing or otherwise distracted.
The ocfs2 fix is a bit of a mess - the code runs into an immediate
NULL deref and we're trying to work out how this got through test and
review, but we haven't heard from Goldwyn in the past few days.
Sasha's patch fixes the oops, but the feature as a whole is probably
broken. So this is a stopgap for 3.14 - I'll aim to get the real
fixes into 3.14.x"
* emailed patches from Andrew Morton akpm@linux-foundation.org>:
x86: fix boot on uniprocessor systems
ocfs2: check if cluster name exists before deref
Artem Fetishev [Fri, 28 Mar 2014 20:33:39 +0000 (13:33 -0700)]
x86: fix boot on uniprocessor systems
On x86 uniprocessor systems topology_physical_package_id() returns -1
which causes rapl_cpu_prepare() to leave rapl_pmu variable uninitialized
which leads to GPF in rapl_pmu_init().
See arch/x86/kernel/cpu/perf_event_intel_rapl.c.
It turns out that physical_package_id and core_id can actually be
retreived for uniprocessor systems too. Enabling them also fixes
rapl_pmu code.
Signed-off-by: Artem Fetishev <artem_fetishev@epam.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sasha Levin [Fri, 28 Mar 2014 20:33:38 +0000 (13:33 -0700)]
ocfs2: check if cluster name exists before deref
Commit
c74a3bdd9b52 ("ocfs2: add clustername to cluster connection") is
trying to strlcpy a string which was explicitly passed as NULL in the
very same patch, triggering a NULL ptr deref.
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: strlcpy (lib/string.c:388 lib/string.c:151)
CPU: 19 PID: 19426 Comm: trinity-c19 Tainted: G W
3.14.0-rc7-next-20140325-sasha-00014-g9476368-dirty #274
RIP: strlcpy (lib/string.c:388 lib/string.c:151)
Call Trace:
ocfs2_cluster_connect (fs/ocfs2/stackglue.c:350)
ocfs2_cluster_connect_agnostic (fs/ocfs2/stackglue.c:396)
user_dlm_register (fs/ocfs2/dlmfs/userdlm.c:679)
dlmfs_mkdir (fs/ocfs2/dlmfs/dlmfs.c:503)
vfs_mkdir (fs/namei.c:3467)
SyS_mkdirat (fs/namei.c:3488 fs/namei.c:3472)
tracesys (arch/x86/kernel/entry_64.S:749)
akpm: this patch probably disables the feature. A temporary thing to
avoid triviel oopses.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Goldwyn Rodrigues <rgoldwyn@suse.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hannes Frederic Sowa [Thu, 27 Mar 2014 17:28:07 +0000 (18:28 +0100)]
ipv6: move DAD and addrconf_verify processing to workqueue
addrconf_join_solict and addrconf_join_anycast may cause actions which
need rtnl locked, especially on first address creation.
A new DAD state is introduced which defers processing of the initial
DAD processing into a workqueue.
To get rtnl lock we need to push the code paths which depend on those
calls up to workqueues, specifically addrconf_verify and the DAD
processing.
(v2)
addrconf_dad_failure needs to be queued up to the workqueue, too. This
patch introduces a new DAD state and stop the DAD processing in the
workqueue (this is because of the possible ipv6_del_addr processing
which removes the solicited multicast address from the device).
addrconf_verify_lock is removed, too. After the transition it is not
needed any more.
As we are not processing in bottom half anymore we need to be a bit more
careful about disabling bottom half out when we lock spin_locks which are also
used in bh.
Relevant backtrace:
[ 541.030090] RTNL: assertion failed at net/core/dev.c (4496)
[ 541.031143] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 3.10.33-1-amd64-vyatta #1
[ 541.031145] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 541.031146]
ffffffff8148a9f0 000000000000002f ffffffff813c98c1 ffff88007c4451f8
[ 541.031148]
0000000000000000 0000000000000000 ffffffff813d3540 ffff88007fc03d18
[ 541.031150]
0000880000000006 ffff88007c445000 ffffffffa0194160 0000000000000000
[ 541.031152] Call Trace:
[ 541.031153] <IRQ> [<
ffffffff8148a9f0>] ? dump_stack+0xd/0x17
[ 541.031180] [<
ffffffff813c98c1>] ? __dev_set_promiscuity+0x101/0x180
[ 541.031183] [<
ffffffff813d3540>] ? __hw_addr_create_ex+0x60/0xc0
[ 541.031185] [<
ffffffff813cfe1a>] ? __dev_set_rx_mode+0xaa/0xc0
[ 541.031189] [<
ffffffff813d3a81>] ? __dev_mc_add+0x61/0x90
[ 541.031198] [<
ffffffffa01dcf9c>] ? igmp6_group_added+0xfc/0x1a0 [ipv6]
[ 541.031208] [<
ffffffff8111237b>] ? kmem_cache_alloc+0xcb/0xd0
[ 541.031212] [<
ffffffffa01ddcd7>] ? ipv6_dev_mc_inc+0x267/0x300 [ipv6]
[ 541.031216] [<
ffffffffa01c2fae>] ? addrconf_join_solict+0x2e/0x40 [ipv6]
[ 541.031219] [<
ffffffffa01ba2e9>] ? ipv6_dev_ac_inc+0x159/0x1f0 [ipv6]
[ 541.031223] [<
ffffffffa01c0772>] ? addrconf_join_anycast+0x92/0xa0 [ipv6]
[ 541.031226] [<
ffffffffa01c311e>] ? __ipv6_ifa_notify+0x11e/0x1e0 [ipv6]
[ 541.031229] [<
ffffffffa01c3213>] ? ipv6_ifa_notify+0x33/0x50 [ipv6]
[ 541.031233] [<
ffffffffa01c36c8>] ? addrconf_dad_completed+0x28/0x100 [ipv6]
[ 541.031241] [<
ffffffff81075c1d>] ? task_cputime+0x2d/0x50
[ 541.031244] [<
ffffffffa01c38d6>] ? addrconf_dad_timer+0x136/0x150 [ipv6]
[ 541.031247] [<
ffffffffa01c37a0>] ? addrconf_dad_completed+0x100/0x100 [ipv6]
[ 541.031255] [<
ffffffff8105313a>] ? call_timer_fn.isra.22+0x2a/0x90
[ 541.031258] [<
ffffffffa01c37a0>] ? addrconf_dad_completed+0x100/0x100 [ipv6]
Hunks and backtrace stolen from a patch by Stephen Hemminger.
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 27 Mar 2014 14:19:19 +0000 (07:19 -0700)]
tcp: fix get_timewait4_sock() delay computation on 64bit
It seems I missed one change in get_timewait4_sock() to compute
the remaining time before deletion of IPV4 timewait socket.
This could result in wrong output in /proc/net/tcp for tm->when field.
Fixes:
96f817fedec4 ("tcp: shrink tcp6_timewait_sock by one cache line")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Flavio Leitner [Thu, 27 Mar 2014 14:05:34 +0000 (11:05 -0300)]
openvswitch: fix a possible deadlock and lockdep warning
There are two problematic situations.
A deadlock can happen when is_percpu is false because it can get
interrupted while holding the spinlock. Then it executes
ovs_flow_stats_update() in softirq context which tries to get
the same lock.
The second sitation is that when is_percpu is true, the code
correctly disables BH but only for the local CPU, so the
following can happen when locking the remote CPU without
disabling BH:
CPU#0 CPU#1
ovs_flow_stats_get()
stats_read()
+->spin_lock remote CPU#1 ovs_flow_stats_get()
| <interrupted> stats_read()
| ... +--> spin_lock remote CPU#0
| | <interrupted>
| ovs_flow_stats_update() | ...
| spin_lock local CPU#0 <--+ ovs_flow_stats_update()
+---------------------------------- spin_lock local CPU#1
This patch disables BH for both cases fixing the deadlocks.
Acked-by: Jesse Gross <jesse@nicira.com>
=================================
[ INFO: inconsistent lock state ]
3.14.0-rc8-00007-g632b06a #1 Tainted: G I
---------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
swapper/0/0 [HC0[0]:SC1[5]:HE1:SE0] takes:
(&(&cpu_stats->lock)->rlock){+.?...}, at: [<
ffffffffa05dd8a1>] ovs_flow_stats_update+0x51/0xd0 [openvswitch]
{SOFTIRQ-ON-W} state was registered at:
[<
ffffffff810f973f>] __lock_acquire+0x68f/0x1c40
[<
ffffffff810fb4e2>] lock_acquire+0xa2/0x1d0
[<
ffffffff817d8d9e>] _raw_spin_lock+0x3e/0x80
[<
ffffffffa05dd9e4>] ovs_flow_stats_get+0xc4/0x1e0 [openvswitch]
[<
ffffffffa05da855>] ovs_flow_cmd_fill_info+0x185/0x360 [openvswitch]
[<
ffffffffa05daf05>] ovs_flow_cmd_build_info.constprop.27+0x55/0x90 [openvswitch]
[<
ffffffffa05db41d>] ovs_flow_cmd_new_or_set+0x4dd/0x570 [openvswitch]
[<
ffffffff816c245d>] genl_family_rcv_msg+0x1cd/0x3f0
[<
ffffffff816c270e>] genl_rcv_msg+0x8e/0xd0
[<
ffffffff816c0239>] netlink_rcv_skb+0xa9/0xc0
[<
ffffffff816c0798>] genl_rcv+0x28/0x40
[<
ffffffff816bf830>] netlink_unicast+0x100/0x1e0
[<
ffffffff816bfc57>] netlink_sendmsg+0x347/0x770
[<
ffffffff81668e9c>] sock_sendmsg+0x9c/0xe0
[<
ffffffff816692d9>] ___sys_sendmsg+0x3a9/0x3c0
[<
ffffffff8166a911>] __sys_sendmsg+0x51/0x90
[<
ffffffff8166a962>] SyS_sendmsg+0x12/0x20
[<
ffffffff817e3ce9>] system_call_fastpath+0x16/0x1b
irq event stamp:
1740726
hardirqs last enabled at (
1740726): [<
ffffffff8175d5e0>] ip6_finish_output2+0x4f0/0x840
hardirqs last disabled at (
1740725): [<
ffffffff8175d59b>] ip6_finish_output2+0x4ab/0x840
softirqs last enabled at (
1740674): [<
ffffffff8109be12>] _local_bh_enable+0x22/0x50
softirqs last disabled at (
1740675): [<
ffffffff8109db05>] irq_exit+0xc5/0xd0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&(&cpu_stats->lock)->rlock);
<Interrupt>
lock(&(&cpu_stats->lock)->rlock);
*** DEADLOCK ***
5 locks held by swapper/0/0:
#0: (((&ifa->dad_timer))){+.-...}, at: [<
ffffffff810a7155>] call_timer_fn+0x5/0x320
#1: (rcu_read_lock){.+.+..}, at: [<
ffffffff81788a55>] mld_sendpack+0x5/0x4a0
#2: (rcu_read_lock_bh){.+....}, at: [<
ffffffff8175d149>] ip6_finish_output2+0x59/0x840
#3: (rcu_read_lock_bh){.+....}, at: [<
ffffffff8168ba75>] __dev_queue_xmit+0x5/0x9b0
#4: (rcu_read_lock){.+.+..}, at: [<
ffffffffa05e41b5>] internal_dev_xmit+0x5/0x110 [openvswitch]
stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G I
3.14.0-rc8-00007-g632b06a #1
Hardware name: /DX58SO, BIOS SOX5810J.86A.5599.2012.0529.2218 05/29/2012
0000000000000000 0fcf20709903df0c ffff88042d603808 ffffffff817cfe3c
ffffffff81c134c0 ffff88042d603858 ffffffff817cb6da 0000000000000005
ffffffff00000001 ffff880400000000 0000000000000006 ffffffff81c134c0
Call Trace:
<IRQ> [<
ffffffff817cfe3c>] dump_stack+0x4d/0x66
[<
ffffffff817cb6da>] print_usage_bug+0x1f4/0x205
[<
ffffffff810f7f10>] ? check_usage_backwards+0x180/0x180
[<
ffffffff810f8963>] mark_lock+0x223/0x2b0
[<
ffffffff810f96d3>] __lock_acquire+0x623/0x1c40
[<
ffffffff810f5707>] ? __lock_is_held+0x57/0x80
[<
ffffffffa05e26c6>] ? masked_flow_lookup+0x236/0x250 [openvswitch]
[<
ffffffff810fb4e2>] lock_acquire+0xa2/0x1d0
[<
ffffffffa05dd8a1>] ? ovs_flow_stats_update+0x51/0xd0 [openvswitch]
[<
ffffffff817d8d9e>] _raw_spin_lock+0x3e/0x80
[<
ffffffffa05dd8a1>] ? ovs_flow_stats_update+0x51/0xd0 [openvswitch]
[<
ffffffffa05dd8a1>] ovs_flow_stats_update+0x51/0xd0 [openvswitch]
[<
ffffffffa05dcc64>] ovs_dp_process_received_packet+0x84/0x120 [openvswitch]
[<
ffffffff810f93f7>] ? __lock_acquire+0x347/0x1c40
[<
ffffffffa05e3bea>] ovs_vport_receive+0x2a/0x30 [openvswitch]
[<
ffffffffa05e4218>] internal_dev_xmit+0x68/0x110 [openvswitch]
[<
ffffffffa05e41b5>] ? internal_dev_xmit+0x5/0x110 [openvswitch]
[<
ffffffff8168b4a6>] dev_hard_start_xmit+0x2e6/0x8b0
[<
ffffffff8168be87>] __dev_queue_xmit+0x417/0x9b0
[<
ffffffff8168ba75>] ? __dev_queue_xmit+0x5/0x9b0
[<
ffffffff8175d5e0>] ? ip6_finish_output2+0x4f0/0x840
[<
ffffffff8168c430>] dev_queue_xmit+0x10/0x20
[<
ffffffff8175d641>] ip6_finish_output2+0x551/0x840
[<
ffffffff8176128a>] ? ip6_finish_output+0x9a/0x220
[<
ffffffff8176128a>] ip6_finish_output+0x9a/0x220
[<
ffffffff8176145f>] ip6_output+0x4f/0x1f0
[<
ffffffff81788c29>] mld_sendpack+0x1d9/0x4a0
[<
ffffffff817895b8>] mld_send_initial_cr.part.32+0x88/0xa0
[<
ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220
[<
ffffffff8178e301>] ipv6_mc_dad_complete+0x31/0x50
[<
ffffffff817690d7>] addrconf_dad_completed+0x147/0x220
[<
ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220
[<
ffffffff8176934f>] addrconf_dad_timer+0x19f/0x1c0
[<
ffffffff810a71e9>] call_timer_fn+0x99/0x320
[<
ffffffff810a7155>] ? call_timer_fn+0x5/0x320
[<
ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220
[<
ffffffff810a76c4>] run_timer_softirq+0x254/0x3b0
[<
ffffffff8109d47d>] __do_softirq+0x12d/0x480
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Toshiaki Makita [Thu, 27 Mar 2014 12:46:56 +0000 (21:46 +0900)]
bridge: Fix handling stacked vlan tags
If a bridge with vlan_filtering enabled receives frames with stacked
vlan tags, i.e., they have two vlan tags, br_vlan_untag() strips not
only the outer tag but also the inner tag.
br_vlan_untag() is called only from br_handle_vlan(), and in this case,
it is enough to set skb->vlan_tci to 0 here, because vlan_tci has already
been set before calling br_handle_vlan().
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Toshiaki Makita [Thu, 27 Mar 2014 12:46:55 +0000 (21:46 +0900)]
bridge: Fix inabillity to retrieve vlan tags when tx offload is disabled
Bridge vlan code (br_vlan_get_tag()) assumes that all frames have vlan_tci
if they are tagged, but if vlan tx offload is manually disabled on bridge
device and frames are sent from vlan device on the bridge device, the tags
are embedded in skb->data and they break this assumption.
Extract embedded vlan tags and move them to vlan_tci at ingress.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael S. Tsirkin [Thu, 27 Mar 2014 10:53:37 +0000 (12:53 +0200)]
vhost: validate vhost_get_vq_desc return value
vhost fails to validate negative error code
from vhost_get_vq_desc causing
a crash: we are using -EFAULT which is 0xfffffff2
as vector size, which exceeds the allocated size.
The code in question was introduced in commit
8dd014adfea6f173c1ef6378f7e5e7924866c923
vhost-net: mergeable buffers support
CVE-2014-0055
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael S. Tsirkin [Thu, 27 Mar 2014 10:00:26 +0000 (12:00 +0200)]
vhost: fix total length when packets are too short
When mergeable buffers are disabled, and the
incoming packet is too large for the rx buffer,
get_rx_bufs returns success.
This was intentional in order for make recvmsg
truncate the packet and then handle_rx would
detect err != sock_len and drop it.
Unfortunately we pass the original sock_len to
recvmsg - which means we use parts of iov not fully
validated.
Fix this up by detecting this overrun and doing packet drop
immediately.
CVE-2014-0077
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>