Tejun Heo [Mon, 19 Nov 2012 16:13:36 +0000 (08:13 -0800)]
cgroup: make CSS_* flags bit masks instead of bit positions
Currently, CSS_* flags are defined as bit positions and manipulated
using atomic bitops. There's no reason to use atomic bitops for them
and bit positions are clunkier to deal with than bit masks. Make
CSS_* bit masks instead and use the usual C bitwise operators to
access them.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Tejun Heo [Mon, 19 Nov 2012 16:13:36 +0000 (08:13 -0800)]
cgroup: cgroup->dentry isn't a RCU pointer
cgroup->dentry is marked and used as a RCU pointer; however, it isn't
one - the final dentry put doesn't go through call_rcu(). cgroup and
dentry share the same RCU freeing rule via synchronize_rcu() in
cgroup_diput() (kfree_rcu() used on cgrp is unnecessary). If cgrp is
accessible under RCU read lock, so is its dentry and dereferencing
cgrp->dentry doesn't need any further RCU protection or annotation.
While not being accurate, before the previous patch, the RCU accessors
served a purpose as memory barriers - cgroup->dentry used to be
assigned after the cgroup was made visible to cgroup_path(), so the
assignment and dereferencing in cgroup_path() needed the memory
barrier pair. Now that list_add_tail_rcu() happens after
cgroup->dentry is assigned, this no longer is necessary.
Remove the now unnecessary and misleading RCU annotations from
cgroup->dentry. To make up for the removal of rcu_dereference_check()
in cgroup_path(), add an explicit rcu_lockdep_assert(), which asserts
the dereference rule of @cgrp, not cgrp->dentry.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Tejun Heo [Mon, 19 Nov 2012 16:13:36 +0000 (08:13 -0800)]
cgroup: create directory before linking while creating a new cgroup
While creating a new cgroup, cgroup_create() links the newly allocated
cgroup into various places before trying to create its directory.
Because cgroup life-cycle is tied to the vfs objects, this makes it
impossible to use cgroup_rmdir() for rolling back creation - the
removal logic depends on having full vfs objects.
This patch moves directory creation above linking and collect linking
operations to one place. This allows directory creation failure to
share error exit path with css allocation failures and any failure
sites afterwards (to be added later) can use cgroup_rmdir() logic to
undo creation.
Note that this also makes the memory barriers around cgroup->dentry,
which currently is misleadingly using RCU operations, unnecessary.
This will be handled in the next patch.
While at it, locking BUG_ON() on i_mutex is converted to
lockdep_assert_held().
v2: Patch originally removed %NULL dentry check in cgroup_path();
however, Li pointed out that this patch doesn't make it
unnecessary as ->create() may call cgroup_path(). Drop the
change for now.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Tejun Heo [Mon, 19 Nov 2012 16:13:36 +0000 (08:13 -0800)]
cgroup: open-code cgroup_create_dir()
The operation order of cgroup creation is about to change and
cgroup_create_dir() is more of a hindrance than a proper abstraction.
Open-code it by moving the parent nlink adjustment next to self nlink
adjustment in cgroup_create_file() and the rest to cgroup_create().
This patch doesn't introduce any behavior change.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Tejun Heo [Mon, 19 Nov 2012 16:13:35 +0000 (08:13 -0800)]
cgroup: initialize cgrp->allcg_node in init_cgroup_housekeeping()
Not strictly necessary but it's annoying to have uninitialized
list_head around.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Tejun Heo [Mon, 19 Nov 2012 16:13:35 +0000 (08:13 -0800)]
cgroup: remove incorrect dget/dput() pair in cgroup_create_dir()
cgroup_create_dir() does weird dancing with dentry refcnt. On
success, it gets and then puts it achieving nothing. On failure, it
puts but there isn't no matching get anywhere leading to the following
oops if cgroup_create_file() fails for whatever reason.
------------[ cut here ]------------
kernel BUG at /work/os/work/fs/dcache.c:552!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in:
CPU 2
Pid: 697, comm: mkdir Not tainted 3.7.0-rc4-work+ #3 Bochs Bochs
RIP: 0010:[<
ffffffff811d9c0c>] [<
ffffffff811d9c0c>] dput+0x1dc/0x1e0
RSP: 0018:
ffff88001a3ebef8 EFLAGS:
00010246
RAX:
0000000000000000 RBX:
ffff88000e5b1ef8 RCX:
0000000000000403
RDX:
0000000000000303 RSI:
2000000000000000 RDI:
ffff88000e5b1f58
RBP:
ffff88001a3ebf18 R08:
ffffffff82c76960 R09:
0000000000000001
R10:
ffff880015022080 R11:
ffd9bed70f48a041 R12:
00000000ffffffea
R13:
0000000000000001 R14:
ffff88000e5b1f58 R15:
00007fff57656d60
FS:
00007ff05fcb3800(0000) GS:
ffff88001fd00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00000000004046f0 CR3:
000000001315f000 CR4:
00000000000006e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000ffff0ff0 DR7:
0000000000000400
Process mkdir (pid: 697, threadinfo
ffff88001a3ea000, task
ffff880015022080)
Stack:
ffff88001a3ebf48 00000000ffffffea 0000000000000001 0000000000000000
ffff88001a3ebf38 ffffffff811cc889 0000000000000001 ffff88000e5b1ef8
ffff88001a3ebf68 ffffffff811d1fc9 ffff8800198d7f18 ffff880019106ef8
Call Trace:
[<
ffffffff811cc889>] done_path_create+0x19/0x50
[<
ffffffff811d1fc9>] sys_mkdirat+0x59/0x80
[<
ffffffff811d2009>] sys_mkdir+0x19/0x20
[<
ffffffff81be1e02>] system_call_fastpath+0x16/0x1b
Code: 00 48 8d 90 18 01 00 00 48 89 93 c0 00 00 00 4c 89 a0 18 01 00 00 48 8b 83 a0 00 00 00 83 80 28 01 00 00 01 e8 e6 6f a0 00 eb 92 <0f> 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41
RIP [<
ffffffff811d9c0c>] dput+0x1dc/0x1e0
RSP <
ffff88001a3ebef8>
---[ end trace
1277bcfd9561ddb0 ]---
Fix it by dropping the unnecessary dget/dput() pair.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: stable@vger.kernel.org
Tejun Heo [Fri, 9 Nov 2012 17:12:30 +0000 (09:12 -0800)]
cgroup_freezer: implement proper hierarchy support
Up until now, cgroup_freezer didn't implement hierarchy properly.
cgroups could be arranged in hierarchy but it didn't make any
difference in how each cgroup_freezer behaved. They all operated
separately.
This patch implements proper hierarchy support. If a cgroup is
frozen, all its descendants are frozen. A cgroup is thawed iff it and
all its ancestors are THAWED. freezer.self_freezing shows the current
freezing state for the cgroup itself. freezer.parent_freezing shows
whether the cgroup is freezing because any of its ancestors is
freezing.
freezer_post_create() locks the parent and new cgroup and inherits the
parent's state and freezer_change_state() applies new state top-down
using cgroup_for_each_descendant_pre() which guarantees that no child
can escape its parent's state. update_if_frozen() uses
cgroup_for_each_descendant_post() to propagate frozen states
bottom-up.
Synchronization could be coarser and easier by using a single mutex to
protect all hierarchy operations. Finer grained approach was used
because it wasn't too difficult for cgroup_freezer and I think it's
beneficial to have an example implementation and cgroup_freezer is
rather simple and can serve a good one.
As this makes cgroup_freezer properly hierarchical,
freezer_subsys.broken_hierarchy marking is removed.
Note that this patch changes userland visible behavior - freezing a
cgroup now freezes all its descendants too. This behavior change is
intended and has been warned via .broken_hierarchy.
v2: Michal spotted a bug in freezer_change_state() - descendants were
inheriting from the wrong ancestor. Fixed.
v3: Documentation/cgroups/freezer-subsystem.txt updated.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Tejun Heo [Fri, 9 Nov 2012 17:12:30 +0000 (09:12 -0800)]
cgroup_freezer: add ->post_create() and ->pre_destroy() and track online state
A cgroup is online and visible to iteration between ->post_create()
and ->pre_destroy(). This patch introduces CGROUP_FREEZER_ONLINE and
toggles it from the newly added freezer_post_create() and
freezer_pre_destroy() while holding freezer->lock such that a
cgroup_freezer can be reilably distinguished to be online. This will
be used by full hierarchy support.
ONLINE test is added to freezer_apply_state() but it currently doesn't
make any difference as freezer_write() can only be called for an
online cgroup.
Adjusting system_freezing_cnt on destruction is moved from
freezer_destroy() to the new freezer_pre_destroy() for consistency.
This patch doesn't introduce any noticeable behavior change.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Tejun Heo [Fri, 9 Nov 2012 17:12:30 +0000 (09:12 -0800)]
cgroup_freezer: introduce CGROUP_FREEZING_[SELF|PARENT]
Introduce FREEZING_SELF and FREEZING_PARENT and make FREEZING OR of
the two flags. This is to prepare for full hierarchy support.
freezer_apply_date() is updated such that it can handle setting and
clearing of both flags. The two flags are also exposed to userland
via read-only files self_freezing and parent_freezing.
Other than the added cgroupfs files, this patch doesn't introduce any
behavior change.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Tejun Heo [Fri, 9 Nov 2012 17:12:30 +0000 (09:12 -0800)]
cgroup_freezer: make freezer->state mask of flags
freezer->state was an enum value - one of THAWED, FREEZING and FROZEN.
As the scheduled full hierarchy support requires more than one
freezing condition, switch it to mask of flags. If FREEZING is not
set, it's thawed. FREEZING is set if freezing or frozen. If frozen,
both FREEZING and FROZEN are set. Now that tasks can be attached to
an already frozen cgroup, this also makes freezing condition checks
more natural.
This patch doesn't introduce any behavior change.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Tejun Heo [Fri, 9 Nov 2012 17:12:30 +0000 (09:12 -0800)]
cgroup_freezer: prepare freezer_change_state() for full hierarchy support
* Make freezer_change_state() take bool @freeze instead of enum
freezer_state.
* Separate out freezer_apply_state() out of freezer_change_state().
This makes freezer_change_state() a rather silly thin wrapper. It
will be filled with hierarchy handling later on.
This patch doesn't introduce any behavior change.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Tejun Heo [Fri, 9 Nov 2012 17:12:29 +0000 (09:12 -0800)]
cgroup_freezer: trivial cleanups
* Clean-up indentation and line-breaks. Drop the invalid comment
about freezer->lock.
* Make all internal functions take @freezer instead of both @cgroup
and @freezer.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Tejun Heo [Fri, 9 Nov 2012 17:12:29 +0000 (09:12 -0800)]
cgroup: implement generic child / descendant walk macros
Currently, cgroup doesn't provide any generic helper for walking a
given cgroup's children or descendants. This patch adds the following
three macros.
* cgroup_for_each_child() - walk immediate children of a cgroup.
* cgroup_for_each_descendant_pre() - visit all descendants of a cgroup
in pre-order tree traversal.
* cgroup_for_each_descendant_post() - visit all descendants of a
cgroup in post-order tree traversal.
All three only require the user to hold RCU read lock during
traversal. Verifying that each iterated cgroup is online is the
responsibility of the user. When used with proper synchronization,
cgroup_for_each_descendant_pre() can be used to propagate state
updates to descendants in reliable way. See comments for details.
v2: s/config/state/ in commit message and comments per Michal. More
documentation on synchronization rules.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujisu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Li Zefan <lizefan@huawei.com>
Tejun Heo [Fri, 9 Nov 2012 17:12:29 +0000 (09:12 -0800)]
cgroup: use rculist ops for cgroup->children
Use RCU safe list operations for cgroup->children. This will be used
to implement cgroup children / descendant walking which can be used by
controllers.
Note that cgroup_create() now puts a new cgroup at the end of the
->children list instead of head. This isn't strictly necessary but is
done so that the iteration order is more conventional.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Tejun Heo [Fri, 9 Nov 2012 17:12:29 +0000 (09:12 -0800)]
cgroup: add cgroup_subsys->post_create()
Currently, there's no way for a controller to find out whether a new
cgroup finished all ->create() allocatinos successfully and is
considered "live" by cgroup.
This becomes a problem later when we add generic descendants walking
to cgroup which can be used by controllers as controllers don't have a
synchronization point where it can synchronize against new cgroups
appearing in such walks.
This patch adds ->post_create(). It's called after all ->create()
succeeded and the cgroup is linked into the generic cgroup hierarchy.
This plays the counterpart of ->pre_destroy().
When used in combination with the to-be-added generic descendant
iterators, ->post_create() can be used to implement reliable state
inheritance. It will be explained with the descendant iterators.
v2: Added a paragraph about its future use w/ descendant iterators per
Michal.
v3: Forgot to add ->post_create() invocation to cgroup_load_subsys().
Fixed.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Glauber Costa <glommer@parallels.com>
Tao Ma [Thu, 8 Nov 2012 13:36:38 +0000 (21:36 +0800)]
cgroup: set 'start' with the right value in cgroup_path.
'start' is set to buf + buflen and do the '--' immediately.
Just set it to 'buf + buflen - 1' directly.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Tejun Heo [Tue, 6 Nov 2012 17:16:53 +0000 (09:16 -0800)]
device_cgroup: add lockdep asserts
device_cgroup uses RCU safe ->exceptions list which is write-protected
by devcgroup_mutex and has had some issues using locking correctly.
Add lockdep asserts to utility functions so that future errors can be
easily detected.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Cc: Aristeu Rozanski <aris@redhat.com>
Cc: Li Zefan <lizefan@huawei.com>
Tejun Heo [Tue, 6 Nov 2012 20:26:23 +0000 (12:26 -0800)]
Merge branch 'cgroup/for-3.7-fixes' into cgroup/for-3.8
This is to receive device_cgroup fixes so that further device_cgroup
changes can be made in cgroup/for-3.8.
Signed-off-by: Tejun Heo <tj@kernel.org>
Tejun Heo [Tue, 6 Nov 2012 17:17:37 +0000 (09:17 -0800)]
device_cgroup: fix RCU usage
dev_cgroup->exceptions is protected with devcgroup_mutex for writes
and RCU for reads; however, RCU usage isn't correct.
* dev_exception_clean() doesn't use RCU variant of list_del() and
kfree(). The function can race with may_access() and may_access()
may end up dereferencing already freed memory. Use list_del_rcu()
and kfree_rcu() instead.
* may_access() may be called only with RCU read locked but doesn't use
RCU safe traversal over ->exceptions. Use list_for_each_entry_rcu().
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Cc: stable@vger.kernel.org
Cc: Aristeu Rozanski <aris@redhat.com>
Cc: Li Zefan <lizefan@huawei.com>
Aristeu Rozanski [Tue, 6 Nov 2012 15:25:04 +0000 (07:25 -0800)]
device_cgroup: fix unchecked cgroup parent usage
In
4cef7299b478687 ("device_cgroup: add proper checking when changing
default behavior") the cgroup parent usage is unchecked. root will not
have a parent and trying to use device.{allow,deny} will cause problems.
For some reason my stressing scripts didn't test the root directory so I
didn't catch it on my regular tests.
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: James Morris <jmorris@namei.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Tejun Heo [Mon, 5 Nov 2012 17:21:51 +0000 (09:21 -0800)]
Merge branch 'cgroup-rmdir-updates' into cgroup/for-3.8
Pull rmdir updates into for-3.8 so that further callback updates can
be put on top. This pull created a trivial conflict between the
following two commits.
8c7f6edbda ("cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them")
ed95779340 ("cgroup: kill cgroup_subsys->__DEPRECATED_clear_css_refs")
The former added a field to cgroup_subsys and the latter removed one
from it. They happen to be colocated causing the conflict. Keeping
what's added and removing what's removed resolves the conflict.
Signed-off-by: Tejun Heo <tj@kernel.org>
Tejun Heo [Mon, 5 Nov 2012 17:16:59 +0000 (09:16 -0800)]
cgroup: make ->pre_destroy() return void
All ->pre_destory() implementations return 0 now, which is the only
allowed return value. Make it return void.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Michal Hocko [Fri, 26 Oct 2012 11:37:33 +0000 (13:37 +0200)]
hugetlb: do not fail in hugetlb_cgroup_pre_destroy
Now that pre_destroy callbacks are called from the context where neither
any task can attach the group nor any children group can be added there
is no other way to fail from hugetlb_pre_destroy.
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Glauber Costa <glommer@parallels.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Michal Hocko [Fri, 26 Oct 2012 11:37:32 +0000 (13:37 +0200)]
memcg: make mem_cgroup_reparent_charges non failing
Now that pre_destroy callbacks are called from the context where neither
any task can attach the group nor any children group can be added there
is no other way to fail from mem_cgroup_pre_destroy.
mem_cgroup_pre_destroy doesn't have to take a reference to memcg's css
because all css' are marked dead already.
tj: Remove now unused local variable @cgrp from
mem_cgroup_reparent_charges().
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: Glauber Costa <glommer@parallels.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Tejun Heo [Mon, 5 Nov 2012 17:16:59 +0000 (09:16 -0800)]
cgroup: remove CGRP_WAIT_ON_RMDIR, cgroup_exclude_rmdir() and cgroup_release_and_wakeup_rmdir()
CGRP_WAIT_ON_RMDIR is another kludge which was added to make cgroup
destruction rollback somewhat working. cgroup_rmdir() used to drain
CSS references and CGRP_WAIT_ON_RMDIR and the associated waitqueue and
helpers were used to allow the task performing rmdir to wait for the
next relevant event.
Unfortunately, the wait is visible to controllers too and the
mechanism got exposed to memcg by
887032670d ("cgroup avoid permanent
sleep at rmdir").
Now that the draining and retries are gone, CGRP_WAIT_ON_RMDIR is
unnecessary. Remove it and all the mechanisms supporting it. Note
that memcontrol.c changes are essentially revert of
887032670d
("cgroup avoid permanent sleep at rmdir").
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Tejun Heo [Mon, 5 Nov 2012 17:16:59 +0000 (09:16 -0800)]
cgroup: deactivate CSS's and mark cgroup dead before invoking ->pre_destroy()
Because ->pre_destroy() could fail and can't be called under
cgroup_mutex, cgroup destruction did something very ugly.
1. Grab cgroup_mutex and verify it can be destroyed; fail otherwise.
2. Release cgroup_mutex and call ->pre_destroy().
3. Re-grab cgroup_mutex and verify it can still be destroyed; fail
otherwise.
4. Continue destroying.
In addition to being ugly, it has been always broken in various ways.
For example, memcg ->pre_destroy() expects the cgroup to be inactive
after it's done but tasks can be attached and detached between #2 and
#3 and the conditions that memcg verified in ->pre_destroy() might no
longer hold by the time control reaches #3.
Now that ->pre_destroy() is no longer allowed to fail. We can switch
to the following.
1. Grab cgroup_mutex and verify it can be destroyed; fail otherwise.
2. Deactivate CSS's and mark the cgroup removed thus preventing any
further operations which can invalidate the verification from #1.
3. Release cgroup_mutex and call ->pre_destroy().
4. Re-grab cgroup_mutex and continue destroying.
After this change, controllers can safely assume that ->pre_destroy()
will only be called only once for a given cgroup and, once
->pre_destroy() is called, the cgroup will stay dormant till it's
destroyed.
This removes the only reason ->pre_destroy() can fail - new task being
attached or child cgroup being created inbetween. Error out path is
removed and ->pre_destroy() invocation is open coded in
cgroup_rmdir().
v2: cgroup_call_pre_destroy() removal moved to this patch per Michal.
Commit message updated per Glauber.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Glauber Costa <glommer@parallels.com>
Tejun Heo [Mon, 5 Nov 2012 17:16:59 +0000 (09:16 -0800)]
cgroup: use cgroup_lock_live_group(parent) in cgroup_create()
This patch makes cgroup_create() fail if @parent is marked removed.
This is to prepare for further updates to cgroup_rmdir() path.
Note that this change isn't strictly necessary. cgroup can only be
created via mkdir and the removed marking and dentry removal happen
without releasing cgroup_mutex, so cgroup_create() can never race with
cgroup_rmdir(). Even after the scheduled updates to cgroup_rmdir(),
cgroup_mkdir() and cgroup_rmdir() are synchronized by i_mutex
rendering the added liveliness check unnecessary.
Do it anyway such that locking is contained inside cgroup proper and
we don't get nasty surprises if we ever grow another caller of
cgroup_create().
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Tejun Heo [Mon, 5 Nov 2012 17:16:58 +0000 (09:16 -0800)]
cgroup: kill CSS_REMOVED
CSS_REMOVED is one of the several contortions which were necessary to
support css reference draining on cgroup removal. All css->refcnts
which need draining should be deactivated and verified to equal zero
atomically w.r.t. css_tryget(). If any one isn't zero, all refcnts
needed to be re-activated and css_tryget() shouldn't fail in the
process.
This was achieved by letting css_tryget() busy-loop until either the
refcnt is reactivated (failed removal attempt) or CSS_REMOVED is set
(committing to removal).
Now that css refcnt draining is no longer used, there's no need for
atomic rollback mechanism. css_tryget() simply can look at the
reference count and fail if it's deactivated - it's never getting
re-activated.
This patch removes CSS_REMOVED and updates __css_tryget() to fail if
the refcnt is deactivated. As deactivation and removal are a single
step now, they no longer need to be protected against css_tryget()
happening from irq context. Remove local_irq_disable/enable() from
cgroup_rmdir().
Note that this removes css_is_removed() whose only user is VM_BUG_ON()
in memcontrol.c. We can replace it with a check on the refcnt but
given that the only use case is a debug assert, I think it's better to
simply unexport it.
v2: Comment updated and explanation on local_irq_disable/enable()
added per Michal Hocko.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Tejun Heo [Mon, 5 Nov 2012 17:16:58 +0000 (09:16 -0800)]
cgroup: kill cgroup_subsys->__DEPRECATED_clear_css_refs
2ef37d3fe4 ("memcg: Simplify mem_cgroup_force_empty_list error
handling") removed the last user of __DEPRECATED_clear_css_refs. This
patch removes __DEPRECATED_clear_css_refs and mechanisms to support
it.
* Conditionals dependent on __DEPRECATED_clear_css_refs removed.
* cgroup_clear_css_refs() can no longer fail. All that needs to be
done are deactivating refcnts, setting CSS_REMOVED and putting the
base reference on each css. Remove cgroup_clear_css_refs() and the
failure path, and open-code the loops into cgroup_rmdir().
This patch keeps the two for_each_subsys() loops separate while open
coding them. They can be merged now but there are scheduled changes
which need them to be separate, so keep them separate to reduce the
amount of churn.
local_irq_save/restore() from cgroup_clear_css_refs() are replaced
with local_irq_disable/enable() for simplicity. This is safe as
cgroup_rmdir() is always called with IRQ enabled. Note that this IRQ
switching is necessary to ensure that css_tryget() isn't called from
IRQ context on the same CPU while lower context is between CSS
deactivation and setting CSS_REMOVED as css_tryget() would hang
forever in such cases waiting for CSS to be re-activated or
CSS_REMOVED set. This will go away soon.
v2: cgroup_call_pre_destroy() removal dropped per Michal. Commit
message updated to explain local_irq_disable/enable() conversion.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Linus Torvalds [Sun, 4 Nov 2012 19:07:39 +0000 (11:07 -0800)]
Linux 3.7-rc4
Linus Torvalds [Sat, 3 Nov 2012 22:27:21 +0000 (15:27 -0700)]
Merge tag 'nfs-for-3.7-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
- Fix a bunch of deadlock situations:
* State recovery can deadlock if we fail to release sequence ids
before scheduling the recovery thread.
* Calling deactivate_super() from an RPC workqueue thread can
deadlock because of the call to rpc_shutdown_client.
- Display the device name correctly in /proc/*/mounts
- Fix a number of incorrect error return values:
* When NFSv3 mounts fail due to a timeout.
* On NFSv4.1 backchannel setup failure
* On NFSv4 open access checks
- pnfs_find_alloc_layout() must check the layout pointer for NULL
- Fix a regression in the legacy DNS resolved
* tag 'nfs-for-3.7-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS4: nfs4_opendata_access should return errno
NFSv4: Initialise the NFSv4.1 slot table highest_used_slotid correctly
SUNRPC: return proper errno from backchannel_rqst
NFS: add nfs_sb_deactive_async to avoid deadlock
nfs: Show original device name verbatim in /proc/*/mount{s,info}
nfsv3: Make v3 mounts fail with ETIMEDOUTs instead EIO on mountd timeouts
nfs: Check whether a layout pointer is NULL before free it
NFS: fix bug in legacy DNS resolver.
NFSv4: nfs4_locku_done must release the sequence id
NFSv4.1: We must release the sequence id when we fail to get a session slot
NFS: Wait for session recovery to finish before returning
Linus Torvalds [Sat, 3 Nov 2012 22:25:14 +0000 (15:25 -0700)]
Merge branch 'release' of git://git./linux/kernel/git/rzhang/linux
Pull thermal management & ACPI update from Zhang Rui,
Ho humm. Normally these things go through Len. But it's just three
small fixes, I guess I can pull directly too.
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
exynos4_tmu_driver_ids should be exynos_tmu_driver_ids.
ACPI video: Ignore errors after _DOD evaluation.
thermal: solve compilation errors in rcar_thermal
Linus Torvalds [Sat, 3 Nov 2012 22:14:54 +0000 (15:14 -0700)]
Merge branch 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux
Pull i2c embedded fixes from Wolfram Sang:
"Two patches are usual stuff.
The bigger patch is needed to correct a wrong decision made in this
merge window. We hoped to get the PIOQUEUE mode in the mxs driver
working with DMA, but it turned out to be too broken (leading to data
loss), so we now think it is best to remove it entirely and work only
with DMA now. The patch should be in 3.7. IMO, so users never get
the chance to use both modes in parallel."
* 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux:
i2c: tegra: set irq name as device name
i2c-nomadik: Fixup clock handling
i2c: mxs: remove broken PIOQUEUE support
Linus Torvalds [Sat, 3 Nov 2012 22:13:49 +0000 (15:13 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Scattered selection of fixes:
- radeon: load detect fixes from SuSE/AMD
- intel: misc i830, sdvo regression, vesafb kickoff ums fix
- exynos: maintainers entry update + fixes
- udl: fix stride scanout issue
it's slightly bigger than I'd probably like, but nothing looked
dangerous enough to hold off on."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/udl: fix stride issues scanning out stride != width*bpp
drm/radeon: add load detection support for ext DAC on R200 (v2)
DRM/radeon: For single CRTC GPUs move handling of CRTC_CRT_ON to crtc_dpms().
DRM/Radeon: Fix TV DAC Load Detection for single CRTC chips.
DRM/Radeon: Clean up code in TV DAC load detection.
drm/radeon: fix ATPX function documentation
drivers/gpu/drm/radeon/evergreen_cs.c: Remove unnecessary semicolon
DRM/Radeon: On DVI-I use Load Detection when EDID is bogus.
DRM/Radeon: Fix primary DAC Load Detection for RV100 chips.
DRM/Radeon: Fix Load Detection on legacy primary DAC.
drm: exynos: removed warning due to missing typecast for mixer driver data
drm/exynos: add support for ARCH_MULTIPLATFORM
MAINTAINERS: Add git repository for Exynos DRM
drm/exynos: fix display on issue
drm/i915: Only kick out vesafb if we takeover the fbcon with KMS
drm/i915: be less verbose about inability to provide vendor backlight
drm/i915: clear the entire sdvo infoframe buffer
drm/i915: VGA needs to be on pipe A on i830M
drm/i915: fix overlay on i830M
Linus Torvalds [Sat, 3 Nov 2012 03:48:41 +0000 (20:48 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
"First post-Sandy pull request"
1) Fix antenna gain handling and initialization of chan->max_reg_power
in wireless, from Felix Fietkau.
2) Fix nexthop handling in H.232 conntrack helper, from Julian
Anastasov.
3) Only process 80211 mesh config header in certain kinds of frames,
from Javier Cardona.
4) 80211 management frame header length needs to be validated, from
Johannes Berg.
5) Don't access free'd SKBs in ath9k driver, from Felix Fietkay.
6) Test for permanent state correctly in VXLAN driver, from Stephen
Hemminger.
7) BNX2X bug fixes from Yaniv Rosner and Dmitry Kravkov.
8) Fix off by one errors in bonding, from Nikolay ALeksandrov.
9) Fix divide by zero in TCP-Illinois congestion control. From Jesper
Dangaard Brouer.
10) TCP metrics code says "Yo dawg, I heard you like sizeof, so I did a
sizeof of a sizeof, so you can size your size" Fix from Julian
Anastasov.
11) Several drivers do mdiobus_free without first doing an
mdiobus_unregister leading to stray pointer references. Fix from
Peter Senna Tschudin.
12) Fix OOPS in l2tp_eth_create() error path, it's another danling
pointer kinda situation. Fix from Tom Parkin.
13) Hardware driven by the vmxnet driver can't handle larger than 16K
fragments, so split them up when necessary. From Eric Dumazet.
14) Handle zero length data length in tcp_send_rcvq() properly. Fix
from Pavel Emelyanov.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits)
tcp-repair: Handle zero-length data put in rcv queue
vmxnet3: must split too big fragments
l2tp: fix oops in l2tp_eth_create() error path
cxgb4: Fix unable to get UP event from the LLD
drivers/net/phy/mdio-bitbang.c: Call mdiobus_unregister before mdiobus_free
drivers/net/ethernet/nxp/lpc_eth.c: Call mdiobus_unregister before mdiobus_free
bnx2x: fix HW initialization using fw 7.8.x
tcp: Fix double sizeof in new tcp_metrics code
net: fix divide by zero in tcp algorithm illinois
net: sctp: Fix typo in net/sctp
bonding: fix second off-by-one error
bonding: fix off-by-one error
bnx2x: Disable FCoE for 57840 since not yet supported by FW
bnx2x: Fix no link on 577xx 10G-baseT
bnx2x: Fix unrecognized SFP+ module after driver is loaded
bnx2x: Fix potential incorrect link speed provision
bnx2x: Restore global registers back to default.
bnx2x: Fix link down in 57712 following LFA
bnx2x: Fix 57810 1G-KR link against certain switches.
ixgbe: PTP get_ts_info missing software support
...
Pavel Emelyanov [Mon, 29 Oct 2012 05:05:33 +0000 (05:05 +0000)]
tcp-repair: Handle zero-length data put in rcv queue
When sending data into a tcp socket in repair state we should check
for the amount of data being 0 explicitly. Otherwise we'll have an skb
with seq == end_seq in rcv queue, but tcp doesn't expect this to happen
(in particular a warn_on in tcp_recvmsg shoots).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Reported-by: Giorgos Mavrikas <gmavrikas@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 29 Oct 2012 07:30:49 +0000 (07:30 +0000)]
vmxnet3: must split too big fragments
vmxnet3 has a 16Kbytes limit per tx descriptor, that happened to work
as long as we provided PAGE_SIZE fragments.
Our stack can now build larger fragments, so we need to split them to
the 16kbytes boundary.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: jongman heo <jongman.heo@samsung.com>
Tested-by: jongman heo <jongman.heo@samsung.com>
Cc: Shreyas Bhatewara <sbhatewara@vmware.com>
Reviewed-by: Bhavesh Davda <bhavesh@vmware.com>
Signed-off-by: Shreyas Bhatewara <sbhatewara@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Mon, 29 Oct 2012 23:41:48 +0000 (23:41 +0000)]
l2tp: fix oops in l2tp_eth_create() error path
When creating an L2TPv3 Ethernet session, if register_netdev() should fail for
any reason (for example, automatic naming for "l2tpeth%d" interfaces hits the
32k-interface limit), the netdev is freed in the error path. However, the
l2tp_eth_sess structure's dev pointer is left uncleared, and this results in
l2tp_eth_delete() then attempting to unregister the same netdev later in the
session teardown. This results in an oops.
To avoid this, clear the session dev pointer in the error path.
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jonghwan Choi [Tue, 23 Oct 2012 06:54:42 +0000 (14:54 +0800)]
exynos4_tmu_driver_ids should be exynos_tmu_driver_ids.
Signed-off-by: Jonghwan Choi <jhbird.choi@samsung.com>
Reviewed-by: Amit Daniel Kachhap <amit.kachhap@linaro.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Igor Murzov [Sat, 13 Oct 2012 00:41:25 +0000 (04:41 +0400)]
ACPI video: Ignore errors after _DOD evaluation.
There are systems where video module known to work fine regardless
of broken _DOD and ignoring returned value here doesn't cause
any issues later. This should fix brightness controls on some laptops.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=47861
Signed-off-by: Igor Murzov <e-mail@date.by>
Reviewed-by: Sergey V <sftp.mtuci@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Devendra Naga [Wed, 31 Oct 2012 08:46:10 +0000 (17:46 +0900)]
thermal: solve compilation errors in rcar_thermal
following were the errors reported
drivers/thermal/rcar_thermal.c: In function ‘rcar_thermal_probe’:
drivers/thermal/rcar_thermal.c:214:10: warning: passing argument 3 of ‘thermal_zone_device_register’ makes integer from pointer without a cast [enabled by default]
include/linux/thermal.h:166:29: note: expected ‘int’ but argument is of type ‘struct rcar_thermal_priv *’
drivers/thermal/rcar_thermal.c:214:10: error: too few arguments to function ‘thermal_zone_device_register’
include/linux/thermal.h:166:29: note: declared here
make[1]: *** [drivers/thermal/rcar_thermal.o] Error 1
make: *** [drivers/thermal/rcar_thermal.o] Error 2
with gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)
Signed-off-by: Devendra Naga <develkernel412222@gmail.com>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Vipul Pandya [Mon, 29 Oct 2012 02:02:36 +0000 (02:02 +0000)]
cxgb4: Fix unable to get UP event from the LLD
If T4 configuration file gets loaded from the /lib/firmware/cxgb4/ directory
then offload capabilities of the cards were getting disabled during
initialization. Hence ULDs do not get an UP event from the LLD.
Signed-off-by: Jay Hernandez <jay@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Senna Tschudin [Sun, 28 Oct 2012 06:12:01 +0000 (06:12 +0000)]
drivers/net/phy/mdio-bitbang.c: Call mdiobus_unregister before mdiobus_free
Based on commit
b27393aecf66199f5ddad37c302d3e0cfadbe6c0
Calling mdiobus_free without calling mdiobus_unregister causes
BUG_ON(). This patch fixes the issue.
The semantic patch that found this issue(http://coccinelle.lip6.fr/):
// <smpl>
@@
expression E;
@@
... when != mdiobus_unregister(E);
+ mdiobus_unregister(E);
mdiobus_free(E);
// </smpl>
Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Senna Tschudin [Sun, 28 Oct 2012 06:12:00 +0000 (06:12 +0000)]
drivers/net/ethernet/nxp/lpc_eth.c: Call mdiobus_unregister before mdiobus_free
Based on commit
b27393aecf66199f5ddad37c302d3e0cfadbe6c0
Calling mdiobus_free without calling mdiobus_unregister causes
BUG_ON(). This patch fixes the issue.
The semantic patch that found this issue(http://coccinelle.lip6.fr/):
// <smpl>
@@
expression E;
@@
... when != mdiobus_unregister(E);
+ mdiobus_unregister(E);
mdiobus_free(E);
// </smpl>
Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com>
Tested-by: Roland Stigge <stigge@antcom.de>
Tested-by: Alexandre Pereira da Silva <aletes.xgr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Sun, 28 Oct 2012 21:59:04 +0000 (21:59 +0000)]
bnx2x: fix HW initialization using fw 7.8.x
Since commit
96bed4b9 (use FW 7.8.2) BRB HW block needs to be
initialized using fw values for all devices.
Otherwise ETS on 57712/578xx will not work.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 2 Nov 2012 23:56:39 +0000 (16:56 -0700)]
Merge tag 'pm-for-3.7-rc4' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management update from Rafael J. Wysocki:
"Change the email address of the powernow-k8 maintainer."
* tag 'pm-for-3.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq / powernow-k8: Change maintainer's email address
Linus Torvalds [Fri, 2 Nov 2012 23:11:15 +0000 (16:11 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull input subsystem fixes from Dmitry Torokhov:
"Just a few driver fixes."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: tsc40 - remove wrong announcement of pressure support
Input: lpc32xx-keys - select INPUT_MATRIXKMAP
Input: pxa27x_keypad - clear pending interrupts on keypad config
Input: wacom - correct bad Cintiq 24HD check
Input: wacom - add INPUT_PROP_DIRECT flag to Cintiq 24HD
Input: egalax_ts - get gpio from devicetree
Weston Andros Adamson [Fri, 2 Nov 2012 22:00:56 +0000 (18:00 -0400)]
NFS4: nfs4_opendata_access should return errno
Return errno - not an NFS4ERR_. This worked because NFS4ERR_ACCESS == EACCES.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Linus Torvalds [Fri, 2 Nov 2012 22:08:20 +0000 (15:08 -0700)]
Merge git://git./linux/kernel/git/nab/target-pending
Pull more scsi target fixes from Nicholas Bellinger:
"This series is a second round of target fixes for v3.7-rc4 that have
come into target-devel over the last days, and are important enough to
be applied ASAP.
All are being CC'ed to stable. The most important two are:
- target: Re-add explict zeroing of INQUIRY bounce buffer memory to
fix a regression for handling zero-length payloads, a bug that went
during v3.7-rc1, and hit >= v3.6.3 stable. (nab + paolo)
- iscsi-target: Fix a long-standing missed R2T wakeup race in TX
thread processing when using a single queue slot. (Roland)
Thanks to Roland & PureStorage team for helping to track down this
long standing race with iscsi-target single queue slot operation.
Also, the tcm_fc(FCoE) regression bug that was observed recently with
-rc2 code has also been resolved with the cancel_delayed_work() return
bugfix (commit
c0158ca64da5: "workqueue: cancel_delayed_work() should
return %false if work item is idle") now in -rc3. Thanks again to Yi
Zou, MDR, Robert Love @ Intel for helping to track this down."
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
target: Fix incorrect usage of nested IRQ spinlocks in ABORT_TASK path
iscsi-target: Fix missed wakeup race in TX thread
target: Avoid integer overflow in se_dev_align_max_sectors()
target: Don't return success from module_init() if setup fails
target: Re-add explict zeroing of INQUIRY bounce buffer memory
Linus Torvalds [Fri, 2 Nov 2012 20:27:52 +0000 (13:27 -0700)]
Merge tag 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
"An e-mail address update, and fix a compile error on SPARC"
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: Only include of_match_table with CONFIG_OF_GPIO
hwmon, fam15h_power: Change email address, MAINTAINERS entry
Linus Torvalds [Fri, 2 Nov 2012 20:27:01 +0000 (13:27 -0700)]
Merge tag 'frv-fixes-
20121102' of git://git./linux/kernel/git/dhowells/linux-frv
Pull FRV fixes from David Howells:
"A collection of small fixes for the FRV architecture."
* tag 'frv-fixes-
20121102' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-frv:
frv: fix the broken preempt
frv: switch to saner kernel_execve() semantics
FRV: Fix the new-style kernel_thread() stuff
FRV: Fix the preemption handling
FRV: gcc-4.1.2 also inlines weak functions
FRV: Don't objcopy the GNU build_id note
FRV: Add missing linux/export.h #inclusions
Linus Torvalds [Fri, 2 Nov 2012 20:26:11 +0000 (13:26 -0700)]
Merge tag 'stable/for-linus-3.7-rc4-tag' of git://git./linux/kernel/git/konrad/xen
Pull Xen bugfixes from Konrad Rzeszutek Wilk:
- Use appropriate macros instead of hand-rolling our own (ARM).
- Fixes if FB/KBD closed unexpectedly.
- Fix memory leak in /dev/gntdev ioctl calls.
- Fix overflow check in xenbus_file_write.
- Document cleanup.
- Performance optimization when migrating guests.
* tag 'stable/for-linus-3.7-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/mmu: Use Xen specific TLB flush instead of the generic one.
xen/arm: use the __HVC macro
xen/xenbus: fix overflow check in xenbus_file_write()
xen-kbdfront: handle backend CLOSED without CLOSING
xen-fbfront: handle backend CLOSED without CLOSING
xen/gntdev: don't leak memory from IOCTL_GNTDEV_MAP_GRANT_REF
x86: remove obsolete comment from asm/xen/hypervisor.h
Sasha Levin [Tue, 30 Oct 2012 18:45:57 +0000 (14:45 -0400)]
hashtable: introduce a small and naive hashtable
This hashtable implementation is using hlist buckets to provide a simple
hashtable to prevent it from getting reimplemented all over the kernel.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
[ Merging this now, so that subsystems can start applying Sasha's
patches that use this - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Fri, 2 Nov 2012 16:05:44 +0000 (12:05 -0400)]
frv: fix the broken preempt
Just get %icc2 into the state we would have after local_irq_disable()
and physical IRQ having happened since then. Then we can simply
use preempt_schedule_irq() and be done with the whole mess.
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 15 Oct 2012 14:53:17 +0000 (10:53 -0400)]
frv: switch to saner kernel_execve() semantics
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
David Howells [Fri, 2 Nov 2012 13:20:43 +0000 (13:20 +0000)]
FRV: Fix the new-style kernel_thread() stuff
The kernel_thread() changes for FRV don't work, and FRV fails to boot,
starting with:
commit
02ce496f152df87be081a64796498942c433a2fd
Author: Al Viro <viro@zeniv.linux.org.uk>
Date: Tue Sep 18 22:18:51 2012 -0400
Subject: frv: split ret_from_fork, simplify kernel_thread() a lot
The problem is that the userspace registers are completely cleared when a
kernel thread is created and all subsequent user threads are then copied from
that. Unfortunately, however, the TBR and PSR registers are restored from the
pt_regs and the values they should be set to are clobbered by the memset.
Instead, copy across the old user registers as normal, and then merely alter
GR8 and GR9 in it if we're going to execute a kernel thread.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Fri, 2 Nov 2012 13:20:42 +0000 (13:20 +0000)]
FRV: Fix the preemption handling
Fix the preemption handling in FRV code where the PREEMPT_ACTIVE value is
incorrectly loaded into the threadinfo flags rather than the threadinfo
preemption count.
Unfortunately, the code cannot be simply converted to use
preempt_schedule_irq() as is because FRV uses virtual interrupt disablement to
cut down on the cost of actually disabling interrupts and thus
local_irq_enable() doesn't actually enable interrupts.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Al Viro <viro@ZenIV.linux.org.uk>
David Howells [Fri, 2 Nov 2012 13:20:42 +0000 (13:20 +0000)]
FRV: gcc-4.1.2 also inlines weak functions
gcc-4.1.2 inlines weak functions, which causes FRV to fail when the dummy
thread_info_cache_init() gets inlined into start_kernel().
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Fri, 2 Nov 2012 13:20:42 +0000 (13:20 +0000)]
FRV: Don't objcopy the GNU build_id note
Don't let objcopy transfer the GNU build_id note into the loadable image as it
is located at address 0 and the image ends up >3G in size.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Fri, 2 Nov 2012 13:20:42 +0000 (13:20 +0000)]
FRV: Add missing linux/export.h #inclusions
Add missing linux/export.h #inclusions to the FRV arch.
Signed-off-by: David Howells <dhowells@redhat.com>
Laxman Dewangan [Thu, 1 Nov 2012 16:38:14 +0000 (22:08 +0530)]
i2c: tegra: set irq name as device name
When watching the irqs name of tegra i2c, all instances
irq name shows as tegra_i2c.
Passing the device name properly to have the irq names with
instance like tegra-i2c.0, tegra-i2c.1 etc.
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Philippe Begnic [Wed, 10 Oct 2012 11:02:26 +0000 (13:02 +0200)]
i2c-nomadik: Fixup clock handling
Make sure to clk_prepare as well as clk_enable.
Signed-off-by: Philippe Begnic <philippe.begnic@stericsson.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Wolfram Sang [Fri, 12 Oct 2012 10:55:16 +0000 (11:55 +0100)]
i2c: mxs: remove broken PIOQUEUE support
This I2C master can do DMA and PIOQUEUE (PIO with FIFO). Originally,
only PIOQUEUE was supported and it had issues, then DMA support was added
this cycle. The original intention was to keep PIOQUEUE since it has
less overhead what is nice for small transfers. However, runtime
switching between PIOQEUE and DMA depending on the transfer size never
worked despite a lot of trying. Since PIOQUEUE mode itself was flaky
(polling at places where interrupts failed to work) and the
implementation also imposed a size limit for transfers, it is best to
remove the support, so users don't fall over its limitations. It also
makes the driver a lot cleaner and more robust. If somebody really wants
less overhead, plain PIO mode could still be implemented with the
addidtional advantage that this mode is also available on MX23, too.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Reviewed-by: Marek Vasut <marex@denx.de>
Linus Torvalds [Fri, 2 Nov 2012 00:48:19 +0000 (17:48 -0700)]
Merge tag 'xtensa-next-
20121101' of git://github.com/czankel/xtensa-linux
Pull Xtensa fixes from Chris Zankel:
"Some important bug fixes.
With the change to uapi, there was a bug introduced that results in an
empty syscall table (mult-inclusion bug). Switching to the generic
thread/execve allowed us to fix a bug we had in vfork()."
* tag 'xtensa-next-
20121101' of git://github.com/czankel/xtensa-linux:
xtensa: switch to generic sys_execve()
xtensa: switch to generic kernel_execve()
xtensa: switch to generic kernel_thread()
xtensa: reset windowbase/windowstart when cloning the VM
xtensa: use physical addresses for bus addresses
xtensa: allow multi-inclusion for uapi/unistd.h
Dave Airlie [Thu, 1 Nov 2012 03:47:09 +0000 (13:47 +1000)]
drm/udl: fix stride issues scanning out stride != width*bpp
When buffer sharing with the i915 and using a 1680x1050 monitor,
the i915 gives is a 6912 buffer for the 6720 width, the code doesn't
render this properly as it uses one value to set the base address for
reading from the vmap and for where to start on the device.
This fixes it by calculating the values correctly for the device and
for the pixmap. No idea how I haven't seen this before now.
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jamie Lentin [Thu, 1 Nov 2012 23:55:43 +0000 (23:55 +0000)]
hwmon: Only include of_match_table with CONFIG_OF_GPIO
The following fixes build errors on sparc. Without any DT support,
of_match_ptr is NULL and the below is a no-op. However, if just
CONFIG_OF is defined then so is of_match_ptr.
All useful parts of the gpio-fan DT support rely on CONFIG_OF_GPIO
anyway, so of_match_table should too.
Signed-off-by: Jamie Lentin <jm@lentin.co.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Dave Airlie [Fri, 2 Nov 2012 00:30:37 +0000 (10:30 +1000)]
Merge branch 'exynos-drm-fixes' of git://git./linux/kernel/git/daeinki/drm-exynos into drm-fixes
Inki writes:
"As I posted before, we have added a new git repository for Exynos drm
to MAINTAINERS file so change it to new one like below,
from git://git.infradead.org/users/kmpark/linux-samsung
to git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos
And this pull request includes the following:
- fix display on issue when user requested dpms mode changing.
- add git repository for Exynos drm to MAINTAINERS file.
- add support for ARCH_MULTIPLATFORM.
- and code clean."
* 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
drm: exynos: removed warning due to missing typecast for mixer driver data
drm/exynos: add support for ARCH_MULTIPLATFORM
MAINTAINERS: Add git repository for Exynos DRM
drm/exynos: fix display on issue
Dave Airlie [Fri, 2 Nov 2012 00:29:22 +0000 (10:29 +1000)]
Merge branch 'drm-fixes-3.7' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Alex writes:
"This request is mostly load detection fixes from Egbert and me."
* 'drm-fixes-3.7' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: add load detection support for ext DAC on R200 (v2)
DRM/radeon: For single CRTC GPUs move handling of CRTC_CRT_ON to crtc_dpms().
DRM/Radeon: Fix TV DAC Load Detection for single CRTC chips.
DRM/Radeon: Clean up code in TV DAC load detection.
drm/radeon: fix ATPX function documentation
drivers/gpu/drm/radeon/evergreen_cs.c: Remove unnecessary semicolon
DRM/Radeon: On DVI-I use Load Detection when EDID is bogus.
DRM/Radeon: Fix primary DAC Load Detection for RV100 chips.
DRM/Radeon: Fix Load Detection on legacy primary DAC.
Dave Airlie [Fri, 2 Nov 2012 00:26:03 +0000 (10:26 +1000)]
Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes
Daniel Vetter writes"
Nothing big at all for -fixes, just small stuff:
- Two patches to fix bugs on i830M
- ums regression fixer due to kicking firmeware fbs (Chris)
- tune down a too loud warning (Jani)
- be more careful with sdvo infoframes, which fixes a long-standing
sdvo-hdmi regression"
* 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel:
drm/i915: Only kick out vesafb if we takeover the fbcon with KMS
drm/i915: be less verbose about inability to provide vendor backlight
drm/i915: clear the entire sdvo infoframe buffer
drm/i915: VGA needs to be on pipe A on i830M
drm/i915: fix overlay on i830M
Trond Myklebust [Mon, 29 Oct 2012 20:48:40 +0000 (16:48 -0400)]
NFSv4: Initialise the NFSv4.1 slot table highest_used_slotid correctly
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Julian Anastasov [Tue, 30 Oct 2012 12:03:09 +0000 (12:03 +0000)]
tcp: Fix double sizeof in new tcp_metrics code
Fix double sizeof when parsing IPv6 address from
user space because it breaks get/del by specific IPv6 address.
Problem noticed by David Binderman:
https://bugzilla.kernel.org/show_bug.cgi?id=49171
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jesper Dangaard Brouer [Wed, 31 Oct 2012 02:45:32 +0000 (02:45 +0000)]
net: fix divide by zero in tcp algorithm illinois
Reading TCP stats when using TCP Illinois congestion control algorithm
can cause a divide by zero kernel oops.
The division by zero occur in tcp_illinois_info() at:
do_div(t, ca->cnt_rtt);
where ca->cnt_rtt can become zero (when rtt_reset is called)
Steps to Reproduce:
1. Register tcp_illinois:
# sysctl -w net.ipv4.tcp_congestion_control=illinois
2. Monitor internal TCP information via command "ss -i"
# watch -d ss -i
3. Establish new TCP conn to machine
Either it fails at the initial conn, or else it needs to wait
for a loss or a reset.
This is only related to reading stats. The function avg_delay() also
performs the same divide, but is guarded with a (ca->cnt_rtt > 0) at its
calling point in update_params(). Thus, simply fix tcp_illinois_info().
Function tcp_illinois_info() / get_info() is called without
socket lock. Thus, eliminate any race condition on ca->cnt_rtt
by using a local stack variable. Simply reuse info.tcpv_rttcnt,
as its already set to ca->cnt_rtt.
Function avg_delay() is not affected by this race condition, as
its called with the socket lock.
Cc: Petr Matousek <pmatouse@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Masanari Iida [Wed, 31 Oct 2012 05:48:19 +0000 (05:48 +0000)]
net: sctp: Fix typo in net/sctp
Correct spelling typo in net/sctp/socket.c
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
nikolay@redhat.com [Wed, 31 Oct 2012 06:03:52 +0000 (06:03 +0000)]
bonding: fix second off-by-one error
Fix off-by-one error because IFNAMSIZ == 16 and when this
code gets executed we stick a NULL byte where we should not.
How to reproduce:
with CONFIG_CC_STACKPROTECTOR=y (otherwise it may pass by silently)
modprobe bonding; echo 1 > /sys/class/net/bond0/bonding/mode;
echo "
AAAAAAAAAAAAAAAA" > /sys/class/net/bond0/bonding/active_slave;
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Note: Sorry for the second patch but I missed this one while checking
the file. You can squash them into one patch.
Signed-off-by: David S. Miller <davem@davemloft.net>
nikolay@redhat.com [Wed, 31 Oct 2012 04:42:51 +0000 (04:42 +0000)]
bonding: fix off-by-one error
Fix off-by-one error because IFNAMSIZ == 16 and when this
code gets executed we stick a NULL byte where we should not.
How to reproduce:
with CONFIG_CC_STACKPROTECTOR=y (otherwise it may pass by silently)
modprobe bonding; echo 1 > /sys/class/net/bond0/bonding/mode;
echo "
AAAAAAAAAAAAAAAA" > /sys/class/net/bond0/bonding/primary;
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Weston Andros Adamson [Thu, 1 Nov 2012 15:21:53 +0000 (11:21 -0400)]
SUNRPC: return proper errno from backchannel_rqst
The one and only caller (in fs/nfs/nfs4client.c) uses the result
as an errno and would have interpreted an error as EPERM.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Dmitry Kravkov [Wed, 31 Oct 2012 05:46:58 +0000 (05:46 +0000)]
bnx2x: Disable FCoE for 57840 since not yet supported by FW
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Wed, 31 Oct 2012 05:46:57 +0000 (05:46 +0000)]
bnx2x: Fix no link on 577xx 10G-baseT
Since the Warpcore supports various link types, need to set only the correct
supported modes for XFI which is the serdes interface for the 10G-baseT PHY.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Wed, 31 Oct 2012 05:46:56 +0000 (05:46 +0000)]
bnx2x: Fix unrecognized SFP+ module after driver is loaded
When SFP+ module is plugged in after driver is already loaded, it may not be
recognized, so set SFP module recognition time up to 300ms, without resetting
the module power in the middle.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Wed, 31 Oct 2012 05:46:55 +0000 (05:46 +0000)]
bnx2x: Fix potential incorrect link speed provision
Fix possible incorrect link speed provision following rapid link speed change.
Clear link speed mask after each link change, and not only after link down.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Wed, 31 Oct 2012 05:46:54 +0000 (05:46 +0000)]
bnx2x: Restore global registers back to default.
Several KR registers were not set correctly back to default after
loopback test, so set those global registers over the global WC lane (zero)
rather than the current lane.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Wed, 31 Oct 2012 05:46:53 +0000 (05:46 +0000)]
bnx2x: Fix link down in 57712 following LFA
In case of link flap avoidance between PXE boot and bnx2x, set the appropriate
PHY DEVAD even if LFA kicks in.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Wed, 31 Oct 2012 05:46:52 +0000 (05:46 +0000)]
bnx2x: Fix 57810 1G-KR link against certain switches.
Fix 1G KR link by restoring CL72 misc control register to default value rather
than 0.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 1 Nov 2012 15:27:02 +0000 (08:27 -0700)]
Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fix from Marcelo Tosatti.
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: fix vcpu->mmio_fragments overflow
Jacob Keller [Wed, 31 Oct 2012 22:30:54 +0000 (22:30 +0000)]
ixgbe: PTP get_ts_info missing software support
This patch corrects the ethtool get_ts_info functon which did not state that
software timestamping was supported, even though it is.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
CC: Stable <stable@vger.kernel.org> [3.5]
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Deucher [Wed, 31 Oct 2012 15:51:03 +0000 (11:51 -0400)]
drm/radeon: add load detection support for ext DAC on R200 (v2)
The R200 asics use an external DAC for the secondary DAC.
The current KMS code tries to use code for the integrated
TV DAC for R200 which leads to unpredictable results since
R200 does not have an integrated TV DAC. This patch ports
the external DAC load detection support from the UMS
driver to KMS.
v2: fix typo in loop break logic
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Egbert Eich <eich@suse.de>
Egbert Eich [Mon, 29 Oct 2012 12:46:48 +0000 (13:46 +0100)]
DRM/radeon: For single CRTC GPUs move handling of CRTC_CRT_ON to crtc_dpms().
On all dual CRTC GPUs the CRTC_CRT_ON in the RADEON_CRTC_EXT_CNTL register
controls the CRTC of the primary DAC. Therefore it is set in the DAC DMPS
function.
This is different for GPU's with a single CRTC but a primary and a
TV DAC: here it controls the single CRTC no matter where it is routed.
Therefore we set it here. This avoids an elaborate on/off state tracking
since both primary_dac_dpms() and tv_dac_dpms() functions would have
to touch this bit.
On single CRTC GPUs with just one DAC it's irrelevant where this bit
is handled.
agd5f: fix warning
Signed-off-by: Egbert Eich <eich@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Egbert Eich [Tue, 30 Oct 2012 16:42:27 +0000 (17:42 +0100)]
DRM/Radeon: Fix TV DAC Load Detection for single CRTC chips.
The RN50 has a TV DAC but only a single CRTC. For load detection this
DAC is controlled by the primary CRTC.
Signed-off-by: Egbert Eich <eich@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Egbert Eich [Tue, 30 Oct 2012 16:42:26 +0000 (17:42 +0100)]
DRM/Radeon: Clean up code in TV DAC load detection.
Signed-off-by: Egbert Eich <eich@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Igor Murzov [Thu, 25 Oct 2012 13:09:00 +0000 (17:09 +0400)]
drm/radeon: fix ATPX function documentation
Fix a copy&pasted documentation.
Signed-off-by: Igor Murzov <e-mail@date.by>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Peter Senna Tschudin [Wed, 24 Oct 2012 14:42:26 +0000 (16:42 +0200)]
drivers/gpu/drm/radeon/evergreen_cs.c: Remove unnecessary semicolon
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@r1@
statement S;
position p,p1;
@@
S@p1;@p
@script:python r2@
p << r1.p;
p1 << r1.p1;
@@
if p[0].line != p1[0].line_end:
cocci.include_match(False)
@@
position r1.p;
@@
-;@p
// </smpl>
Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Egbert Eich [Wed, 24 Oct 2012 16:32:52 +0000 (18:32 +0200)]
DRM/Radeon: On DVI-I use Load Detection when EDID is bogus.
The Radeon driver uses the analog/digital flag to determine if the
DAC or the TMDS encoder should be enabled on a DVI-I connector.
If the EDID is bogus this flag is no longer reliable. This fix
adds a fallback to DAC load detection to determine if anything
is connected to the DAC. If not and a (bogus) EDID is found it
assumes a digital display is connected.
This works around problems with some crappy IPMI devices using
Radeon ES1000.
Signed-off-by: Egbert Eich <eich@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Egbert Eich [Wed, 24 Oct 2012 16:31:19 +0000 (18:31 +0200)]
DRM/Radeon: Fix primary DAC Load Detection for RV100 chips.
For Radeon 7500 ATI recommends a DAC_FORCE value of 0x1ac. This value
works better on ES1000 (RV100) chips, too, as it doesn't produce any false
positives on any cards I have tested. Therefore let's assume that this
value is good for all RV100 and RV200 chipset generations.
Signed-off-by: Egbert Eich <eich@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Egbert Eich [Wed, 24 Oct 2012 16:29:49 +0000 (18:29 +0200)]
DRM/Radeon: Fix Load Detection on legacy primary DAC.
An uninitialized variable led to broken load detection.
Signed-off-by: Egbert Eich <eich@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Steve Hodgson [Wed, 31 Oct 2012 17:24:02 +0000 (10:24 -0700)]
target: Fix incorrect usage of nested IRQ spinlocks in ABORT_TASK path
This patch changes core_tmr_abort_task() to use spin_lock -> spin_unlock
around se_cmd->t_state_lock while spin_lock_irqsave is held via
se_sess->sess_cmd_lock.
Signed-off-by: Steve Hodgson <steve@purestorage.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Cc: stable@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Roland Dreier [Wed, 31 Oct 2012 16:16:46 +0000 (09:16 -0700)]
iscsi-target: Fix missed wakeup race in TX thread
The sleeping code in iscsi_target_tx_thread() is susceptible to the classic
missed wakeup race:
- TX thread finishes handle_immediate_queue() and handle_response_queue(),
thinks both queues are empty.
- Another thread adds a queue entry and does wake_up_process(), which does
nothing because the TX thread is still awake.
- TX thread does schedule_timeout() and sleeps forever.
In practice this can kill an iSCSI connection if for example an initiator
does single-threaded writes and the target misses the wakeup window when
queueing an R2T; in this case the connection will be stuck until the
initiator loses patience and does some task management operation (or kills
the connection entirely).
Fix this by converting to wait_event_interruptible(), which does not
suffer from this sort of race.
Signed-off-by: Roland Dreier <roland@purestorage.com>
Cc: Andy Grover <agrover@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Roland Dreier [Wed, 31 Oct 2012 16:16:45 +0000 (09:16 -0700)]
target: Avoid integer overflow in se_dev_align_max_sectors()
The expression (max_sectors * block_size) might overflow a u32
(indeed, since iblock sets max_hw_sectors to UINT_MAX, it is
guaranteed to overflow and end up with a much-too-small result in many
common cases). Fix this by doing an equivalent calculation that
doesn't require multiplication.
While we're touching this code, avoid splitting a printk format across
two lines and use pr_info(...) instead of printk(KERN_INFO ...).
Signed-off-by: Roland Dreier <roland@purestorage.com>
Cc: stable@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Roland Dreier [Wed, 31 Oct 2012 16:16:44 +0000 (09:16 -0700)]
target: Don't return success from module_init() if setup fails
If the call to core_dev_release_virtual_lun0() fails, then nothing
sets ret to anything other than 0, so even though everything is
torn down and freed, target_core_init_configfs() will seem to succeed
and the module will be loaded. Fix this by passing the return value
on up the chain.
Signed-off-by: Roland Dreier <roland@purestorage.com>
Cc: stable@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Nicholas Bellinger [Thu, 1 Nov 2012 05:04:26 +0000 (22:04 -0700)]
target: Re-add explict zeroing of INQUIRY bounce buffer memory
This patch fixes a regression in spc_emulate_inquiry() code where the
local scope bounce buffer was no longer getting it's memory zeroed,
causing various problems with SCSI initiators that depend upon areas
of INQUIRY EVPD=0x83 payload having been zeroed.
This bug was introduced with the following v3.7-rc1 patch + CC'ed
stable commit:
commit
ffe7b0e9326d9c68f5688bef691dd49f1e0d3651
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Fri Sep 7 17:30:38 2012 +0200
target: support zero allocation length in INQUIRY
Go ahead and re-add the missing memset of bounce buffer memory to be
copied into the outgoing se_cmd descriptor kmapped SGL payload.
Reported-by: Kelsey Prantis <kelsey.prantis@intel.com>
Cc: Kelsey Prantis <kelsey.prantis@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andy Grover <agrover@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Linus Torvalds [Wed, 31 Oct 2012 22:42:08 +0000 (15:42 -0700)]
Merge git://git./linux/kernel/git/nab/target-pending
Pull scsi target fixes from Nicholas Bellinger:
"These are the current target pending fixes headed for v3.7-rc4 code.
This includes the following highlights:
- Fix long-standing qla2xxx target bug where certain fc_port_t state
transitions could cause the internal session b-tree list to become
out-of-sync. (Roland)
- Fix task management double free of se_cmd descriptor in exception
path for users of target_submit_tmr(). (nab)
- Re-introduce simple NOP emulation of REZERO_UNIT, SEEK_6, and
SEEK_10 SCSI-2 commands in order to support legacy initiators that
still require them. (Bernhard)
Note these three patches are also CC'ed to stable.
Also, there a couple of outstanding (external) regressions that are
still being tracked down for tcm_fc(FCoE) and tcm_vhost fabrics for
v3.7.0 code, so please expect another PULL as these issues identified
-> resolved."
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
target: reintroduce some obsolete SCSI-2 commands
target: Fix double-free of se_cmd in target_complete_tmr_failure
qla2xxx: Update target lookup session tables when a target session changes
tcm_qla2xxx: Format VPD page 83h SCSI name string according to SPC
qla2xxx: Add missing ->vport_slock while calling qlt_update_vp_map