Joonsoo Kim [Mon, 29 Apr 2013 22:07:39 +0000 (15:07 -0700)]
mm, vmalloc: remove list management of vmlist after initializing vmalloc
Now, there is no need to maintain vmlist after initializing vmalloc. So
remove related code and data structure.
Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Dave Anderson <anderson@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Mon, 29 Apr 2013 22:07:37 +0000 (15:07 -0700)]
mm, vmalloc: export vmap_area_list, instead of vmlist
Although our intention is to unexport internal structure entirely, but
there is one exception for kexec. kexec dumps address of vmlist and
makedumpfile uses this information.
We are about to remove vmlist, then another way to retrieve information
of vmalloc layer is needed for makedumpfile. For this purpose, we
export vmap_area_list, instead of vmlist.
Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Dave Anderson <anderson@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Mon, 29 Apr 2013 22:07:35 +0000 (15:07 -0700)]
mm, vmalloc: iterate vmap_area_list, instead of vmlist, in vmallocinfo()
This patch is a preparatory step for removing vmlist entirely. For
above purpose, we change iterating a vmap_list codes to iterating a
vmap_area_list. It is somewhat trivial change, but just one thing
should be noticed.
Using vmap_area_list in vmallocinfo() introduce ordering problem in SMP
system. In s_show(), we retrieve some values from vm_struct.
vm_struct's values is not fully setup when va->vm is assigned. Full
setup is notified by removing VM_UNLIST flag without holding a lock.
When we see that VM_UNLIST is removed, it is not ensured that vm_struct
has proper values in view of other CPUs. So we need smp_[rw]mb for
ensuring that proper values is assigned when we see that VM_UNLIST is
removed.
Therefore, this patch not only change a iteration list, but also add a
appropriate smp_[rw]mb to right places.
Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Dave Anderson <anderson@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Mon, 29 Apr 2013 22:07:34 +0000 (15:07 -0700)]
mm, vmalloc: iterate vmap_area_list in get_vmalloc_info()
This patch is a preparatory step for removing vmlist entirely. For
above purpose, we change iterating a vmap_list codes to iterating a
vmap_area_list. It is somewhat trivial change, but just one thing
should be noticed.
vmlist is lack of information about some areas in vmalloc address space.
For example, vm_map_ram() allocate area in vmalloc address space, but it
doesn't make a link with vmlist. To provide full information about
vmalloc address space is better idea, so we don't use va->vm and use
vmap_area directly. This makes get_vmalloc_info() more precise.
Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Dave Anderson <anderson@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Mon, 29 Apr 2013 22:07:32 +0000 (15:07 -0700)]
mm, vmalloc: iterate vmap_area_list, instead of vmlist in vread/vwrite()
Now, when we hold a vmap_area_lock, va->vm can't be discarded. So we can
safely access to va->vm when iterating a vmap_area_list with holding a
vmap_area_lock. With this property, change iterating vmlist codes in
vread/vwrite() to iterating vmap_area_list.
There is a little difference relate to lock, because vmlist_lock is mutex,
but, vmap_area_lock is spin_lock. It may introduce a spinning overhead
during vread/vwrite() is executing. But, these are debug-oriented
functions, so this overhead is not real problem for common case.
Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Dave Anderson <anderson@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Mon, 29 Apr 2013 22:07:30 +0000 (15:07 -0700)]
mm, vmalloc: protect va->vm by vmap_area_lock
Inserting and removing an entry to vmlist is linear time complexity, so
it is inefficient. Following patches will try to remove vmlist
entirely. This patch is preparing step for it.
For removing vmlist, iterating vmlist codes should be changed to
iterating a vmap_area_list. Before implementing that, we should make
sure that when we iterate a vmap_area_list, accessing to va->vm doesn't
cause a race condition. This patch ensure that when iterating a
vmap_area_list, there is no race condition for accessing to vm_struct.
Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Dave Anderson <anderson@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Mon, 29 Apr 2013 22:07:28 +0000 (15:07 -0700)]
mm, vmalloc: move get_vmalloc_info() to vmalloc.c
Now get_vmalloc_info() is in fs/proc/mmu.c. There is no reason that this
code must be here and it's implementation needs vmlist_lock and it iterate
a vmlist which may be internal data structure for vmalloc.
It is preferable that vmlist_lock and vmlist is only used in vmalloc.c
for maintainability. So move the code to vmalloc.c
Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Dave Anderson <anderson@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Mon, 29 Apr 2013 22:07:27 +0000 (15:07 -0700)]
mm, vmalloc: change iterating a vmlist to find_vm_area()
This patchset removes vm_struct list management after initializing
vmalloc. Adding and removing an entry to vmlist is linear time
complexity, so it is inefficient. If we maintain this list, overall
time complexity of adding and removing area to vmalloc space is O(N),
although we use rbtree for finding vacant place and it's time complexity
is just O(logN).
And vmlist and vmlist_lock is used many places of outside of vmalloc.c.
It is preferable that we hide this raw data structure and provide
well-defined function for supporting them, because it makes that they
cannot mistake when manipulating theses structure and it makes us easily
maintain vmalloc layer.
For kexec and makedumpfile, I export vmap_area_list, instead of vmlist.
This comes from Atsushi's recommendation. For more information, please
refer below link. https://lkml.org/lkml/2012/12/6/184
This patch:
The purpose of iterating a vmlist is finding vm area with specific virtual
address. find_vm_area() is provided for this purpose and more efficient,
because it uses a rbtree. So change it.
Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
Cc: Dave Anderson <anderson@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Darrick J. Wong [Mon, 29 Apr 2013 22:07:25 +0000 (15:07 -0700)]
mm: make snapshotting pages for stable writes a per-bio operation
Walking a bio's page mappings has proved problematic, so create a new
bio flag to indicate that a bio's data needs to be snapshotted in order
to guarantee stable pages during writeback. Next, for the one user
(ext3/jbd) of snapshotting, hook all the places where writes can be
initiated without PG_writeback set, and set BIO_SNAP_STABLE there.
We must also flag journal "metadata" bios for stable writeout, since
file data can be written through the journal. Finally, the
MS_SNAP_STABLE mount flag (only used by ext3) is now superfluous, so get
rid of it.
[akpm@linux-foundation.org: rename _submit_bh()'s `flags' to `bio_flags', delobotomize the _submit_bh declaration]
[akpm@linux-foundation.org: teeny cleanup]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gerald Schaefer [Mon, 29 Apr 2013 22:07:23 +0000 (15:07 -0700)]
mm/hugetlb: add more arch-defined huge_pte functions
Commit
abf09bed3cce ("s390/mm: implement software dirty bits")
introduced another difference in the pte layout vs. the pmd layout on
s390, thoroughly breaking the s390 support for hugetlbfs. This requires
replacing some more pte_xxx functions in mm/hugetlbfs.c with a
huge_pte_xxx version.
This patch introduces those huge_pte_xxx functions and their generic
implementation in asm-generic/hugetlb.h, which will now be included on
all architectures supporting hugetlbfs apart from s390. This change
will be a no-op for those architectures.
[akpm@linux-foundation.org: fix warning]
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Hillf Danton <dhillf@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.cz> [for !s390 parts]
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Josh Triplett [Mon, 29 Apr 2013 22:07:22 +0000 (15:07 -0700)]
fs: don't compile in drop_caches.c when CONFIG_SYSCTL=n
drop_caches.c provides code only invokable via sysctl, so don't compile it
in when CONFIG_SYSCTL=n.
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Mon, 29 Apr 2013 22:07:21 +0000 (15:07 -0700)]
cgroup: remove css_get_next
Now that we have generic and well ordered cgroup tree walkers there is
no need to keep css_get_next in the place.
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Ying Han <yinghan@google.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Glauber Costa <glommer@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Mon, 29 Apr 2013 22:07:19 +0000 (15:07 -0700)]
memcg: further simplify mem_cgroup_iter
mem_cgroup_iter basically does two things currently. It takes care of
the house keeping (reference counting, raclaim cookie) and it iterates
through a hierarchy tree (by using cgroup generic tree walk). The code
would be much more easier to follow if we move the iteration outside of
the function (to __mem_cgrou_iter_next) so the distinction is more
clear. This patch doesn't introduce any functional changes.
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Ying Han <yinghan@google.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Glauber Costa <glommer@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Mon, 29 Apr 2013 22:07:18 +0000 (15:07 -0700)]
memcg: simplify mem_cgroup_iter
The current implementation of mem_cgroup_iter has to consider both css
and memcg to find out whether no group has been found (css==NULL - aka
the loop is completed) and that no memcg is associated with the found
node (!memcg - aka css_tryget failed because the group is no longer
alive). This leads to awkward tweaks like tests for css && !memcg to
skip the current node.
It will be much easier if we got rid off css variable altogether and
only rely on memcg. In order to do that the iteration part has to skip
dead nodes. This sounds natural to me and as a nice side effect we will
get a simple invariant that memcg is always alive when non-NULL and all
nodes have been visited otherwise.
We could get rid of the surrounding while loop but keep it in for now to
make review easier. It will go away in the following patch.
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Ying Han <yinghan@google.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Glauber Costa <glommer@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Mon, 29 Apr 2013 22:07:17 +0000 (15:07 -0700)]
memcg: relax memcg iter caching
Now that the per-node-zone-priority iterator caches memory cgroups
rather than their css ids we have to be careful and remove them from the
iterator when they are on the way out otherwise they might live for
unbounded amount of time even though their group is already gone (until
the global/targeted reclaim triggers the zone under priority to find out
the group is dead and let it to find the final rest).
We can fix this issue by relaxing rules for the last_visited memcg.
Instead of taking a reference to the css before it is stored into
iter->last_visited we can just store its pointer and track the number of
removed groups from each memcg's subhierarchy.
This number would be stored into iterator everytime when a memcg is
cached. If the iter count doesn't match the curent walker root's one we
will start from the root again. The group counter is incremented
upwards the hierarchy every time a group is removed.
The iter_lock can be dropped because racing iterators cannot leak the
reference anymore as the reference count is not elevated for
last_visited when it is cached.
Locking rules got a bit complicated by this change though. The iterator
primarily relies on rcu read lock which makes sure that once we see a
valid last_visited pointer then it will be valid for the whole RCU walk.
smp_rmb makes sure that dead_count is read before last_visited and
last_dead_count while smp_wmb makes sure that last_visited is updated
before last_dead_count so the up-to-date last_dead_count cannot point to
an outdated last_visited. css_tryget then makes sure that the
last_visited is still alive in case the iteration races with the cached
group removal (css is invalidated before mem_cgroup_css_offline
increments dead_count).
In short:
mem_cgroup_iter
rcu_read_lock()
dead_count = atomic_read(parent->dead_count)
smp_rmb()
if (dead_count != iter->last_dead_count)
last_visited POSSIBLY INVALID -> last_visited = NULL
if (!css_tryget(iter->last_visited))
last_visited DEAD -> last_visited = NULL
next = find_next(last_visited)
css_tryget(next)
css_put(last_visited) // css would be invalidated and parent->dead_count
// incremented if this was the last reference
iter->last_visited = next
smp_wmb()
iter->last_dead_count = dead_count
rcu_read_unlock()
cgroup_rmdir
cgroup_destroy_locked
atomic_add(CSS_DEACT_BIAS, &css->refcnt) // subsequent css_tryget fail
mem_cgroup_css_offline
mem_cgroup_invalidate_reclaim_iterators
while(parent = parent_mem_cgroup)
atomic_inc(parent->dead_count)
css_put(css) // last reference held by cgroup core
Spotted by Ying Han.
Original idea from Johannes Weiner.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ying Han <yinghan@google.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Glauber Costa <glommer@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Mon, 29 Apr 2013 22:07:15 +0000 (15:07 -0700)]
memcg: rework mem_cgroup_iter to use cgroup iterators
mem_cgroup_iter curently relies on css->id when walking down a group
hierarchy tree. This is really awkward because the tree walk depends on
the groups creation ordering. The only guarantee is that a parent node is
visited before its children.
Example:
1) mkdir -p a a/d a/b/c
2) mkdir -a a/b/c a/d
Will create the same trees but the tree walks will be different:
1) a, d, b, c
2) a, b, c, d
Commit
574bd9f7c7c1 ("cgroup: implement generic child / descendant walk
macros") has introduced generic cgroup tree walkers which provide either
pre-order or post-order tree walk. This patch converts css->id based
iteration to pre-order tree walk to keep the semantic with the original
iterator where parent is always visited before its subtree.
cgroup_for_each_descendant_pre suggests using post_create and
pre_destroy for proper synchronization with groups addidition resp.
removal. This implementation doesn't use those because a new memory
cgroup is initialized sufficiently for iteration in mem_cgroup_css_alloc
already and css reference counting enforces that the group is alive for
both the last seen cgroup and the found one resp. it signals that the
group is dead and it should be skipped.
If the reclaim cookie is used we need to store the last visited group
into the iterator so we have to be careful that it doesn't disappear in
the mean time. Elevated reference count on the css keeps it alive even
though the group have been removed (parked waiting for the last dput so
that it can be freed).
Per node-zone-prio iter_lock has been introduced to ensure that
css_tryget and iter->last_visited is set atomically. Otherwise two
racing walkers could both take a references and only one release it
leading to a css leak (which pins cgroup dentry).
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Ying Han <yinghan@google.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Glauber Costa <glommer@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Mon, 29 Apr 2013 22:07:14 +0000 (15:07 -0700)]
memcg: keep prev's css alive for the whole mem_cgroup_iter
The patchset tries to make mem_cgroup_iter saner in the way how it walks
hierarchies. css->id based traversal is far from being ideal as it is not
deterministic because it depends on the creation ordering. Additional to
that css_id is considered a burden for cgroup maintainers because it is
quite some code and memcg is the last user of it. After this series only
the swap accounting uses css_id but that one will follow up later.
Diffstat (if we exclude removed/added comments) looks quite
promising. We got rid of some code:
$ git diff mmotm... | grep -v "^[+-][[:space:]]*[/ ]\*" | diffstat
b/include/linux/cgroup.h | 3 ---
kernel/cgroup.c | 33 ---------------------------------
mm/memcontrol.c | 4 +++-
3 files changed, 3 insertions(+), 37 deletions(-)
The first patch is just preparatory and it changes when we release css of
the previously returned memcg. Nothing controlversial.
The second patch is the core of the patchset and it replaces css_get_next
based on css_id by the generic cgroup pre-order. This brings some
chalanges for the last visited group caching during the reclaim
(mem_cgroup_per_zone::reclaim_iter). We have to use memcg pointers
directly now which means that we have to keep a reference to those groups'
css to keep them alive.
I also folded iter_lock introduced by https://lkml.org/lkml/2013/1/3/295
in the previous version into this patch. Johannes felt the race I was
describing should be mostly harmless and I haven't been able to trigger it
so the lock doesn't deserve its own patch. It is still needed
temporarily, though, because the reference counting on iter->last_visited
depends on it. It will go away with the next patch.
The next patch fixups an unbounded cgroup removal holdoff caused by the
elevated css refcount. The issue has been observed by Ying Han. Johannes
wasn't impressed by the previous version of the fix
(https://lkml.org/lkml/2013/2/8/379) which cleaned up pending references
during mem_cgroup_css_offline when a group is removed. He has suggested a
different way when the iterator checks whether a cached memcg is still
valid or no. More on that in the patch but the basic idea is that every
memcg tracks the number removed subgroups and iterator records this number
when a group is cached. These numbers are checked before
iter->last_visited is about to be used and the iteration is restarted if
it is invalid.
The fourth and fifth patches are an attempt for simplification of the
mem_cgroup_iter. css juggling is removed and the iteration logic is moved
to a helper so that the reference counting and iteration are separated.
The last patch just removes css_get_next as there is no user for it any
longer.
My testing looked as follows:
A (use_hierarchy=1, limit_in_bytes=150M)
/|\
1 2 3
Children groups were created so that the number is never higher than 3 and
their limits were random between 50-100M. Each group hosts a kernel build
(starting with tar -xf so the tree is not shared and make -jNUM_CPUs/3)
and terminated after random time - up to 5 minutes) and then it is
removed.
This should exercise both leaf and hierarchical reclaim as well as races
with cgroup removals and debugging messages I added on top proved that.
100 groups were created during the test.
This patch:
css reference counting keeps the cgroup alive even though it has been
already removed. mem_cgroup_iter relies on this fact and takes a
reference to the returned group. The reference is then released on the
next iteration or mem_cgroup_iter_break. mem_cgroup_iter currently
releases the reference right after it gets the last css_id.
This is correct because neither prev's memcg nor cgroup are accessed after
then. This will change in the next patch so we need to hold the group
alive a bit longer so let's move the css_put at the end of the function.
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Ying Han <yinghan@google.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Glauber Costa <glommer@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:07:12 +0000 (15:07 -0700)]
mm/x86: use free_highmem_page() to free highmem pages into buddy system
Use helper function free_highmem_page() to free highmem pages into
the buddy system.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Cong Wang <amwang@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Attilio Rao <attilio.rao@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:07:11 +0000 (15:07 -0700)]
mm/um: use free_highmem_page() to free highmem pages into buddy system
Use helper function free_highmem_page() to free highmem pages into
the buddy system.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:07:10 +0000 (15:07 -0700)]
mm/SPARC: use free_highmem_page() to free highmem pages into buddy system
Use helper function free_highmem_page() to free highmem pages into
the buddy system.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:07:08 +0000 (15:07 -0700)]
mm/PPC: use free_highmem_page() to free highmem pages into buddy system
Use helper function free_highmem_page() to free highmem pages into
the buddy system.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: "Suzuki K. Poulose" <suzuki@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:07:07 +0000 (15:07 -0700)]
mm/MIPS: use free_highmem_page() to free highmem pages into buddy system
Use helper function free_highmem_page() to free highmem pages into
the buddy system.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Daney <david.daney@cavium.com>
Cc: Cong Wang <amwang@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:07:06 +0000 (15:07 -0700)]
mm/microblaze: use free_highmem_page() to free highmem pages into buddy system
Use helper function free_highmem_page() to free highmem pages into
the buddy system.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:07:05 +0000 (15:07 -0700)]
mm/metag: use free_highmem_page() to free highmem pages into buddy system
Use helper function free_highmem_page() to free highmem pages into
the buddy system.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:07:04 +0000 (15:07 -0700)]
mm/FRV: use free_highmem_page() to free highmem pages into buddy system
Use helper function free_highmem_page() to free highmem pages into
the buddy system.
Also fix a bug that totalhigh_pages should be increased when freeing
a highmem page into the buddy system.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:07:03 +0000 (15:07 -0700)]
mm/ARM: use free_highmem_page() to free highmem pages into buddy system
Use helper function free_highmem_page() to free highmem pages into
the buddy system.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:07:00 +0000 (15:07 -0700)]
mm: introduce free_highmem_page() helper to free highmem pages into buddy system
The original goal of this patchset is to fix the bug reported by
https://bugzilla.kernel.org/show_bug.cgi?id=53501
Now it has also been expanded to reduce common code used by memory
initializion.
This is the second part, which applies to the previous part at:
http://marc.info/?l=linux-mm&m=
136289696323825&w=2
It introduces a helper function free_highmem_page() to free highmem
pages into the buddy system when initializing mm subsystem.
Introduction of free_highmem_page() is one step forward to clean up
accesses and modificaitons of totalhigh_pages, totalram_pages and
zone->managed_pages etc. I hope we could remove all references to
totalhigh_pages from the arch/ subdirectory.
We have only tested these patchset on x86 platforms, and have done basic
compliation tests using cross-compilers from ftp.kernel.org. That means
some code may not pass compilation on some architectures. So any help
to test this patchset are welcomed!
There are several other parts still under development:
Part3: refine code to manage totalram_pages, totalhigh_pages and
zone->managed_pages
Part4: introduce helper functions to simplify mem_init() and remove the
global variable num_physpages.
This patch:
Introduce helper function free_highmem_page(), which will be used by
architectures with HIGHMEM enabled to free highmem pages into the buddy
system.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Suzuki K. Poulose" <suzuki@in.ibm.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Attilio Rao <attilio.rao@citrix.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Cong Wang <amwang@redhat.com>
Cc: David Daney <david.daney@cavium.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rik van Riel <riel@redhat.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:58 +0000 (15:06 -0700)]
mm,kexec: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:57 +0000 (15:06 -0700)]
mm/metag: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:56 +0000 (15:06 -0700)]
mm/arc: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:55 +0000 (15:06 -0700)]
mm/xtensa: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:53 +0000 (15:06 -0700)]
mm/x86: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:52 +0000 (15:06 -0700)]
mm/unicore32: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:51 +0000 (15:06 -0700)]
mm/um: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:50 +0000 (15:06 -0700)]
mm/SH: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:49 +0000 (15:06 -0700)]
mm/score: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:48 +0000 (15:06 -0700)]
mm/s390: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:47 +0000 (15:06 -0700)]
mm/ppc: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anatolij Gustschin <agust@denx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:46 +0000 (15:06 -0700)]
mm/parisc: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:45 +0000 (15:06 -0700)]
mm/openrisc: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Also include <asm/sections.h> to avoid local declarations.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jonas Bonn <jonas@southpole.se>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:44 +0000 (15:06 -0700)]
mm/mn10300: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:43 +0000 (15:06 -0700)]
mm/MIPS: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:42 +0000 (15:06 -0700)]
mm/microblaze: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:41 +0000 (15:06 -0700)]
mm/m68k: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:40 +0000 (15:06 -0700)]
mm/m32r: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Also include <asm/sections.h> to avoid local declarations.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:39 +0000 (15:06 -0700)]
mm/IA64: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:38 +0000 (15:06 -0700)]
mm/h8300: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:37 +0000 (15:06 -0700)]
mm/FRV: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:36 +0000 (15:06 -0700)]
mm/cris: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Also include <asm/sections.h> to avoid local declaration.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Mikael Starvik <starvik@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:34 +0000 (15:06 -0700)]
mm/c6x: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:33 +0000 (15:06 -0700)]
mm/blackfin: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:27 +0000 (15:06 -0700)]
mm/avr32: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:26 +0000 (15:06 -0700)]
mm/ARM: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:25 +0000 (15:06 -0700)]
mm/alpha: use common help functions to free reserved pages
Use common help functions to free reserved pages. Also include
<asm/sections.h> to avoid local declarations.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Mon, 29 Apr 2013 22:06:21 +0000 (15:06 -0700)]
mm: introduce common help functions to deal with reserved/managed pages
The original goal of this patchset is to fix the bug reported by
https://bugzilla.kernel.org/show_bug.cgi?id=53501
Now it has also been expanded to reduce common code used by memory
initializion.
This is the first part, which applies to v3.9-rc1.
It introduces following common helper functions to simplify
free_initmem() and free_initrd_mem() on different architectures:
adjust_managed_page_count():
will be used to adjust totalram_pages, totalhigh_pages,
zone->managed_pages when reserving/unresering a page.
__free_reserved_page():
free a reserved page into the buddy system without adjusting
page statistics info
free_reserved_page():
free a reserved page into the buddy system and adjust page
statistics info
mark_page_reserved():
mark a page as reserved and adjust page statistics info
free_reserved_area():
free a continous ranges of pages by calling free_reserved_page()
free_initmem_default():
default method to free __init pages.
We have only tested these patchset on x86 platforms, and have done basic
compliation tests using cross-compilers from ftp.kernel.org. That means
some code may not pass compilation on some architectures. So any help to
test this patchset are welcomed!
There are several other parts still under development:
Part2: introduce free_highmem_page() to simplify freeing highmem pages
Part3: refine code to manage totalram_pages, totalhigh_pages and
zone->managed_pages
Part4: introduce helper functions to simplify mem_init() and remove the
global variable num_physpages.
This patch:
Code to deal with reserved/managed pages are duplicated by many
architectures, so introduce common help functions to reduce duplicated
code. These common help functions will also be used to concentrate code
to modify totalram_pages and zone->managed_pages, which makes the code
much more clear.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Anatolij Gustschin <agust@denx.de>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Howells <dhowells@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hillf Danton [Mon, 29 Apr 2013 22:06:19 +0000 (15:06 -0700)]
mm/vmscan.c: minor cleanup for kswapd
Local variable total_scanned is no longer used.
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Kara [Mon, 29 Apr 2013 22:06:18 +0000 (15:06 -0700)]
direct-io: submit bio after boundary buffer is added to it
Currently, dio_send_cur_page() submits bio before current page and cached
sdio->cur_page is added to the bio if sdio->boundary is set. This is
actually wrong because sdio->boundary means the current buffer is the last
one before metadata needs to be read. So we should rather submit the bio
after the current page is added to it.
Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Kazuya Mio <k-mio@sx.jp.nec.com>
Tested-by: Kazuya Mio <k-mio@sx.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Kara [Mon, 29 Apr 2013 22:06:17 +0000 (15:06 -0700)]
direct-io: fix boundary block handling
When we read/write a file sequentially, we will read/write not only the
data blocks but also the indirect blocks that may not be physically
adjacent to the data blocks. So filesystems set the BH_Boundary flag to
submit the previous I/O before reading/writing an indirect block.
However the generic direct IO code mishandles buffer_boundary(), setting
sdio->boundary before each submit_page_section() call which results in
sending only one page bios as underlying code thinks this page is the last
in the contiguous extent. So fix the problem by setting sdio->boundary
only if the current page is really the last one in the mapped extent.
With this patch and "direct-io: submit bio after boundary buffer is added
to it" I've measured about 10% throughput improvement of direct IO reads
on ext3 with SATA harddrive (from 90 MB/s to 100 MB/s). With ramdisk, the
improvement was about 3-fold (from 350 MB/s to 1.2 GB/s). For other
filesystems (such as ext4), the improvements won't be as visible because
the frequency of BH_Boundary flag being set is much smaller.
Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Kazuya Mio <k-mio@sx.jp.nec.com>
Tested-by: Kazuya Mio <k-mio@sx.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Toshi Kani [Mon, 29 Apr 2013 22:06:16 +0000 (15:06 -0700)]
mm: walk_memory_range(): fix typo in comment
Fix a typo "end_pft" in the comment of walk_memory_range().
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vineet Gupta [Mon, 29 Apr 2013 22:06:15 +0000 (15:06 -0700)]
memblock: add assertion for zero allocation alignment
This came to light when calling memblock allocator from arc port (for
copying flattended DT). If a "0" alignment is passed, the allocator
round_up() call incorrectly rounds up the size to 0.
round_up(num, alignto) => ((num - 1) | (alignto -1)) + 1
While the obvious allocation failure causes kernel to panic, it is better
to warn the caller to fix the code.
Tejun suggested that instead of BUG_ON(!align) - which might be
ineffective due to pending console init and such, it is better to WARN_ON,
and continue the boot with a reasonable default align.
Caller passing @size need not be handled similarly as the subsequent
panic will indicate that anyhow.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hillf Danton [Mon, 29 Apr 2013 22:06:14 +0000 (15:06 -0700)]
rmap: recompute pgoff for unmapping huge page
We have to recompute pgoff if the given page is huge, since result based
on HPAGE_SIZE is not approapriate for scanning the vma interval tree, as
shown by commit
36e4f20af833 ("hugetlb: do not use vma_hugecache_offset()
for vma_prio_tree_foreach").
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul E. McKenney [Mon, 29 Apr 2013 22:06:13 +0000 (15:06 -0700)]
vm: adjust ifdef for TINY_RCU
There is an ifdef in page_cache_get_speculative() that checks for !SMP
and TREE_RCU, which has been an impossible combination since the advent
of TINY_RCU. The ifdef enables a fastpath that is valid when preemption
is disabled by rcu_read_lock() in UP systems, which is the case when
TINY_RCU is enabled. This commit therefore adjusts the ifdef to
generate the fastpath when TINY_RCU is enabled.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reported-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Mon, 29 Apr 2013 22:06:12 +0000 (15:06 -0700)]
mm/shmem.c: remove an ifdef
Create a CONFIG_MMU=y stub for ramfs_nommu_expand_for_mapping() in the
usual fashion.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Mon, 29 Apr 2013 22:06:11 +0000 (15:06 -0700)]
mm, show_mem: suppress page counts in non-blockable contexts
On large systems with a lot of memory, walking all RAM to determine page
types may take a half second or even more.
In non-blockable contexts, the page allocator will emit a page allocation
failure warning unless __GFP_NOWARN is specified. In such contexts, irqs
are typically disabled and such a lengthy delay may even result in NMI
watchdog timeouts.
To fix this, suppress the page walk in such contexts when printing the
page allocation failure warning.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robert Jarzmik [Mon, 29 Apr 2013 22:06:10 +0000 (15:06 -0700)]
mm: trace filemap add and del
Use the events API to trace filemap loading and unloading of file pieces
into the page cache.
This patch aims at tracing the eviction reload cycle of executable and
shared libraries pages in a memory constrained environment.
The typical usage is to spot a specific device and inode (for example
/lib/libc.so) to see the eviction cycles, and find out if frequently
used code is rather spread across many pages (bad) or coallesced (good).
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Naoya Horiguchi [Mon, 29 Apr 2013 22:06:08 +0000 (15:06 -0700)]
HWPOISON: check dirty flag to match against clean page
Currently page_action() does not check dirty flag to determine whether
the error page is "clean mlocked/unevictable LRU" page. This doesn't
cause any misjudgement because we do matching against "dirty
mlocked/unevictable LRU" just before the check. But in order to make
code consistent and/or to avoid potential regression, we had better
check dirty flag explicitly.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Suggested-by: Chen Gong <gong.chen@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ming Lei [Mon, 29 Apr 2013 22:06:07 +0000 (15:06 -0700)]
fs/read_write.c: fix generic_file_llseek() comment
Commit
ef3d0fd27e90 ("vfs: do (nearly) lockless generic_file_llseek")
has removed i_mutex from generic_file_llseek, so update the comment
accordingly.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
James Hogan [Mon, 29 Apr 2013 22:06:06 +0000 (15:06 -0700)]
debug_locks.h: make warning more verbose
The WARN_ON(1) in DEBUG_LOCKS_WARN_ON is surprisingly awkward to track
down when it's hit, as it's usually buried in macros, causing multiple
instances to land on the same line number.
This patch makes it more useful by switching to:
WARN(1, "DEBUG_LOCKS_WARN_ON(%s)", #c);
so that the particular DEBUG_LOCKS_WARN_ON is more easily identified and
grep'd for. For example:
WARNING: at kernel/mutex.c:198 _mutex_lock_nested+0x31c/0x380()
DEBUG_LOCKS_WARN_ON(l->magic != l)
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sachin Kamat [Mon, 29 Apr 2013 22:06:00 +0000 (15:06 -0700)]
ocfs2/dlm: remove redundant null pointer check
kfree on a NULL pointer is a no-op. Remove the redundant null pointer
check.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Mon, 29 Apr 2013 22:05:59 +0000 (15:05 -0700)]
ocfs2: fix NULL dereference for moving extents
We can't dereference "bg" before it has been assigned. GCC should have
warned about this but "bg" was initialized to NULL. I've fixed that as
well.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Mon, 29 Apr 2013 22:05:58 +0000 (15:05 -0700)]
ocfs2: fix error handling in ocfs2_ioctl_move_extents()
Smatch complains that if we hit an error (for example if the file is
immutable) then "range" has uninitialized stack data and we copy it to
the user.
I've re-written the error handling to avoid this problem and make it a
little cleaner as well.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wei Yongjun [Mon, 29 Apr 2013 22:05:57 +0000 (15:05 -0700)]
ocfs2: fix error return code in ocfs2_info_handle_freefrag()
Fix to return a negative error code from the error handling case instead
of 0, as returned elsewhere in this function.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jeff Liu [Mon, 29 Apr 2013 22:05:56 +0000 (15:05 -0700)]
ocfs2: delay inode update transactions after verifying the input flags
There is no need to start the inode update transactions before/while
verifying the input flags. As a refinement, this patch delay the
transactions utill the pre-check up is ok.
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Acked-by: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Borislav Petkov [Mon, 29 Apr 2013 22:05:54 +0000 (15:05 -0700)]
scripts/decodecode: make faulting insn ptr more robust
It can accidentally happen that the faulting insn (the exact instruction
bytes) is repeated a little further on in the trace. This causes that
same instruction to be tagged twice, see example below.
What we want to do, however, is to track back from the end of the whole
disassembly so many lines as the slice which starts with the faulting
instruction is long. This leads us to the actual faulting instruction
and *then* we tag it.
While we're at it, we can drop the sed "g" flag because we address only
this one line.
Also, if we point to an instruction which changes decoding depending on
the slice being objdumped, like a Jcc insn, for example, we do not even
tag it as a faulting instruction because the instruction decode changes
in the second slice but we use that second format as a regex on the
fsrst disassembled buffer and more often than not that instruction
doesn't match.
Again, simply tag the line which is deduced from the original "<>"
marking we've received from the kernel.
This also solves the pathologic issue of multiple tagging like this:
29:* 0f 0b ud2 <-- trapping instruction
2b:* 0f 0b ud2 <-- trapping instruction
2d:* 0f 0b ud2 <-- trapping instruction
Double tagging example:
Code: 34 dd 40 30 ad 81 48 c7 c0 80 f6 00 00 48 8b 3c 30 48 01 c6 b8 ff ff ff ff 48 8d 57 f0 48 39 f7 74 2f 49 8b 4c 24 08 48 8b 47 f0 <48> 39 48 08 75 0e eb 2a 66 90 48 8b 40 f0 48 39 48 08 74 1e 48
All code
========
0: 34 dd xor $0xdd,%al
2: 40 30 ad 81 48 c7 c0 xor %bpl,-0x3f38b77f(%rbp)
9: 80 f6 00 xor $0x0,%dh
c: 00 48 8b add %cl,-0x75(%rax)
f: 3c 30 cmp $0x30,%al
11: 48 01 c6 add %rax,%rsi
14: b8 ff ff ff ff mov $0xffffffff,%eax
19: 48 8d 57 f0 lea -0x10(%rdi),%rdx
1d: 48 39 f7 cmp %rsi,%rdi
20: 74 2f je 0x51
22: 49 8b 4c 24 08 mov 0x8(%r12),%rcx
27: 48 8b 47 f0 mov -0x10(%rdi),%rax
2b:* 48 39 48 08 cmp %rcx,0x8(%rax) <-- trapping instruction
2f: 75 0e jne 0x3f
31: eb 2a jmp 0x5d
33: 66 90 xchg %ax,%ax
35: 48 8b 40 f0 mov -0x10(%rax),%rax
39:* 48 39 48 08 cmp %rcx,0x8(%rax) <-- trapping instruction
3d: 74 1e je 0x5d
3f: 48 rex.W
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rob Landley [Mon, 29 Apr 2013 22:05:53 +0000 (15:05 -0700)]
mkcapflags.pl: convert to mkcapflags.sh
Generate asm-x86/cpufeature.h with posix-2008 commands instead of perl.
Signed-off-by: Rob Landley <rob@landley.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowell@redhat.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Anurup m [Mon, 29 Apr 2013 22:05:52 +0000 (15:05 -0700)]
fs/fscache/stats.c: fix memory leak
There is a kernel memory leak observed when the proc file
/proc/fs/fscache/stats is read.
The reason is that in fscache_stats_open, single_open is called and the
respective release function is not called during release. Hence fix
with correct release function - single_release().
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=57101
Signed-off-by: Anurup m <anurup.m@huawei.com>
Cc: shyju pv <shyju.pv@huawei.com>
Cc: Sanil kumar <sanil.kumar@huawei.com>
Cc: Nataraj m <nataraj.m@huawei.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: David Howells <dhowells@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Zhou Zhu [Mon, 29 Apr 2013 22:05:50 +0000 (15:05 -0700)]
drivers/video/mmp: remove legacy hw definitions
Removed legacy hw definitions in hw/mmp_ctrl.h. These definitions are
for earlier soc versions and are not supported in this driver.
Signed-off-by: Zhou Zhu <zzhu3@marvell.com>
Cc: Paul Bolle <pebolle@tiscali.nl>
Cc: Lisa Du <cldu@marvell.com>
Cc: Guoqing Li <ligq@marvell.com>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
H Hartley Sweeten [Mon, 29 Apr 2013 22:05:44 +0000 (15:05 -0700)]
drivers/video/ep93xx-fb.c: fix section mismatch and use module_platform_driver
Remove the __init tags from the ep93xxfb_calc_fbsize() and
ep93xxfb_alloc_videomem() functions to fix the section mismatch
warnings.
Use module_platform_driver() to remove the init/exit boilerplate.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ryan Mallon <rmallon@gmail.com>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sachin Kamat [Mon, 29 Apr 2013 22:05:43 +0000 (15:05 -0700)]
drivers/video/exynos/exynos_mipi_dsi.c: convert to devm_ioremap_resource()
Use the newly introduced devm_ioremap_resource() instead of
devm_request_and_ioremap() which provides more consistent error
handling.
devm_ioremap_resource() provides its own error messages; so all explicit
error messages can be removed from the failure code paths.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Donghwa Lee <dh09.lee@samsung.com>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Reviewed-by: Thierry Reding <thierry.reding@avionic-design.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Haiyang Zhang [Mon, 29 Apr 2013 22:05:42 +0000 (15:05 -0700)]
drivers/video: add Hyper-V Synthetic Video Frame Buffer Driver
This is the driver for the Hyper-V Synthetic Video, which supports
screen resolution up to Full HD 1920x1080 on Windows Server 2012 host,
and 1600x1200 on Windows Server 2008 R2 or earlier. It also solves the
double mouse cursor issue of the emulated video mode.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Cc: Olaf Hering <olaf@aepfle.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Devendra Naga [Mon, 29 Apr 2013 22:05:31 +0000 (15:05 -0700)]
drivers/video/console/fbcon_cw.c: fix compiler warning in cw_update_attr
With make W=1 we get
drivers/video/console/fbcon_cw.c: In function `cw_update_attr':
drivers/video/console/fbcon_cw.c:30:8: warning: variable `t' set but not used [-Wunused-but-set-variable]
fixed by removing as since its used nowhere
Signed-off-by: Devendra Naga <devendra.aaru@gmail.com>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Shubhrajyoti Datta [Mon, 29 Apr 2013 22:05:25 +0000 (15:05 -0700)]
matroxfb: convert struct i2c_msg initialization to C99 format
Convert the struct i2c_msg initialization to C99 format. This makes
maintaining and editing the code simpler. Also helps once other fields
like transferred are added in future.
Thanks to Julia Lawall for automating the conversion.
Signed-off-by: Shubhrajyoti D <shubhrajyoti@ti.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Julia Lawall <julia@diku.dk>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chen Gang [Mon, 29 Apr 2013 22:05:19 +0000 (15:05 -0700)]
kernel/audit_tree.c: tree will leak memory when failure occurs in audit_trim_trees()
audit_trim_trees() calls get_tree(). If a failure occurs we must call
put_tree().
[akpm@linux-foundation.org: run put_tree() before mutex_lock() for small scalability improvement]
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chen Gang [Mon, 29 Apr 2013 22:05:18 +0000 (15:05 -0700)]
kernel/auditfilter.c: tree and watch will memory leak when failure occurs
In audit_data_to_entry() when a failure occurs we must check and free
the tree and watch to avoid a memory leak.
test:
plan:
test command:
"auditctl -a exit,always -w /etc -F auid=-1"
(on fedora17, need modify auditctl to let "-w /etc" has effect)
running:
under fedora17 x86_64, 2 CPUs 3.20GHz, 2.5GB RAM.
let 15 auditctl processes continue running at the same time.
monitor command:
watch -d -n 1 "cat /proc/meminfo | awk '{print \$2}' \
| head -n 4 | xargs \
| awk '{print \"used \",\$1 - \$2 - \$3 - \$4}'"
result:
for original version:
will use up all memory, within 3 hours.
kill all auditctl, the memory still does not free.
for new version (apply this patch):
after 14 hours later, not find issues.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gao feng [Mon, 29 Apr 2013 22:05:16 +0000 (15:05 -0700)]
audit: remove unnecessary #if CONFIG_AUDIT
The files which include kernel/audit.h are complied only when
CONFIG_AUDIT is set.
Just like audit_pid, there is no need to surround audit_ever_enabled
with CONFIG_AUDIT.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gao feng [Mon, 29 Apr 2013 22:05:15 +0000 (15:05 -0700)]
audit: remove duplicate export of audit_enabled
audit_enabled has already been exported in include/linux/audit.h. and
kernel/audit.h includes include/linux/audit.h, no need to export
aduit_enabled again in kernel/audit.h
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gao feng [Mon, 29 Apr 2013 22:05:14 +0000 (15:05 -0700)]
audit: don't check if kauditd is valid every time
We only need to check if kauditd is valid after we start it, if kauditd
is invalid, we will set kauditd_task to NULL. So next time, we will
start kauditd again.
It means if kauditd_task is not NULL,it must be valid.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rakib Mullick [Mon, 29 Apr 2013 22:05:13 +0000 (15:05 -0700)]
kernel/auditsc.c: use kzalloc instead of kmalloc+memset
In audit_alloc_context() use kzalloc instead of kmalloc+memset. Also
rename audit_zero_context() to audit_set_context(), to represent it's
inner workings properly.
[akpm@linux-foundation.org: remove audit_set_context() altogether - fold it into its caller]
Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Mon, 29 Apr 2013 22:05:12 +0000 (15:05 -0700)]
kthread: kill task_get_live_kthread()
task_get_live_kthread() looks confusing and unneeded. It does
get_task_struct() but only kthread_stop() needs this, it can be called
even if the calller doesn't have a reference when we know that this
kthread can't exit until we do kthread_stop().
kthread_park() and kthread_unpark() do not need get_task_struct(), the
callers already have the reference. And it can not help if we can race
with the exiting kthread anyway, kthread_park() can hang forever in this
case.
Change kthread_park() and kthread_unpark() to use to_live_kthread(),
change kthread_stop() to do get_task_struct() by hand and remove
task_get_live_kthread().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Mon, 29 Apr 2013 22:05:01 +0000 (15:05 -0700)]
kthread: introduce to_live_kthread()
"k->vfork_done != NULL" with a barrier() after to_kthread(k) in
task_get_live_kthread(k) looks unclear, and sub-optimal because we load
->vfork_done twice.
All we need is to ensure that we do not return to_kthread(NULL). Add a
new trivial helper which loads/checks ->vfork_done once, this also looks
more understandable.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 29 Apr 2013 19:19:23 +0000 (12:19 -0700)]
Merge tag 'usb-3.10-rc1' of git://git./linux/kernel/git/gregkh/usb
Pull USB patches from Greg Kroah-Hartman:
"Here's the big USB pull request for 3.10-rc1.
Lots of USB patches here, the majority being USB gadget changes and
USB-serial driver cleanups, the rest being ARM build fixes / cleanups,
and individual driver updates. We also finally got some chipidea
fixes, which have been delayed for a number of kernel releases, as the
maintainer has now reappeared.
All of these have been in linux-next for a while"
* tag 'usb-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (568 commits)
USB: ehci-msm: USB_MSM_OTG needs USB_PHY
USB: OHCI: avoid conflicting platform drivers
USB: OMAP: ISP1301 needs USB_PHY
USB: lpc32xx: ISP1301 needs USB_PHY
USB: ftdi_sio: enable two UART ports on ST Microconnect Lite
usb: phy: tegra: don't call into tegra-ehci directly
usb: phy: phy core cannot yet be a module
USB: Fix initconst in ehci driver
usb-storage: CY7C68300A chips do not support Cypress ATACB
USB: serial: option: Added support Olivetti Olicard 145
USB: ftdi_sio: correct ST Micro Connect Lite PIDs
ARM: mxs_defconfig: add CONFIG_USB_PHY
ARM: imx_v6_v7_defconfig: add CONFIG_USB_PHY
usb: phy: remove exported function from __init section
usb: gadget: zero: put function instances on unbind
usb: gadget: f_sourcesink.c: correct a copy-paste misnomer
usb: gadget: cdc2: fix error return code in cdc_do_config()
usb: gadget: multi: fix error return code in rndis_do_config()
usb: gadget: f_obex: fix error return code in obex_bind()
USB: storage: convert to use module_usb_driver()
...
Linus Torvalds [Mon, 29 Apr 2013 19:16:17 +0000 (12:16 -0700)]
Merge tag 'tty-3.10-rc1' of git://git./linux/kernel/git/gregkh/tty
Pull tty/serial driver update from Greg Kroah-Hartman:
"Here's the big tty/serial driver merge request for 3.10-rc1
Once again, Jiri has a number of TTY driver fixes and cleanups, and
Peter Hurley came through with a bunch of ldisc fixes that resolve a
number of reported issues. There are some other serial driver
cleanups as well.
All of these have been in the linux-next tree for a while"
* tag 'tty-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (117 commits)
tty/serial/sirf: fix MODULE_DEVICE_TABLE
serial: mxs: drop superfluous {get|put}_device
serial: mxs: fix buffer overflow
ARM: PL011: add support for extended FIFO-size of PL011-r1p5
serial_core.c: add put_device() after device_find_child()
tty: Fix unsafe bit ops in tty_throttle_safe/unthrottle_safe
serial: sccnxp: Replace pdata.init/exit with regulator API
serial: sccnxp: Do not override device name
TTY: pty, fix compilation warning
TTY: rocket, fix compilation warning
TTY: ircomm: fix DTR being raised on hang up
TTY: synclinkmp: fix DTR being raised on hang up
TTY: synclink_gt: fix DTR being raised on hang up
TTY: synclink: fix DTR being raised on hang up
serial: 8250_dw: Fix the stub for dw8250_probe_acpi()
serial: 8250_dw: Convert to devm_ioremap()
serial: 8250_dw: Set port capabilities based on CPR register
serial: 8250_dw: Let ACPI code extract the DMA client info
serial: 8250_dw: Support clk framework also with ACPI
serial: 8250_dw: Enable runtime PM
...
Linus Torvalds [Mon, 29 Apr 2013 18:34:17 +0000 (11:34 -0700)]
Merge tag 'staging-3.10-rc1' of git://git./linux/kernel/git/gregkh/staging
Pull staging driver tree update from Greg Kroah-Hartman:
"Here's the big staging driver tree update for 3.10-rc1
This update contains loads of comedi driver cleanups and fixes in
here, iio updates, android driver changes, and other various staging
driver cleanups.
Thanks to some drivers being removed, and the comedi driver cleanups,
we have removed more code than we added:
627 files changed, 65145 insertions(+), 76321 deletions(-)
which is always nice to see.
All of these have been in linux-next for a while."
* tag 'staging-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (940 commits)
staging: comedi: ni_labpc: fix legacy driver build
staging: comedi: das800: cleanup the cio-das802/16 fifo comments
staging: comedi: das800: rename CamelCase vars in das800_ai_do_cmd()
staging: comedi: das800: tidy up the private data
staging: comedi: das800: tidy up das800_interrupt()
staging: comedi: das800: tidy up das800_ai_insn_read()
staging: comedi: das800: tidy up das800_di_insn_bits()
staging: comedi: das800: tidy up das800_do_insn_bits()
staging: comedi: das800: remove extra divisor calculation call
staging: comedi: das800: rename {enable,disable}_das800
staging: comedi: das800: tidy up subdevice init
staging: comedi: das800: allow attaching without interrupt support
staging: comedi: das800: interrupts are required for async command support
staging: comedi: das800: tidy up das800_ai_do_cmdtest()
staging: comedi: das800: remove 'volatile' on private data variables
staging: comedi: das800: cleanup the boardinfo
staging: comedi: das800: cleanup range table declarations
staging: comedi: das800: introduce das800_ind_{write, read}()
staging: comedi: das800: remove forward declarations
staging: comedi: das800: move das800_set_frequency()
...
Linus Torvalds [Mon, 29 Apr 2013 18:31:50 +0000 (11:31 -0700)]
Merge tag 'driver-core-3.10-rc1' of git://git./linux/kernel/git/gregkh/driver-core
Pull driver core update from Greg Kroah-Hartman:
"Here's the merge request for the driver core tree for 3.10-rc1
It's pretty small, just a number of driver core and sysfs updates and
fixes, all of which have been in linux-next for a while now.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
Fixed conflict in kernel/rtmutex-tester.c, the locking tree had a better
fix for the same sysfs file mode problem.
* tag 'driver-core-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
PM / Runtime: Idle devices asynchronously after probe|release
driver core: handle user namespaces properly with the uid/gid devtmpfs change
driver core: devtmpfs: fix compile failure with CONFIG_UIDGID_STRICT_TYPE_CHECKS
devtmpfs: add base.h include
driver core: add uid and gid to devtmpfs
sysfs: check if one entry has been removed before freeing
sysfs: fix crash_notes_size build warning
sysfs: fix use after free in case of concurrent read/write and readdir
rtmutex-tester: fix mode of sysfs files
Documentation: Add ABI entry for crash_notes and crash_notes_size
sysfs: Add crash_notes_size to export percpu note size
driver core: platform_device.h: fix checkpatch errors and warnings
driver core: platform.c: fix checkpatch errors and warnings
driver core: warn that platform_driver_probe can not use deferred probing
sysfs: use atomic_inc_unless_negative in sysfs_get_active
base: core: WARN() about bogus permissions on device attributes
device: separate all subsys mutexes
Linus Torvalds [Mon, 29 Apr 2013 18:18:34 +0000 (11:18 -0700)]
Merge tag 'char-misc-3.10-rc1' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc driver update from Greg Kroah-Hartman:
"Here's the big char / misc driver update for 3.10-rc1
A number of various driver updates, the majority being new
functionality in the MEI driver subsystem (it's now a subsystem, it
started out just a single driver), extcon updates, memory updates,
hyper-v updates, and a bunch of other small stuff that doesn't fit in
any other tree.
All of these have been in linux-next for a while"
* tag 'char-misc-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (148 commits)
Tools: hv: Fix a checkpatch warning
tools: hv: skip iso9660 mounts in hv_vss_daemon
tools: hv: use FIFREEZE/FITHAW in hv_vss_daemon
tools: hv: use getmntent in hv_vss_daemon
Tools: hv: Fix a checkpatch warning
tools: hv: fix checks for origin of netlink message in hv_vss_daemon
Tools: hv: fix warnings in hv_vss_daemon
misc: mark spear13xx-pcie-gadget as broken
mei: fix krealloc() misuse in in mei_cl_irq_read_msg()
mei: reduce flow control only for completed messages
mei: reseting -> resetting
mei: fix reading large reposnes
mei: revamp mei_irq_read_client_message function
mei: revamp mei_amthif_irq_read_message
mei: revamp hbm state machine
Revert "drivers/scsi: use module_pcmcia_driver() in pcmcia drivers"
Revert "scsi: pcmcia: nsp_cs: remove module init/exit function prototypes"
scsi: pcmcia: nsp_cs: remove module init/exit function prototypes
mei: wd: fix line over 80 characters
misc: tsl2550: Use dev_pm_ops
...
Linus Torvalds [Mon, 29 Apr 2013 16:45:48 +0000 (09:45 -0700)]
Merge tag 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon update from Guenter Roeck:
- New drivers for NCT6775, NCT6776, NCT6779, and LM95234.
- Added support for LTC2974, LTC3883, LM25056, TMP431, TMP432, ADT7310,
and ADT7320 to existing drivers.
- Various code cleanups and minor improvements.
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (54 commits)
hwmon: (nct6775) Fix coding style problems
hwmon: (nct6775) Constify strings
hwmon: (tmp401) Add support for TMP432
hwmon: (tmp401) Add support for update_interval attribute
hwmon: (tmp401) Reset valid flag when resetting temperature history
hwmon: (tmp401) Simplification and cleanup
hwmon: (tmp401) Use sysfs_create_group / sysfs_remove_group
hwmon: (tmp401) Drop unused defines, use BIT for bit masks
hwmon: (nct6775) Use ARRAY_SIZE for loops where possible
documentation: hwmon: Fix typo in documentation/hwmon
hwmon: (nct6775) Enable both AUXTIN and VIN3 on NCT6776
hwmon: (ad7314) use spi_get_drvdata() and spi_set_drvdata()
MAINTAINERS: Add myself as maintainer for the NCT6775 driver
hwmon: (nct6775) Expand scope of supported chips
hwmon: (gpio-fan) Use is_visible to determine if attributes should be created
hwmon: (tmp401) Fix device detection for TMP411B and TMP411C
hwmon: Add driver for LM95234
hwmon: (tmp401) Add support for TMP431
hwmon: (pmbus/lm25066) Add support for LM25056
hwmon: (pmbus/lm25066) Refactor device specific coefficients
...
Linus Torvalds [Mon, 29 Apr 2013 16:40:35 +0000 (09:40 -0700)]
Merge tag 'pinctrl-for-v3.10' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pinctrl update from Linus Walleij:
"These are the pinctrl changes for v3.10:
- Patrice Chotard contributed a new configuration debugfs interface
and reintroduced fine-grained locking into the core: instead of
having a "big pinctrl lock" we have a per-controller lock and
specialized locks for the global controller and pinctrl handle
lists.
- Haoijan Zhuang deleted all the PXA and MMP2 pinctrl drivers and
replaced them with pinctrl-single (which is also used by other
SoCs) so we are gaining consolidation. The platform particulars
now come in through the device tree.
- Haoijan also added support for generic pin config into the
pinctrl-single driver which is another big consolidation win.
- Finally also GPIO ranges are now supported by the pinctrl-single
driver.
- Tomasz Figa contributed a new Samsung S3C pinctrl driver, bringing
more of the older Samsung platforms under the pinctrl umbrella and
out of arch/arm.
- Maxime Ripard contributed new Allwinner A10/A13 drivers.
- Sachin Kamat, Wei Yongjun and Axel Lin did a lot of cleanups."
* tag 'pinctrl-for-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (66 commits)
pinctrl: move subsystem mutex to pinctrl_dev struct
pinctrl/pinconfig: fix misplaced goto
pinctrl: s3c64xx: Fix build error caused by undefined chained_irq_enter
pinctrl/pinconfig: add debug interface
pinctrl: abx500: fix issue when no pdata
pinctrl: pinctrl-single: add missing double quote
pinctrl: sunxi: Rename wemac functions to emac
pinctrl: exynos5440: add gpio interrupt support
pinctrl: exynos5440: fix probe failure due to missing pin-list in config nodes
pinctrl: ab8505: Staticize some symbols
pinctrl: ab8540: Staticize some symbols
pinctrl: ab9540: Staticize some symbols
pinctrl: ab8500: Staticize some symbols
pinctrl: abx500: Staticize some symbols
pinctrl: Add pinctrl-s3c64xx driver
pinctrl: samsung: Handle banks with two configuration registers
pinctrl: samsung: Remove hardcoded register offsets
pinctrl: samsung: Split pin bank description into two structures
pinctrl: samsung: Include pinctrl-exynos driver data conditionally
pinctrl: samsung: Protect bank registers with a spinlock
...
Linus Torvalds [Mon, 29 Apr 2013 16:35:27 +0000 (09:35 -0700)]
Merge tag 'fbdev-for-3.10' of git://gitorious.org/linux-omap-dss2/linux
Pull fbdev updates from Tomi Valkeinen:
- use vm_iomap_memory() in various fb drivers to map the fb memory to
userspace
- Cleanups for the videomode and display_timing features
- Updates to vt8500, wm8505 and auo-k190x fb drivers
* tag 'fbdev-for-3.10' of git://gitorious.org/linux-omap-dss2/linux: (36 commits)
fbdev: fix check for fb_mmap's mmio availability
fbdev: improve fb_mmap bounds checks
fbdev/ps3fb: use vm_iomap_memory()
fbdev/sgivwfb: use vm_iomap_memory()
fbdev/vermillion: use vm_iomap_memory()
fbdev/sa1100fb: use vm_iomap_memory()
fbdev/fb-puv3: use vm_iomap_memory()
fbdev/controlfb: use vm_iomap_memory()
fbdev/omapfb: use vm_iomap_memory()
video: vt8500: fix Kconfig for videomode
video/s3c: move platform_data out of arch/arm
video/exynos: remove unnecessary header inclusions
drivers/video: fsl-diu-fb: add hardware cursor support
drivers: video: use module_platform_driver_probe()
ARM: OMAP: remove "config FB_OMAP_CONSISTENT_DMA_SIZE"
video: wm8505fb: Convert to devm_ioremap_resource()
AUO-K190x: Add resolutions for portrait displays
AUO-K190x: add framebuffer rotation support
AUO-K190x: add a 16bit truecolor mode
AUO-K190x: make color handling more flexible
...
Linus Torvalds [Mon, 29 Apr 2013 16:30:25 +0000 (09:30 -0700)]
Merge tag 'pci-v3.10-changes' of git://git./linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"PCI changes for the v3.10 merge window:
PCI device hotplug
- Remove ACPI PCI subdrivers (Jiang Liu, Myron Stowe)
- Make acpiphp builtin only, not modular (Jiang Liu)
- Add acpiphp mutual exclusion (Jiang Liu)
Power management
- Skip "PME enabled/disabled" messages when not supported (Rafael
Wysocki)
- Fix fallback to PCI_D0 (Rafael Wysocki)
Miscellaneous
- Factor quirk_io_region (Yinghai Lu)
- Cache MSI capability offsets & cleanup (Gavin Shan, Bjorn Helgaas)
- Clean up EISA resource initialization and logging (Bjorn Helgaas)
- Fix prototype warnings (Andy Shevchenko, Bjorn Helgaas)
- MIPS: Initialize of_node before scanning bus (Gabor Juhos)
- Fix pcibios_get_phb_of_node() declaration "weak" annotation (Gabor
Juhos)
- Add MSI INTX_DISABLE quirks for AR8161/AR8162/etc (Xiong Huang)
- Fix aer_inject return values (Prarit Bhargava)
- Remove PME/ACPI dependency (Andrew Murray)
- Use shared PCI_BUS_NUM() and PCI_DEVID() (Shuah Khan)"
* tag 'pci-v3.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (63 commits)
vfio-pci: Use cached MSI/MSI-X capabilities
vfio-pci: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK
PCI: Remove "extern" from function declarations
PCI: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK
PCI: Drop msi_mask_reg() and remove drivers/pci/msi.h
PCI: Use msix_table_size() directly, drop multi_msix_capable()
PCI: Drop msix_table_offset_reg() and msix_pba_offset_reg() macros
PCI: Drop is_64bit_address() and is_mask_bit_support() macros
PCI: Drop msi_data_reg() macro
PCI: Drop msi_lower_address_reg() and msi_upper_address_reg() macros
PCI: Drop msi_control_reg() macro and use PCI_MSI_FLAGS directly
PCI: Use cached MSI/MSI-X offsets from dev, not from msi_desc
PCI: Clean up MSI/MSI-X capability #defines
PCI: Use cached MSI-X cap while enabling MSI-X
PCI: Use cached MSI cap while enabling MSI interrupts
PCI: Remove MSI/MSI-X cap check in pci_msi_check_device()
PCI: Cache MSI/MSI-X capability offsets in struct pci_dev
PCI: Use u8, not int, for PM capability offset
[SCSI] megaraid_sas: Use correct #define for MSI-X capability
PCI: Remove "extern" from function declarations
...
Linus Torvalds [Mon, 29 Apr 2013 16:23:17 +0000 (09:23 -0700)]
Merge tag 'please-pull-misc-3.10' of git://git./linux/kernel/git/aegl/linux
Pull ia64 fixes from Tony Luck:
"Bundle of miscellaneous ia64 fixes for 3.10 merge window."
* tag 'please-pull-misc-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
Add size restriction to the kdump documentation
Fix example error_injection_tool
Fix build error for numa_clear_node() under IA64
Fix initialization of CMCI/CMCP interrupts
Change "select DMAR" to "select INTEL_IOMMU"
Wrong asm register contraints in the kvm implementation
Wrong asm register contraints in the futex implementation
Remove cast for kmalloc return value
Fix kexec oops when iosapic was removed
iosapic: fix a minor typo in comments
Add WB/UC check for early_ioremap
Fix broken fsys_getppid()
tiocx: check retval from bus_register()