GitHub/mt8127/android_kernel_alcatel_ttab.git
15 years agomm: remove pagevec_swap_free()
KOSAKI Motohiro [Tue, 31 Mar 2009 22:23:14 +0000 (15:23 -0700)]
mm: remove pagevec_swap_free()

pagevec_swap_free() is now unused.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@redhat.com>
Acked-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomm: don't free swap slots on page deactivation
Johannes Weiner [Tue, 31 Mar 2009 22:23:13 +0000 (15:23 -0700)]
mm: don't free swap slots on page deactivation

The pagevec_swap_free() at the end of shrink_active_list() was introduced
in 68a22394 "vmscan: free swap space on swap-in/activation" when
shrink_active_list() was still rotating referenced active pages.

In 7e9cd48 "vmscan: fix pagecache reclaim referenced bit check" this was
changed, the rotating removed but the pagevec_swap_free() after the
rotation loop was forgotten, applying now to the pagevec of the
deactivation loop instead.

Now swap space is freed for deactivated pages.  And only for those that
happen to be on the pagevec after the deactivation loop.

Complete 7e9cd48 and remove the rest of the swap freeing.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomm: move pagevec stripping to save unlock-relock
Johannes Weiner [Tue, 31 Mar 2009 22:23:12 +0000 (15:23 -0700)]
mm: move pagevec stripping to save unlock-relock

In shrink_active_list() after the deactivation loop, we strip buffer heads
from the potentially remaining pages in the pagevec.

Currently, this drops the zone's lru lock for stripping, only to reacquire
it again afterwards to update statistics.

It is not necessary to strip the pages before updating the stats, so move
the whole thing out of the protected region and save the extra locking.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: MinChan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomm: reintroduce and deprecate rlimit based access for SHM_HUGETLB
Ravikiran G Thirumalai [Tue, 31 Mar 2009 22:21:26 +0000 (15:21 -0700)]
mm: reintroduce and deprecate rlimit based access for SHM_HUGETLB

Allow non root users with sufficient mlock rlimits to be able to allocate
hugetlb backed shm for now.  Deprecate this though.  This is being
deprecated because the mlock based rlimit checks for SHM_HUGETLB is not
consistent with mmap based huge page allocations.

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Reviewed-by: Mel Gorman <mel@csn.ul.ie>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Adam Litke <agl@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomm: fix SHM_HUGETLB to work with users in hugetlb_shm_group
Ravikiran G Thirumalai [Tue, 31 Mar 2009 22:19:40 +0000 (15:19 -0700)]
mm: fix SHM_HUGETLB to work with users in hugetlb_shm_group

Fix hugetlb subsystem so that non root users belonging to
hugetlb_shm_group can actually allocate hugetlb backed shm.

Currently non root users cannot even map one large page using SHM_HUGETLB
when they belong to the gid in /proc/sys/vm/hugetlb_shm_group.  This is
because allocation size is verified against RLIMIT_MEMLOCK resource limit
even if the user belongs to hugetlb_shm_group.

This patch
1. Fixes hugetlb subsystem so that users with CAP_IPC_LOCK and users
   belonging to hugetlb_shm_group don't need to be restricted with
   RLIMIT_MEMLOCK resource limits
2. This patch also disables mlock based rlimit checking (which will
   be reinstated and marked deprecated in a subsequent patch).

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Reviewed-by: Mel Gorman <mel@csn.ul.ie>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Adam Litke <agl@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agovfs: add/use account_page_dirtied()
Edward Shishkin [Tue, 31 Mar 2009 22:19:39 +0000 (15:19 -0700)]
vfs: add/use account_page_dirtied()

Add a helper function account_page_dirtied().  Use that from two
callsites.  reiser4 adds a function which adds a third callsite.

Signed-off-by: Edward Shishkin<edward.shishkin@gmail.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agovmscan: respect higher order in zone_reclaim()
Johannes Weiner [Tue, 31 Mar 2009 22:19:38 +0000 (15:19 -0700)]
vmscan: respect higher order in zone_reclaim()

During page allocation, there are two stages of direct reclaim that are
applied to each zone in the preferred list.  The first stage using
zone_reclaim() reclaims unmapped file backed pages and slab pages if over
defined limits as these are cheaper to reclaim.  The caller specifies the
order of the target allocation but the scan control is not being correctly
initialised.

The impact is that the correct number of pages are being reclaimed but
that lumpy reclaim is not being applied.  This increases the chances of a
full direct reclaim via try_to_free_pages() is required.

This patch initialises the order field of the scan control as requested by
the caller.

[mel@csn.ul.ie: rewrote changelog]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomm: add comment why mark_page_accessed() would be better than pte_mkyoung() in follow...
KOSAKI Motohiro [Tue, 31 Mar 2009 22:19:37 +0000 (15:19 -0700)]
mm: add comment why mark_page_accessed() would be better than pte_mkyoung() in follow_page()

At first look, mark_page_accessed() in follow_page() seems a bit strange.
It seems pte_mkyoung() would be better consistent with other kernel code.

However, it is intentional. The commit log said:

    ------------------------------------------------
    commit 9e45f61d69be9024a2e6bef3831fb04d90fac7a8
    Author: akpm <akpm>
    Date:   Fri Aug 15 07:24:59 2003 +0000

    [PATCH] Use mark_page_accessed() in follow_page()

    Touching a page via follow_page() counts as a reference so we should be
    either setting the referenced bit in the pte or running mark_page_accessed().

    Altering the pte is tricky because we haven't implemented an atomic
    pte_mkyoung().  And mark_page_accessed() is better anyway because it has more
    aging state: it can move the page onto the active list.

    BKrev: 3f3c8acbplT8FbwBVGtth7QmnqWkIw
    ------------------------------------------------

The atomic issue is still true nowadays. adding comment help to understand
code intention and it would be better.

[akpm@linux-foundation.org: clarify text]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agovmscan: clip swap_cluster_max in shrink_all_memory()
Johannes Weiner [Tue, 31 Mar 2009 22:19:35 +0000 (15:19 -0700)]
vmscan: clip swap_cluster_max in shrink_all_memory()

shrink_inactive_list() scans in sc->swap_cluster_max chunks until it hits
the scan limit it was passed.

shrink_inactive_list()
{
do {
isolate_pages(swap_cluster_max)
shrink_page_list()
} while (nr_scanned < max_scan);
}

This assumes that swap_cluster_max is not bigger than the scan limit
because the latter is checked only after at least one iteration.

In shrink_all_memory() sc->swap_cluster_max is initialized to the overall
reclaim goal in the beginning but not decreased while reclaim is making
progress which leads to subsequent calls to shrink_inactive_list()
reclaiming way too much in the one iteration that is done unconditionally.

Set sc->swap_cluster_max always to the proper goal before doing
  shrink_all_zones()
    shrink_list()
      shrink_inactive_list().

While the current shrink_all_memory() happily reclaims more than actually
requested, this patch fixes it to never exceed the goal:

unpatched
   wanted=10000 reclaimed=13356
   wanted=10000 reclaimed=19711
   wanted=10000 reclaimed=10289
   wanted=10000 reclaimed=17306
   wanted=10000 reclaimed=10700
   wanted=10000 reclaimed=10004
   wanted=10000 reclaimed=13301
   wanted=10000 reclaimed=10976
   wanted=10000 reclaimed=10605
   wanted=10000 reclaimed=10088
   wanted=10000 reclaimed=15000

patched
   wanted=10000 reclaimed=10000
   wanted=10000 reclaimed=9599
   wanted=10000 reclaimed=8476
   wanted=10000 reclaimed=8326
   wanted=10000 reclaimed=10000
   wanted=10000 reclaimed=10000
   wanted=10000 reclaimed=9919
   wanted=10000 reclaimed=10000
   wanted=10000 reclaimed=10000
   wanted=10000 reclaimed=10000
   wanted=10000 reclaimed=10000
   wanted=10000 reclaimed=9624
   wanted=10000 reclaimed=10000
   wanted=10000 reclaimed=10000
   wanted=8500 reclaimed=8092
   wanted=316 reclaimed=316

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: MinChan Kim <minchan.kim@gmail.com>
Acked-by: Nigel Cunningham <ncunningham@crca.org.au>
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomm: shrink_all_memory(): use sc.nr_reclaimed
MinChan Kim [Tue, 31 Mar 2009 22:19:34 +0000 (15:19 -0700)]
mm: shrink_all_memory(): use sc.nr_reclaimed

Commit a79311c14eae4bb946a97af25f3e1b17d625985d "vmscan: bail out of
direct reclaim after swap_cluster_max pages" moved the nr_reclaimed
counter into the scan control to accumulate the number of all reclaimed
pages in a reclaim invocation.

shrink_all_memory() can use the same mechanism. it increase code
consistency and redability.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: MinChan Kim <minchan.kim@gmail.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomm: don't call mark_page_accessed() in do_swap_page()
KOSAKI Motohiro [Tue, 31 Mar 2009 22:19:33 +0000 (15:19 -0700)]
mm: don't call mark_page_accessed() in do_swap_page()

commit bf3f3bc5e734706730c12a323f9b2068052aa1f0 (mm: don't
mark_page_accessed in fault path) only remove the mark_page_accessed() in
filemap_fault().

Therefore, swap-backed pages and file-backed pages have inconsistent
behavior.  mark_page_accessed() should be removed from do_swap_page().

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomm: introduce for_each_populated_zone() macro
KOSAKI Motohiro [Tue, 31 Mar 2009 22:19:31 +0000 (15:19 -0700)]
mm: introduce for_each_populated_zone() macro

Impact: cleanup

In almost cases, for_each_zone() is used with populated_zone().  It's
because almost function doesn't need memoryless node information.
Therefore, for_each_populated_zone() can help to make code simplify.

This patch has no functional change.

[akpm@linux-foundation.org: small cleanup]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agovmscan: rename sc.may_swap to may_unmap
Johannes Weiner [Tue, 31 Mar 2009 22:19:30 +0000 (15:19 -0700)]
vmscan: rename sc.may_swap to may_unmap

sc.may_swap does not only influence reclaiming of anon pages but pages
mapped into pagetables in general, which also includes mapped file pages.

In shrink_page_list():

if (!sc->may_swap && page_mapped(page))
goto keep_locked;

For anon pages, this makes sense as they are always mapped and reclaiming
them always requires swapping.

But mapped file pages are skipped here as well and it has nothing to do
with swapping.

The real effect of the knob is whether mapped pages are unmapped and
reclaimed or not.  Rename it to `may_unmap' to have its name match its
actual meaning more precisely.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: MinChan Kim <minchan.kim@gmail.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoget_mm_hiwater_xxx: trivial, s/define/inline/
Oleg Nesterov [Tue, 31 Mar 2009 22:19:29 +0000 (15:19 -0700)]
get_mm_hiwater_xxx: trivial, s/define/inline/

Andrew pointed out get_mm_hiwater_xxx() evaluate "mm" argument thrice/twice,
make them inline.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Hugh Dickins <hugh@veritas.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agooom_kill: don't call for int_sqrt(0)
Cyrill Gorcunov [Tue, 31 Mar 2009 22:19:27 +0000 (15:19 -0700)]
oom_kill: don't call for int_sqrt(0)

There is no need to call for int_sqrt if argument is 0.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <cl@linux-foundation.org>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agovmap: remove needless lock and list in vmap
MinChan Kim [Tue, 31 Mar 2009 22:19:26 +0000 (15:19 -0700)]
vmap: remove needless lock and list in vmap

vmap's dirty_list is unused.  It's for optimizing flushing.  but Nick
didn't write the code yet.  so, we don't need it until time as it is
needed.

This patch removes vmap_block's dirty_list and codes related to it.

Signed-off-by: MinChan Kim <minchan.kim@gmail.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomm: mminit_validate_memmodel_limits(): remove redundant test
Cyrill Gorcunov [Tue, 31 Mar 2009 22:19:25 +0000 (15:19 -0700)]
mm: mminit_validate_memmodel_limits(): remove redundant test

In case if start_pfn overlap the upper bound no need to test end_pfn again
since we have it already trimmed.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoproc tty: remove struct tty_operations::read_proc
Alexey Dobriyan [Tue, 31 Mar 2009 22:19:25 +0000 (15:19 -0700)]
proc tty: remove struct tty_operations::read_proc

struct tty_operations::proc_fops took it's place and there is one less
create_proc_read_entry() user now!

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoproc tty: switch xtensa iss console to ->proc_fops
Alexey Dobriyan [Tue, 31 Mar 2009 22:19:24 +0000 (15:19 -0700)]
proc tty: switch xtensa iss console to ->proc_fops

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoproc tty: switch ia64 simserial to ->proc_fops
Alexey Dobriyan [Tue, 31 Mar 2009 22:19:23 +0000 (15:19 -0700)]
proc tty: switch ia64 simserial to ->proc_fops

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoproc tty: switch amiserial to ->proc_fops
Alexey Dobriyan [Tue, 31 Mar 2009 22:19:23 +0000 (15:19 -0700)]
proc tty: switch amiserial to ->proc_fops

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoproc tty: switch ircomm to ->proc_fops
Alexey Dobriyan [Tue, 31 Mar 2009 22:19:22 +0000 (15:19 -0700)]
proc tty: switch ircomm to ->proc_fops

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoproc tty: switch usb-serial to ->proc_fops
Alexey Dobriyan [Tue, 31 Mar 2009 22:19:21 +0000 (15:19 -0700)]
proc tty: switch usb-serial to ->proc_fops

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoproc tty: switch serial_core to ->proc_fops
Alexey Dobriyan [Tue, 31 Mar 2009 22:19:21 +0000 (15:19 -0700)]
proc tty: switch serial_core to ->proc_fops

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoproc tty: switch sdio_uart to ->proc_fops
Alexey Dobriyan [Tue, 31 Mar 2009 22:19:20 +0000 (15:19 -0700)]
proc tty: switch sdio_uart to ->proc_fops

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoproc tty: switch synclinkmp to ->proc_fops
Alexey Dobriyan [Tue, 31 Mar 2009 22:19:20 +0000 (15:19 -0700)]
proc tty: switch synclinkmp to ->proc_fops

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoproc tty: switch synclink_gt to ->proc_fops
Alexey Dobriyan [Tue, 31 Mar 2009 22:19:19 +0000 (15:19 -0700)]
proc tty: switch synclink_gt to ->proc_fops

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoproc tty: switch synclink to ->proc_fops
Alexey Dobriyan [Tue, 31 Mar 2009 22:19:18 +0000 (15:19 -0700)]
proc tty: switch synclink to ->proc_fops

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoproc tty: switch stallion to ->proc_fops
Alexey Dobriyan [Tue, 31 Mar 2009 22:19:18 +0000 (15:19 -0700)]
proc tty: switch stallion to ->proc_fops

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoproc tty: switch synclink_cs to ->proc_fops
Alexey Dobriyan [Tue, 31 Mar 2009 22:19:17 +0000 (15:19 -0700)]
proc tty: switch synclink_cs to ->proc_fops

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoproc tty: switch istallion to ->proc_fops
Alexey Dobriyan [Tue, 31 Mar 2009 22:19:16 +0000 (15:19 -0700)]
proc tty: switch istallion to ->proc_fops

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoproc tty: switch ip2 to ->proc_fops
Alexey Dobriyan [Tue, 31 Mar 2009 22:19:16 +0000 (15:19 -0700)]
proc tty: switch ip2 to ->proc_fops

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoproc tty: switch cyclades to ->proc_fops
Alexey Dobriyan [Tue, 31 Mar 2009 22:19:15 +0000 (15:19 -0700)]
proc tty: switch cyclades to ->proc_fops

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoproc tty: add struct tty_operations::proc_fops
Alexey Dobriyan [Tue, 31 Mar 2009 22:19:15 +0000 (15:19 -0700)]
proc tty: add struct tty_operations::proc_fops

Used for gradual switch of TTY drivers from using ->read_proc which helps
with gradual switch from ->read_proc for the whole tree.

As side effect, fix possible race condition when ->data initialized after
PDE is hooked into proc tree.

->proc_fops takes precedence over ->read_proc.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Tue, 31 Mar 2009 01:46:43 +0000 (18:46 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  wireless: remove duplicated .ndo_set_mac_address
  netfilter: xtables: fix IPv6 dependency in the cluster match
  tg3: Add GRO support.
  niu: Add GRO support.
  ucc_geth: Fix use-after-of_node_put() in ucc_geth_probe().
  gianfar: Fix use-after-of_node_put() in gfar_of_init().
  kernel: remove HIPQUAD()
  netpoll: store local and remote ip in net-endian
  netfilter: fix endian bug in conntrack printks
  dmascc: fix incomplete conversion to network_device_ops
  gso: Fix support for linear packets
  skbuff.h: fix missing kernel-doc
  ni5010: convert to net_device_ops

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
Linus Torvalds [Tue, 31 Mar 2009 01:46:12 +0000 (18:46 -0700)]
Merge git://git./linux/kernel/git/davem/sparc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc64: Fix reset hangs on Niagara systems.
  cpumask: use mm_cpumask() wrapper: sparc
  cpumask: remove dangerous CPU_MASK_ALL_PTR, &CPU_MASK_ALL.: sparc
  cpumask: remove the now-obsoleted pcibus_to_cpumask(): sparc
  cpumask: remove cpu_coregroup_map: sparc
  cpumask: prepare for iterators to only go to nr_cpu_ids/nr_cpumask_bits.: sparc
  cpumask: prepare for iterators to only go to nr_cpu_ids/nr_cpumask_bits.: sparc64
  cpumask: Use accessors code.: sparc64
  cpumask: Use accessors code: sparc
  cpumask: arch_send_call_function_ipi_mask: sparc
  cpumask: Use smp_call_function_many(): sparc64

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask
Linus Torvalds [Tue, 31 Mar 2009 01:00:26 +0000 (18:00 -0700)]
Merge git://git./linux/kernel/git/rusty/linux-2.6-cpumask

* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask:
  oprofile: Thou shalt not call __exit functions from __init functions
  cpumask: remove the now-obsoleted pcibus_to_cpumask(): generic
  cpumask: remove cpumask_t from core
  cpumask: convert rcutorture.c
  cpumask: use new cpumask_ functions in core code.
  cpumask: remove references to struct irqaction's mask field.
  cpumask: use mm_cpumask() wrapper: kernel/fork.c
  cpumask: use set_cpu_active in init/main.c
  cpumask: remove node_to_first_cpu
  cpumask: fix seq_bitmap_*() functions.
  cpumask: remove dangerous CPU_MASK_ALL_PTR, &CPU_MASK_ALL

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-lguest-and-virtio
Linus Torvalds [Tue, 31 Mar 2009 00:57:39 +0000 (17:57 -0700)]
Merge git://git./linux/kernel/git/rusty/linux-2.6-lguest-and-virtio

* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-lguest-and-virtio:
  lguest: barrier me harder
  lguest: use bool instead of int
  lguest: use KVM hypercalls
  lguest: wire up pte_update/pte_update_defer
  lguest: fix spurious BUG_ON() on invalid guest stack.
  virtio: more neatening of virtio_ring macros.
  virtio: fix BAD_RING, START_US and END_USE macros

15 years agoMerge branch 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
Linus Torvalds [Tue, 31 Mar 2009 00:54:32 +0000 (17:54 -0700)]
Merge branch 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6

* 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
  hwmon: (fschmd) Add support for the FSC Hades IC
  hwmon: (fschmd) Add support for the FSC Syleus IC
  i2c-i801: Instantiate FSC hardware montioring chips
  dmi: Let dmi_walk() users pass private data
  hwmon: Define a standard interface for chassis intrusion detection
  Move the pcf8591 driver to hwmon
  hwmon: (w83627ehf) Only expose in6 or temp3 on the W83667HG
  hwmon: (w83627ehf) Add support for W83667HG
  hwmon: (w83627ehf) Invert fan pin variables logic
  hwmon: (hdaps) Fix Thinkpad X41 axis inversion
  hwmon: (hdaps) Allow inversion of separate axis
  hwmon: (ds1621) Clean up documentation
  hwmon: (ds1621) Avoid unneeded register access
  hwmon: (ds1621) Clean up register access
  hwmon: (ds1621) Reorder code statements

15 years agoMerge branch 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 31 Mar 2009 00:17:35 +0000 (17:17 -0700)]
Merge branch 'locking-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (33 commits)
  lockdep: fix deadlock in lockdep_trace_alloc
  lockdep: annotate reclaim context (__GFP_NOFS), fix SLOB
  lockdep: annotate reclaim context (__GFP_NOFS), fix
  lockdep: build fix for !PROVE_LOCKING
  lockstat: warn about disabled lock debugging
  lockdep: use stringify.h
  lockdep: simplify check_prev_add_irq()
  lockdep: get_user_chars() redo
  lockdep: simplify get_user_chars()
  lockdep: add comments to mark_lock_irq()
  lockdep: remove macro usage from mark_held_locks()
  lockdep: fully reduce mark_lock_irq()
  lockdep: merge the !_READ mark_lock_irq() helpers
  lockdep: merge the _READ mark_lock_irq() helpers
  lockdep: simplify mark_lock_irq() helpers #3
  lockdep: further simplify mark_lock_irq() helpers
  lockdep: simplify the mark_lock_irq() helpers
  lockdep: split up mark_lock_irq()
  lockdep: generate usage strings
  lockdep: generate the state bit definitions
  ...

15 years agoMerge branch 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan...
Linus Torvalds [Mon, 30 Mar 2009 23:06:04 +0000 (16:06 -0700)]
Merge branch 'proc-linus' of git://git./linux/kernel/git/adobriyan/proc

* 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc:
  Revert "proc: revert /proc/uptime to ->read_proc hook"
  proc 2/2: remove struct proc_dir_entry::owner
  proc 1/2: do PDE usecounting even for ->read_proc, ->write_proc
  proc: fix sparse warnings in pagemap_read()
  proc: move fs/proc/inode-alloc.txt comment into a source file

15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Mon, 30 Mar 2009 22:12:14 +0000 (15:12 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/rafael/suspend-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
  PCI PM: Make pci_prepare_to_sleep() disable wake-up if needed
  radeonfb: Use __pci_complete_power_transition()
  PCI PM: Introduce __pci_[start|complete]_power_transition() (rev. 2)
  PCI PM: Restore config spaces of all devices during early resume
  PCI PM: Make pci_set_power_state() handle devices with no PM support
  PCI PM: Put devices into low power states during late suspend (rev. 2)
  PCI PM: Move pci_restore_standard_config to pci-driver.c
  PCI PM: Use pci_set_power_state during early resume
  PCI PM: Consistently use variable name "error" for pm call return values
  kexec: Change kexec jump code ordering
  PM: Change hibernation code ordering
  PM: Change suspend code ordering
  PM: Rework handling of interrupts during suspend-resume
  PM: Introduce functions for suspending and resuming device interrupts

15 years agodma-debug: fix printk formats (i386)
Randy Dunlap [Mon, 30 Mar 2009 21:08:44 +0000 (14:08 -0700)]
dma-debug: fix printk formats (i386)

Fix printk format warnings in dma-debug:

  lib/dma-debug.c:645: warning: format '%016llx' expects type 'long long unsigned int', but argument 6 has type 'dma_addr_t'
  lib/dma-debug.c:662: warning: format '%016llx' expects type 'long long unsigned int', but argument 6 has type 'dma_addr_t'
  lib/dma-debug.c:676: warning: format '%016llx' expects type 'long long unsigned int', but argument 6 has type 'dma_addr_t'
  lib/dma-debug.c:686: warning: format '%016llx' expects type 'long long unsigned int', but argument 6 has type 'dma_addr_t'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoreiserfs: xattr_create is unused with xattrs disabled
Jeff Mahoney [Mon, 30 Mar 2009 20:49:58 +0000 (16:49 -0400)]
reiserfs: xattr_create is unused with xattrs disabled

This patch ifdefs xattr_create when xattrs aren't enabled.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoreiserfs: fix build breakage
Alexander Beregalov [Mon, 30 Mar 2009 20:32:40 +0000 (00:32 +0400)]
reiserfs: fix build breakage

Fix this build error when REISERFS_FS_POSIX_ACL is not set:

  fs/reiserfs/inode.c: In function 'reiserfs_new_inode':
  fs/reiserfs/inode.c:1919: warning: passing argument 1 of 'reiserfs_inherit_default_acl' from incompatible pointer type
  fs/reiserfs/inode.c:1919: warning: passing argument 2 of 'reiserfs_inherit_default_acl' from incompatible pointer type
  fs/reiserfs/inode.c:1919: warning: passing argument 3 of 'reiserfs_inherit_default_acl' from incompatible pointer type
  fs/reiserfs/inode.c:1919: error: too many arguments to function 'reiserfs_inherit_default_acl'

due to a missing transaction-handle argument in the non-acl
compatibility function.

Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Acked-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agolockdep: fix deadlock in lockdep_trace_alloc
Peter Zijlstra [Fri, 20 Mar 2009 10:13:20 +0000 (11:13 +0100)]
lockdep: fix deadlock in lockdep_trace_alloc

Heiko reported that we grab the graph lock with irqs enabled.

Fix this by providng the same wrapper as all other lockdep entry
functions have.

Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Nick Piggin <npiggin@suse.de>
LKML-Reference: <1237544000.24626.52.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoRevert "proc: revert /proc/uptime to ->read_proc hook"
Alexey Dobriyan [Fri, 20 Feb 2009 14:07:22 +0000 (17:07 +0300)]
Revert "proc: revert /proc/uptime to ->read_proc hook"

This reverts commit 6c87df37dcb9c6c33923707fa5191e0a65874d60.

proc files implemented through seq_file do pread(2) now.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
15 years agoproc 2/2: remove struct proc_dir_entry::owner
Alexey Dobriyan [Wed, 25 Mar 2009 19:48:06 +0000 (22:48 +0300)]
proc 2/2: remove struct proc_dir_entry::owner

Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy
as correctly noted at bug #12454. Someone can lookup entry with NULL
->owner, thus not pinning enything, and release it later resulting
in module refcount underflow.

We can keep ->owner and supply it at registration time like ->proc_fops
and ->data.

But this leaves ->owner as easy-manipulative field (just one C assignment)
and somebody will forget to unpin previous/pin current module when
switching ->owner. ->proc_fops is declared as "const" which should give
some thoughts.

->read_proc/->write_proc were just fixed to not require ->owner for
protection.

rmmod'ed directories will be empty and return "." and ".." -- no harm.
And directories with tricky enough readdir and lookup shouldn't be modular.
We definitely don't want such modular code.

Removing ->owner will also make PDE smaller.

So, let's nuke it.

Kudos to Jeff Layton for reminding about this, let's say, oversight.

http://bugzilla.kernel.org/show_bug.cgi?id=12454

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
15 years agoproc 1/2: do PDE usecounting even for ->read_proc, ->write_proc
Alexey Dobriyan [Fri, 20 Feb 2009 14:04:33 +0000 (17:04 +0300)]
proc 1/2: do PDE usecounting even for ->read_proc, ->write_proc

struct proc_dir_entry::owner is going to be removed. Now it's only necessary
to protect PDEs which are using ->read_proc, ->write_proc hooks.

However, ->owner assignments are racy and make it very easy for someone to switch
->owner on live PDE (as some subsystems do) without fixing refcounts and so on.

http://bugzilla.kernel.org/show_bug.cgi?id=12454

So, ->owner is on death row.

Proxy file operations exist already (proc_file_operations), just bump usecount
when necessary.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
15 years agoproc: fix sparse warnings in pagemap_read()
Milind Arun Choudhary [Fri, 20 Feb 2009 13:56:45 +0000 (16:56 +0300)]
proc: fix sparse warnings in pagemap_read()

fs/proc/task_mmu.c:696:12: warning: cast removes address space of expression
fs/proc/task_mmu.c:696:9: warning: incorrect type in assignment (different address spaces)
fs/proc/task_mmu.c:696:9:    expected unsigned long long [noderef] [usertype] <asn:1>*out
fs/proc/task_mmu.c:696:9:    got unsigned long long [usertype] *<noident>
fs/proc/task_mmu.c:697:12: warning: cast removes address space of expression
fs/proc/task_mmu.c:697:9: warning: incorrect type in assignment (different address spaces)
fs/proc/task_mmu.c:697:9:    expected unsigned long long [noderef] [usertype] <asn:1>*end
fs/proc/task_mmu.c:697:9:    got unsigned long long [usertype] *<noident>
fs/proc/task_mmu.c:723:12: warning: cast removes address space of expression
fs/proc/task_mmu.c:723:26: error: subtraction of different types can't work (different address spaces)
fs/proc/task_mmu.c:725:24: error: subtraction of different types can't work (different address spaces)

Signed-off-by: Milind Arun Choudhary <milindchoudhary@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
15 years agoproc: move fs/proc/inode-alloc.txt comment into a source file
Randy Dunlap [Tue, 13 Jan 2009 10:53:48 +0000 (13:53 +0300)]
proc: move fs/proc/inode-alloc.txt comment into a source file

so that people will realize that it exists and can update it as needed.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
15 years agolockdep: annotate reclaim context (__GFP_NOFS), fix SLOB
Ingo Molnar [Sun, 15 Mar 2009 05:03:11 +0000 (06:03 +0100)]
lockdep: annotate reclaim context (__GFP_NOFS), fix SLOB

Impact: build fix

fix typo in mm/slob.c:

 mm/slob.c:469: error: â€˜flags’ undeclared (first use in this function)
 mm/slob.c:469: error: (Each undeclared identifier is reported only once
 mm/slob.c:469: error: for each function it appears in.)

Cc: Nick Piggin <npiggin@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090128135457.350751756@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branch 'drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied...
Linus Torvalds [Mon, 30 Mar 2009 20:54:50 +0000 (13:54 -0700)]
Merge branch 'drm-next' of git://git./linux/kernel/git/airlied/drm-2.6

* 'drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (53 commits)
  drm: detect hdmi monitor by hdmi identifier (v3)
  drm: drm_fops.c unlock missing on error path
  drm: reorder struct drm_ioctl_desc to save space on 64 bit builds
  radeon: add some new pci ids
  drm: read EDID extensions from monitor
  drm: Use a little stash on the stack to avoid kmalloc in most DRM ioctls.
  drm/radeon: add regs required for occlusion queries support
  drm/i915: check the return value from the copy from user
  drm/radeon: fix logic in r600_page_table_init() to match ati_gart
  drm/radeon: r600 ptes are 64-bit, cleanup cleanup function.
  drm/radeon: don't call irq changes on r600 suspend/resume
  drm/radeon: fix r600 writeback across suspend/resume
  drm/radeon: fix r600 writeback setup.
  drm: fix warnings about new mappings in info code.
  drm/radeon: NULL noise: drivers/gpu/drm/radeon/radeon_*.c
  drm/radeon: fix r600 pci mapping calls.
  drm/radeon: r6xx/r7xx: fix possible oops in r600_page_table_cleanup()
  radeon: call the correct idle function, logic got inverted.
  drm/radeon: RS600: fix interrupt handling
  drm/r600: fix rptr address along lines of previous fixes to radeon.
  ...

15 years agoMerge branch 'iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip...
Linus Torvalds [Mon, 30 Mar 2009 20:41:00 +0000 (13:41 -0700)]
Merge branch 'iommu-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits)
  dma-debug: make memory range checks more consistent
  dma-debug: warn of unmapping an invalid dma address
  dma-debug: fix dma_debug_add_bus() definition for !CONFIG_DMA_API_DEBUG
  dma-debug/x86: register pci bus for dma-debug leak detection
  dma-debug: add a check dma memory leaks
  dma-debug: add checks for kernel text and rodata
  dma-debug: print stacktrace of mapping path on unmap error
  dma-debug: Documentation update
  dma-debug: x86 architecture bindings
  dma-debug: add function to dump dma mappings
  dma-debug: add checks for sync_single_sg_*
  dma-debug: add checks for sync_single_range_*
  dma-debug: add checks for sync_single_*
  dma-debug: add checking for [alloc|free]_coherent
  dma-debug: add add checking for map/unmap_sg
  dma-debug: add checking for map/unmap_page/single
  dma-debug: add core checking functions
  dma-debug: add debugfs interface
  dma-debug: add kernel command line parameters
  dma-debug: add initialization code
  ...

Fix trivial conflicts due to whitespace changes in arch/x86/kernel/pci-nommu.c

15 years agoPCI PM: Make pci_prepare_to_sleep() disable wake-up if needed
Rafael J. Wysocki [Mon, 30 Mar 2009 19:46:27 +0000 (21:46 +0200)]
PCI PM: Make pci_prepare_to_sleep() disable wake-up if needed

If the device is not supposed to wake up the system, ie. when
device_may_wakeup(&dev->dev) returns 'false', pci_prepare_to_sleep()
should pass 'false' to pci_enable_wake() so that it calls the
platform to disable the wake-up capability of the device.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
15 years agoradeonfb: Use __pci_complete_power_transition()
Rafael J. Wysocki [Thu, 26 Mar 2009 21:52:08 +0000 (22:52 +0100)]
radeonfb: Use __pci_complete_power_transition()

Use __pci_complete_power_transition() to finalize the transition into
D2 after programming the PMCSR of the device directly.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
15 years agoPCI PM: Introduce __pci_[start|complete]_power_transition() (rev. 2)
Rafael J. Wysocki [Thu, 26 Mar 2009 21:51:40 +0000 (22:51 +0100)]
PCI PM: Introduce __pci_[start|complete]_power_transition() (rev. 2)

The radeonfb driver needs to program the device's PMCSR directly due
to some quirky hardware it has to handle (see
http://bugzilla.kernel.org/show_bug.cgi?id=12846 for details) and
after doing that it needs to call the platform (usually ACPI) to
finish the power transition of the device.  Currently it uses
pci_set_power_state() for this purpose, however making a specific
assumption about the internal behavior of this function, which has
changed recently so that this assumption is no longer satisfied.
For this reason, introduce __pci_complete_power_transition() that may
be called by the radeonfb driver to complete the power transition of
the device.  For symmetry, introduce __pci_start_power_transition().

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
15 years agoPCI PM: Restore config spaces of all devices during early resume
Rafael J. Wysocki [Mon, 16 Mar 2009 21:40:50 +0000 (22:40 +0100)]
PCI PM: Restore config spaces of all devices during early resume

At present the configuration spaces of PCI devices that have no
drivers or no PM support in the drivers (either legacy or through a
pm object) are not saved during suspend and, consequently, they are
not restored during resume.  This generally may lead to the state of
the system being slightly inconsistent after the resume, so it's
better to save and restore the configuration spaces of these devices
as well.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
15 years agoPCI PM: Make pci_set_power_state() handle devices with no PM support
Rafael J. Wysocki [Mon, 16 Mar 2009 21:40:36 +0000 (22:40 +0100)]
PCI PM: Make pci_set_power_state() handle devices with no PM support

There is a problem with PCI devices without any PM support (either
native or through the platform) that pci_set_power_state() always
returns error code for them, even if they are being put into D0.
However, such devices are always in D0, so pci_set_power_state()
should return success when attempting to put such a device into D0.
It also should update the current_state field for these devices as
appropriate.  This modification is necessary so that the standard
configuration registers of these devices are successfully restored by
pci_restore_standard_config() during the "early" phase of resume.

In addition, pci_set_power_state() should check the value of
current_state before calling the platform to change the power state
of the device to avoid doing that unnecessarily.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
15 years agoPCI PM: Put devices into low power states during late suspend (rev. 2)
Rafael J. Wysocki [Mon, 16 Mar 2009 21:40:26 +0000 (22:40 +0100)]
PCI PM: Put devices into low power states during late suspend (rev. 2)

Once we have allowed timer interrupts to be enabled during the late
phase of suspending devices, we are now able to use the generic
pci_set_power_state() to put PCI devices into low power states at
that time.  We can also use some related platform callbacks, like the
ones preparing devices for wake-up, during the late suspend.

Doing this will allow us to avoid the race condition where a device
using shared interrupts is put into a low power state with interrupts
enabled and then an interrupt (for another device) comes in and
confuses its driver.  At the same time, devices that don't support
the native PCI PM or that require some additional, platform-specific
operations to be carried out to put them into low power states will
be handled as appropriate.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
15 years agoPCI PM: Move pci_restore_standard_config to pci-driver.c
Rafael J. Wysocki [Mon, 16 Mar 2009 21:40:18 +0000 (22:40 +0100)]
PCI PM: Move pci_restore_standard_config to pci-driver.c

Move pci_restore_standard_config() from pci.c to pci-driver.c and
make it static.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
15 years agoPCI PM: Use pci_set_power_state during early resume
Rafael J. Wysocki [Mon, 16 Mar 2009 21:40:08 +0000 (22:40 +0100)]
PCI PM: Use pci_set_power_state during early resume

Once we have allowed timer interrupts to be enabled during the early
phase of resuming devices, we are now able to use the generic
pci_set_power_state() to put PCI devices into D0 at that time.  Then,
the platform-specific PM code will have a chance to handle devices
that don't implement the native PCI PM or that require some
additional, platform-specific operations to be carried out to power
them up.  Also, by doing this we can simplify the code quite a bit.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
15 years agoPCI PM: Consistently use variable name "error" for pm call return values
Frans Pop [Mon, 16 Mar 2009 21:39:56 +0000 (22:39 +0100)]
PCI PM: Consistently use variable name "error" for pm call return values

I noticed two functions use a variable "i" to store the return value of PM
function calls while the rest of the file uses "error". As "i" normally
indicates a counter of some sort it seems better to keep this consistent.

Signed-off-by: Frans Pop <elendil@planet.nl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
15 years agokexec: Change kexec jump code ordering
Rafael J. Wysocki [Mon, 16 Mar 2009 21:34:35 +0000 (22:34 +0100)]
kexec: Change kexec jump code ordering

Change the ordering of the kexec jump code so that the nonboot CPUs
are disabled after calling device drivers' "late suspend" methods.

This change reflects the recent modifications of the power management
code that is also used by kexec jump.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
15 years agoPM: Change hibernation code ordering
Rafael J. Wysocki [Mon, 16 Mar 2009 21:34:26 +0000 (22:34 +0100)]
PM: Change hibernation code ordering

Change the ordering of the hibernation core code so that the platform
"prepare" callbacks are executed and the nonboot CPUs are disabled
after calling device drivers' "late suspend" methods.

This change (along with the previous analogous change of the suspend
core code) will allow us to rework the PCI PM core so that the power
state of devices is changed in the "late" phase of suspend (and
analogously in the "early" phase of resume), which in turn will allow
us to avoid the race condition where a device using shared interrupts
is put into a low power state with interrupts enabled and then an
interrupt (for another device) comes in and confuses its driver.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
15 years agoPM: Change suspend code ordering
Rafael J. Wysocki [Mon, 16 Mar 2009 21:34:15 +0000 (22:34 +0100)]
PM: Change suspend code ordering

Change the ordering of the suspend core code so that the platform
"prepare" callback is executed and the nonboot CPUs are disabled
after calling device drivers' "late suspend" methods.

This change will allow us to rework the PCI PM core so that the power
state of devices is changed in the "late" phase of suspend (and
analogously in the "early" phase of resume), which in turn will allow
us to avoid the race condition where a device using shared interrupts
is put into a low power state with interrupts enabled and then an
interrupt (for another device) comes in and confuses its driver.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
15 years agoPM: Rework handling of interrupts during suspend-resume
Rafael J. Wysocki [Mon, 16 Mar 2009 21:34:06 +0000 (22:34 +0100)]
PM: Rework handling of interrupts during suspend-resume

Use the functions introduced in by the previous patch,
suspend_device_irqs(), resume_device_irqs() and check_wakeup_irqs(),
to rework the handling of interrupts during suspend (hibernation) and
resume.  Namely, interrupts will only be disabled on the CPU right
before suspending sysdevs, while device drivers will be prevented
from receiving interrupts, with the help of the new helper function,
before their "late" suspend callbacks run (and analogously during
resume).

In addition, since the device interrups are now disabled before the
CPU has turned all interrupts off and the CPU will ACK the interrupts
setting the IRQ_PENDING bit for them, check in sysdev_suspend() if
any wake-up interrupts are pending and abort suspend if that's the
case.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
15 years agoPM: Introduce functions for suspending and resuming device interrupts
Rafael J. Wysocki [Mon, 16 Mar 2009 21:33:49 +0000 (22:33 +0100)]
PM: Introduce functions for suspending and resuming device interrupts

Introduce helper functions allowing us to prevent device drivers from
getting any interrupts (without disabling interrupts on the CPU)
during suspend (or hibernation) and to make them start to receive
interrupts again during the subsequent resume.  These functions make it
possible to keep timer interrupts enabled while the "late" suspend and
"early" resume callbacks provided by device drivers are being
executed.  In turn, this allows device drivers' "late" suspend and
"early" resume callbacks to sleep, execute ACPI callbacks etc.

The functions introduced here will be used to rework the handling of
interrupts during suspend (hibernation) and resume.  Namely,
interrupts will only be disabled on the CPU right before suspending
sysdevs, while device drivers will be prevented from receiving
interrupts, with the help of the new helper function, before their
"late" suspend callbacks run (and analogously during resume).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
15 years agohwmon: (fschmd) Add support for the FSC Hades IC
Hans de Goede [Mon, 30 Mar 2009 19:46:45 +0000 (21:46 +0200)]
hwmon: (fschmd) Add support for the FSC Hades IC

Add support for the Hades to the FSC hwmon driver.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
15 years agohwmon: (fschmd) Add support for the FSC Syleus IC
Hans de Goede [Mon, 30 Mar 2009 19:46:45 +0000 (21:46 +0200)]
hwmon: (fschmd) Add support for the FSC Syleus IC

Many thanks to Fujitsu Siemens Computers for providing docs and a
machine to test the driver on.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
15 years agoi2c-i801: Instantiate FSC hardware montioring chips
Hans de Goede [Mon, 30 Mar 2009 19:46:44 +0000 (21:46 +0200)]
i2c-i801: Instantiate FSC hardware montioring chips

Detect various FSC hwmon IC's based on DMI tables and then let
the i2c-i801 driver instantiate the i2c client devices. Note that
some of the info in the added table is indentical for all rows, still
this is kept in the table to keep the code general and thus (hopefully)
easily extensible in the future.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
15 years agodmi: Let dmi_walk() users pass private data
Jean Delvare [Mon, 30 Mar 2009 19:46:44 +0000 (21:46 +0200)]
dmi: Let dmi_walk() users pass private data

At the moment, dmi_walk() lacks flexibility, users can't pass data to
the callback function. Add a pointer for private data to make this
function more flexible.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Roland Dreier <rolandd@cisco.com>
15 years agohwmon: Define a standard interface for chassis intrusion detection
Jean Delvare [Mon, 30 Mar 2009 19:46:44 +0000 (21:46 +0200)]
hwmon: Define a standard interface for chassis intrusion detection

Define a standard interface for the chassis intrusion detection feature
some hardware monitoring chips have. Some drivers have custom sysfs
entries for it, but a standard interface would allow integration with
user-space (namely libsensors.)

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Hans de Goede <j.w.r.degoede@hhs.nl>
Acked-by: Matt Roberds <mattroberds@cox.net>
15 years agoMove the pcf8591 driver to hwmon
Jean Delvare [Mon, 30 Mar 2009 19:46:43 +0000 (21:46 +0200)]
Move the pcf8591 driver to hwmon

Directory drivers/i2c/chips is going away, so drivers there must find
new homes. For the pcf8591 driver, the best choice seems to be the
hwmon subsystem. While the Philips PCF8591 device isn't a typical
hardware monitoring chip, its DAC interface is compatible with the
hwmon one, so it fits somewhat.

If a better subsystem is ever created for ADC/DAC chips, the driver
could be moved there.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Aurelien Jarno <aurelien@aurel32.net>
15 years agohwmon: (w83627ehf) Only expose in6 or temp3 on the W83667HG
Gong Jun [Mon, 30 Mar 2009 19:46:43 +0000 (21:46 +0200)]
hwmon: (w83627ehf) Only expose in6 or temp3 on the W83667HG

The pin for in6 and temp3 is shared on the W83667HG, so only one of
these features can be supported on any given system. Let the driver
select which one depending on the temp3 disabled bit.

Signed-off-by: Gong Jun <JGong@nuvoton.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
15 years agohwmon: (w83627ehf) Add support for W83667HG
Gong Jun [Mon, 30 Mar 2009 19:46:42 +0000 (21:46 +0200)]
hwmon: (w83627ehf) Add support for W83667HG

Add initial support for the Nuvoton W83667HG chip to the w83627ehf
driver. It has been tested on ASUS P5QL PRO by Gong Jun.

At the moment there is still a usability issue which is that only in6
or temp3 can be present on the W83667HG, so the driver shouldn't
expose both. This will be addressed later.

Signed-off-by: Gong Jun <JGong@nuvoton.com>
Acked-by: David Hubbard <david.c.hubbard@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
15 years agohwmon: (w83627ehf) Invert fan pin variables logic
Jean Delvare [Mon, 30 Mar 2009 19:46:42 +0000 (21:46 +0200)]
hwmon: (w83627ehf) Invert fan pin variables logic

Use positive logic for fan pin variables (variable is set if pin is
used for fan), instead of negative logic which is error prone.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Gong Jun <JGong@nuvoton.com>
15 years agohwmon: (hdaps) Fix Thinkpad X41 axis inversion
Frank Seidel [Mon, 30 Mar 2009 19:46:42 +0000 (21:46 +0200)]
hwmon: (hdaps) Fix Thinkpad X41 axis inversion

Fix for kernel.org bug #7154: hdaps inversion of actual Thinkpad
X41's Y-axis.

Signed-off-by: Frank Seidel <frank@f-seidel.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
15 years agohwmon: (hdaps) Allow inversion of separate axis
Frank Seidel [Mon, 30 Mar 2009 19:46:41 +0000 (21:46 +0200)]
hwmon: (hdaps) Allow inversion of separate axis

Fix for kernel.org bug #7154: hdaps inversion of each axis. This
version is based on the work from Michael Ruoss <miruoss@student.ethz.ch>.

Signed-off-by: Frank Seidel <frank@f-seidel.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
15 years agohwmon: (ds1621) Clean up documentation
Jean Delvare [Mon, 30 Mar 2009 19:46:41 +0000 (21:46 +0200)]
hwmon: (ds1621) Clean up documentation

* The alarms sysfs file is deprecated, and individual alarm files are
  self-explanatory.
* The driver doesn't implement high-reslution temperature readings so
  don't document that.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Aurelien Jarno <aurelien@aurel32.net>
15 years agohwmon: (ds1621) Avoid unneeded register access
Jean Delvare [Mon, 30 Mar 2009 19:46:40 +0000 (21:46 +0200)]
hwmon: (ds1621) Avoid unneeded register access

Register access over SMBus isn't cheap, so avoid register access where
possible:
* Only write back the configuration register if it changed.
* Don't refresh the register cache when we don't have to.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Aurelien Jarno <aurelien@aurel32.net>
15 years agohwmon: (ds1621) Clean up register access
Jean Delvare [Mon, 30 Mar 2009 19:46:40 +0000 (21:46 +0200)]
hwmon: (ds1621) Clean up register access

Fix a few oddities in how the ds1621 driver accesses the registers:
* We don't need a wrapper to access the configuration register.
* Check for error before calling swab16. Error checking isn't
  complete yet, but that's a start.
* Device-specific read functions should never be called during
  detection, as by definition we don't know what device we are talking
  to at that point.
* Likewise, don't assume that register reads succeed during detection.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Aurelien Jarno <aurelien@aurel32.net>
15 years agohwmon: (ds1621) Reorder code statements
Jean Delvare [Mon, 30 Mar 2009 19:46:40 +0000 (21:46 +0200)]
hwmon: (ds1621) Reorder code statements

Reorder the ds1621 driver code so that we can get rid of forward
function declarations.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Aurelien Jarno <aurelien@aurel32.net>
15 years agoMerge branch 'reiserfs-updates' from Jeff Mahoney
Linus Torvalds [Mon, 30 Mar 2009 19:29:21 +0000 (12:29 -0700)]
Merge branch 'reiserfs-updates' from Jeff Mahoney

* reiserfs-updates: (35 commits)
  reiserfs: rename [cn]_* variables
  reiserfs: rename p_._ variables
  reiserfs: rename p_s_tb to tb
  reiserfs: rename p_s_inode to inode
  reiserfs: rename p_s_bh to bh
  reiserfs: rename p_s_sb to sb
  reiserfs: strip trailing whitespace
  reiserfs: cleanup path functions
  reiserfs: factor out buffer_info initialization
  reiserfs: add atomic addition of selinux attributes during inode creation
  reiserfs: use generic readdir for operations across all xattrs
  reiserfs: journaled xattrs
  reiserfs: use generic xattr handlers
  reiserfs: remove i_has_xattr_dir
  reiserfs: make per-inode xattr locking more fine grained
  reiserfs: eliminate per-super xattr lock
  reiserfs: simplify xattr internal file lookups/opens
  reiserfs: Clean up xattrs when REISERFS_FS_XATTR is unset
  reiserfs: remove IS_PRIVATE helpers
  reiserfs: remove link detection code
  ...

Fixed up conflicts manually due to:
 - quota name cleanups vs variable naming changes:
fs/reiserfs/inode.c
fs/reiserfs/namei.c
fs/reiserfs/stree.c
        fs/reiserfs/xattr.c
 - exported include header cleanups
include/linux/reiserfs_fs.h

15 years agoreiserfs: rename [cn]_* variables
Jeff Mahoney [Mon, 30 Mar 2009 18:02:50 +0000 (14:02 -0400)]
reiserfs: rename [cn]_* variables

This patch renames n_, c_, etc variables to something more sane.  This
is the sixth in a series of patches to rip out some of the awful
variable naming in reiserfs.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoreiserfs: rename p_._ variables
Jeff Mahoney [Mon, 30 Mar 2009 18:02:49 +0000 (14:02 -0400)]
reiserfs: rename p_._ variables

This patch is a simple s/p_._//g to the reiserfs code.  This is the
fifth in a series of patches to rip out some of the awful variable
naming in reiserfs.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoreiserfs: rename p_s_tb to tb
Jeff Mahoney [Mon, 30 Mar 2009 18:02:48 +0000 (14:02 -0400)]
reiserfs: rename p_s_tb to tb

This patch is a simple s/p_s_tb/tb/g to the reiserfs code.  This is the
fourth in a series of patches to rip out some of the awful variable
naming in reiserfs.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoreiserfs: rename p_s_inode to inode
Jeff Mahoney [Mon, 30 Mar 2009 18:02:47 +0000 (14:02 -0400)]
reiserfs: rename p_s_inode to inode

This patch is a simple s/p_s_inode/inode/g to the reiserfs code.  This
is the third in a series of patches to rip out some of the awful
variable naming in reiserfs.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoreiserfs: rename p_s_bh to bh
Jeff Mahoney [Mon, 30 Mar 2009 18:02:46 +0000 (14:02 -0400)]
reiserfs: rename p_s_bh to bh

This patch is a simple s/p_s_bh/bh/g to the reiserfs code.  This is the
second in a series of patches to rip out some of the awful variable
naming in reiserfs.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoreiserfs: rename p_s_sb to sb
Jeff Mahoney [Mon, 30 Mar 2009 18:02:45 +0000 (14:02 -0400)]
reiserfs: rename p_s_sb to sb

This patch is a simple s/p_s_sb/sb/g to the reiserfs code.  This is the
first in a series of patches to rip out some of the awful variable
naming in reiserfs.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoreiserfs: strip trailing whitespace
Jeff Mahoney [Mon, 30 Mar 2009 18:02:44 +0000 (14:02 -0400)]
reiserfs: strip trailing whitespace

This patch strips trailing whitespace from the reiserfs code.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoreiserfs: cleanup path functions
Jeff Mahoney [Mon, 30 Mar 2009 18:02:43 +0000 (14:02 -0400)]
reiserfs: cleanup path functions

This patch cleans up some redundancies in the reiserfs tree path code.

decrement_bcount() is essentially the same function as brelse(), so we use
that instead.

decrement_counters_in_path() is exactly the same function as pathrelse(), so
we kill that and use pathrelse() instead.

There's also a bit of cleanup that makes the code a bit more readable.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoreiserfs: factor out buffer_info initialization
Jeff Mahoney [Mon, 30 Mar 2009 18:02:42 +0000 (14:02 -0400)]
reiserfs: factor out buffer_info initialization

This is the first in a series of patches to make balance_leaf() not
quite so insane.

This patch factors out the open coded initializations of buffer_info
structures and defines a few initializers for the 4 cases they're used.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoreiserfs: add atomic addition of selinux attributes during inode creation
Jeff Mahoney [Mon, 30 Mar 2009 18:02:41 +0000 (14:02 -0400)]
reiserfs: add atomic addition of selinux attributes during inode creation

Some time ago, some changes were made to make security inode attributes
be atomically written during inode creation.  ReiserFS fell behind in
this area, but with the reworking of the xattr code, it's now fairly
easy to add.

The following patch adds the ability for security attributes to be added
automatically during inode creation.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoreiserfs: use generic readdir for operations across all xattrs
Jeff Mahoney [Mon, 30 Mar 2009 18:02:40 +0000 (14:02 -0400)]
reiserfs: use generic readdir for operations across all xattrs

The current reiserfs xattr implementation open codes reiserfs_readdir
and frees the path before calling the filldir function.  Typically, the
filldir function is something that modifies the file system, such as a
chown or an inode deletion that also require reading of an inode
associated with each direntry.  Since the file system is modified, the
path retained becomes invalid for the next run.  In addition, it runs
backwards in attempt to minimize activity.

This is clearly suboptimal from a code cleanliness perspective as well
as performance-wise.

This patch implements a generic reiserfs_for_each_xattr that uses the
generic readdir and a specific filldir routine that simply populates an
array of dentries and then performs a specific operation on them.  When
all files have been operated on, it then calls the operation on the
directory itself.

The result is a noticable code reduction and better performance.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoreiserfs: journaled xattrs
Jeff Mahoney [Mon, 30 Mar 2009 18:02:39 +0000 (14:02 -0400)]
reiserfs: journaled xattrs

Deadlocks are possible in the xattr code between the journal lock and the
xattr sems.

This patch implements journalling for xattr operations. The benefit is
twofold:
 * It gets rid of the deadlock possibility by always ensuring that xattr
   write operations are initiated inside a transaction.
 * It corrects the problem where xattr backing files aren't considered any
   differently than normal files, despite the fact they are metadata.

I discussed the added journal load with Chris Mason, and we decided that
since xattrs (versus other journal activity) is fairly rare, the introduction
of larger transactions to support journaled xattrs wouldn't be too big a deal.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoreiserfs: use generic xattr handlers
Jeff Mahoney [Mon, 30 Mar 2009 18:02:38 +0000 (14:02 -0400)]
reiserfs: use generic xattr handlers

Christoph Hellwig had asked me quite some time ago to port the reiserfs
xattrs to the generic xattr interface.

This patch replaces the reiserfs-specific xattr handling code with the
generic struct xattr_handler.

However, since reiserfs doesn't split the prefix and name when accessing
xattrs, it can't leverage generic_{set,get,list,remove}xattr without
needlessly reconstructing the name on the back end.

Update 7/26/07: Added missing dput() to deletion path.
Update 8/30/07: Added missing mark_inode_dirty when i_mode is used to
                represent an ACL and no previous ACL existed.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoreiserfs: remove i_has_xattr_dir
Jeff Mahoney [Mon, 30 Mar 2009 18:02:37 +0000 (14:02 -0400)]
reiserfs: remove i_has_xattr_dir

With the changes to xattr root locking, the i_has_xattr_dir flag
is no longer needed. This patch removes it.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoreiserfs: make per-inode xattr locking more fine grained
Jeff Mahoney [Mon, 30 Mar 2009 18:02:36 +0000 (14:02 -0400)]
reiserfs: make per-inode xattr locking more fine grained

The per-inode locking can be made more fine-grained to surround just the
interaction with the filesystem itself.  This really only applies to
protecting reads during a write, since concurrent writes are barred with
inode->i_mutex at the vfs level.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoreiserfs: eliminate per-super xattr lock
Jeff Mahoney [Mon, 30 Mar 2009 18:02:35 +0000 (14:02 -0400)]
reiserfs: eliminate per-super xattr lock

With the switch to using inode->i_mutex locking during lookups/creation
in the xattr root, the per-super xattr lock is no longer needed.

This patch removes it.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>