GitHub/exynos8895/android_kernel_samsung_universal8895.git
18 years ago[PATCH] More page migration: use migration entries for file pages
Christoph Lameter [Fri, 23 Jun 2006 09:03:38 +0000 (02:03 -0700)]
[PATCH] More page migration: use migration entries for file pages

This implements the use of migration entries to preserve ptes of file backed
pages during migration.  Processes can therefore be migrated back and forth
without loosing their connection to pagecache pages.

Note that we implement the migration entries only for linear mappings.
Nonlinear mappings still require the unmapping of the ptes for migration.

And another writepage() ugliness shows up.  writepage() can drop the page
lock.  Therefore we have to remove migration ptes before calling writepages()
in order to avoid having migration entries point to unlocked pages.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] More page migration: do not inc/dec rss counters
Christoph Lameter [Fri, 23 Jun 2006 09:03:38 +0000 (02:03 -0700)]
[PATCH] More page migration: do not inc/dec rss counters

If we install a migration entry then the rss not really decreases since the
page is just moved somewhere else.  We can save ourselves the work of
decrementing and later incrementing which will just eventually cause cacheline
bouncing.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Swapless page migration: modify core logic
Christoph Lameter [Fri, 23 Jun 2006 09:03:37 +0000 (02:03 -0700)]
[PATCH] Swapless page migration: modify core logic

Use the migration entries for page migration

This modifies the migration code to use the new migration entries.  It now
becomes possible to migrate anonymous pages without having to add a swap
entry.

We add a couple of new functions to replace migration entries with the proper
ptes.

We cannot take the tree_lock for migrating anonymous pages anymore.  However,
we know that we hold the only remaining reference to the page when the page
count reaches 1.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Swapless page migration: rip out swap based logic
Christoph Lameter [Fri, 23 Jun 2006 09:03:36 +0000 (02:03 -0700)]
[PATCH] Swapless page migration: rip out swap based logic

Rip the page migration logic out.

Remove all code that has to do with swapping during page migration.

This also guts the ability to migrate pages to swap.  No one used that so lets
let it go for good.

Page migration should be a bit broken after this patch.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Swapless page migration: add R/W migration entries
Christoph Lameter [Fri, 23 Jun 2006 09:03:35 +0000 (02:03 -0700)]
[PATCH] Swapless page migration: add R/W migration entries

Implement read/write migration ptes

We take the upper two swapfiles for the two types of migration ptes and define
a series of macros in swapops.h.

The VM is modified to handle the migration entries.  migration entries can
only be encountered when the page they are pointing to is locked.  This limits
the number of places one has to fix.  We also check in copy_pte_range and in
mprotect_pte_range() for migration ptes.

We check for migration ptes in do_swap_cache and call a function that will
then wait on the page lock.  This allows us to effectively stop all accesses
to apge.

Migration entries are created by try_to_unmap if called for migration and
removed by local functions in migrate.c

From: Hugh Dickins <hugh@veritas.com>

  Several times while testing swapless page migration (I've no NUMA, just
  hacking it up to migrate recklessly while running load), I've hit the
  BUG_ON(!PageLocked(p)) in migration_entry_to_page.

  This comes from an orphaned migration entry, unrelated to the current
  correctly locked migration, but hit by remove_anon_migration_ptes as it
  checks an address in each vma of the anon_vma list.

  Such an orphan may be left behind if an earlier migration raced with fork:
  copy_one_pte can duplicate a migration entry from parent to child, after
  remove_anon_migration_ptes has checked the child vma, but before it has
  removed it from the parent vma.  (If the process were later to fault on this
  orphaned entry, it would hit the same BUG from migration_entry_wait.)

  This could be fixed by locking anon_vma in copy_one_pte, but we'd rather
  not.  There's no such problem with file pages, because vma_prio_tree_add
  adds child vma after parent vma, and the page table locking at each end is
  enough to serialize.  Follow that example with anon_vma: add new vmas to the
  tail instead of the head.

  (There's no corresponding problem when inserting migration entries,
  because a missed pte will leave the page count and mapcount high, which is
  allowed for.  And there's no corresponding problem when migrating via swap,
  because a leftover swap entry will be correctly faulted.  But the swapless
  method has no refcounting of its entries.)

From: Ingo Molnar <mingo@elte.hu>

  pte_unmap_unlock() takes the pte pointer as an argument.

From: Hugh Dickins <hugh@veritas.com>

  Several times while testing swapless page migration, gcc has tried to exec
  a pointer instead of a string: smells like COW mappings are not being
  properly write-protected on fork.

  The protection in copy_one_pte looks very convincing, until at last you
  realize that the second arg to make_migration_entry is a boolean "write",
  and SWP_MIGRATION_READ is 30.

  Anyway, it's better done like in change_pte_range, using
  is_write_migration_entry and make_migration_entry_read.

From: Hugh Dickins <hugh@veritas.com>

  Remove unnecessary obfuscation from sys_swapon's range check on swap type,
  which blew up causing memory corruption once swapless migration made
  MAX_SWAPFILES no longer 2 ^ MAX_SWAPFILES_SHIFT.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Christoph Lameter <clameter@engr.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
From: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] page migration cleanup: move fallback handling into special function
Christoph Lameter [Fri, 23 Jun 2006 09:03:33 +0000 (02:03 -0700)]
[PATCH] page migration cleanup: move fallback handling into special function

Move the fallback code into a new fallback function and make the function
behave like any other migration function.  This requires retaking the lock if
pageout() drops it.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] page migration cleanup: pass "mapping" to migration functions
Christoph Lameter [Fri, 23 Jun 2006 09:03:33 +0000 (02:03 -0700)]
[PATCH] page migration cleanup: pass "mapping" to migration functions

Change handling of address spaces.

Pass a pointer to the address space in which the page is migrated to all
migration function.  This avoids repeatedly having to retrieve the address
space pointer from the page and checking it for validity.  The old page
mapping will change once migration has gone to a certain step, so it is less
confusing to have the pointer always available.

Move the setting of the mapping and index for the new page into
migrate_pages().

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] page migration cleanup: extract try_to_unmap from migration functions
Christoph Lameter [Fri, 23 Jun 2006 09:03:32 +0000 (02:03 -0700)]
[PATCH] page migration cleanup: extract try_to_unmap from migration functions

Extract try_to_unmap and rename remove_references -> move_mapping

try_to_unmap() may significantly change the page state by for example setting
the dirty bit.  It is therefore best to unmap in migrate_pages() before
calling any migration functions.

migrate_page_remove_references() will then only move the new page in place of
the old page in the mapping.  Rename the function to
migrate_page_move_mapping().

This allows us to get rid of the special unmapping for the fallback path.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] page migration cleanup: drop nr_refs in remove_references()
Christoph Lameter [Fri, 23 Jun 2006 09:03:29 +0000 (02:03 -0700)]
[PATCH] page migration cleanup: drop nr_refs in remove_references()

Drop nr_refs parameter from migrate_page_remove_references()

The nr_refs parameter is not really useful since the number of remaining
references is always

1 for anonymous pages without a mapping
2 for pages with a mapping
3 for pages with a mapping and PagePrivate set.

Remove the early check for the number of references since we are checking
page_mapcount() earlier.  Ultimately only the refcount matters after the
tree_lock has been obtained.

Signed-off-by: Christoph Lameter <clameter@sgi.coim>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] page migration cleanup: remove useless definitions
Christoph Lameter [Fri, 23 Jun 2006 09:03:29 +0000 (02:03 -0700)]
[PATCH] page migration cleanup: remove useless definitions

Remove the export for migrate_page_remove_references() and migrate_page_copy()
that are unlikely to be used directly by filesystems implementing migration.
The export was useful when buffer_migrate_page() lived in fs/buffer.c but it
has now been moved to migrate.c in the migration reorg.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] page migration cleanup: group functions
Christoph Lameter [Fri, 23 Jun 2006 09:03:28 +0000 (02:03 -0700)]
[PATCH] page migration cleanup: group functions

Reorder functions in migrate.c.  Group all migration functions for struct
address_space_operations together.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] page migration cleanup: rename "ignrefs" to "migration"
Christoph Lameter [Fri, 23 Jun 2006 09:03:27 +0000 (02:03 -0700)]
[PATCH] page migration cleanup: rename "ignrefs" to "migration"

migrate is a better name since it is only used by page migration.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] writeback: fix range handling
OGAWA Hirofumi [Fri, 23 Jun 2006 09:03:26 +0000 (02:03 -0700)]
[PATCH] writeback: fix range handling

When a writeback_control's `start' and `end' fields are used to
indicate a one-byte-range starting at file offset zero, the required
values of .start=0,.end=0 mean that the ->writepages() implementation
has no way of telling that it is being asked to perform a range
request.  Because we're currently overloading (start == 0 && end == 0)
to mean "this is not a write-a-range request".

To make all this sane, the patch changes range of writeback_control.

So caller does: If it is calling ->writepages() to write pages, it
sets range (range_start/end or range_cyclic) always.

And if range_cyclic is true, ->writepages() thinks the range is
cyclic, otherwise it just uses range_start and range_end.

This patch does,

    - Add LLONG_MAX, LLONG_MIN, ULLONG_MAX to include/linux/kernel.h
      -1 is usually ok for range_end (type is long long). But, if someone did,

range_end += val; range_end is "val - 1"
u64val = range_end >> bits; u64val is "~(0ULL)"

      or something, they are wrong. So, this adds LLONG_MAX to avoid nasty
      things, and uses LLONG_MAX for range_end.

    - All callers of ->writepages() sets range_start/end or range_cyclic.

    - Fix updates of ->writeback_index. It seems already bit strange.
      If it starts at 0 and ended by check of nr_to_write, this last
      index may reduce chance to scan end of file.  So, this updates
      ->writeback_index only if range_cyclic is true or whole-file is
      scanned.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Nathan Scott <nathans@sgi.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: Steven French <sfrench@us.ibm.com>
Cc: "Vladimir V. Saveliev" <vs@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] buglet in radix_tree_tag_set
Peter Zijlstra [Fri, 23 Jun 2006 09:03:25 +0000 (02:03 -0700)]
[PATCH] buglet in radix_tree_tag_set

The comment states: 'Setting a tag on a not-present item is a BUG.' Hence
if 'index' is larger than the maxindex; the item _cannot_ be presen; it
should also be a BUG.

Also, this allows the following statement (assume a fresh tree):

  radix_tree_tag_set(root, 16, 1);

to fail silently, but when preceded by:

  radix_tree_insert(root, 32, item);

it would BUG, because the height has been extended by the insert.

In neither case was 16 present.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] slab: redzone double-free detection
Pekka Enberg [Fri, 23 Jun 2006 09:03:24 +0000 (02:03 -0700)]
[PATCH] slab: redzone double-free detection

At present our slab debugging tells us that it detected a double-free or
corruption - it does not distinguish between them.  Sometimes it's useful
to be able to differentiate between these two types of information.

Add double-free detection to redzone verification when freeing an object.
As explained by Manfred, when we are freeing an object, both redzones
should be RED_ACTIVE.  However, if both are RED_INACTIVE, we are trying to
free an object that was already free'd.

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] likely cleanup: remove unlikely in sys_mprotect()
Hua Zhong [Fri, 23 Jun 2006 09:03:23 +0000 (02:03 -0700)]
[PATCH] likely cleanup: remove unlikely in sys_mprotect()

With likely/unlikely profiling on my not-so-busy-typical-developmentsystem
there are 5k misses vs 2k hits.  So I guess we should remove the unlikely.

Signed-off-by: Hua Zhong <hzhong@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] radix-tree: small
Nick Piggin [Fri, 23 Jun 2006 09:03:22 +0000 (02:03 -0700)]
[PATCH] radix-tree: small

Reduce radix tree node memory usage by about a factor of 4 for small files
(< 64K).  There are pointer traversal and memory usage costs for large
files with dense pagecache.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] radix-tree: direct data
Nick Piggin [Fri, 23 Jun 2006 09:03:22 +0000 (02:03 -0700)]
[PATCH] radix-tree: direct data

The ability to have height 0 radix trees (a direct pointer to the data item
rather than going through a full node->slot) quietly disappeared with
old-2.6-bkcvs commit ffee171812d51652f9ba284302d9e5c5cc14bdfd.  On 64-bit
machines this causes nearly 600 bytes to be used for every <= 4K file in
pagecache.

Re-introduce this feature, root tags stored in spare ->gfp_mask bits.

Simplify radix_tree_delete's complex tag clearing arrangement (which would
become even more complex) by just falling back to tag clearing functions
(the pagecache radix-tree never uses this path anyway, so the icache
savings will mean it's actually a speedup).

On my 4GB G5, this saves 8MB RAM per kernel kernel source+object tree in
pagecache.

Pagecache lookup, insertion, and removal speed for small files will also be
improved.

This makes RCU radix tree harder, but it's worth it.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] change gen_pool allocator to not touch managed memory
Dean Nelson [Fri, 23 Jun 2006 09:03:21 +0000 (02:03 -0700)]
[PATCH] change gen_pool allocator to not touch managed memory

Modify the gen_pool allocator (lib/genalloc.c) to utilize a bitmap scheme
instead of the buddy scheme.  The purpose of this change is to eliminate
the touching of the actual memory being allocated.

Since the change modifies the interface, a change to the uncached allocator
(arch/ia64/kernel/uncached.c) is also required.

Both Andrey Volkov and Jes Sorenson have expressed a desire that the
gen_pool allocator not write to the memory being managed. See the
following:

  http://marc.theaimsgroup.com/?l=linux-kernel&m=113518602713125&w=2
  http://marc.theaimsgroup.com/?l=linux-kernel&m=113533568827916&w=2

Signed-off-by: Dean Nelson <dcn@sgi.com>
Cc: Andrey Volkov <avolkov@varma-el.com>
Acked-by: Jes Sorensen <jes@trained-monkey.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] mm: introduce remap_vmalloc_range()
Nick Piggin [Fri, 23 Jun 2006 09:03:20 +0000 (02:03 -0700)]
[PATCH] mm: introduce remap_vmalloc_range()

Add remap_vmalloc_range, vmalloc_user, and vmalloc_32_user so that drivers
can have a nice interface for remapping vmalloc memory.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Unify pxm_to_node() and node_to_pxm()
Yasunori Goto [Fri, 23 Jun 2006 09:03:19 +0000 (02:03 -0700)]
[PATCH] Unify pxm_to_node() and node_to_pxm()

Consolidate the various arch-specific implementations of pxm_to_node() and
node_to_pxm() into a single generic version.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] swsusp: rework memory shrinker
Rafael J. Wysocki [Fri, 23 Jun 2006 09:03:18 +0000 (02:03 -0700)]
[PATCH] swsusp: rework memory shrinker

Rework the swsusp's memory shrinker in the following way:

- Simplify balance_pgdat() by removing all of the swsusp-related code
  from it.

- Make shrink_all_memory() use shrink_slab() and a new function
  shrink_all_zones() which calls shrink_active_list() and
  shrink_inactive_list() directly for each zone in a way that's optimized
  for suspend.

In shrink_all_memory() we try to free exactly as many pages as the caller
asks for, preferably in one shot, starting from easier targets.   If slab
caches are huge, they are most likely to have enough pages to reclaim.
 The inactive lists are next (the zones with more inactive pages go first)
etc.

Each time shrink_all_memory() attempts to shrink the active and inactive
lists for each zone in 5 passes.   In the first pass, only the inactive
lists are taken into consideration.   In the next two passes the active
lists are also shrunk, but mapped pages are not reclaimed.   In the last
two passes the active and inactive lists are shrunk and mapped pages are
reclaimed as well.  The aim of this is to alter the reclaim logic to choose
the best pages to keep on resume and improve the responsiveness of the
resumed system.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Con Kolivas <kernel@kolivas.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] slab: stop using list_for_each
Christoph Hellwig [Fri, 23 Jun 2006 09:03:17 +0000 (02:03 -0700)]
[PATCH] slab: stop using list_for_each

Use the _entry variant everywhere to clean the code up a tiny bit.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] slab: clean up kmem_getpages
Christoph Hellwig [Fri, 23 Jun 2006 09:03:17 +0000 (02:03 -0700)]
[PATCH] slab: clean up kmem_getpages

The last ifdef addition hit the ugliness treshold on this functions, so:

 - rename the variable i to nr_pages so it's somewhat descriptive
 - remove the addr variable and do the page_address call at the very end
 - instead of ifdef'ing the whole alloc_pages_node call just make the
   __GFP_COMP addition to flags conditional
 - rewrite the __GFP_COMP comment to make sense

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] tightening hugetlb strict accounting
Chen, Kenneth W [Fri, 23 Jun 2006 09:03:15 +0000 (02:03 -0700)]
[PATCH] tightening hugetlb strict accounting

Current hugetlb strict accounting for shared mapping always assume mapping
starts at zero file offset and reserves pages between zero and size of the
file.  This assumption often reserves (or lock down) a lot more pages then
necessary if application maps at none zero file offset.  libhugetlbfs is
one example that requires proper reservation on shared mapping starts at
none zero offset.

This patch extends the reservation and hugetlb strict accounting to support
any arbitrary pair of (offset, len), resulting a much more robust and
accurate scheme.  More importantly, it won't lock down any hugetlb pages
outside file mapping.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Acked-by: Adam Litke <agl@us.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] reserve space for swap label
Andreas Dilger [Fri, 23 Jun 2006 09:03:14 +0000 (02:03 -0700)]
[PATCH] reserve space for swap label

Reserve space in the swap disk header for a LABEL and UUID to be specified.
 This has been possible with util-linux-2.12b (via e2fsprogs 1.36
libblkid), and is used by at least FC3 and later.  The kernel doesn't
really care about this, but the space shouldn't accidentally be used by
something else either.

Also make the on-disk structures be fixed-size types, instead of "int",
though I don't know of any architecture in use where an "int" isn't the
same size as a "__u32" (all current kernel arches have it as "unsigned
int").

Signed-off-by: Andreas Dilger <adilger@shaw.ca>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] mm: fix typos in comments in mm/oom_kill.c
Dave Peterson [Fri, 23 Jun 2006 09:03:13 +0000 (02:03 -0700)]
[PATCH] mm: fix typos in comments in mm/oom_kill.c

This fixes a few typos in the comments in mm/oom_kill.c.

Signed-off-by: David S. Peterson <dsp@llnl.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] support for panic at OOM
KAMEZAWA Hiroyuki [Fri, 23 Jun 2006 09:03:13 +0000 (02:03 -0700)]
[PATCH] support for panic at OOM

This patch adds panic_on_oom sysctl under sys.vm.

When sysctl vm.panic_on_oom = 1, the kernel panics intead of killing rogue
processes.  And if vm.panic_on_oom is 0 the kernel will do oom_kill() in
the same way as it does today.  Of course, the default value is 0 and only
root can modifies it.

In general, oom_killer works well and kill rogue processes.  So the whole
system can survive.  But there are environments where panic is preferable
rather than kill some processes.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] squash duplicate page_to_pfn and pfn_to_page
Andy Whitcroft [Fri, 23 Jun 2006 09:03:12 +0000 (02:03 -0700)]
[PATCH] squash duplicate page_to_pfn and pfn_to_page

We have architectures where the size of page_to_pfn and pfn_to_page are
significant enough to overall image size that they wish to push them out of
line.  However, in the process we have grown a second copy of the
implementation of each of these routines for each memory model.  Share the
implmentation exposing it either inline or out-of-line as required.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] wait_table and zonelist initializing for memory hotadd: update zonelists
Yasunori Goto [Fri, 23 Jun 2006 09:03:11 +0000 (02:03 -0700)]
[PATCH] wait_table and zonelist initializing for memory hotadd: update zonelists

In current code, zonelist is considered to be build once, no modification.
But MemoryHotplug can add new zone/pgdat.  It must be updated.

This patch modifies build_all_zonelists().  By this, build_all_zonelist() can
reconfig pgdat's zonelists.

To update them safety, this patch use stop_machine_run().  Other cpus don't
touch among updating them by using it.

In old version (V2 of node hotadd), kernel updated them after zone
initialization.  But present_page of its new zone is still 0, because
online_page() is not called yet at this time.  Build_zonelists() checks
present_pages to find present zone.  It was too early.  So, I changed it after
online_pages().

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] wait_table and zonelist initializing for memory hotadd: wait_table initialization
Yasunori Goto [Fri, 23 Jun 2006 09:03:10 +0000 (02:03 -0700)]
[PATCH] wait_table and zonelist initializing for memory hotadd: wait_table initialization

Wait_table is initialized according to zone size at boot time.  But, we cannot
know the maixmum zone size when memory hotplug is enabled.  It can be
changed....  And resizing of wait_table is hard.

So kernel allocate and initialzie wait_table as its maximum size.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] wait_table and zonelist initializing for memory hotadd: add return code for...
Yasunori Goto [Fri, 23 Jun 2006 09:03:10 +0000 (02:03 -0700)]
[PATCH] wait_table and zonelist initializing for memory hotadd: add return code for init_current_empty_zone

When add_zone() is called against empty zone (not populated zone), we have to
initialize the zone which didn't initialize at boot time.  But,
init_currently_empty_zone() may fail due to allocation of wait table.  So,
this patch is to catch its error code.

Changes against wait_table is in the next patch.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] wait_table and zonelist initializing for memory hotadd: change to meminit...
Yasunori Goto [Fri, 23 Jun 2006 09:03:09 +0000 (02:03 -0700)]
[PATCH] wait_table and zonelist initializing for memory hotadd: change to meminit for build_zonelist

Change definitions of some functions and data from __init to __meminit.

These functions and data can be used after bootup by this patch to be used for
hot-add codes.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] wait_table and zonelist initializing for memory hotadd: change name of wait_t...
Yasunori Goto [Fri, 23 Jun 2006 09:03:08 +0000 (02:03 -0700)]
[PATCH] wait_table and zonelist initializing for memory hotadd: change name of wait_table_size()

This is just to rename from wait_table_size() to wait_table_hash_nr_entries().

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] migration: remove unnecessary PageSwapCache checks
Christoph Lameter [Fri, 23 Jun 2006 09:03:08 +0000 (02:03 -0700)]
[PATCH] migration: remove unnecessary PageSwapCache checks

Remove two unnecessary PageSwapCache checks.  The page refcount is raised
and therefore page migration cannot occur in both functions.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] slab: page mapping cleanup
Pekka Enberg [Fri, 23 Jun 2006 09:03:07 +0000 (02:03 -0700)]
[PATCH] slab: page mapping cleanup

Clean up slab allocator page mapping a bit.  The memory allocated for a
slab is physically contiguous so it is okay to assume struct pages are too
so kill the long-standing comment.  Furthermore, rename set_slab_attr to
slab_map_pages and add a comment explaining why its needed.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] PG_uncached is ia64 only
Andrew Morton [Fri, 23 Jun 2006 09:03:06 +0000 (02:03 -0700)]
[PATCH] PG_uncached is ia64 only

As Nick points out, only ia64 uses PG_uncached.  So we can push it up into the
higher bits of the lower half of page->flags and make room for another flag on
32-bit machines.

Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Jesse Barnes <jbarnes@sgi.com>
Cc: Jes Sorensen <jes@trained-monkey.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] slab: extract cache_free_alien from __cache_free
Pekka Enberg [Fri, 23 Jun 2006 09:03:05 +0000 (02:03 -0700)]
[PATCH] slab: extract cache_free_alien from __cache_free

Move alien object freeing to cache_free_alien() to reduce #ifdef clutter in
__cache_free().

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Page Migration: Make do_swap_page redo the fault
Christoph Lameter [Fri, 23 Jun 2006 09:03:04 +0000 (02:03 -0700)]
[PATCH] Page Migration: Make do_swap_page redo the fault

It is better to redo the complete fault if do_swap_page() finds that the
page is not in PageSwapCache() because the page migration code may have
replaced the swap pte already with a pte pointing to valid memory.

do_swap_page() may interpret an invalid swap entry without this patch
because we do not reload the pte if we are looping back.  The page
migration code may already have reused the swap entry referenced by our
local swp_entry.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] zone handle unaligned zone boundaries
Andy Whitcroft [Fri, 23 Jun 2006 09:03:01 +0000 (02:03 -0700)]
[PATCH] zone handle unaligned zone boundaries

The buddy allocator has a requirement that boundaries between contigious
zones occur aligned with the the MAX_ORDER ranges.  Where they do not we
will incorrectly merge pages cross zone boundaries.  This can lead to pages
from the wrong zone being handed out.

Originally the buddy allocator would check that buddies were in the same
zone by referencing the zone start and end page frame numbers.  This was
removed as it became very expensive and the buddy allocator already made
the assumption that zones boundaries were aligned.

It is clear that not all configurations and architectures are honouring
this alignment requirement.  Therefore it seems safest to reintroduce
support for non-aligned zone boundaries.  This patch introduces a new check
when considering a page a buddy it compares the zone_table index for the
two pages and refuses to merge the pages where they do not match.  The
zone_table index is unique for each node/zone combination when
FLATMEM/DISCONTIGMEM is enabled and for each section/zone combination when
SPARSEMEM is enabled (a SPARSEMEM section is at least a MAX_ORDER size).

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] for_each_possible_cpu: xfs
KAMEZAWA Hiroyuki [Fri, 23 Jun 2006 09:03:00 +0000 (02:03 -0700)]
[PATCH] for_each_possible_cpu: xfs

for_each_cpu() actually iterates across all possible CPUs.  We've had mistakes
in the past where people were using for_each_cpu() where they should have been
iterating across only online or present CPUs.  This is inefficient and
possibly buggy.

We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the
future.

This patch replaces for_each_cpu with for_each_possible_cpu.
in xfs.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] XFS: Use the dentry passed to statfs() to limit the scope of the results
David Howells [Fri, 23 Jun 2006 09:02:59 +0000 (02:02 -0700)]
[PATCH] XFS: Use the dentry passed to statfs() to limit the scope of the results

Enable XFS to limit the statfs() results to the project quota covering the
dentry used as a base for call.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] VFS: Permit filesystem to perform statfs with a known root dentry
David Howells [Fri, 23 Jun 2006 09:02:58 +0000 (02:02 -0700)]
[PATCH] VFS: Permit filesystem to perform statfs with a known root dentry

Give the statfs superblock operation a dentry pointer rather than a superblock
pointer.

This complements the get_sb() patch.  That reduced the significance of
sb->s_root, allowing NFS to place a fake root there.  However, NFS does
require a dentry to use as a target for the statfs operation.  This permits
the root in the vfsmount to be used instead.

linux/mount.h has been added where necessary to make allyesconfig build
successfully.

Interest has also been expressed for use with the FUSE and XFS filesystems.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] VFS: Permit filesystem to override root dentry on mount
David Howells [Fri, 23 Jun 2006 09:02:57 +0000 (02:02 -0700)]
[PATCH] VFS: Permit filesystem to override root dentry on mount

Extend the get_sb() filesystem operation to take an extra argument that
permits the VFS to pass in the target vfsmount that defines the mountpoint.

The filesystem is then required to manually set the superblock and root dentry
pointers.  For most filesystems, this should be done with simple_set_mnt()
which will set the superblock pointer and then set the root dentry to the
superblock's s_root (as per the old default behaviour).

The get_sb() op now returns an integer as there's now no need to return the
superblock pointer.

This patch permits a superblock to be implicitly shared amongst several mount
points, such as can be done with NFS to avoid potential inode aliasing.  In
such a case, simple_set_mnt() would not be called, and instead the mnt_root
and mnt_sb would be set directly.

The patch also makes the following changes:

 (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
     pointer argument and return an integer, so most filesystems have to change
     very little.

 (*) If one of the convenience function is not used, then get_sb() should
     normally call simple_set_mnt() to instantiate the vfsmount. This will
     always return 0, and so can be tail-called from get_sb().

 (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
     dcache upon superblock destruction rather than shrink_dcache_anon().

     This is required because the superblock may now have multiple trees that
     aren't actually bound to s_root, but that still need to be cleaned up. The
     currently called functions assume that the whole tree is rooted at s_root,
     and that anonymous dentries are not the roots of trees which results in
     dentries being left unculled.

     However, with the way NFS superblock sharing are currently set to be
     implemented, these assumptions are violated: the root of the filesystem is
     simply a dummy dentry and inode (the real inode for '/' may well be
     inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
     with child trees.

     [*] Anonymous until discovered from another tree.

 (*) The documentation has been adjusted, including the additional bit of
     changing ext2_* into foo_* in the documentation.

[akpm@osdl.org: convert ipath_fs, do other stuff]
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nathan Scott <nathans@sgi.com>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Fix cdrom being confused on using kdump
Rachita Kothiyal [Fri, 23 Jun 2006 09:02:56 +0000 (02:02 -0700)]
[PATCH] Fix cdrom being confused on using kdump

I have seen the cdrom drive appearing confused on using kdump on certain
x86_64 systems.  During the booting up of the second kernel, the following
message would keep flooding the console, and the booting would not proceed
any further.

hda: cdrom_pc_intr: The drive appears confused (ireason = 0x01)

In this patch, whenever we are hitting a confused state in the interrupt
handler with the DRQ set, we end the request and return ide_stopped.  Using
this I dont see the status error.

Signed-off-by: Rachita Kothiyal <rachita@in.ibm.com>
Acked-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6
Linus Torvalds [Fri, 23 Jun 2006 06:09:42 +0000 (23:09 -0700)]
Merge /pub/scm/linux/kernel/git/gregkh/usb-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6:
  [PATCH] Driver core: fix locking issues with the devices that are attached to classes
  [PATCH] USB: get USB suspend to work again

18 years ago[PATCH] Driver core: fix locking issues with the devices that are attached to classes
Greg Kroah-Hartman [Fri, 23 Jun 2006 00:17:32 +0000 (17:17 -0700)]
[PATCH] Driver core: fix locking issues with the devices that are attached to classes

Doh, that was foolish...

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
18 years ago[PATCH] USB: get USB suspend to work again
Greg Kroah-Hartman [Thu, 22 Jun 2006 20:29:52 +0000 (13:29 -0700)]
[PATCH] USB: get USB suspend to work again

Yeah, it's a hack, but it is only temporary until Alan's patches
reworking this area make it in.  We really should not care what devices
below us are doing, especially when we do not really know what type of
devices they are.  This patch relies on the fact that the endpoint
devices do not have a driver assigned to us.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
18 years agoMerge branch 'devel' of master.kernel.org:/home/rmk/linux-2.6-mmc
Linus Torvalds [Fri, 23 Jun 2006 05:47:06 +0000 (22:47 -0700)]
Merge branch 'devel' of /home/rmk/linux-2.6-mmc

* 'devel' of master.kernel.org:/home/rmk/linux-2.6-mmc:
  [ARM] 3565/1: AT91RM9200 MMC update
  [MMC] Convert all hosts except mmci to use data->blksz

18 years agoMerge branch 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm
Linus Torvalds [Fri, 23 Jun 2006 05:46:28 +0000 (22:46 -0700)]
Merge branch 'devel' of /home/rmk/linux-2.6-arm

* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (21 commits)
  [ARM] 3629/1: S3C24XX: fix missing bracket in regs-dsc.h
  [ARM] 3537/1: Rework DMA-bounce locking for finer granularity
  [ARM] 3601/1: i.MX/MX1 DMA error handling for signaled channels only
  [ARM] 3597/1: ixp4xx/nslu2: Board support for new LED subsystem
  [ARM] 3595/1: ixp4xx/nas100d: Board support for new LED subsystem
  [ARM] 3626/1: ARM EABI: fix syscall restarting
  [ARM] 3628/1: S3C24XX: add get_rate call to struct clk
  [ARM] 3627/1: S3C24XX: split s3c2410 clocks from core clocks
  [ARM] 3613/1: S3C2410: Add sysdev and sysclass
  [ARM] 3624/1: Report true modem control line states
  [ARM] 3620/2: ixp23xx: add uengine loader support
  [ARM] 3618/1: add defconfig for logicpd pxa270 card engine
  [ARM] 3617/1: ep93xx: fix slightly incorrect timer tick rate
  [ARM] 3616/1: fix timer handler wrap logic for a number of platforms
  [ARM] 3615/1: ixp23xx: use platform devices for physmap flash
  [ARM] 3614/1: ep93xx: use platform devices for physmap flash
  [ARM] 3621/1: fix compilation breakage for pnx4008
  [ARM] 3623/1: pnx4008: move GPIO-related defines to gpio.h
  [ARM] 3622/1: pnx4008: remove clk_use/clk_unuse
  [ARM] Enable VFP to be built when non-VFP capable CPUs are selected
  ...

18 years agoMerge branch 'devel' of master.kernel.org:/home/rmk/linux-2.6-serial
Linus Torvalds [Fri, 23 Jun 2006 05:45:53 +0000 (22:45 -0700)]
Merge branch 'devel' of /home/rmk/linux-2.6-serial

* 'devel' of master.kernel.org:/home/rmk/linux-2.6-serial:
  [ARM] 3600/1: increase amba-pl010 UART_NR to 8
  [ARM] 3571/1: netX: serial driver for Hilscher netX

18 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq
Linus Torvalds [Fri, 23 Jun 2006 05:40:00 +0000 (22:40 -0700)]
Merge /pub/scm/linux/kernel/git/davej/cpufreq

* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] Fix ondemand vs suspend deadlock
  [CPUFREQ] Fix powernow-k8 SMP kernel on UP hardware bug.
  [PATCH] redirect speedstep-centrino maintainer mail to cpufreq list
  [CPUFREQ] correct powernow-k8 fid/vid masks for extended parts
  [CPUFREQ] Clarify powernow-k8 cpu_family statements

18 years agoMerge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik...
Linus Torvalds [Fri, 23 Jun 2006 05:15:09 +0000 (22:15 -0700)]
Merge branch 'upstream-linus' of /linux/kernel/git/jgarzik/netdev-2.6

* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (33 commits)
  [PATCH] myri10ge - drop workaround pci_save_state() disabling MSI
  [PATCH] myri10ge - drop workaround for the missing AER ext cap on nVidia CK804
  via-velocity: the link is not correctly detected when the device starts
  [PATCH] add b44 to maintainers
  [PATCH] WAN: ioremap() failure checks in drivers
  [PATCH] WAN: register_hdlc_device() doesn't need dev_alloc_name()
  [PATCH] skb_padto()-area fixes in 8390, wavelan
  [PATCH] make drivers/net/forcedeth.c:nv_update_pause() static
  [PATCH] network driver for Hilscher netx
  [PATCH] Dereference in tokenring/olympic.c
  [PATCH] Array overrun in drivers/net/wireless/wavelan.c
  [PATCH] Remove useless check in drivers/net/pcmcia/xirc2ps_cs.c
  [PATCH] 8139cp: add ethtool eeprom support
  [PATCH] 8139cp: fix eeprom read command length
  [PATCH] b44: update b44 Kconfig entry
  [PATCH] b44: update version to 1.01
  [PATCH] b44: add wol for old nic
  [PATCH] b44: add parameter
  [PATCH] b44: add wol
  [PATCH] b44: fix manual speed/duplex/autoneg settings
  ...

18 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
Linus Torvalds [Fri, 23 Jun 2006 05:11:30 +0000 (22:11 -0700)]
Merge git://git./linux/kernel/git/paulus/powerpc

* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (139 commits)
  [POWERPC] re-enable OProfile for iSeries, using timer interrupt
  [POWERPC] support ibm,extended-*-frequency properties
  [POWERPC] Extra sanity check in EEH code
  [POWERPC] Dont look for class-code in pci children
  [POWERPC] Fix mdelay badness on shared processor partitions
  [POWERPC] disable floating point exceptions for init
  [POWERPC] Unify ppc syscall tables
  [POWERPC] mpic: add support for serial mode interrupts
  [POWERPC] pseries: Print PCI slot location code on failure
  [POWERPC] spufs: one more fix for 64k pages
  [POWERPC] spufs: fail spu_create with invalid flags
  [POWERPC] spufs: clear class2 interrupt status before wakeup
  [POWERPC] spufs: fix Makefile for "make clean"
  [POWERPC] spufs: remove stop_code from struct spu
  [POWERPC] spufs: fix spu irq affinity setting
  [POWERPC] spufs: further abstract priv1 register access
  [POWERPC] spufs: split the Cell BE support into generic and platform dependant parts
  [POWERPC] spufs: dont try to access SPE channel 1 count
  [POWERPC] spufs: use kzalloc in create_spu
  [POWERPC] spufs: fix initial state of wbox file
  ...

Manually resolved conflicts in:
drivers/net/phy/Makefile
include/asm-powerpc/spu.h

18 years ago[PATCH] myri10ge - drop workaround pci_save_state() disabling MSI
Brice Goglin [Fri, 23 Jun 2006 01:12:36 +0000 (21:12 -0400)]
[PATCH] myri10ge - drop workaround pci_save_state() disabling MSI

We don't need to restore the state right after saving it for later recovery
since commit 99dc804d9bcc2c53f4c20c291bf4e185312a1a0c (PCI: disable msi mode
in pci_disable_device) now prevents pci_save_state() from disabling MSI.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] myri10ge - drop workaround for the missing AER ext cap on nVidia CK804
Brice Goglin [Fri, 23 Jun 2006 01:11:59 +0000 (21:11 -0400)]
[PATCH] myri10ge - drop workaround for the missing AER ext cap on nVidia CK804

We don't need to hardcode the AER capability of the nVidia CK804 chipset
anymore since commit cf34a8e07f02c76f3f1232eecb681301a3d7b10b (PCI: nVidia
quirk to make AER PCI-E extended capability visible) now makes sure that
this cap will be available to pci_find_ext_capability().

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years agoMerge branch 'upstream' of git://electric-eye.fr.zoreil.com/home/romieu/linux-2.6...
Jeff Garzik [Fri, 23 Jun 2006 03:33:23 +0000 (23:33 -0400)]
Merge branch 'upstream' of git://electric-eye.fr.zoreil.com/home/romieu/linux-2.6 into upstream

18 years ago[PATCH] add b44 to maintainers
Gary Zambrano [Fri, 23 Jun 2006 00:26:20 +0000 (17:26 -0700)]
[PATCH] add b44 to maintainers

Add b44 to the MAINTAINERS file.

Signed-off-by: Gary Zambrano <zambrano@broadcom.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] WAN: ioremap() failure checks in drivers
Krzysztof Halasa [Thu, 22 Jun 2006 20:29:28 +0000 (22:29 +0200)]
[PATCH] WAN: ioremap() failure checks in drivers

Eric Sesterhenn found that pci200syn initialization lacks return
statement in ioremap() error path (coverity bug id #195). It looks
like more WAN drivers have problems with ioremap().

Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] WAN: register_hdlc_device() doesn't need dev_alloc_name()
Krzysztof Halasa [Thu, 22 Jun 2006 20:20:19 +0000 (22:20 +0200)]
[PATCH] WAN: register_hdlc_device() doesn't need dev_alloc_name()

David Boggs noticed that register_hdlc_device() no longer needs
to call dev_alloc_name() as it's called by register_netdev().
register_hdlc_device() is currently equivalent to register_netdev().

hdlc_setup() is now EXPORTed as per David's request.

Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] skb_padto()-area fixes in 8390, wavelan
Alan Cox [Thu, 22 Jun 2006 13:25:34 +0000 (14:25 +0100)]
[PATCH] skb_padto()-area fixes in 8390, wavelan

Ar Iau, 2006-06-22 am 21:29 +1000, ysgrifennodd Herbert Xu:
> Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> >
> > The 8390 change (corrected version) also makes 8390.c faster so should
> > be applied anyway, and the orinoco one fixes some code that isn't even
> > needed and someone forgot to remove long ago. Otherwise the skb_padto
>
> Yeah I agree totally.  However, I haven't actually seen the fixed 8390
> version being posted yet or at least not to netdev :)

Ah the resounding clang of a subtle hint ;)

Signed-off-by: Alan Cox <alan@redhat.com>
- Return 8390.c to the old way of handling short packets (which is also
faster)

- Remove the skb_padto from orinoco. This got left in when the padding bad
write patch was added and is actually not needed. This is fixing a merge
error way back when.

- Wavelan can also use the stack based buffer trick if you want
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] make drivers/net/forcedeth.c:nv_update_pause() static
Adrian Bunk [Thu, 22 Jun 2006 10:03:29 +0000 (12:03 +0200)]
[PATCH] make drivers/net/forcedeth.c:nv_update_pause() static

This patch makes the needlessly global nv_update_pause() static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] network driver for Hilscher netx
Sascha Hauer [Thu, 22 Jun 2006 05:11:13 +0000 (07:11 +0200)]
[PATCH] network driver for Hilscher netx

This is a patch for the Hilscher netx builtin ethernet ports. The
netx board support was merged into 2.6.17-git2.
The netx is a arm926 based SoC.

Signed-off-by: Robert Schwebel <r.schwebel@pengutronix.de>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
--
 drivers/net/Kconfig             |   11
 drivers/net/Makefile            |    1
 drivers/net/netx-eth.c          |  516 ++++++++++++++++++++++++++++++++++++++++
 include/asm-arm/arch-netx/eth.h |   27 ++
 4 files changed, 555 insertions(+)
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] Dereference in tokenring/olympic.c
Eric Sesterhenn [Wed, 21 Jun 2006 14:17:17 +0000 (16:17 +0200)]
[PATCH] Dereference in tokenring/olympic.c

hi,

coverity found (bug id #225) that we might call free_netdev()
with NULL argument, when alloc_trdev() fails. This patch
changes the goto, so we dont call free_netdev() for
dev == NULL.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] Array overrun in drivers/net/wireless/wavelan.c
Eric Sesterhenn [Wed, 21 Jun 2006 14:40:24 +0000 (16:40 +0200)]
[PATCH] Array overrun in drivers/net/wireless/wavelan.c

hi,

this is another array overrun spotted by coverity (#id 507)
we should check the index against array size before using it.
Not sure why the driver doesnt use ARRAY_SIZE instead of its
own macro.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] Remove useless check in drivers/net/pcmcia/xirc2ps_cs.c
Eric Sesterhenn [Wed, 21 Jun 2006 14:10:48 +0000 (16:10 +0200)]
[PATCH] Remove useless check in drivers/net/pcmcia/xirc2ps_cs.c

hi,

coverity choked at this check (id #223), assuming that
skb might be NULL and used anyways later. Since
start_hard_xmit() always gets called with a valid
skb, the check is useless and this patch removes it.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] 8139cp: add ethtool eeprom support
Philip Craig [Wed, 21 Jun 2006 01:33:27 +0000 (11:33 +1000)]
[PATCH] 8139cp: add ethtool eeprom support

Implement the ethtool eeprom operations for the 8139cp driver.
Tested on x86 and big-endian ARM.

Signed-off-by: Philip Craig <philipc@snapgear.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] 8139cp: fix eeprom read command length
Philip Craig [Wed, 21 Jun 2006 01:33:26 +0000 (11:33 +1000)]
[PATCH] 8139cp: fix eeprom read command length

The read command for the 93C46/93C56 EEPROMS should be 3 bits plus
the address.  This doesn't appear to affect the operation of the
read command, but similar errors for write commands do cause failures.

Signed-off-by: Philip Craig <philipc@snapgear.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] b44: update b44 Kconfig entry
Gary Zambrano [Tue, 20 Jun 2006 22:34:46 +0000 (15:34 -0700)]
[PATCH] b44: update b44 Kconfig entry

Deleted "EXPERIMENTAL" from b44 entry in Kconfig.

Signed-off-by: Gary Zambrano <zambrano@broadcom.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] b44: update version to 1.01
Gary Zambrano [Tue, 20 Jun 2006 22:34:40 +0000 (15:34 -0700)]
[PATCH] b44: update version to 1.01

Update the driver version to 1.01

Signed-off-by: Gary Zambrano <zambrano@broadcom.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] b44: add wol for old nic
Gary Zambrano [Tue, 20 Jun 2006 22:34:36 +0000 (15:34 -0700)]
[PATCH] b44: add wol for old nic

This patch adds wol support for the older 440x nics that use pattern matching.
This patch is a redo thanks to feedback from Michael Chan and Francois Romieu.

Signed-off-by: Gary Zambrano <zambrano@broadcom.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] b44: add parameter
Gary Zambrano [Tue, 20 Jun 2006 22:34:26 +0000 (15:34 -0700)]
[PATCH] b44: add parameter

This patch adds a parameter to init_hw() to not completely initialize
the nic for wol.

Signed-off-by: Gary Zambrano <zambrano@broadcom.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] b44: add wol
Gary Zambrano [Tue, 20 Jun 2006 22:34:23 +0000 (15:34 -0700)]
[PATCH] b44: add wol

Adds wol to the driver.
This is a redo of a previous patch thanks to feedback from Francois Romieu.

Signed-off-by Gary Zambrano <zambrano@broadcom.com>

Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] b44: fix manual speed/duplex/autoneg settings
Gary Zambrano [Tue, 20 Jun 2006 22:34:15 +0000 (15:34 -0700)]
[PATCH] b44: fix manual speed/duplex/autoneg settings

Fixes for speed/duplex/autoneg settings and driver settings info.
This is a redo of a previous patch thanks to feedback from Jeff Garzik.

Signed-off-by: Gary Zambrano <zambrano@broadcom.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] AT91RM9200 Ethernet #4: Suspend/Resume
Andrew Victor [Tue, 20 Jun 2006 10:19:13 +0000 (12:19 +0200)]
[PATCH] AT91RM9200 Ethernet #4: Suspend/Resume

Adds power-management (suspend/resume) support to the AT91RM9200
Ethernet driver.
Patch from David Brownell.

Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] AT91RM9200 Ethernet #3: Cleanup
Andrew Victor [Tue, 20 Jun 2006 10:10:57 +0000 (12:10 +0200)]
[PATCH] AT91RM9200 Ethernet #3: Cleanup

Moved global ether_clk variable into controller data structure.
Patch from David Brownell.

Davicom 9161 PHY was being incorrectly displayed as "9196".
Patch from Brian Stafford.

clk_get() doesn't return NULL on error, so the return value needs to be
tested with IS_ERR().

Whitespace cleanup.

Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] AT91RM9200 Ethernet #2: MII interface
Andrew Victor [Tue, 20 Jun 2006 09:59:05 +0000 (11:59 +0200)]
[PATCH] AT91RM9200 Ethernet #2: MII interface

Adds support for the MII ioctls via generic_mii_ioctl().
Patch from Brian Stafford.

Set the mii.phy_id to the detected PHY address, otherwise ethtool cannot
access PHYs other than 0.
Patch from Roman Kolesnikov.

Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] AT91RM9200 Ethernet #1: Link poll
Andrew Victor [Tue, 20 Jun 2006 09:50:23 +0000 (11:50 +0200)]
[PATCH] AT91RM9200 Ethernet #1: Link poll

For Ethernet PHYs that don't have an IRQ pin or boards that don't
connect the IRQ pin to the processor, we enable a timer to poll the
PHY's link state.

Patch originally supplied by Eric Benard and Roman Kolesnikov.

Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[PATCH] IP27: Really set PCI64_ATTR_VIRTUAL, not PCI64_ATTR_PREC.
Ralf Baechle [Sat, 17 Jun 2006 17:57:39 +0000 (18:57 +0100)]
[PATCH] IP27: Really set PCI64_ATTR_VIRTUAL, not PCI64_ATTR_PREC.

IOC3's homegrown DMA mapping functions that are used to optimize things
a little on IP27 set the wrong bit.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years agoMerge branch 'master' into upstream
Jeff Garzik [Fri, 23 Jun 2006 02:51:46 +0000 (22:51 -0400)]
Merge branch 'master' into upstream

18 years agovia-velocity: the link is not correctly detected when the device starts
Francois Romieu [Thu, 22 Jun 2006 22:47:06 +0000 (00:47 +0200)]
via-velocity: the link is not correctly detected when the device starts

The patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=6711

Signed-off-by: Roy Marples <uberlord@gentoo.org>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
18 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/i2c-2.6
Linus Torvalds [Thu, 22 Jun 2006 22:08:56 +0000 (15:08 -0700)]
Merge /pub/scm/linux/kernel/git/gregkh/i2c-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/i2c-2.6: (44 commits)
  [PATCH] I2C: I2C controllers go into right place on sysfs
  [PATCH] hwmon-vid: Add support for Intel Core and Conroe
  [PATCH] lm70: New hardware monitoring driver
  [PATCH] hwmon: Fix the Kconfig header
  [PATCH] i2c-i801: Merge setup function
  [PATCH] i2c-i801: Better pci subsystem integration
  [PATCH] i2c-i801: Cleanups
  [PATCH] i2c-i801: Remove PCI function check
  [PATCH] i2c-i801: Remove force_addr parameter
  [PATCH] i2c-i801: Fix block transaction poll loops
  [PATCH] scx200_acb: Documentation update
  [PATCH] scx200_acb: Mark scx200_acb_probe __init
  [PATCH] scx200_acb: Use PCI I/O resource when appropriate
  [PATCH] i2c: Mark block write buffers as const
  [PATCH] i2c-ocores: Minor cleanups
  [PATCH] abituguru: Fix fan detection
  [PATCH] abituguru: Review fixes
  [PATCH] abituguru: New hardware monitoring driver
  [PATCH] w83792d: Add missing data access locks
  [PATCH] w83792d: Fix setting the PWM value
  ...

18 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/w1-2.6
Linus Torvalds [Thu, 22 Jun 2006 22:08:34 +0000 (15:08 -0700)]
Merge /pub/scm/linux/kernel/git/gregkh/w1-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/w1-2.6:
  [PATCH] w1: warning fix
  [PATCH] w1: clean up W1_CON dependency.
  [PATCH] drivers/w1/w1.c: fix a compile error
  [PATCH] W1: fix dependencies of W1_SLAVE_DS2433_CRC
  [PATCH] W1: possible cleanups
  [PATCH] W1: cleanups
  [PATCH] w1 exports
  [PATCH] w1: Use mutexes instead of semaphores.
  [PATCH] w1: Make w1 connector notifications depend on connector.
  [PATCH] w1: netlink: Mark netlink group 1 as unused.
  [PATCH] w1: Move w1-connector definitions into linux/include/connector.h
  [PATCH] w1: Userspace communication protocol over connector.
  [PATCH] w1: Replace dscore and ds_w1_bridge with ds2490 driver.
  [PATCH] w1: Added default generic read/write operations.

18 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
Linus Torvalds [Thu, 22 Jun 2006 22:07:59 +0000 (15:07 -0700)]
Merge /pub/scm/linux/kernel/git/gregkh/pci-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (27 commits)
  [PATCH] PCI: nVidia quirk to make AER PCI-E extended capability visible
  [PATCH] PCI: fix issues with extended conf space when MMCONFIG disabled because of e820
  [PATCH] PCI: Bus Parity Status sysfs interface
  [PATCH] PCI: fix memory leak in MMCONFIG error path
  [PATCH] PCI: fix error with pci_get_device() call in the mpc85xx driver
  [PATCH] PCI: MSI-K8T-Neo2-Fir: run only where needed
  [PATCH] PCI: fix race with pci_walk_bus and pci_destroy_dev
  [PATCH] PCI: clean up pci documentation to be more specific
  [PATCH] PCI: remove unneeded msi code
  [PATCH] PCI: don't move ioapics below PCI bridge
  [PATCH] PCI: cleanup unused variable about msi driver
  [PATCH] PCI: disable msi mode in pci_disable_device
  [PATCH] PCI: Allow MSI to work on kexec kernel
  [PATCH] PCI: AMD 8131 MSI quirk called too late, bus_flags not inherited ?
  [PATCH] PCI: Move various PCI IDs to header file
  [PATCH] PCI Bus Parity Status-broken hardware attribute, EDAC foundation
  [PATCH] PCI: i386/x86_84: disable PCI resource decode on device disable
  [PATCH] PCI ACPI: Rename the functions to avoid multiple instances.
  [PATCH] PCI: don't enable device if already enabled
  [PATCH] PCI: Add a "enable" sysfs attribute to the pci devices to allow userspace (Xorg) to enable devices without doing foul direct access
  ...

18 years ago[PATCH] x86_64: use select for GART_IOMMU to enable AGP
Roman Zippel [Thu, 22 Jun 2006 21:47:35 +0000 (14:47 -0700)]
[PATCH] x86_64: use select for GART_IOMMU to enable AGP

The AGP default doesn't work well with other selects, so use a select for
GART_IOMMU as well.  Remove a redundant default for SWIOTLB as well.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@muc.de>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Dave Airlie <airlied@linux.ie>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] zlib_inflate: Upgrade library code to a recent version
Richard Purdie [Thu, 22 Jun 2006 21:47:34 +0000 (14:47 -0700)]
[PATCH] zlib_inflate: Upgrade library code to a recent version

Upgrade the zlib_inflate implementation in the kernel from a patched
version 1.1.3/4 to a patched 1.2.3.

The code in the kernel is about seven years old and I noticed that the
external zlib library's inflate performance was significantly faster (~50%)
than the code in the kernel on ARM (and faster again on x86_32).

For comparison the newer deflate code is 20% slower on ARM and 50% slower
on x86_32 but gives an approx 1% compression ratio improvement.  I don't
consider this to be an improvement for kernel use so have no plans to
change the zlib_deflate code.

Various changes have been made to the zlib code in the kernel, the most
significant being the extra functions/flush option used by ppp_deflate.
This update reimplements the features PPP needs to ensure it continues to
work.

This code has been tested on ARM under both JFFS2 (with zlib compression
enabled) and ppp_deflate and on x86_32.  JFFS2 sees an approx.  10% real
world file read speed improvement.

This patch also removes ZLIB_VERSION as it no longer has a correct value.
We don't need version checks anyway as the kernel's module handling will
take care of that for us.  This removal is also more in keeping with the
zlib author's wishes (http://www.zlib.net/zlib_faq.html#faq24) and I've
added something to the zlib.h header to note its a modified version.

Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Acked-by: Joern Engel <joern@wh.fh-wedel.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] vgacon: make VGA_MAP_MEM take size, remove extra use
Bjorn Helgaas [Thu, 22 Jun 2006 21:47:32 +0000 (14:47 -0700)]
[PATCH] vgacon: make VGA_MAP_MEM take size, remove extra use

VGA_MAP_MEM translates to ioremap() on some architectures.  It makes sense
to do this to vga_vram_base, because we're going to access memory between
vga_vram_base and vga_vram_end.

But it doesn't really make sense to map starting at vga_vram_end, because
we aren't going to access memory starting there.  On ia64, which always has
to be different, ioremapping vga_vram_end gives you something completely
incompatible with ioremapped vga_vram_start, so vga_vram_size ends up being
nonsense.

As a bonus, we often know the size up front, so we can use ioremap()
correctly, rather than giving it a zero size.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] prune_one_dentry() tweaks
Andrew Morton [Thu, 22 Jun 2006 21:47:31 +0000 (14:47 -0700)]
[PATCH] prune_one_dentry() tweaks

- Add description of d_lock handling to comments over prune_one_dentry().

- It has three callsites - uninline it, saving 200 bytes of text.

Cc: Jan Blunck <jblunck@suse.de>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: Olaf Hering <olh@suse.de>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Fix dcache race during umount
NeilBrown [Thu, 22 Jun 2006 21:47:28 +0000 (14:47 -0700)]
[PATCH] Fix dcache race during umount

The race is that the shrink_dcache_memory shrinker could get called while a
filesystem is being unmounted, and could try to prune a dentry belonging to
that filesystem.

If it does, then it will call in to iput on the inode while the dentry is
no longer able to be found by the umounting process.  If iput takes a
while, generic_shutdown_super could get all the way though
shrink_dcache_parent and shrink_dcache_anon and invalidate_inodes without
ever waiting on this particular inode.

Eventually the superblock gets freed anyway and if the iput tried to touch
it (which some filesystems certainly do), it will lose.  The promised
"Self-destruct in 5 seconds" doesn't lead to a nice day.

The race is closed by holding s_umount while calling prune_one_dentry on
someone else's dentry.  As a down_read_trylock is used,
shrink_dcache_memory will no longer try to prune the dentry of a filesystem
that is being unmounted, and unmount will not be able to start until any
such active prune_one_dentry completes.

This requires that prune_dcache *knows* which filesystem (if any) it is
doing the prune on behalf of so that it can be careful of other
filesystems.  shrink_dcache_memory isn't called it on behalf of any
filesystem, and so is careful of everything.

shrink_dcache_anon is now passed a super_block rather than the s_anon list
out of the superblock, so it can get the s_anon list itself, and can pass
the superblock down to prune_dcache.

If prune_dcache finds a dentry that it cannot free, it leaves it where it
is (at the tail of the list) and exits, on the assumption that some other
thread will be removing that dentry soon.  To try to make sure that some
work gets done, a limited number of dnetries which are untouchable are
skipped over while choosing the dentry to work on.

I believe this race was first found by Kirill Korotaev.

Cc: Jan Blunck <jblunck@suse.de>
Acked-by: Kirill Korotaev <dev@openvz.org>
Cc: Olaf Hering <olh@suse.de>
Acked-by: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Balbir Singh <balbir@in.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] avoid tasklist_lock at getrusage for multithreaded case too
Ravikiran G Thirumalai [Thu, 22 Jun 2006 21:47:26 +0000 (14:47 -0700)]
[PATCH] avoid tasklist_lock at getrusage for multithreaded case too

Avoid taking tasklist_lock for at getrusage for the multithreaded case too.
We don't need to take the tasklist lock for thread traversal of a process
since Oleg's do-__unhash_process-under-siglock.patch and related work.

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] remove steal_locks()
Miklos Szeredi [Thu, 22 Jun 2006 21:47:22 +0000 (14:47 -0700)]
[PATCH] remove steal_locks()

This patch removes the steal_locks() function.

steal_locks() doesn't work correctly with any filesystem that does it's own
lock management, including NFS, CIFS, etc.

In addition it has weird semantics on local filesystems in case tasks
sharing file-descriptor tables are doing POSIX locking operations in
parallel to execve().

The steal_locks() function has an effect on applications doing:

clone(CLONE_FILES)
  /* in child */
  lock
  execve
  lock

POSIX locks acquired before execve (by "child", "parent" or any further
task sharing files_struct) will after the execve be owned exclusively by
"child".

According to Chris Wright some LSB/LTP kind of suite triggers without the
stealing behavior, but there's no known real-world application that would
also fail.

Apps using NPTL are not affected, since all other threads are killed before
execve.

Apps using LinuxThreads are only affected if they

  - have multiple threads during exec (LinuxThreads doesn't kill other
    threads, the app may do it with pthread_kill_other_threads_np())
  - rely on POSIX locks being inherited across exec

Both conditions are documented, but not their interaction.

Apps using clone() natively are affected if they

  - use clone(CLONE_FILES)
  - rely on POSIX locks being inherited across exec

The above scenarios are unlikely, but possible.

If the patch is vetoed, there's a plan B, that involves mostly keeping the
weird stealing semantics, but changing the way lock ownership is handled so
that network and local filesystems work consistently.

That would add more complexity though, so this solution seems to be
preferred by most people.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Matthew Wilcox <willy@debian.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Fix a race condition between ->i_mapping and iput()
OGAWA Hirofumi [Thu, 22 Jun 2006 21:47:21 +0000 (14:47 -0700)]
[PATCH] Fix a race condition between ->i_mapping and iput()

This race became a cause of oops, and can reproduce by the following.

    while true; do
dd if=/dev/zero of=/dev/.static/dev/hdg1 bs=512 count=1000 & sync
    done

This race condition was between __sync_single_inode() and iput().

          cpu0 (fs's inode)                 cpu1 (bdev's inode)
          -----------------                 -------------------
                                       close("/dev/hda2")
                                       [...]
__sync_single_inode()
   /* copy the bdev's ->i_mapping */
   mapping = inode->i_mapping;

                                       generic_forget_inode()
                                          bdev_clear_inode()
     /* restre the fs's ->i_mapping */
             inode->i_mapping = &inode->i_data;
          /* bdev's inode was freed */
                                          destroy_inode(inode);

   if (wait) {
      /* dereference a freed bdev's mapping->host */
      filemap_fdatawait(mapping);  /* Oops */

Since __sync_single_inode() is only taking a ref-count of fs's inode, the
another process can be close() and freeing the bdev's inode while writing
fs's inode.  So, __sync_signle_inode() accesses the freed ->i_mapping,
oops.

This patch takes a ref-count on the bdev's inode for the fs's inode before
setting a ->i_mapping, and the clear_inode() of the fs's inode does iput() on
the bdev's inode.  So if the fs's inode is still living, bdev's inode
shouldn't be freed.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] PCI: Add PCI_CAP_ID_VNDR
Brice Goglin [Thu, 22 Jun 2006 21:47:20 +0000 (14:47 -0700)]
[PATCH] PCI: Add PCI_CAP_ID_VNDR

Add the vendor-specific extended capability PCI_CAP_ID_VNDR.  It is required
by the Myri-10G Ethernet driver.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] myri10ge build fix
Andrew Morton [Thu, 22 Jun 2006 21:47:19 +0000 (14:47 -0700)]
[PATCH] myri10ge build fix

Someone changed skb_linearize().

Cc: Brice Goglin <bgoglin@myri.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] suspend_console() warning fix
Andrew Morton [Thu, 22 Jun 2006 21:47:18 +0000 (14:47 -0700)]
[PATCH] suspend_console() warning fix

kernel/power/main.c: In function 'suspend_prepare':
kernel/power/main.c:89: warning: implicit declaration of function 'suspend_console'
kernel/power/main.c: In function 'suspend_finish':
kernel/power/main.c:137: warning: implicit declaration of function 'resume_console'

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Keys: Fix race between two instantiators of a key
David Howells [Thu, 22 Jun 2006 21:47:18 +0000 (14:47 -0700)]
[PATCH] Keys: Fix race between two instantiators of a key

Add a revocation notification method to the key type and calls it whilst
the key's semaphore is still write-locked after setting the revocation
flag.

The patch then uses this to maintain a reference on the task_struct of the
process that calls request_key() for as long as the authorisation key
remains unrevoked.

This fixes a potential race between two processes both of which have
assumed the authority to instantiate a key (one may have forked the other
for example).  The problem is that there's no locking around the check for
revocation of the auth key and the use of the task_struct it points to, nor
does the auth key keep a reference on the task_struct.

Access to the "context" pointer in the auth key must thenceforth be done
with the auth key semaphore held.  The revocation method is called with the
target key semaphore held write-locked and the search of the context
process's keyrings is done with the auth key semaphore read-locked.

The check for the revocation state of the auth key just prior to searching
it is done after the auth key is read-locked for the search.  This ensures
that the auth key can't be revoked between the check and the search.

The revocation notification method is added so that the context task_struct
can be released as soon as instantiation happens rather than waiting for
the auth key to be destroyed, thus avoiding the unnecessary pinning of the
requesting process.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] selinux: add hooks for key subsystem
Michael LeMay [Thu, 22 Jun 2006 21:47:17 +0000 (14:47 -0700)]
[PATCH] selinux: add hooks for key subsystem

Introduce SELinux hooks to support the access key retention subsystem
within the kernel.  Incorporate new flask headers from a modified version
of the SELinux reference policy, with support for the new security class
representing retained keys.  Extend the "key_alloc" security hook with a
task parameter representing the intended ownership context for the key
being allocated.  Attach security information to root's default keyrings
within the SELinux initialization routine.

Has passed David's testsuite.

Signed-off-by: Michael LeMay <mdlemay@epoch.ncsc.mil>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] NTFS: Critical bug fix (affects MIPS and possibly others)
Anton Altaparmakov [Thu, 22 Jun 2006 21:47:15 +0000 (14:47 -0700)]
[PATCH] NTFS: Critical bug fix (affects MIPS and possibly others)

Many thanks to Pauline Ng for the detailed bug report and analysis!

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] sparc build breakage
Al Viro [Thu, 22 Jun 2006 21:47:14 +0000 (14:47 -0700)]
[PATCH] sparc build breakage

rd_prompt et.al.  depend on CONFIG_BLK_DEV_RAM, not CONFIG_BLK_INITRD; now
that those are independent, setup.c blows with INITRD on and BLK_DEV_RAM
off.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] UML: fix wall_to_monotonic initialization
Jeff Dike [Thu, 22 Jun 2006 21:47:09 +0000 (14:47 -0700)]
[PATCH] UML: fix wall_to_monotonic initialization

Change a variable from unsigned to signed in order to get sign-extension
when the thing is negated.  Without this, uptime is horribly confused.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>