David Gibson [Wed, 22 Mar 2006 08:08:56 +0000 (00:08 -0800)]
[PATCH] hugepage: Make {alloc,free}_huge_page() local
Originally, mm/hugetlb.c just handled the hugepage physical allocation path
and its {alloc,free}_huge_page() functions were used from the arch specific
hugepage code. These days those functions are only used with mm/hugetlb.c
itself. Therefore, this patch makes them static and removes their
prototypes from hugetlb.h. This requires a small rearrangement of code in
mm/hugetlb.c to avoid a forward declaration.
This patch causes no regressions on the libhugetlbfs testsuite (ppc64,
POWER5).
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
David Gibson [Wed, 22 Mar 2006 08:08:55 +0000 (00:08 -0800)]
[PATCH] hugepage: Strict page reservation for hugepage inodes
These days, hugepages are demand-allocated at first fault time. There's a
somewhat dubious (and racy) heuristic when making a new mmap() to check if
there are enough available hugepages to fully satisfy that mapping.
A particularly obvious case where the heuristic breaks down is where a
process maps its hugepages not as a single chunk, but as a bunch of
individually mmap()ed (or shmat()ed) blocks without touching and
instantiating the pages in between allocations. In this case the size of
each block is compared against the total number of available hugepages.
It's thus easy for the process to become overcommitted, because each block
mapping will succeed, although the total number of hugepages required by
all blocks exceeds the number available. In particular, this defeats such
a program which will detect a mapping failure and adjust its hugepage usage
downward accordingly.
The patch below addresses this problem, by strictly reserving a number of
physical hugepages for hugepage inodes which have been mapped, but not
instatiated. MAP_SHARED mappings are thus "safe" - they will fail on
mmap(), not later with an OOM SIGKILL. MAP_PRIVATE mappings can still
trigger an OOM. (Actually SHARED mappings can technically still OOM, but
only if the sysadmin explicitly reduces the hugepage pool between mapping
and instantiation)
This patch appears to address the problem at hand - it allows DB2 to start
correctly, for instance, which previously suffered the failure described
above.
This patch causes no regressions on the libhugetblfs testsuite, and makes a
test (designed to catch this problem) pass which previously failed (ppc64,
POWER5).
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
David Gibson [Wed, 22 Mar 2006 08:08:53 +0000 (00:08 -0800)]
[PATCH] hugepage: serialize hugepage allocation and instantiation
Currently, no lock or mutex is held between allocating a hugepage and
inserting it into the pagetables / page cache. When we do go to insert the
page into pagetables or page cache, we recheck and may free the newly
allocated hugepage. However, since the number of hugepages in the system
is strictly limited, and it's usualy to want to use all of them, this can
still lead to spurious allocation failures.
For example, suppose two processes are both mapping (MAP_SHARED) the same
hugepage file, large enough to consume the entire available hugepage pool.
If they race instantiating the last page in the mapping, they will both
attempt to allocate the last available hugepage. One will fail, of course,
returning OOM from the fault and thus causing the process to be killed,
despite the fact that the entire mapping can, in fact, be instantiated.
The patch fixes this race by the simple method of adding a (sleeping) mutex
to serialize the hugepage fault path between allocation and insertion into
pagetables and/or page cache. It would be possible to avoid the
serialization by catching the allocation failures, waiting on some
condition, then rechecking to see if someone else has instantiated the page
for us. Given the likely frequency of hugepage instantiations, it seems
very doubtful it's worth the extra complexity.
This patch causes no regression on the libhugetlbfs testsuite, and one
test, which can trigger this race now passes where it previously failed.
Actually, the test still sometimes fails, though less often and only as a
shmat() failure, rather processes getting OOM killed by the VM. The dodgy
heuristic tests in fs/hugetlbfs/inode.c for whether there's enough hugepage
space aren't protected by the new mutex, and would be ugly to do so, so
there's still a race there. Another patch to replace those tests with
something saner for this reason as well as others coming...
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
David Gibson [Wed, 22 Mar 2006 08:08:51 +0000 (00:08 -0800)]
[PATCH] hugepage: Small fixes to hugepage clear/copy path
Move the loops used in mm/hugetlb.c to clear and copy hugepages to their
own functions for clarity. As we do so, we add some checks of need_resched
- we are, after all copying megabytes of memory here. We also add
might_sleep() accordingly. We generally dropped locks around the clear and
copy, already but not everyone has PREEMPT enabled, so we should still be
checking explicitly.
For this to work, we need to remove the clear_huge_page() from
alloc_huge_page(), which is called with the page_table_lock held in the COW
path. We move the clear_huge_page() to just after the alloc_huge_page() in
the hugepage no-page path. In the COW path, the new page is about to be
copied over, so clearing it was just a waste of time anyway. So as a side
effect we also fix the fact that we held the page_table_lock for far too
long in this path by calling alloc_huge_page() under it.
It causes no regressions on the libhugetlbfs testsuite (ppc64, POWER5).
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Zhang, Yanmin [Wed, 22 Mar 2006 08:08:50 +0000 (00:08 -0800)]
[PATCH] Enable mprotect on huge pages
2.6.16-rc3 uses hugetlb on-demand paging, but it doesn_t support hugetlb
mprotect.
From: David Gibson <david@gibson.dropbear.id.au>
Remove a test from the mprotect() path which checks that the mprotect()ed
range on a hugepage VMA is hugepage aligned (yes, really, the sense of
is_aligned_hugepage_range() is the opposite of what you'd guess :-/).
In fact, we don't need this test. If the given addresses match the
beginning/end of a hugepage VMA they must already be suitably aligned. If
they don't, then mprotect_fixup() will attempt to split the VMA. The very
first test in split_vma() will check for a badly aligned address on a
hugepage VMA and return -EINVAL if necessary.
From: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
On i386 and x86-64, pte flag _PAGE_PSE collides with _PAGE_PROTNONE. The
identify of hugetlb pte is lost when changing page protection via mprotect.
A page fault occurs later will trigger a bug check in huge_pte_alloc().
The fix is to always make new pte a hugetlb pte and also to clean up
legacy code where _PAGE_PRESENT is forced on in the pre-faulting day.
Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Steven Pratt [Wed, 22 Mar 2006 08:08:48 +0000 (00:08 -0800)]
[PATCH] readahead: fix initial window size calculation
The current current get_init_ra_size is not optimal across different IO
sizes and max_readahead values. Here is a quick summary of sizes computed
under current design and under the attached patch. All of these assume 1st
IO at offset 0, or 1st detected sequential IO.
32k max, 4k request
old new
-----------------
8k 8k
16k 16k
32k 32k
128k max, 4k request
old new
-----------------
32k 16k
64k 32k
128k 64k
128k 128k
128k max, 32k request
old new
-----------------
32k 64k <-----
64k 128k
128k 128k
512k max, 4k request
old new
-----------------
4k 32k <----
16k 64k
64k 128k
128k 256k
512k 512k
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Steven Pratt <slpratt@austin.ibm.com>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 22 Mar 2006 08:08:47 +0000 (00:08 -0800)]
[PATCH] readahead: ->prev_page can overrun the ahead window
If get_next_ra_size() does not grow fast enough, ->prev_page can overrun
the ahead window. This means the caller will read the pages from
->ahead_start + ->ahead_size to ->prev_page synchronously.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Steven Pratt <slpratt@austin.ibm.com>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Hugh Dickins [Wed, 22 Mar 2006 08:08:46 +0000 (00:08 -0800)]
[PATCH] shmem: inline to avoid warning
shmem.c was named and shamed in Jesper's "Building 100 kernels" warnings:
shmem_parse_mpol is only used when CONFIG_TMPFS parses mount options; and
only called from that one site, so mark it inline like its non-NUMA stub.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Christoph Lameter [Wed, 22 Mar 2006 08:08:45 +0000 (00:08 -0800)]
[PATCH] vmscan: emove obsolete checks from shrink_list() and fix unlikely in refill_inactive_zone()
As suggested by Marcelo:
1. The optimization introduced recently for not calling
page_referenced() during zone reclaim makes two additional checks in
shrink_list unnecessary.
2. The if (unlikely(sc->may_swap)) in refill_inactive_zone is optimized
for the zone_reclaim case. However, most peoples system only does swap.
Undo that.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Michael Buesch [Wed, 22 Mar 2006 08:08:44 +0000 (00:08 -0800)]
[PATCH] Uninline sys_mmap common code (reduce binary size)
Remove the inlining of the new vs old mmap system call common code. This
reduces the size of the resulting vmlinux for defconfig as follows:
mb@pc1:~/develop/git/linux-2.6$ size vmlinux.mmap*
text data bss dec hex filename
3303749 521524 186564
4011837 3d373d vmlinux.mmapinline
3303557 521524 186564
4011645 3d367d vmlinux.mmapnoinline
The new sys_mmap2() has also one function call overhead removed, now.
(probably it was already optimized to a jmp before, but anyway...)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:43 +0000 (00:08 -0800)]
[PATCH] mm: optimise page_count
Optimise page_count compound page test and make it consistent with similar
functions.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:42 +0000 (00:08 -0800)]
[PATCH] mm: more CONFIG_DEBUG_VM
Put a few more checks under CONFIG_DEBUG_VM
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Wed, 22 Mar 2006 08:08:42 +0000 (00:08 -0800)]
[PATCH] mm: prep_zero_page() in irq is a bug
prep_zero_page() uses KM_USER0 and hence may not be used from IRQ context, at
least for highmem pages.
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:41 +0000 (00:08 -0800)]
[PATCH] mm: cleanup prep_ stuff
Move the prep_ stuff into prep_new_page.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:40 +0000 (00:08 -0800)]
[PATCH] remove set_page_count() outside mm/
set_page_count usage outside mm/ is limited to setting the refcount to 1.
Remove set_page_count from outside mm/, and replace those users with
init_page_count() and set_page_refcounted().
This allows more debug checking, and tighter control on how code is allowed
to play around with page->_count.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:35 +0000 (00:08 -0800)]
[PATCH] remove set_page_count(page, 0) users (outside mm)
A couple of places set_page_count(page, 1) that don't need to.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:34 +0000 (00:08 -0800)]
[PATCH] mm: nommu use compound pages
Now that compound page handling is properly fixed in the VM, move nommu
over to using compound pages rather than rolling their own refcounting.
nommu vm page refcounting is broken anyway, but there is no need to have
divergent code in the core VM now, nor when it gets fixed.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: David Howells <dhowells@redhat.com>
(Needs testing, please).
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:33 +0000 (00:08 -0800)]
[PATCH] mm: make __put_page internal
Remove __put_page from outside the core mm/. It is dangerous because it does
not handle compound pages nicely, and misses 1->0 transitions. If a user
later appears that really needs the extra speed we can reevaluate.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:33 +0000 (00:08 -0800)]
[PATCH] x86_64: pageattr remove __put_page
Remove page_count and __put_page from x86-64 pageattr
Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:32 +0000 (00:08 -0800)]
[PATCH] x86_64: pageattr use single list
Use page->lru.next to implement the singly linked list of pages rather than
the struct deferred_page which needs to be allocated and freed for each
page.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:31 +0000 (00:08 -0800)]
[PATCH] i386: pageattr remove __put_page
Stop using __put_page and page_count in i386 pageattr.c
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:30 +0000 (00:08 -0800)]
[PATCH] sg: use compound pages
sg increments the refcount of constituent pages in its higher order memory
allocations when they are about to be mapped by userspace. This is done so
the subsequent get_page/put_page when doing the mapping and unmapping does not
free the page.
Move over to the preferred way, that is, using compound pages instead. This
fixes a whole class of possible obscure bugs where a get_user_pages on a
constituent page may outlast the user mappings or even the driver.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Douglas Gilbert <dougg@torque.net>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Hugh Dickins [Wed, 22 Mar 2006 08:08:29 +0000 (00:08 -0800)]
[PATCH] remove VM_DONTCOPY bogosities
Now that it's madvisable, remove two pieces of VM_DONTCOPY bogosity:
1. There was and is no logical reason why VM_DONTCOPY should be in the
list of flags which forbid vma merging (and those drivers which set
it are also setting VM_IO, which itself forbids the merge).
2. It's hard to understand the purpose of the VM_HUGETLB, VM_DONTCOPY
block in vm_stat_account: but never mind, it's under CONFIG_HUGETLB,
which (unlike CONFIG_HUGETLB_PAGE or CONFIG_HUGETLBFS) has never been
defined.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Wu Fengguang [Wed, 22 Mar 2006 08:08:28 +0000 (00:08 -0800)]
[PATCH] mm: shrink_inactive_lis() nr_scan accounting fix
In shrink_inactive_list(), nr_scan is not accounted when nr_taken is 0.
But 0 pages taken does not mean 0 pages scanned.
Move the goto statement below the accounting code to fix it.
Signed-off-by: Wu Fengguang <wfg@mail.ustc.edu.cn>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Wu Fengguang [Wed, 22 Mar 2006 08:08:23 +0000 (00:08 -0800)]
[PATCH] mm: isolate_lru_pages() scan count fix
In isolate_lru_pages(), *scanned reports one more scan because the scan
counter is increased one more time on exit of the while-loop.
Change the while-loop to for-loop to fix it.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Wu Fengguang <wfg@mail.ustc.edu.cn>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Christoph Lameter [Wed, 22 Mar 2006 08:08:22 +0000 (00:08 -0800)]
[PATCH] zone_reclaim: additional comments and cleanup
Add some comments to explain how zone reclaim works. And it fixes the
following issues:
- PF_SWAPWRITE needs to be set for RECLAIM_SWAP to be able to write
out pages to swap. Currently RECLAIM_SWAP may not do that.
- remove setting nr_reclaimed pages after slab reclaim since the slab shrinking
code does not use that and the nr_reclaimed pages is just right for the
intended follow up action.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Wed, 22 Mar 2006 08:08:21 +0000 (00:08 -0800)]
[PATCH] vmscan: rename functions
We have:
try_to_free_pages
->shrink_caches(struct zone **zones, ..)
->shrink_zone(struct zone *, ...)
->shrink_cache(struct zone *, ...)
->shrink_list(struct list_head *, ...)
->refill_inactive_list((struct zone *, ...)
which is fairly irrational.
Rename things so that we have
try_to_free_pages
->shrink_zones(struct zone **zones, ..)
->shrink_zone(struct zone *, ...)
->shrink_inactive_list(struct zone *, ...)
->shrink_page_list(struct list_head *, ...)
->shrink_active_list(struct zone *, ...)
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Wed, 22 Mar 2006 08:08:20 +0000 (00:08 -0800)]
[PATCH] vmscan return nr_reclaimed
Change all the vmscan functions to retunr the number-of-reclaimed pages and
remove scan_conrtol.nr_reclaimed.
Saves ten-odd bytes of text and makes things clearer and more consistent.
The patch also changes the behaviour of zone_reclaim() when it falls back to slab shrinking. Christoph says
"Setting this to one means that we will rescan and shrink the slab for
each allocation if we are out of zone memory and RECLAIM_SLAB is set. Plus
if we do an order 0 allocation we do not go off node as intended.
"We better set this to zero. This means the allocation will go offnode
despite us having potentially freed lots of memory on the zone. Future
allocations can then again be done from this zone."
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Wed, 22 Mar 2006 08:08:19 +0000 (00:08 -0800)]
[PATCH] vmscan: use unsigned longs
Turn basically everything in vmscan.c into `unsigned long'. This is to avoid
the possibility that some piece of code in there might decide to operate upon
more than 4G (or even 2G) of pages in one hit.
This might be silly, but we'll need it one day.
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Wed, 22 Mar 2006 08:08:18 +0000 (00:08 -0800)]
[PATCH] vmscan: scan_control cleanup
Initialise as much of scan_control as possible at the declaration site. This
tidies things up a bit and assures us that all unmentioned fields are zeroed
out.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Christoph Lameter [Wed, 22 Mar 2006 08:08:18 +0000 (00:08 -0800)]
[PATCH] Thin out scan_control: remove nr_to_scan and priority
Make nr_to_scan and priority a parameter instead of putting it into scan
control. This allows various small optimizations and IMHO makes the code
easier to read.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Wed, 22 Mar 2006 08:08:17 +0000 (00:08 -0800)]
[PATCH] slab: use on_each_cpu()
Slab duplicates on_each_cpu().
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Wed, 22 Mar 2006 08:08:16 +0000 (00:08 -0800)]
[PATCH] on_each_cpu(): disable local interrupts
When on_each_cpu() runs the callback on other CPUs, it runs with local
interrupts disabled. So we should run the function with local interrupts
disabled on this CPU, too.
And do the same for UP, so the callback is run in the same environment on both
UP and SMP. (strictly it should do preempt_disable() too, but I think
local_irq_disable is sufficiently equivalent).
Also uninlines on_each_cpu(). softirq.c was the most appropriate file I could
find, but it doesn't seem to justify creating a new file.
Oh, and fix up that comment over (under?) x86's smp_call_function(). It
drives me nuts.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Christoph Lameter [Wed, 22 Mar 2006 08:08:15 +0000 (00:08 -0800)]
[PATCH] slab: Remove SLAB_NO_REAP option
SLAB_NO_REAP is documented as an option that will cause this slab not to be
reaped under memory pressure. However, that is not what happens. The only
thing that SLAB_NO_REAP controls at the moment is the reclaim of the unused
slab elements that were allocated in batch in cache_reap(). Cache_reap()
is run every few seconds independently of memory pressure.
Could we remove the whole thing? Its only used by three slabs anyways and
I cannot find a reason for having this option.
There is an additional problem with SLAB_NO_REAP. If set then the recovery
of objects from alien caches is switched off. Objects not freed on the
same node where they were initially allocated will only be reused if a
certain amount of objects accumulates from one alien node (not very likely)
or if the cache is explicitly shrunk. (Strangely __cache_shrink does not
check for SLAB_NO_REAP)
Getting rid of SLAB_NO_REAP fixes the problems with alien cache freeing.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Randy Dunlap [Wed, 22 Mar 2006 08:08:14 +0000 (00:08 -0800)]
[PATCH] slab: fix kernel-doc warnings
Fix kernel-doc warnings in mm/slab.c.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Pekka Enberg [Wed, 22 Mar 2006 08:08:13 +0000 (00:08 -0800)]
[PATCH] mm: kill kmem_cache_t usage
We have struct kmem_cache now so use it instead of the old typedef.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ravikiran G Thirumalai [Wed, 22 Mar 2006 08:08:12 +0000 (00:08 -0800)]
[PATCH] slab: remove cachep->spinlock
Remove cachep->spinlock. Locking has moved to the kmem_list3 and most of
the structures protected earlier by cachep->spinlock is now protected by
the l3->list_lock. slab cache tunables like batchcount are accessed always
with the cache_chain_mutex held.
Patch tested on SMP and NUMA kernels with dbench processes running,
constant onlining/offlining, and constant cache tuning, all at the same
time.
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Cc: Christoph Lameter <christoph@lameter.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Wed, 22 Mar 2006 08:08:11 +0000 (00:08 -0800)]
[PATCH] slab cleanup
slab.c has become a bit revolting again. Try to repair it.
- Coding style fixes
- Don't do assignments-in-if-statements.
- Don't typecast assignments to/from void*
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Pekka Enberg [Wed, 22 Mar 2006 08:08:11 +0000 (00:08 -0800)]
[PATCH] slab: extract setup_cpu_cache
Extract setup_cpu_cache() function from kmem_cache_create() to make the
latter a little less complex.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Pekka Enberg [Wed, 22 Mar 2006 08:08:10 +0000 (00:08 -0800)]
[PATCH] slab: object to index mapping cleanup
Clean up the object to index mapping that has been spread around mm/slab.c.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Adrian Bunk [Wed, 22 Mar 2006 08:08:09 +0000 (00:08 -0800)]
[PATCH] kcalloc(): INT_MAX -> ULONG_MAX
Since size_t has the same size as a long on all architectures, it's enough
for overflow checks to check against ULONG_MAX.
This change could allow a compiler better optimization (especially in the
n=1 case).
The practical effect seems to be positive, but quite small:
text data bss dec hex filename
21762380 5859870 1848928 29471178 1c1b1ca vmlinux-old
21762211 5859870 1848928 29471009 1c1b121 vmlinux-patched
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:08 +0000 (00:08 -0800)]
[PATCH] hugepage allocator cleanup
Insert "fresh" huge pages into the hugepage allocator by the same means as
they are freed back into it. This reduces code size and allows
enqueue_huge_page to be inlined into the hugepage free fastpath.
Eliminate occurances of hugepages on the free list with non-zero refcount.
This can allow stricter refcount checks in future. Also required for
lockless pagecache.
Signed-off-by: Nick Piggin <npiggin@suse.de>
"This patch also eliminates a leak "cleaned up" by re-clobbering the
refcount on every allocation from the hugepage freelists. With respect to
the lockless pagecache, the crucial aspect is to eliminate unconditional
set_page_count() to 0 on pages with potentially nonzero refcounts, though
closer inspection suggests the assignments removed are entirely spurious."
Acked-by: William Irwin <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:07 +0000 (00:08 -0800)]
[PATCH] mm: cleanup bootmem
The bootmem code added to page_alloc.c duplicated some page freeing code
that it really doesn't need to because it is not so performance critical.
While we're here, make prefetching work properly by actually prefetching
the page we're about to use before prefetching ahead to the next one (ie.
get the most important transaction started first). Also prefetch just a
single page ahead rather than leaving a gap of 16.
Jack Steiner reported no problems with SGI's ia64 simulator.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:06 +0000 (00:08 -0800)]
[PATCH] mm: page_state comment more
Clarify that preemption needs to be guarded against with the
__xxx_page_state functions.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:05 +0000 (00:08 -0800)]
[PATCH] mm: split highorder pages
Have an explicit mm call to split higher order pages into individual pages.
Should help to avoid bugs and be more explicit about the code's intention.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:04 +0000 (00:08 -0800)]
[PATCH] xtensa: pgtable fixes
- Don't return uninitialised stack values in case of allocation failure
- Don't bother clearing PageCompound because __GFP_COMP wasn't specified
Increment over the pte page rather than one pte entry in
pte_alloc_one_kernel
- Actually increment the page pointer in pte_alloc_one
- Compile fixes, typos.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:03 +0000 (00:08 -0800)]
[PATCH] mm: de-skew page refcounting
atomic_add_unless (atomic_inc_not_zero) no longer requires an offset refcount
to function correctly.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:03 +0000 (00:08 -0800)]
[PATCH] mm: simplify vmscan vs release refcounting
The VM has an interesting race where a page refcount can drop to zero, but it
is still on the LRU lists for a short time. This was solved by testing a 0->1
refcount transition when picking up pages from the LRU, and dropping the
refcount in that case.
Instead, use atomic_add_unless to ensure we never pick up a 0 refcount page
from the LRU, thus a 0 refcount page will never have its refcount elevated
until it is allocated again.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:02 +0000 (00:08 -0800)]
[PATCH] mm: slab less atomics
Atomic operation removal from slab
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:01 +0000 (00:08 -0800)]
[PATCH] mm: page_alloc less atomics
More atomic operation removal from page allocator
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:00 +0000 (00:08 -0800)]
[PATCH] mm: less atomic ops
In the page release paths, we can be sure that nobody will mess with our
page->flags because the refcount has dropped to 0. So no need for atomic
operations here.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:08:00 +0000 (00:08 -0800)]
[PATCH] mm: PageActive no testset
PG_active is protected by zone->lru_lock, it does not need TestSet/TestClear
operations.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:07:59 +0000 (00:07 -0800)]
[PATCH] mm: PageLRU no testset
PG_lru is protected by zone->lru_lock. It does not need TestSet/TestClear
operations.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 22 Mar 2006 08:07:58 +0000 (00:07 -0800)]
[PATCH] mm: never ClearPageLRU released pages
If vmscan finds a zero refcount page on the lru list, never ClearPageLRU
it. This means the release code need not hold ->lru_lock to stabilise
PageLRU, so that lock may be skipped entirely when releasing !PageLRU pages
(because we know PageLRU won't have been temporarily cleared by vmscan,
which was previously guaranteed by holding the lock to synchronise against
vmscan).
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Christoph Hellwig [Wed, 22 Mar 2006 08:07:57 +0000 (00:07 -0800)]
[PATCH] mm: remove set_pgdir leftovers
set_pgdir isn't needed anymore for a very long time. Remove the leftover
implementation on sh64 and the stub on s390.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Richard Curnow <rc@rc0.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Dmitry Torokhov [Wed, 22 Mar 2006 08:07:56 +0000 (00:07 -0800)]
[PATCH] dcdbas: convert to the new platform device interface
Do not use platform_device_register_simple() as it is going away, define
dcdbas_driver and implement ->probe() and ->remove() functions so manual
binding and unbinding will work with this driver.
Also switch to using attribute_group when creating sysfs attributes and
make sure to check and handle errors; explicitely remove attributes when
detaching driver.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Dmitry Torokhov [Wed, 22 Mar 2006 08:07:55 +0000 (00:07 -0800)]
[PATCH] tb0219: convert to the new platform device interface
Do not use platform_device_register_simple() as it is going away.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Dmitry Torokhov [Wed, 22 Mar 2006 08:07:54 +0000 (00:07 -0800)]
[PATCH] mv64x600_wdt: convert to the new platform device interface
mv64x600_wdt: convert to the new platform device interface Do not use
platform_device_register_simple() as it is going away.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Dmitry Torokhov [Wed, 22 Mar 2006 08:07:53 +0000 (00:07 -0800)]
[PATCH] vr41xx: convert to the new platform device interface
The patch does the following for v441xx seris drivers:
- stop using platform_device_register_simple() as it is going away
- mark ->probe() and ->remove() methods as __devinit and __devexit
respectively
- initialize "owner" field in driver structure so there is a link
from /sys/modules to the driver
- mark *_init() and *_exit() functions as __init and __exit
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Wed, 22 Mar 2006 08:07:46 +0000 (00:07 -0800)]
[PATCH] multiple exports of strpbrk
Sam's tree includes a new check, which found that we're exporting strpbrk()
multiple times.
It seems that the convention is that this is exported from the arch files, so
reove the lib/string.c export.
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: David Howells <dhowells@redhat.com>
Cc: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Atsushi Nemoto [Wed, 22 Mar 2006 08:07:45 +0000 (00:07 -0800)]
[PATCH] serial: serial_txx9 driver update
Update the serial_txx9 driver.
* More strict check in verify_port. Cleanup.
* Do not insert a char caused previous overrun.
* Fix some spin_locks.
* Do not call uart_add_one_port for absent ports.
Also, this patch removes a BROKEN tag from Kconfig. This driver has been
marked as BROKEN by removal of uart_register_port, but it has been solved
already on Sep 2005.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Joe Korty [Wed, 22 Mar 2006 08:07:43 +0000 (00:07 -0800)]
[PATCH] rtc.h broke strace(1) builds
Git patch
52dfa9a64cfb3dd01fa1ee1150d589481e54e28e
[PATCH] move rtc_interrupt() prototype to rtc.h
broke strace(1) builds. The below moves the kernel-only additions lower,
under the already provided #ifdef __KERNEL__ statement.
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Alasdair G Kergon [Wed, 22 Mar 2006 08:07:42 +0000 (00:07 -0800)]
[PATCH] dm: bio split bvec fix
The code that handles bios that span table target boundaries by breaking
them up into smaller bios will not split an individual struct bio_vec into
more than two pieces. Sometimes more than that are required.
This patch adds a loop to break the second piece up into as many pieces as
are necessary.
Cc: "Abhishek Gupta" <abhishekgupt@gmail.com>
Cc: Dan Smith <danms@us.ibm.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Eric W. Biederman [Wed, 22 Mar 2006 08:07:40 +0000 (00:07 -0800)]
[PATCH] unshare: Error if passed unsupported flags
A bare bones trivial patch to ensure we always get -EINVAL on the
unsupported cases for sys_unshare. If this goes in before 2.6.16 it allows
us to forward compatible with future applications using sys_unshare.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: JANAK DESAI <janak@us.ibm.com>
Cc: <stable@kerenl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Wed, 22 Mar 2006 08:07:39 +0000 (00:07 -0800)]
[PATCH] __get_page_state() cpumask cleanup and fix
__get_page_state() has an open-coded for_each_cpu_mask() loop in it.
Tidy that up, then notice that the code was buggy:
while (cpu < NR_CPUS) {
unsigned long *in, *out, off;
if (!cpu_isset(cpu, *cpumask))
continue;
an obvious infinite loop. I guess we just never call it with a holey cpu
mask.
Even after my cpumask size-reduction work, this patch increases code size :(
Cc: Paul Jackson <pj@sgi.com>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ravikiran G Thirumalai [Wed, 22 Mar 2006 08:07:38 +0000 (00:07 -0800)]
[PATCH] x86: mark cyc2ns_scale readmostly
This variable is rarely written to. Mark the variable accordingly.
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Latchesar Ionkov [Wed, 22 Mar 2006 08:07:37 +0000 (00:07 -0800)]
[PATCH] v9fs: assign dentry ops to negative dentries
If a file is not found in v9fs_vfs_lookup, the function creates negative
dentry, but doesn't assign any dentry ops. This leaves the negative entry
in the cache (there is no d_delete to mark it for removal). If the file is
created outside of the mounted v9fs filesystem, the file shows up in the
directory with weird permissions.
This patch assigns the default v9fs dentry ops to the negative dentry.
Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Antonino A. Daplas [Wed, 22 Mar 2006 08:07:36 +0000 (00:07 -0800)]
[PATCH] i810fb_cursor(): use GFP_ATOMIC
The console cursor can be called in atomic context. Change memory
allocation to use the GFP_ATOMIC flag in i810fb_cursor().
Signed-off-by: Antonino Daplas <adaplas@pol.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Wed, 22 Mar 2006 08:07:35 +0000 (00:07 -0800)]
[PATCH] efi_call_phys_epilog() warning fix
arch/i386/kernel/efi.c: In function `efi_call_phys_epilog': arch/i386/kernel/efi.c:118: warning: assignment makes integer from pointer without a cast
Cc: Matt Domsch <Matt_Domsch@dell.com>
Cc: "Tolentino, Matthew E" <matthew.e.tolentino@intel.com>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Herbert Poetzl [Wed, 22 Mar 2006 08:07:34 +0000 (00:07 -0800)]
[PATCH] don't call check_acpi_pci() on x86 with ACPI disabled
check_acpi_pci() is called from arch/i386/kernel/setup.c even if
CONFIG_ACPI is not defined, but the code in include/asm/acpi.h doesn't
provide it in this case.
Signed-off-by: Herbert Pƶtzl <herbert@13thfloor.at>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Mike Galbraith [Wed, 22 Mar 2006 08:07:33 +0000 (00:07 -0800)]
[PATCH] sched: remove sleep_avg multiplier
Remove the sleep_avg multiplier. This multiplier was necessary back when
we had 10 seconds of dynamic range in sleep_avg, but now that we only have
one second, it causes that one second to be compressed down to 100ms in
some cases. This is particularly noticeable when compiling a kernel in a
slow NFS mount, and I believe it to be a very likely candidate for other
recently reported network related interactivity problems.
In testing, I can detect no negative impact of this removal.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Tue, 21 Mar 2006 23:58:17 +0000 (15:58 -0800)]
Merge branch 'release' of git://git./linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64-SGI] SN2-XP reduce kmalloc wrapper inlining
[IA64] MCA: remove obsolete ifdef
[IA64] MCA: update MCA comm field for user space tasks
[IA64] MCA: print messages in MCA handler
[IA64-SGI] - Eliminate SN pio_phys_xxx macros. Move to assembly
[IA64] use icc defined constant
[IA64] add __builtin_trap definition for icc build
[IA64] clean up asm/intel_intrin.h
[IA64] map ia64_hint definition to intel compiler intrinsic
[IA64] hooks to wait for mmio writes to drain when migrating processes
[IA64-SGI] driver bugfixes and hardware workarounds for CE1.0 asic
[IA64-SGI] Handle SC env. powerdown events
[IA64] Delete MCA/INIT sigdelayed code
[IA64-SGI] sem2mutex ioc4.c
[IA64] implement ia64 specific mutex primitives
[IA64] Fix UP build with BSP removal support.
[IA64] support for cpu0 removal
Linus Torvalds [Tue, 21 Mar 2006 22:51:37 +0000 (14:51 -0800)]
Revert "V4L/DVB (3543): Fix Makefile to adapt to bt8xx/ conversion"
This reverts commit
08f1d0b99f4e2203935d86640a7fec5c233b777c
The "bt8xx/ conversion" for drivers/video/ hasn't actually percolated
all the way to this tree, so the Makefile change escaped too soon.
Build breakage noticed by Jeff Garzik <jeff@garzik.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jeff Garzik [Tue, 21 Mar 2006 21:40:51 +0000 (16:40 -0500)]
Merge branch 'e1000-fixes' of git://198.78.49.142/~jbrandeb/linux-2.6
Jeff Garzik [Tue, 21 Mar 2006 21:22:47 +0000 (16:22 -0500)]
[netdrvr] pcnet32: other source formatting cleanups
- undo some Lindent damage by indenting member names
- remove history at top of .c file, this is stored in the kernel
repo changelog (in greater detail, even).
Jeff Garzik [Tue, 21 Mar 2006 21:15:44 +0000 (16:15 -0500)]
[netdrvr] pcnet32: Lindent
Andrew Morton [Fri, 17 Mar 2006 07:58:44 +0000 (23:58 -0800)]
[PATCH] skfp warning fixes
drivers/net/skfp/fplustm.c: In function `enable_formac':
drivers/net/skfp/fplustm.c:552: warning: large integer implicitly truncated to unsigned type
drivers/net/skfp/fplustm.c:555: warning: large integer implicitly truncated to unsigned type
These arguments were changed to `const', so the compiler can now see that it's
doing and outw(..., 0xffffnnnn). Cast the arg to ushort.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Nicolas Pitre [Mon, 20 Mar 2006 16:54:27 +0000 (11:54 -0500)]
[PATCH] smc91x: allow for dynamic bus access configs
All accessor's different methods are now selected with C code and unused
ones statically optimized away at compile time instead of being selected
with #if's and #ifdef's. This has many advantages such as allowing the
compiler to validate the syntax of the whole code, making it cleaner and
easier to understand, and ultimately allowing people to define
configuration symbols in terms of variables if they really want to
dynamically support multiple bus configurations at the same time (with
the unavoidable performance cost).
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Don Fry [Mon, 20 Mar 2006 23:26:03 +0000 (15:26 -0800)]
[PATCH] pcnet32: support boards with multiple phys
Boards with multiple PHYs were not being handled properly by the pcnet32
driver. This patch by Thomas Bogendoerfer with changes by me will allow
Allied Telesyn 2700FTX and 2701FTX boards to use either the copper or
the fiber interfaces. It has been tested on ia32 and ppc64 hardware.
Philippe Seewer also tested and improved the patch.
ethtool for pcnet32 already supports multiple phys.
See also bugzilla bug 4219.
Please apply to 2.6.16
Signed-off-by: Don Fry <brazilnut@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Mon, 20 Mar 2006 23:48:23 +0000 (15:48 -0800)]
[PATCH] sky2 version 1.1
Set version to 1.1
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Mon, 20 Mar 2006 23:48:22 +0000 (15:48 -0800)]
[PATCH] sky2: handle all error irqs
The hardware has additional error trap interrupt bits. I have never seen
them trigger, but if they do, it looks like this might be useful.
Signed-off-by: Stephen Hemminger <shemminger@osdl.rog>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Mon, 20 Mar 2006 23:48:21 +0000 (15:48 -0800)]
[PATCH] sky2: transmit recovery
This patch decodes state and revovers from any races in the transmit
timeout and NAPI logic. It should never trigger, but if it does then
do the right thing.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Mon, 20 Mar 2006 23:48:20 +0000 (15:48 -0800)]
[PATCH] sky2: whitespace fixes
Small whitespace fixes.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Mon, 20 Mar 2006 23:48:19 +0000 (15:48 -0800)]
[PATCH] sky2: add MSI support
Add MSI support to sky2 driver.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Mon, 20 Mar 2006 23:48:18 +0000 (15:48 -0800)]
[PATCH] sky2: coalescing parameters
Change default coalescing parameters slightly, and allow wider
range of values.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Mon, 20 Mar 2006 23:48:17 +0000 (15:48 -0800)]
[PATCH] sky2: rework of NAPI and IRQ management
Redo the interupt handling of sky2 driver based on the IRQ mangement
documentation. All interrupts are handled by the device0 NAPI poll
routine.
Don't need to adjust interrupt mask in IRQ context, done only when
changing device under RTNL. Therefore don't need hwlock anymore.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Mon, 20 Mar 2006 23:48:16 +0000 (15:48 -0800)]
[PATCH] sky2: drop broken wake on lan support
Remove wake on lan support for now. It doesn't work right, and I
don't have a machine with working suspend/resume to test or fix it.
It will be re-enabled later.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Mon, 20 Mar 2006 23:48:15 +0000 (15:48 -0800)]
[PATCH] sky2: remove support for untested Yukon EC/rev 0
The Yukon EC/rev0 (A1) chipset requires a bunch of workarounds. I copied these
from sk98lin. But since they never got tested and add more cruft to the code;
any attempt at using driver as is on this version will probably fail.
It looks like this was a early engineering sample chip revision, if it ever shows
up on a real system. Produce an error message.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Dale Farnsworth [Tue, 21 Mar 2006 18:44:35 +0000 (11:44 -0700)]
[PATCH] mv643xx_eth: Cache align skb->data if CONFIG_NOT_COHERENT_CACHE
When I/O is non-cache-coherent, we need to ensure that the I/O buffers
we use don't share cache lines with other data.
Signed-off-by: Dale Farnsworth <dale@farnsworth.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Tue, 21 Mar 2006 18:57:07 +0000 (10:57 -0800)]
[PATCH] skge: version 1.4
Update version number
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Tue, 21 Mar 2006 18:57:06 +0000 (10:57 -0800)]
[PATCH] skge: handle pci errors better
When a PCI error occurs, try and report more info.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Tue, 21 Mar 2006 18:57:05 +0000 (10:57 -0800)]
[PATCH] skge: formmating and whitespace cleanup
Reformat some code to make it easier to read. And whitespace
fixes.
Signed-off-by: Stephen Hemminger <sheminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Tue, 21 Mar 2006 18:57:04 +0000 (10:57 -0800)]
[PATCH] skge: use mmiowb
Add mmio barriers at the appropriate places, don't have a platform
that needs them, but this is where the documentation of the patch
says to add them.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Tue, 21 Mar 2006 18:57:03 +0000 (10:57 -0800)]
[PATCH] skge: use kcalloc
Use kcalloc when allocating ring data structure.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Tue, 21 Mar 2006 18:57:02 +0000 (10:57 -0800)]
[PATCH] skge: dma configuration cleanup
Cleanup of the part of the code that sets up DMA configuration.
Should cause no real change in operation, just clearer.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Tue, 21 Mar 2006 18:57:01 +0000 (10:57 -0800)]
[PATCH] skge: check the allocation of ring buffer
The SysKonnect Genesis and Yukon chip sets have restrictions on the possible
control block area. The memory needs to not cross 4 Gig boundary, and it needs
to be 8 byte aligned. This patch checks and fails to bring the device up
if region is unacceptable.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Tue, 21 Mar 2006 18:57:00 +0000 (10:57 -0800)]
[PATCH] skge: use auto masking of irqs
Improve performance of skge driver by not touching irq mask
register as much. Since the interrupt source auto-masks, the driver
can just leave it disabled until the end of the soft irq.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Tue, 21 Mar 2006 18:56:59 +0000 (10:56 -0800)]
[PATCH] skge: use NAPI for tx cleanup.
Cleanup transmit buffers using NAPI. This allows the transmit routine
to leave interrupts enabled, and that improves performance.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Linus Torvalds [Tue, 21 Mar 2006 17:33:19 +0000 (09:33 -0800)]
Merge /pub/scm/linux/kernel/git/herbert/crypto-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/herbert/crypto-2.6:
[CRYPTO] aes: Fixed array boundary violation
[CRYPTO] tcrypt: Fix key alignment
[CRYPTO] all: Add missing cra_alignmask
[CRYPTO] all: Use kzalloc where possible
[CRYPTO] api: Align tfm context as wide as possible
[CRYPTO] twofish: Use rol32/ror32 where appropriate
Linus Torvalds [Tue, 21 Mar 2006 17:31:48 +0000 (09:31 -0800)]
Merge /pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (235 commits)
[NETFILTER]: Add H.323 conntrack/NAT helper
[TG3]: Don't mark tg3_test_registers() as returning const.
[IPV6]: Cleanups for net/ipv6/addrconf.c (kzalloc, early exit) v2
[IPV6]: Nearly complete kzalloc cleanup for net/ipv6
[IPV6]: Cleanup of net/ipv6/reassambly.c
[BRIDGE]: Remove duplicate const from is_link_local() argument type.
[DECNET]: net/decnet/dn_route.c: fix inconsequent NULL checking
[TG3]: make drivers/net/tg3.c:tg3_request_irq() static
[BRIDGE]: use LLC to send STP
[LLC]: llc_mac_hdr_init const arguments
[BRIDGE]: allow show/store of group multicast address
[BRIDGE]: use llc for receiving STP packets
[BRIDGE]: stp timer to jiffies cleanup
[BRIDGE]: forwarding remove unneeded preempt and bh diasables
[BRIDGE]: netfilter inline cleanup
[BRIDGE]: netfilter VLAN macro cleanup
[BRIDGE]: netfilter dont use __constant_htons
[BRIDGE]: netfilter whitespace
[BRIDGE]: optimize frame pass up
[BRIDGE]: use kzalloc
...