GitHub/MotorolaMobilityLLC/kernel-slsi.git
18 years ago[PATCH] out of memory notifier
Martin Schwidefsky [Tue, 26 Sep 2006 06:31:20 +0000 (23:31 -0700)]
[PATCH] out of memory notifier

Add a notifer chain to the out of memory killer.  If one of the registered
callbacks could release some memory, do not kill the process but return and
retry the allocation that forced the oom killer to run.

The purpose of the notifier is to add a safety net in the presence of
memory ballooners.  If the resource manager inflated the balloon to a size
where memory allocations can not be satisfied anymore, it is better to
deflate the balloon a bit instead of killing processes.

The implementation for the s390 ballooner is included.

[akpm@osdl.org: cleanups]
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] linearly index zone->node_zonelists[]
Christoph Lameter [Tue, 26 Sep 2006 06:31:19 +0000 (23:31 -0700)]
[PATCH] linearly index zone->node_zonelists[]

I wonder why we need this bitmask indexing into zone->node_zonelists[]?

We always start with the highest zone and then include all lower zones
if we build zonelists.

Are there really cases where we need allocation from ZONE_DMA or
ZONE_HIGHMEM but not ZONE_NORMAL? It seems that the current implementation
of highest_zone() makes that already impossible.

If we go linear on the index then gfp_zone() == highest_zone() and a lot
of definitions fall by the wayside.

We can now revert back to the use of gfp_zone() in mempolicy.c ;-)

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Apply type enum zone_type
Christoph Lameter [Tue, 26 Sep 2006 06:31:18 +0000 (23:31 -0700)]
[PATCH] Apply type enum zone_type

After we have done this we can now do some typing cleanup.

The memory policy layer keeps a policy_zone that specifies
the zone that gets memory policies applied. This variable
can now be of type enum zone_type.

The check_highest_zone function and the build_zonelists funnctionm must
then also take a enum zone_type parameter.

Plus there are a number of loops over zones that also should use
zone_type.

We run into some troubles at some points with functions that need a
zone_type variable to become -1. Fix that up.

[pj@sgi.com: fix set_mempolicy() crash]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] mempolicies: fix policy_zone check
Christoph Lameter [Tue, 26 Sep 2006 06:31:17 +0000 (23:31 -0700)]
[PATCH] mempolicies: fix policy_zone check

There is a check in zonelist_policy that compares pieces of the bitmap
obtained from a gfp mask via GFP_ZONETYPES with a zone number in function
zonelist_policy().

The bitmap is an ORed mask of __GFP_DMA, __GFP_DMA32 and __GFP_HIGHMEM.
The policy_zone is a zone number with the possible values of ZONE_DMA,
ZONE_DMA32, ZONE_HIGHMEM and ZONE_NORMAL. These are two different domains
of values.

For some reason seemed to work before the zone reduction patchset (It
definitely works on SGI boxes since we just have one zone and the check
cannot fail).

With the zone reduction patchset this check definitely fails on systems
with two zones if the system actually has memory in both zones.

This is because ZONE_NORMAL is selected using no __GFP flag at
all and thus gfp_zone(gfpmask) == 0. ZONE_DMA is selected when __GFP_DMA
is set. __GFP_DMA is 0x01.  So gfp_zone(gfpmask) == 1.

policy_zone is set to ZONE_NORMAL (==1) if ZONE_NORMAL and ZONE_DMA are
populated.

For ZONE_NORMAL gfp_zone(<no _GFP_DMA>) yields 0 which is <
policy_zone(ZONE_NORMAL) and so policy is not applied to regular memory
allocations!

Instead gfp_zone(__GFP_DMA) == 1 which results in policy being applied
to DMA allocations!

What we realy want in that place is to establish the highest allowable
zone for a given gfp_mask. If the highest zone is higher or equal to the
policy_zone then memory policies need to be applied. We have such
a highest_zone() function in page_alloc.c.

So move the highest_zone() function from mm/page_alloc.c into
include/linux/gfp.h.  On the way we simplify the function and use the new
zone_type that was also introduced with the zone reduction patchset plus we
also specify the right type for the gfp flags parameter.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] reduce MAX_NR_ZONES: fix i386 SRAT check for MAX_NR_ZONES
Christoph Lameter [Tue, 26 Sep 2006 06:31:16 +0000 (23:31 -0700)]
[PATCH] reduce MAX_NR_ZONES: fix i386 SRAT check for MAX_NR_ZONES

We cannot check MAX_NR_ZONES since it not defined in the preprocessor
anymore.

So remove the check.

The maximum number of zones per node for i386 is 3 since i386 does not
support ZONE_DMA32.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] reduce MAX_NR_ZONES: remove display of counters for unconfigured zones
Christoph Lameter [Tue, 26 Sep 2006 06:31:15 +0000 (23:31 -0700)]
[PATCH] reduce MAX_NR_ZONES: remove display of counters for unconfigured zones

eventcounters: Do not display counters for zones that are not available on an
arch

Do not define or display counters for the DMA32 and the HIGHMEM zone if such
zones were not configured.

[akpm@osdl.org: s390 fix]
[heiko.carstens@de.ibm.com: s390 fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] reduce MAX_NR_ZONES: make ZONE_HIGHMEM optional
Christoph Lameter [Tue, 26 Sep 2006 06:31:14 +0000 (23:31 -0700)]
[PATCH] reduce MAX_NR_ZONES: make ZONE_HIGHMEM optional

Make ZONE_HIGHMEM optional

- ifdef out code and definitions related to CONFIG_HIGHMEM

- __GFP_HIGHMEM falls back to normal allocations if there is no
  ZONE_HIGHMEM

- GFP_ZONEMASK becomes 0x01 if there is no DMA32 and no HIGHMEM
  zone.

[jdike@addtoit.com: build fix]
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Christoph Lameter <clameter@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] reduce MAX_NR_ZONES: make ZONE_DMA32 optional
Christoph Lameter [Tue, 26 Sep 2006 06:31:13 +0000 (23:31 -0700)]
[PATCH] reduce MAX_NR_ZONES: make ZONE_DMA32 optional

Make ZONE_DMA32 optional

- Add #ifdefs around ZONE_DMA32 specific code and definitions.

- Add CONFIG_ZONE_DMA32 config option and use that for x86_64
  that alone needs this zone.

- Remove the use of CONFIG_DMA_IS_DMA32 and CONFIG_DMA_IS_NORMAL
  for ia64 and fix up the way per node ZVCs are calculated.

- Fall back to prior GFP_ZONEMASK of 0x03 if there is no
  DMA32 zone.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] reduce MAX_NR_ZONES: use enum to define zones, reformat and comment
Christoph Lameter [Tue, 26 Sep 2006 06:31:13 +0000 (23:31 -0700)]
[PATCH] reduce MAX_NR_ZONES: use enum to define zones, reformat and comment

Use enum for zones and reformat zones dependent information

Add comments explaning the use of zones and add a zones_t type for zone
numbers.

Line up information that will be #ifdefd by the following patches.

[akpm@osdl.org: comment cleanups]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] reduce MAX_NR_ZONES: page allocator ZONE_HIGHMEM cleanup
Christoph Lameter [Tue, 26 Sep 2006 06:31:12 +0000 (23:31 -0700)]
[PATCH] reduce MAX_NR_ZONES: page allocator ZONE_HIGHMEM cleanup

page allocator ZONE_HIGHMEM fixups

1. We do not need to do an #ifdef in si_meminfo since both counters
   in use are zero if !CONFIG_HIGHMEM.

2. Add #ifdef in si_meminfo_node instead to avoid referencing zone
   information for ZONE_HIGHMEM if we do not have HIGHMEM
   (may not be there after the following patches).

3. Replace the use of ZONE_HIGHMEM with MAX_NR_ZONES in build_zonelists_node

4. build_zonelists_node: Remove BUG_ON for ZONE_HIGHMEM. Zone will
   be optional soon and thus BUG_ON cannot be triggered anymore.

5. init_free_area_core: Replace a use of ZONE_HIGHMEM with NR_MAX_ZONES.

[akpm@osdl.org: cleanups]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] reduce MAX_NR_ZONES: move HIGHMEM counters into highmem.c/.h
Christoph Lameter [Tue, 26 Sep 2006 06:31:11 +0000 (23:31 -0700)]
[PATCH] reduce MAX_NR_ZONES: move HIGHMEM counters into highmem.c/.h

Move totalhigh_pages and nr_free_highpages() into highmem.c/.h

Move the totalhigh_pages definition into highmem.c/.h.  Move the
nr_free_highpages function into highmem.c

[yoichi_yuasa@tripeaks.co.jp: build fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] reduce MAX_NR_ZONES: make display of highmem counters conditional on CONFIG_H...
Christoph Lameter [Tue, 26 Sep 2006 06:31:10 +0000 (23:31 -0700)]
[PATCH] reduce MAX_NR_ZONES: make display of highmem counters conditional on CONFIG_HIGHMEM

Do not display HIGHMEM memory sizes if CONFIG_HIGHMEM is not set.

Make HIGHMEM dependent texts and make display of highmem counters optional

Some texts are depending on CONFIG_HIGHMEM.

Remove those strings and remove the display of highmem counter values if
CONFIG_HIGHMEM is not set.

[akpm@osdl.org: remove some ifdefs]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] reduce MAX_NR_ZONES: fix MAX_NR_ZONES array initializations
Christoph Lameter [Tue, 26 Sep 2006 06:31:10 +0000 (23:31 -0700)]
[PATCH] reduce MAX_NR_ZONES: fix MAX_NR_ZONES array initializations

Fix array initialization in lots of arches

The number of zones may now be reduced from 4 to 2 for many arches.  Fix the
array initialization for the zones array for all architectures so that it is
not initializing a fixed number of elements.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] reduce MAX_NR_ZONES: remove two strange uses of MAX_NR_ZONES
Christoph Lameter [Tue, 26 Sep 2006 06:31:09 +0000 (23:31 -0700)]
[PATCH] reduce MAX_NR_ZONES: remove two strange uses of MAX_NR_ZONES

I keep seeing zones on various platforms that are never used and wonder why we
compile support for them into the kernel.  Counters show up for HIGHMEM and
DMA32 that are alway zero.

This patch allows the removal of ZONE_DMA32 for non x86_64 architectures and
it will get rid of ZONE_HIGHMEM for arches not using highmem (like 64 bit
architectures).  If an arch does not define CONFIG_HIGHMEM then ZONE_HIGHMEM
will not be defined.  Similarly if an arch does not define CONFIG_ZONE_DMA32
then ZONE_DMA32 will not be defined.

No current architecture uses all the 4 zones (DMA,DMA32,NORMAL,HIGH) that we
have now.  The patchset will reduce the number of zones for all platforms.

On many platforms that do not have DMA32 or HIGHMEM this will reduce the
number of zones by 50%.  F.e.  ia64 only uses DMA and NORMAL.

Large amounts of memory can be saved for larger systemss that may have a few
hundred NUMA nodes.

With ZONE_DMA32 and ZONE_HIGHMEM support optional MAX_NR_ZONES will be 2 for
many non i386 platforms and even for i386 without CONFIG_HIGHMEM set.

Tested on ia64, x86_64 and on i386 with and without highmem.

The patchset consists of 11 patches that are following this message.

One could go even further than this patchset and also make ZONE_DMA optional
because some platforms do not need a separate DMA zone and can do DMA to all
of memory.  This could reduce MAX_NR_ZONES to 1.  Such a patchset will
hopefully follow soon.

This patch:

Fix strange uses of MAX_NR_ZONES

Sometimes we use MAX_NR_ZONES - x to refer to a zone.  Make that explicit.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] bootmem: miscellaneous coding style fixes
Franck Bui-Huu [Tue, 26 Sep 2006 06:31:08 +0000 (23:31 -0700)]
[PATCH] bootmem: miscellaneous coding style fixes

It fixes various coding style issues, specially when spaces are useless.  For
example '*' go next to the function name.

Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] bootmem: use pfn/page conversion macros
Franck Bui-Huu [Tue, 26 Sep 2006 06:31:07 +0000 (23:31 -0700)]
[PATCH] bootmem: use pfn/page conversion macros

It also creates get_mapsize() helper in order to make the code more readable
when it calculates the boot bitmap size.

Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] bootmem: remove useless headers inclusions
Franck Bui-Huu [Tue, 26 Sep 2006 06:31:06 +0000 (23:31 -0700)]
[PATCH] bootmem: remove useless headers inclusions

Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] bootmem: limit to 80 columns width
Franck Bui-Huu [Tue, 26 Sep 2006 06:31:05 +0000 (23:31 -0700)]
[PATCH] bootmem: limit to 80 columns width

Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] bootmem: remove useless parentheses in bootmem header file
Franck Bui-Huu [Tue, 26 Sep 2006 06:31:05 +0000 (23:31 -0700)]
[PATCH] bootmem: remove useless parentheses in bootmem header file

Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] bootmem: mark link_bootmem() as part of the __init section
Franck Bui-Huu [Tue, 26 Sep 2006 06:31:04 +0000 (23:31 -0700)]
[PATCH] bootmem: mark link_bootmem() as part of the __init section

Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] bootmem: remove useless __init in header file
Franck Bui-Huu [Tue, 26 Sep 2006 06:31:03 +0000 (23:31 -0700)]
[PATCH] bootmem: remove useless __init in header file

__init in headers is pretty useless because the compiler doesn't check it, and
they get out of sync relatively frequently.  So if you see an __init in a
header file, it's quite unreliable and you need to check the definition
anyway.

Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] convert i386 NUMA KVA space to bootmem
keith mannthey [Tue, 26 Sep 2006 06:31:03 +0000 (23:31 -0700)]
[PATCH] convert i386 NUMA KVA space to bootmem

Address a long standing issue of booting with an initrd on an i386 numa
system.  Currently (and always) the numa kva area is mapped into low memory
by finding the end of low memory and moving that mark down (thus creating
space for the kva).  The issue with this is that Grub loads initrds into
this similar space so when the kernel check the initrd it finds it outside
max_low_pfn and disables it (it thinks the initrd is not mapped into usable
memory) thus initrd enabled kernels can't boot i386 numa :(

My solution to the problem just converts the numa kva area to use the
bootmem allocator to save it's area (instead of moving the end of low
memory).  Using bootmem allows the kva area to be mapped into more diverse
addresses (not just the end of low memory) and enables the kva area to be
mapped below the initrd if present.

I have tested this patch on numaq(no initrd) and summit(initrd) i386 numa
based systems.

[akpm@osdl.org: cleanups]
Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] mm/: make functions static
Adrian Bunk [Tue, 26 Sep 2006 06:31:02 +0000 (23:31 -0700)]
[PATCH] mm/: make functions static

This patch makes the following needlessly global functions static:
 - slab.c: kmem_find_general_cachep()
 - swap.c: __page_cache_release()
 - vmalloc.c: __vmalloc_node()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] mm: msync() cleanup
Peter Zijlstra [Tue, 26 Sep 2006 06:31:01 +0000 (23:31 -0700)]
[PATCH] mm: msync() cleanup

With the tracking of dirty pages properly done now, msync doesn't need to scan
the PTEs anymore to determine the dirty status.

From: Hugh Dickins <hugh@veritas.com>

In looking to do that, I made some other tidyups: can remove several
#includes, and sys_msync loop termination not quite right.

Most of those points are criticisms of the existing sys_msync, not of your
patch.  In particular, the loop termination errors were introduced in 2.6.17:
I did notice this shortly before it came out, but decided I was more likely to
get it wrong myself, and make matters worse if I tried to rush a last-minute
fix in.  And it's not terribly likely to go wrong, nor disastrous if it does
go wrong (may miss reporting an unmapped area; may also fsync file of a
following vma).

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] mm: fixup do_wp_page()
Peter Zijlstra [Tue, 26 Sep 2006 06:31:00 +0000 (23:31 -0700)]
[PATCH] mm: fixup do_wp_page()

Wrt. the recent modifications in do_wp_page() Hugh Dickins pointed out:

  "I now realize it's right to the first order (normal case) and to the
   second order (ptrace poke), but not to the third order (ptrace poke
   anon page here to be COWed - perhaps can't occur without intervening
   mprotects)."

This patch restores the old COW behaviour for anonymous pages.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] mm: small cleanup of install_page()
Peter Zijlstra [Tue, 26 Sep 2006 06:30:59 +0000 (23:30 -0700)]
[PATCH] mm: small cleanup of install_page()

Smallish cleanup to install_page(), could save a memory read (haven't checked
the asm output) and sure looks nicer.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] mm: optimize the new mprotect() code a bit
Peter Zijlstra [Tue, 26 Sep 2006 06:30:59 +0000 (23:30 -0700)]
[PATCH] mm: optimize the new mprotect() code a bit

mprotect() resets the page protections, which could result in extra write
faults for those pages whose dirty state we track using write faults and are
dirty already.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] mm: balance dirty pages
Peter Zijlstra [Tue, 26 Sep 2006 06:30:58 +0000 (23:30 -0700)]
[PATCH] mm: balance dirty pages

Now that we can detect writers of shared mappings, throttle them.  Avoids OOM
by surprise.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] mm: tracking shared dirty pages
Peter Zijlstra [Tue, 26 Sep 2006 06:30:57 +0000 (23:30 -0700)]
[PATCH] mm: tracking shared dirty pages

Tracking of dirty pages in shared writeable mmap()s.

The idea is simple: write protect clean shared writeable pages, catch the
write-fault, make writeable and set dirty.  On page write-back clean all the
PTE dirty bits and write protect them once again.

The implementation is a tad harder, mainly because the default
backing_dev_info capabilities were too loosely maintained.  Hence it is not
enough to test the backing_dev_info for cap_account_dirty.

The current heuristic is as follows, a VMA is eligible when:
 - its shared writeable
    (vm_flags & (VM_WRITE|VM_SHARED)) == (VM_WRITE|VM_SHARED)
 - it is not a 'special' mapping
    (vm_flags & (VM_PFNMAP|VM_INSERTPAGE)) == 0
 - the backing_dev_info is cap_account_dirty
    mapping_cap_account_dirty(vma->vm_file->f_mapping)
 - f_op->mmap() didn't change the default page protection

Page from remap_pfn_range() are explicitly excluded because their COW
semantics are already horrid enough (see vm_normal_page() in do_wp_page()) and
because they don't have a backing store anyway.

mprotect() is taught about the new behaviour as well.  However it overrides
the last condition.

Cleaning the pages on write-back is done with page_mkclean() a new rmap call.
It can be called on any page, but is currently only implemented for mapped
pages, if the page is found the be of a VMA that accounts dirty pages it will
also wrprotect the PTE.

Finally, in fs/buffers.c:try_to_free_buffers(); remove clear_page_dirty() from
under ->private_lock.  This seems to be safe, since ->private_lock is used to
serialize access to the buffers, not the page itself.  This is needed because
clear_page_dirty() will call into page_mkclean() and would thereby violate
locking order.

[dhowells@redhat.com: Provide a page_mkclean() implementation for NOMMU]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] mm: VM_BUG_ON
Nick Piggin [Tue, 26 Sep 2006 06:30:55 +0000 (23:30 -0700)]
[PATCH] mm: VM_BUG_ON

Introduce a VM_BUG_ON, which is turned on with CONFIG_DEBUG_VM.  Use this
in the lightweight, inline refcounting functions; PageLRU and PageActive
checks in vmscan, because they're pretty well confined to vmscan.  And in
page allocate/free fastpaths which can be the hottest parts of the kernel
for kbuilds.

Unlike BUG_ON, VM_BUG_ON must not be used to execute statements with
side-effects, and should not be used outside core mm code.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] update to the kernel kmap/kunmap API
James Bottomley [Tue, 26 Sep 2006 06:30:55 +0000 (23:30 -0700)]
[PATCH] update to the kernel kmap/kunmap API

Give non-highmem architectures access to the kmap API for the purposes of
overriding (this is what the attached patch does).

The proposal is that we should now require all architectures with coherence
issues to manage data coherence via the kmap/kunmap API.  Thus driver
writers never have to write code like

    kmap(page)
    modify data in page
    flush_kernel_dcache_page(page)
    kunmap(page)

instead, kmap/kunmap will manage the coherence and driver (and filesystem)
writers don't need to worry about how to flush between kmap and kunmap.

For most architectures, the page only needs to be flushed if it was
actually written to *and* there are user mappings of it, so the best
implementation looks to be: clear the page dirty pte bit in the kernel page
tables on kmap and on kunmap, check page->mappings for user maps, and then
the dirty bit, and only flush if it both has user mappings and is dirty.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] jbd: fix commit of ordered data buffers
Jan Kara [Tue, 26 Sep 2006 06:30:53 +0000 (23:30 -0700)]
[PATCH] jbd: fix commit of ordered data buffers

Original commit code assumes, that when a buffer on BJ_SyncData list is
locked, it is being written to disk.  But this is not true and hence it can
lead to a potential data loss on crash.  Also the code didn't count with
the fact that journal_dirty_data() can steal buffers from committing
transaction and hence could write buffers that no longer belong to the
committing transaction.  Finally it could possibly happen that we tried
writing out one buffer several times.

The patch below tries to solve these problems by a complete rewrite of the
data commit code.  We go through buffers on t_sync_datalist, lock buffers
needing write out and store them in an array.  Buffers are also immediately
refiled to BJ_Locked list or unfiled (if the write out is completed).  When
the array is full or we have to block on buffer lock, we submit all
accumulated buffers for IO.

[suitable for 2.6.18.x around the 2.6.19-rc2 timeframe]

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] trigger a syntax error if percpu macros are incorrectly used
Jan Blunck [Tue, 26 Sep 2006 06:30:53 +0000 (23:30 -0700)]
[PATCH] trigger a syntax error if percpu macros are incorrectly used

get_cpu_var()/per_cpu()/__get_cpu_var() arguments must be simple
identifiers.  Otherwise the arch dependent implementations might break.

This patch enforces the correct usage of the macros by producing a syntax
error if the variable is not a simple identifier.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Fix longstanding load balancing bug in the scheduler
Christoph Lameter [Tue, 26 Sep 2006 06:30:51 +0000 (23:30 -0700)]
[PATCH] Fix longstanding load balancing bug in the scheduler

The scheduler will stop load balancing if the most busy processor contains
processes pinned via processor affinity.

The scheduler currently only does one search for busiest cpu.  If it cannot
pull any tasks away from the busiest cpu because they were pinned then the
scheduler goes into a corner and sulks leaving the idle processors idle.

F.e.  If you have processor 0 busy running four tasks pinned via taskset,
there are none on processor 1 and one just started two processes on
processor 2 then the scheduler will not move one of the two processes away
from processor 2.

This patch fixes that issue by forcing the scheduler to come out of its
corner and retrying the load balancing by considering other processors for
load balancing.

This patch was originally developed by John Hawkes and discussed at

    http://marc.theaimsgroup.com/?l=linux-kernel&m=113901368523205&w=2.

I have removed extraneous material and gone back to equipping struct rq
with the cpu the queue is associated with since this makes the patch much
easier and it is likely that others in the future will have the same
difficulty of figuring out which processor owns which runqueue.

The overhead added through these patches is a single word on the stack if
the kernel is configured to support 32 cpus or less (32 bit).  For 32 bit
environments the maximum number of cpus that can be configued is 255 which
would result in the use of 32 bytes additional on the stack.  On IA64 up to
1k cpus can be configured which will result in the use of 128 additional
bytes on the stack.  The maximum additional cache footprint is one
cacheline.  Typically memory use will be much less than a cacheline and the
additional cpumask will be placed on the stack in a cacheline that already
contains other local variable.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: John Hawkes <hawkes@sgi.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Peter Williams <pwil3058@bigpond.net.au>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years agoMerge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik...
Linus Torvalds [Tue, 26 Sep 2006 02:32:02 +0000 (19:32 -0700)]
Merge branch 'upstream-linus' of /linux/kernel/git/jgarzik/libata-dev

* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  [libata] Fix oops introduced in non-uniform port handling fix
  [PATCH] ata-piix: fixes kerneldoc error

18 years ago[libata] Fix oops introduced in non-uniform port handling fix
Jeff Garzik [Tue, 26 Sep 2006 01:56:33 +0000 (21:56 -0400)]
[libata] Fix oops introduced in non-uniform port handling fix

Noticed by several people.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Tue, 26 Sep 2006 00:39:55 +0000 (17:39 -0700)]
Merge /pub/scm/linux/kernel/git/davem/net-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [NetLabel]: update docs with website information
  [NetLabel]: rework the Netlink attribute handling (part 2)
  [NetLabel]: rework the Netlink attribute handling (part 1)
  [Netlink]: add nla_validate_nested()
  [NETLINK]: add nla_for_each_nested() to the interface list
  [NetLabel]: change the SELinux permissions
  [NetLabel]: make the CIPSOv4 cache spinlocks bottom half safe
  [NetLabel]: correct improper handling of non-NetLabel peer contexts
  [TCP]: make cubic the default
  [TCP]: default congestion control menu
  [ATM] he: Fix __init/__devinit conflict
  [NETFILTER]: Add dscp,DSCP headers to header-y
  [DCCP]: Introduce dccp_probe
  [DCCP]: Use constants for CCIDs
  [DCCP]: Introduce constants for CCID numbers
  [DCCP]: Allow default/fallback service code.

18 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
Linus Torvalds [Tue, 26 Sep 2006 00:39:04 +0000 (17:39 -0700)]
Merge /pub/scm/linux/kernel/git/davem/sparc-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SOUND] sparc/amd7930: Use __devinit and __devinitdata as needed.
  [SUNLANCE]: Mark sparc_lance_probe_one as __devinit.
  [SPARC64]: Fix section-mismatch errors in solaris emul module.

18 years ago[PATCH] VIDIOC_ENUMSTD bug
Jonathan Corbet [Mon, 25 Sep 2006 23:25:37 +0000 (16:25 -0700)]
[PATCH] VIDIOC_ENUMSTD bug

The v4l2 API documentation for VIDIOC_ENUMSTD says:

To enumerate all standards applications shall begin at index
zero, incrementing by one until the driver returns EINVAL.

The actual code, however, tests the index this way:

               if (index<=0 || index >= vfd->tvnormsize) {
                        ret=-EINVAL;

So any application which passes in index=0 gets EINVAL right off the bat
- and, in fact, this is what happens to mplayer.  So I think the
following patch is called for, and maybe even appropriate for a 2.6.18.x
stable release.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] load_module: no BUG if module_subsys uninitialized
Ed Swierk [Mon, 25 Sep 2006 23:25:36 +0000 (16:25 -0700)]
[PATCH] load_module: no BUG if module_subsys uninitialized

Invoking load_module() before param_sysfs_init() is called crashes in
mod_sysfs_setup(), since the kset in module_subsys is not initialized yet.

In my case, net-pf-1 is getting modprobed as a result of hotplug trying to
create a UNIX socket.  Calls to hotplug begin after the topology_init
initcall.

Another patch for the same symptom (module_subsys-initialize-earlier.patch)
moves param_sysfs_init() to the subsys initcalls, but this is still not
early enough in the boot process in some cases.  In particular,
topology_init() causes /sbin/hotplug to run, which requests net-pf-1 (the
UNIX socket protocol) which can be compiled as a module.  Moving
param_sysfs_init() to the postcore initcalls fixes this particular race,
but there might well be other cases where a usermodehelper causes a module
to load earlier still.

The patch makes load_module() return an error rather than crashing the
kernel if invoked before module_subsys is initialized.

Cc: Mark Huang <mlhuang@cs.princeton.edu>
Cc: Greg KH <greg@kroah.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] i386: fix flat mode numa on a real numa system
keith mannthey [Mon, 25 Sep 2006 23:25:35 +0000 (16:25 -0700)]
[PATCH] i386: fix flat mode numa on a real numa system

If there is only 1 node in the system cpus should think they are apart of
some other node.

If cases where a real numa system boots the Flat numa option make sure the
cpus don't claim to be apart on a non-existent node.

Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] cpu to node relationship fixup: map cpu to node
KAMEZAWA Hiroyuki [Mon, 25 Sep 2006 23:25:31 +0000 (16:25 -0700)]
[PATCH] cpu to node relationship fixup: map cpu to node

Assume that a cpu is *physically* offlined at boot time...

Because smpboot.c::smp_boot_cpu_map() canoot find cpu's sapicid,
numa.c::build_cpu_to_node_map() cannot build cpu<->node map for
offlined cpu.

For such cpus, cpu_to_node map should be fixed at cpu-hot-add.
This mapping should be done before cpu onlining.

This patch also handles cpu hotremove case.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] cpu to node relationship fixup: acpi_map_cpu2node
KAMEZAWA Hiroyuki [Mon, 25 Sep 2006 23:25:21 +0000 (16:25 -0700)]
[PATCH] cpu to node relationship fixup: acpi_map_cpu2node

Problem description:

  We have additional_cpus= option for allocating possible_cpus.  But nid
  for possible cpus are not fixed at boot time.  cpus which is offlined at
  boot or cpus which is not on SRAT is not tied to its node.  This will
  cause panic at cpu onlining.

Usually, pxm_to_nid() mapping is fixed at boot time by SRAT.

But, unfortunately, some system (my system!) do not include
full SRAT table for possible cpus.  (Then, I use
additiona_cpus= option.)

For such possible cpus, pxm<->nid should be fixed at
hot-add.  We now have acpi_map_pxm_to_node() which is also
used at boot.  It's suitable here.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] backlight: fix oops in __mutex_lock_slowpath during head /sys/class/graphics...
Michael Hanselmann [Mon, 25 Sep 2006 23:25:07 +0000 (16:25 -0700)]
[PATCH] backlight: fix oops in __mutex_lock_slowpath during head /sys/class/graphics/fb0/*

Seems like not all drivers use the framebuffer_alloc() function and won't
have an initialized mutex.  But those don't have a backlight, anyway.

Signed-off-by: Michael Hanselmann <linux-kernel@hansmi.ch>
Cc: Olaf Hering <olaf@aepfle.de>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: Daniel R Thompson <daniel.thompson@st.com>
Cc: Jon Smirl <jonsmirl@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] do not free non slab allocated per_cpu_pageset
David Rientjes [Mon, 25 Sep 2006 23:24:57 +0000 (16:24 -0700)]
[PATCH] do not free non slab allocated per_cpu_pageset

Stops panic associated with attempting to free a non slab-allocated
per_cpu_pageset.

Signed-off-by: David Rientjes <rientjes@cs.washington.edu>
Acked-by: Christoph Lameter <clameter@sgi.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] i386 bootioremap / kexec fix
keith mannthey [Mon, 25 Sep 2006 23:24:39 +0000 (16:24 -0700)]
[PATCH] i386 bootioremap / kexec fix

With CONFIG_PHYSICAL_START set to a non default values the i386
boot_ioremap code calculated its pte index wrong and users of boot_ioremap
have their areas incorrectly mapped (for me SRAT table not mapped during
early boot).  This patch removes the addr < BOOT_PTE_PTRS constraint.

[ Keith says this is applicable to 2.6.16 and 2.6.17 as well ]

Signed-off-by: Keith Mannthey<kmannth@us.ibm.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: <stable@kernel.org>
Cc: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] rtc: lockdep fix/workaround
Peter Zijlstra [Mon, 25 Sep 2006 23:24:23 +0000 (16:24 -0700)]
[PATCH] rtc: lockdep fix/workaround

BUG: warning at kernel/lockdep.c:1816/trace_hardirqs_on() (Not tainted)
 [<c04051ee>] show_trace_log_lvl+0x58/0x171
 [<c0405802>] show_trace+0xd/0x10
 [<c040591b>] dump_stack+0x19/0x1b
 [<c043abee>] trace_hardirqs_on+0xa2/0x11e
 [<c06143c3>] _spin_unlock_irq+0x22/0x26
 [<c0541540>] rtc_get_rtc_time+0x32/0x176
 [<c0419ba4>] hpet_rtc_interrupt+0x92/0x14d
 [<c0450f94>] handle_IRQ_event+0x20/0x4d
 [<c0451055>] __do_IRQ+0x94/0xef
 [<c040678d>] do_IRQ+0x9e/0xbd
 [<c0404a49>] common_interrupt+0x25/0x2c
DWARF2 unwinder stuck at common_interrupt+0x25/0x2c

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] autofs4: zero timeout prevents shutdown
Ian Kent [Mon, 25 Sep 2006 23:24:16 +0000 (16:24 -0700)]
[PATCH] autofs4: zero timeout prevents shutdown

If the timeout of an autofs mount is set to zero then umounts are disabled.
 This works fine, however the kernel module checks the expire timeout and
goes no further if it is zero.  This is not the right thing to do at
shutdown as the module is passed an option to expire mounts regardless of
their timeout setting.

This patch allows autofs to honor the force expire option.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] ata-piix: fixes kerneldoc error
Henne [Mon, 25 Sep 2006 20:00:46 +0000 (22:00 +0200)]
[PATCH] ata-piix: fixes kerneldoc error

Fixes an error in kerneldoc of ata_piix.c.
Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[SOUND] sparc/amd7930: Use __devinit and __devinitdata as needed.
David S. Miller [Mon, 25 Sep 2006 21:08:37 +0000 (14:08 -0700)]
[SOUND] sparc/amd7930: Use __devinit and __devinitdata as needed.

Fixes section-mismatch errors.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[SUNLANCE]: Mark sparc_lance_probe_one as __devinit.
David S. Miller [Mon, 25 Sep 2006 21:04:49 +0000 (14:04 -0700)]
[SUNLANCE]: Mark sparc_lance_probe_one as __devinit.

Fixes section mismatch warnings when built as a module.

Also, mark find_ledma and sun4 init function as __devinit
too.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[SPARC64]: Fix section-mismatch errors in solaris emul module.
David S. Miller [Mon, 25 Sep 2006 21:00:45 +0000 (14:00 -0700)]
[SPARC64]: Fix section-mismatch errors in solaris emul module.

init_socksys() was marked __init but invoked from a
non-__init function.

Use the correct module_{init,exit}() faciltiies while we're
here and eliminate some seriously bogus ifdefs.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NetLabel]: update docs with website information
Paul Moore [Mon, 25 Sep 2006 22:57:13 +0000 (15:57 -0700)]
[NetLabel]: update docs with website information

Now that all of the supporting pieces of NetLabel have a home at SourceForge
update the Kconfig help text and add an entry to the MAINTAINERS file.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NetLabel]: rework the Netlink attribute handling (part 2)
Paul Moore [Mon, 25 Sep 2006 22:56:37 +0000 (15:56 -0700)]
[NetLabel]: rework the Netlink attribute handling (part 2)

At the suggestion of Thomas Graf, rewrite NetLabel's use of Netlink attributes
to better follow the common Netlink attribute usage.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NetLabel]: rework the Netlink attribute handling (part 1)
Paul Moore [Mon, 25 Sep 2006 22:56:09 +0000 (15:56 -0700)]
[NetLabel]: rework the Netlink attribute handling (part 1)

At the suggestion of Thomas Graf, rewrite NetLabel's use of Netlink attributes
to better follow the common Netlink attribute usage.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[Netlink]: add nla_validate_nested()
Paul Moore [Mon, 25 Sep 2006 22:54:03 +0000 (15:54 -0700)]
[Netlink]: add nla_validate_nested()

Add a new function, nla_validate_nested(), to validate nested Netlink
attributes.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETLINK]: add nla_for_each_nested() to the interface list
Paul Moore [Mon, 25 Sep 2006 22:53:37 +0000 (15:53 -0700)]
[NETLINK]: add nla_for_each_nested() to the interface list

At the top of include/net/netlink.h is a list of Netlink interfaces, however,
the nla_for_each_nested() macro was not listed.  This patch adds this interface
to the list at the top of the header file.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NetLabel]: change the SELinux permissions
Paul Moore [Mon, 25 Sep 2006 22:53:13 +0000 (15:53 -0700)]
[NetLabel]: change the SELinux permissions

Change NetLabel to use the 'recvfrom' socket permission and the
SECINITSID_NETMSG SELinux SID as the NetLabel base SID for incoming packets.
This patch effectively makes the old, and currently unused, SELinux NETMSG
permissions NetLabel permissions.

Signed-of-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NetLabel]: make the CIPSOv4 cache spinlocks bottom half safe
Paul Moore [Mon, 25 Sep 2006 22:52:37 +0000 (15:52 -0700)]
[NetLabel]: make the CIPSOv4 cache spinlocks bottom half safe

The CIPSOv4 cache traversal routines are triggered both the userspace events
(cache invalidation due to DOI removal or updated SELinux policy) and network
packet processing events.  As a result there is a problem with the existing
CIPSOv4 cache spinlocks as they are not bottom-half/softirq safe.  This patch
converts the CIPSOv4 cache spin_[un]lock() calls into spin_[un]lock_bh() calls
to address this problem.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NetLabel]: correct improper handling of non-NetLabel peer contexts
Paul Moore [Mon, 25 Sep 2006 22:52:01 +0000 (15:52 -0700)]
[NetLabel]: correct improper handling of non-NetLabel peer contexts

Fix a problem where NetLabel would always set the value of
sk_security_struct->peer_sid in selinux_netlbl_sock_graft() to the context of
the socket, causing problems when users would query the context of the
connection.  This patch fixes this so that the value in
sk_security_struct->peer_sid is only set when the connection is NetLabel based,
otherwise the value is untouched.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[libata] No need for all those arch libata-portmap.h headers
Jeff Garzik [Mon, 25 Sep 2006 19:33:09 +0000 (15:33 -0400)]
[libata] No need for all those arch libata-portmap.h headers

They all contain the same thing.  Instead, have a single generic one in
include/asm-generic, and permit an arch to override as needed.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
18 years ago[TCP]: make cubic the default
Stephen Hemminger [Mon, 25 Sep 2006 03:13:03 +0000 (20:13 -0700)]
[TCP]: make cubic the default

Change default congestion control used from BIC to the newer CUBIC
which it the successor to BIC but has better properties over long delay links.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP]: default congestion control menu
Stephen Hemminger [Mon, 25 Sep 2006 03:11:58 +0000 (20:11 -0700)]
[TCP]: default congestion control menu

Change how default TCP congestion control is chosen. Don't just use
last installed module, instead allow selection during configuration,
and make sure and use the default regardless of load order.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[ATM] he: Fix __init/__devinit conflict
Roland Dreier [Mon, 25 Sep 2006 03:09:33 +0000 (20:09 -0700)]
[ATM] he: Fix __init/__devinit conflict

he_init_one() is declared __devinit, but calls lots of init functions
that are marked __init.  However, if CONFIG_HOTPLUG is enabled,
__devinit functions go into normal .text, which leads to

    WARNING: drivers/atm/he.o - Section mismatch: reference to .init.text: from .text between 'he_start' (at offset 0x2130) and 'he_service_tbrq'

Fix this by changing the __init functions to __devinit.

Signed-off-by: Roland Dreier <roland@digitalvampire.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[PATCH] pata_pdc2027x iomem annotations
Al Viro [Mon, 25 Sep 2006 01:57:57 +0000 (02:57 +0100)]
[PATCH] pata_pdc2027x iomem annotations

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] fix idiocy in asd_init_lseq_mdp()
Al Viro [Mon, 25 Sep 2006 01:57:22 +0000 (02:57 +0100)]
[PATCH] fix idiocy in asd_init_lseq_mdp()

To whoever had written that code:

 a) priority of >> is higher than that of &
 b) priority of typecast is higher than that of any binary operator
 c) learn the fscking C

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] SCSI gfp_t annotations
Al Viro [Mon, 25 Sep 2006 01:55:40 +0000 (02:55 +0100)]
[PATCH] SCSI gfp_t annotations

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] wrong thing iounmapped (qla3xxx)
Al Viro [Mon, 25 Sep 2006 01:54:46 +0000 (02:54 +0100)]
[PATCH] wrong thing iounmapped (qla3xxx)

ql3xxx_probe() does ioremap and stores result in ->mem_map_registers.
On failure exit it does iounmap() of the same thing.

OTOH, ql3xxx_remove() does iounmap() of ->mmap_virt_base which is
 (a) never assigned and
 (b) never used other than in that iounmap() call.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] qla3xxx iomem annotations
Al Viro [Mon, 25 Sep 2006 01:53:53 +0000 (02:53 +0100)]
[PATCH] qla3xxx iomem annotations

the driver is still shite, though...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Revert ABI-breaking change in /proc
Matthew Wilcox [Mon, 25 Sep 2006 02:22:52 +0000 (20:22 -0600)]
[PATCH] Revert ABI-breaking change in /proc

Some user tools parse /proc/scsi/scsi, so we can't yet change the names.
Change the existing ones back to their old names, and add an admonition
to not make the same mistake that I did.

Andrew Morton reports that this was breaking YDL 4.1 userspace.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6
David S. Miller [Mon, 25 Sep 2006 02:29:57 +0000 (19:29 -0700)]
Merge /pub/scm/linux/kernel/git/acme/net-2.6

18 years ago[NETFILTER]: Add dscp,DSCP headers to header-y
Yasuyuki Kozakai [Mon, 25 Sep 2006 02:28:47 +0000 (19:28 -0700)]
[NETFILTER]: Add dscp,DSCP headers to header-y

This patch adds xt_dscp.h and xt_DSCP.h to the kernel headers which are
exported via 'make headers_install'. These are necessary for userspace
to add rules using dscp match and DSCP target.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years agoMerge git://git.infradead.org/~dwmw2/khdrs-2.6
Linus Torvalds [Sun, 24 Sep 2006 22:55:22 +0000 (15:55 -0700)]
Merge git://git.infradead.org/~dwmw2/khdrs-2.6

* git://git.infradead.org/~dwmw2/khdrs-2.6:
  Don't remove $(INSTALL_HDR_PATH)/install before headers_install.

18 years ago[PATCH] missing include (free_irq() use)
Al Viro [Sun, 24 Sep 2006 22:45:29 +0000 (23:45 +0100)]
[PATCH] missing include (free_irq() use)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] NULL noise removal
Al Viro [Sun, 24 Sep 2006 22:42:57 +0000 (23:42 +0100)]
[PATCH] NULL noise removal

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] fix iptables __user misannotations
Al Viro [Sun, 24 Sep 2006 22:42:20 +0000 (23:42 +0100)]
[PATCH] fix iptables __user misannotations

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] netlabel gfp annotations
Al Viro [Sun, 24 Sep 2006 22:41:42 +0000 (23:41 +0100)]
[PATCH] netlabel gfp annotations

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] libata won't build on SUN4
Al Viro [Sun, 24 Sep 2006 22:41:00 +0000 (23:41 +0100)]
[PATCH] libata won't build on SUN4

marked as such...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] libata won't build on m68k and m32r
Al Viro [Sun, 24 Sep 2006 22:40:00 +0000 (23:40 +0100)]
[PATCH] libata won't build on m68k and m32r

no ioread*(), for one thing

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] restore libata build on frv
Al Viro [Sun, 24 Sep 2006 22:39:25 +0000 (23:39 +0100)]
[PATCH] restore libata build on frv

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years agoDon't remove $(INSTALL_HDR_PATH)/install before headers_install.
David Woodhouse [Sun, 24 Sep 2006 22:44:57 +0000 (23:44 +0100)]
Don't remove $(INSTALL_HDR_PATH)/install before headers_install.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
18 years agoMerge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfashe...
Linus Torvalds [Sun, 24 Sep 2006 22:28:50 +0000 (15:28 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/mfasheh/ocfs2

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (28 commits)
  ocfs2: Teach ocfs2_drop_lock() to use ->set_lvb() callback
  ocfs2: Remove ->unblock lockres operation
  ocfs2: move downconvert worker to lockres ops
  ocfs2: Remove unused dlmglue functions
  ocfs2: Have the metadata lock use generic dlmglue functions
  ocfs2: Add ->set_lvb callback in dlmglue
  ocfs2: Add ->check_downconvert callback in dlmglue
  ocfs2: Check for refreshing locks in generic unblock function
  ocfs2: don't unconditionally pass LVB flags
  ocfs2: combine inode and generic blocking AST functions
  ocfs2: Add ->get_osb() dlmglue locking operation
  ocfs2: remove ->unlock_ast() callback from ocfs2_lock_res_ops
  ocfs2: combine inode and generic AST functions
  ocfs2: Clean up lock resource refresh flags
  ocfs2: Remove i_generation from inode lock names
  ocfs2: Encode i_generation in the meta data lvb
  ocfs2: Free up some space in the lvb
  ocfs2: Remove special casing for inode creation in ocfs2_dentry_attach_lock()
  ocfs2: manually d_move() during ocfs2_rename()
  [PATCH] Allow file systems to manually d_move() inside of ->rename()
  ...

18 years agoMerge git://git.infradead.org/~dwmw2/khdrs-2.6
Linus Torvalds [Sun, 24 Sep 2006 21:55:52 +0000 (14:55 -0700)]
Merge git://git.infradead.org/~dwmw2/khdrs-2.6

* git://git.infradead.org/~dwmw2/khdrs-2.6:
  New 'make headers_install_all' target.
  Use dependencies for 'make headers_install'.
  [S390] Unexport <asm/z90crypt.h>, export <asm/zcrypt.h> in its place.
  Remove dead netfilter_logging.h from include/linux/Kbuild
  Remove offsetof() from user-visible <linux/stddef.h>
  Clean up exported headers on CRIS
  Fix v850 exported headers
  Don't advertise (or allow) headers_{install,check} where inappropriate.
  Remove UML header export
  Remove ARM26 header export.
  Fix H8300 exported headers.
  Fix m68knommu exported headers
  Fix exported headers for SPARC, SPARC64
  Fix 'make headers_check' on m32r
  Fix 'make headers_check' on sh64
  Fix 'make headers_check' on sh
  [HEADERS] Fix ARM 'make headers_check'

Initial pass of manual conflict resolution in top-level Makefile over
conflicting build rule and headers_install changes.

18 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild
Linus Torvalds [Sun, 24 Sep 2006 21:24:14 +0000 (14:24 -0700)]
Merge git://git./linux/kernel/git/sam/kbuild

* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild: (28 commits)
  kbuild: add distclean info to 'make help' and more details for 'clean'
  dontdiff: add utsrelease.h
  kbuild: fix "mkdir -p" usage in scripts/package/mkspec
  kbuild: correct and clarify versioning info in Makefile
  kbuild: fixup Documentation/kbuild/modules.txt
  kbuild: Extend kbuild/defconfig tags support to exuberant ctags
  kbuild: fix for some typos in Documentation/makefiles.txt
  kbuild: clarify "make C=" build option
  Documentaion: update Documentation/Changes with minimum versions
  kbuild: update help in top level Makefile
  kbuild: fail kernel compilation in case of unresolved module symbols
  kbuild: remove debug left-over from Makefile.host
  kbuild: create output directory for hostprogs with O=.. build
  kbuild: add missing return statement in modpost.c:secref_whitelist()
  kbuild: preperly align SYSMAP output
  kbuild: make -rR is now default
  kbuild: make V=2 tell why a target is rebuild
  kbuild: modpost on vmlinux regardless of CONFIG_MODULES
  kbuild: ignore references from ".pci_fixup" to ".init.text"
  kbuild: linguistic fixes for Documentation/kbuild/makefiles.txt
  ...

18 years agokbuild: add distclean info to 'make help' and more details for 'clean'
Jesper Juhl [Sun, 24 Sep 2006 12:01:08 +0000 (14:01 +0200)]
kbuild: add distclean info to 'make help' and more details for 'clean'

Add distclean info, that was previously missing, to 'make help'.
Also add a few more details to the 'make clean' help text.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
18 years agodontdiff: add utsrelease.h
Randy Dunlap [Fri, 22 Sep 2006 19:37:56 +0000 (12:37 -0700)]
dontdiff: add utsrelease.h

Add auto-generated utsrelease.h to dontdiff file.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
18 years agokbuild: fix "mkdir -p" usage in scripts/package/mkspec
Rolf Eike Beer [Mon, 14 Aug 2006 06:16:47 +0000 (08:16 +0200)]
kbuild: fix "mkdir -p" usage in scripts/package/mkspec

"mkdir -p" does not only mean not to complain if the directory already
exists, but also to create the parent directories if needed. This patch
removes "lib" from the list of directories to create as we will also create
"lib/modules".

Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
18 years agokbuild: correct and clarify versioning info in Makefile
Robert P. J. Day [Tue, 12 Sep 2006 16:38:19 +0000 (12:38 -0400)]
kbuild: correct and clarify versioning info in Makefile

The attached patch clarifies the creation of KERNELRELEASE and
corrects an error regarding the use of $(LOCALVERSION).

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
18 years agokbuild: fixup Documentation/kbuild/modules.txt
Robert P. J. Day [Thu, 21 Sep 2006 13:39:41 +0000 (09:39 -0400)]
kbuild: fixup Documentation/kbuild/modules.txt

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
18 years agokbuild: Extend kbuild/defconfig tags support to exuberant ctags
Aron Griffis [Thu, 21 Sep 2006 04:27:02 +0000 (00:27 -0400)]
kbuild: Extend kbuild/defconfig tags support to exuberant ctags

The following patch extends kbuild/defconfig tags support to exuberant
ctags.  The previous support is only for emacs ctags/etags programs.

This patch also corrects the kconfig regex for the emacs invocation.
Previously it would miss some instances because it assumed /^config
instead of /^[ \t]*config

Signed-off-by: Aron Griffis <aron@hp.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
18 years agokbuild: fix for some typos in Documentation/makefiles.txt
Bryce Harrington [Wed, 20 Sep 2006 19:43:37 +0000 (12:43 -0700)]
kbuild: fix for some typos in Documentation/makefiles.txt

I noticed a few typos while reading makefiles.txt to learn about the
kbuild system.  Attached is a patch against 2.6.18 to fix them.
Remove trailing whitespace while we are there..

Signed-off-by: Bryce Harrington <bryce@osdl.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
18 years agokbuild: clarify "make C=" build option
Robert P. J. Day [Wed, 13 Sep 2006 11:57:50 +0000 (07:57 -0400)]
kbuild: clarify "make C=" build option

Clarify the use of "make C=" in the top-level Makefile, and fix a
typo in the Documentation file.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
18 years agoDocumentaion: update Documentation/Changes with minimum versions
Robert P. J. Day [Mon, 11 Sep 2006 16:39:19 +0000 (12:39 -0400)]
Documentaion: update Documentation/Changes with minimum versions

Based on conversations with greg kh (and noticing a simple typo),
these are the actual minimal versions for 2.6.18.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
18 years agokbuild: update help in top level Makefile
Robert P. J. Day [Mon, 11 Sep 2006 16:09:42 +0000 (12:09 -0400)]
kbuild: update help in top level Makefile

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
18 years agokbuild: fail kernel compilation in case of unresolved module symbols
Kirill Korotaev [Thu, 7 Sep 2006 20:08:54 +0000 (13:08 -0700)]
kbuild: fail kernel compilation in case of unresolved module symbols

At stage 2 modpost utility is used to check modules.  In case of unresolved
symbols modpost only prints warning.

IMHO it is a good idea to fail compilation process in case of unresolved
symbols (at least in modules coming with kernel), since usually such errors
are left unnoticed, but kernel modules are broken.

- new option '-w' is added to modpost:
  if option is specified, modpost only warns about unresolved symbols

- modpost is called with '-w' for external modules in Makefile.modpost

Signed-off-by: Andrey Mirkin <amirkin@sw.ru>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
18 years agokbuild: remove debug left-over from Makefile.host
Sam Ravnborg [Tue, 8 Aug 2006 14:45:41 +0000 (16:45 +0200)]
kbuild: remove debug left-over from Makefile.host

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
18 years agokbuild: create output directory for hostprogs with O=.. build
Sam Ravnborg [Mon, 7 Aug 2006 19:55:33 +0000 (21:55 +0200)]
kbuild: create output directory for hostprogs with O=.. build

hostprogs-y only supported creating output directory for the final
program. Extend this to also cover the situation where a .o
file (used when host program is made from compositie objects) is
locate in another directory.
First user of this is the built-in lxdialog that.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
18 years agokbuild: add missing return statement in modpost.c:secref_whitelist()
Sam Ravnborg [Wed, 9 Aug 2006 06:23:55 +0000 (08:23 +0200)]
kbuild: add missing return statement in modpost.c:secref_whitelist()

Noticed by: Magnus Damm <magnus@valinux.co.jp>

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
18 years agokbuild: preperly align SYSMAP output
Sam Ravnborg [Tue, 8 Aug 2006 19:41:18 +0000 (21:41 +0200)]
kbuild: preperly align SYSMAP output

Align filenames for SYSMAP with other filenames

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
18 years agokbuild: make -rR is now default
Sam Ravnborg [Tue, 8 Aug 2006 19:36:08 +0000 (21:36 +0200)]
kbuild: make -rR is now default

Do not specify -rR anymore - it is default.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>