GitHub/MotorolaMobilityLLC/kernel-slsi.git
14 years agopowerpc/cpumask: Convert NUMA code to new cpumask API
Anton Blanchard [Mon, 26 Apr 2010 15:32:43 +0000 (15:32 +0000)]
powerpc/cpumask: Convert NUMA code to new cpumask API

Convert NUMA code to new cpumask API. We shift the node to cpumask
setup code until after we complete bootmem allocation so we can
dynamically allocate the cpumasks.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/cpumask: Convert hotplug-cpu code to new cpumask API
Anton Blanchard [Mon, 26 Apr 2010 15:32:42 +0000 (15:32 +0000)]
powerpc/cpumask: Convert hotplug-cpu code to new cpumask API

Convert hotplug-cpu code to new cpumask API.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/cpumask: Dynamically allocate cpu_sibling_map and cpu_core_map cpumasks
Anton Blanchard [Mon, 26 Apr 2010 15:32:41 +0000 (15:32 +0000)]
powerpc/cpumask: Dynamically allocate cpu_sibling_map and cpu_core_map cpumasks

Dynamically allocate cpu_sibling_map and cpu_core_map cpumasks.

We don't need to set_cpu_online() the boot cpu in smp_prepare_boot_cpu,
init/main.c does it for us.

We also postpone setting of the boot cpu in cpu_sibling_map and cpu_core_map
until when the memory allocator is available (smp_prepare_cpus), similar
to x86.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/cpumask: Convert /proc/cpuinfo to new cpumask API
Anton Blanchard [Mon, 26 Apr 2010 15:32:40 +0000 (15:32 +0000)]
powerpc/cpumask: Convert /proc/cpuinfo to new cpumask API

Use new cpumask API in /proc/cpuinfo code.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/cpumask: Refactor /proc/cpuinfo code
Anton Blanchard [Mon, 26 Apr 2010 15:32:39 +0000 (15:32 +0000)]
powerpc/cpumask: Refactor /proc/cpuinfo code

This separates the per cpu output from the summary output at the end of the
file, making it easier to convert to the new cpumask API in a subsequent
patch.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/cpumask: Convert xics driver to new cpumask API
Anton Blanchard [Mon, 26 Apr 2010 15:32:38 +0000 (15:32 +0000)]
powerpc/cpumask: Convert xics driver to new cpumask API

Use the new cpumask API and add some comments to clarify how get_irq_server
works.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/cpumask: Convert pseries SMP code to new cpumask API
Anton Blanchard [Mon, 26 Apr 2010 15:32:37 +0000 (15:32 +0000)]
powerpc/cpumask: Convert pseries SMP code to new cpumask API

Use new cpumask functions in pseries SMP startup code.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/cpumask: Convert iseries SMP code to new cpumask API
Anton Blanchard [Mon, 26 Apr 2010 15:32:36 +0000 (15:32 +0000)]
powerpc/cpumask: Convert iseries SMP code to new cpumask API

Use new cpumask functions in iseries SMP startup code.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/cpumask: Convert fixup_irqs to new cpumask API
Anton Blanchard [Mon, 26 Apr 2010 15:32:35 +0000 (15:32 +0000)]
powerpc/cpumask: Convert fixup_irqs to new cpumask API

Use new cpumask_* functions, and dynamically allocate cpumask in fixup_irqs.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/cpumask: Convert smp_cpus_done to new cpumask API
Anton Blanchard [Mon, 26 Apr 2010 15:32:34 +0000 (15:32 +0000)]
powerpc/cpumask: Convert smp_cpus_done to new cpumask API

Use the new cpumask_* functions and dynamically allocate the cpumask in
smp_cpus_done.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/cpumask: Convert rtasd to new cpumask API
Anton Blanchard [Mon, 26 Apr 2010 15:32:33 +0000 (15:32 +0000)]
powerpc/cpumask: Convert rtasd to new cpumask API

Use cpumask_first, cpumask_next in rtasd code.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/cpumask: Use cpu_online_mask
Anton Blanchard [Mon, 26 Apr 2010 15:32:32 +0000 (15:32 +0000)]
powerpc/cpumask: Use cpu_online_mask

Change &cpu_online_map to cpu_online_mask.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc: Invoke oom-killer from page fault
Benjamin Herrenschmidt [Thu, 6 May 2010 07:15:58 +0000 (17:15 +1000)]
powerpc: Invoke oom-killer from page fault

As explained in commit 1c0fe6e3bd, we want to call the architecture independent
oom killer when getting an unexplained OOM from handle_mm_fault, rather than
simply killing current.

Cc: linuxppc-dev@ozlabs.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linux-arch@vger.kernel.org
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/mm: Track backing pages allocated by vmemmap_populate()
Mark Nelson [Wed, 21 Apr 2010 16:21:03 +0000 (16:21 +0000)]
powerpc/mm: Track backing pages allocated by vmemmap_populate()

We need to keep track of the backing pages that get allocated by
vmemmap_populate() so that when we use kdump, the dump-capture kernel knows
where these pages are.

We use a simple linked list of structures that contain the physical address
of the backing page and corresponding virtual address to track the backing
pages.
To save space, we just use a pointer to the next struct vmemmap_backing. We
can also do this because we never remove nodes.  We call the pointer "list"
to be compatible with changes made to the crash utility.

vmemmap_populate() is called either at boot-time or on a memory hotplug
operation. We don't have to worry about the boot-time calls because they
will be inherently single-threaded, and for a memory hotplug operation
vmemmap_populate() is called through:
sparse_add_one_section()
            |
            V
kmalloc_section_memmap()
            |
            V
sparse_mem_map_populate()
            |
            V
vmemmap_populate()
and in sparse_add_one_section() we're protected by pgdat_resize_lock().
So, we don't need a spinlock to protect the vmemmap_list.

We allocate space for the vmemmap_backing structs by allocating whole pages
in vmemmap_list_alloc() and then handing out chunks of this to
vmemmap_list_populate().

This means that we waste at most just under one page, but this keeps the code
is simple.

Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agoi2c/ibm-iic: Drop NO_IRQ
Wolfram Sang [Thu, 1 Apr 2010 14:17:01 +0000 (14:17 +0000)]
i2c/ibm-iic: Drop NO_IRQ

Drop NO_IRQ as 0 is the preferred way to describe 'no irq'
(http://lkml.org/lkml/2005/11/21/221). This change is safe, as the driver is
only used on powerpc, where NO_IRQ is 0 anyhow.

Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Sean MacLennan <smaclennan@pikatech.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Acked-by: Sean MacLennan <smaclennan@pikatech.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agoi2c/cpm: Drop NO_IRQ
Wolfram Sang [Thu, 1 Apr 2010 14:17:00 +0000 (14:17 +0000)]
i2c/cpm: Drop NO_IRQ

Drop NO_IRQ as 0 is the preferred way to describe 'no irq'
(http://lkml.org/lkml/2005/11/21/221). This change is safe, as the driver is
only used on powerpc, where NO_IRQ is 0 anyhow.

Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Jochen Friedrich <jochen@scram.de>
Cc: Ben Dooks <ben-linux@fluff.org>
Acked-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agoi2c/mpc: Drop NO_IRQ
Wolfram Sang [Thu, 1 Apr 2010 14:16:59 +0000 (14:16 +0000)]
i2c/mpc: Drop NO_IRQ

Drop NO_IRQ as 0 is the preferred way to describe 'no irq'
(http://lkml.org/lkml/2005/11/21/221). This change is safe, as the driver is
only used on powerpc, where NO_IRQ is 0 anyhow.

Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc: Correct parport interrupt parsing
Martyn Welch [Mon, 26 Apr 2010 22:50:21 +0000 (22:50 +0000)]
powerpc: Correct parport interrupt parsing

Currently the parsing of the device tree in
arch/powerpc/include/asm/parport.h assumes that the interrupt provided in
the parallel port node is a valid virtual irq. The values for the
interrupts provided in the device tree should have meaning in the context
of the driver for the specific interrupt controller to which the interrupt
is connected and irq_of_parse_and_map() should be used to determine the
correct virtual irq.

Signed-off-by: Martyn Welch <martyn.welch@ge.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc: Fix CONFIG_DEBUG_PAGEALLOC on 603/e300
Benjamin Herrenschmidt [Tue, 27 Apr 2010 21:22:55 +0000 (21:22 +0000)]
powerpc: Fix CONFIG_DEBUG_PAGEALLOC on 603/e300

So we tried to speed things up a bit using flush_hash_pages() directly
but that falls over on 603 of course meaning we fail to flush the TLB
properly and we may even end up having it corrupt memory randomly by
accessing a hash table that doesn't exist.

This removes the "optimization" by always going through flush_tlb_page()
for now at least.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/pseries: Only call start-cpu when a CPU is stopped
Michael Neuling [Wed, 28 Apr 2010 13:39:41 +0000 (13:39 +0000)]
powerpc/pseries: Only call start-cpu when a CPU is stopped

Currently we always call start-cpu irrespective of if the CPU is
stopped or not. Unfortunatley on POWER7, firmware seems to not like
start-cpu being called when a cpu already been started.  This was not
the case on POWER6 and earlier.

This patch checks to see if the CPU is stopped or not via an
query-cpu-stopped-state call, and only calls start-cpu on CPUs which
are stopped.

This fixes a bug with kexec on POWER7 on PHYP where only the primary
thread would make it to the second kernel.

Reported-by: Ankita Garg <ankita@linux.vnet.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/pseries: Make query_cpu_stopped callable outside hotplug cpu
Michael Neuling [Wed, 28 Apr 2010 13:39:41 +0000 (13:39 +0000)]
powerpc/pseries: Make query_cpu_stopped callable outside hotplug cpu

This moves query_cpu_stopped() out of the hotplug cpu code and into
smp.c so it can called in other places and renames it to
smp_query_cpu_stopped().

It also cleans up the return values by adding some #defines

Cc: <stable@kernel.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/4xx: Add optional "reset_type" property to control reboot via dts
Stefan Roese [Tue, 27 Apr 2010 22:13:34 +0000 (22:13 +0000)]
powerpc/4xx: Add optional "reset_type" property to control reboot via dts

By setting "reset_type" to one of the following values, the default
software reset mechanism may be overidden. Here the possible values of
"reset_type":

  1 - PPC4xx core reset
  2 - PPC4xx chip reset
  3 - PPC4xx system reset (default)

This will be used by a new PPC440SPe board port, which needs a "chip
reset" instead of the default "system reset" to be asserted.

Signed-off-by: Stefan Roese <sr@denx.de>
Cc: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
14 years agopowerpc/47x: defconfig for 476 on the iss 4xx simulator
Dave Kleikamp [Fri, 5 Mar 2010 10:43:35 +0000 (10:43 +0000)]
powerpc/47x: defconfig for 476 on the iss 4xx simulator

A defconfig for the IBM ISS 476 simulator

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
14 years agopowerpc/4xx: Simple platform for the ISS 4xx simulator
Torez Smith [Fri, 5 Mar 2010 10:45:54 +0000 (10:45 +0000)]
powerpc/4xx: Simple platform for the ISS 4xx simulator

This is a trivial 4xx plaform that uses the new simple bsp from
Josh and is handy to use in simulators such as ISS or even Mambo
who don't properly implement most of the actual devices in the
SoC but really only the core.

Signed-off-by: Torez Smith <lnxtorez@linux.vnet.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
14 years agopowerpc/476: Add isync after loading mmu and debug spr's
Dave Kleikamp [Fri, 5 Mar 2010 10:43:24 +0000 (10:43 +0000)]
powerpc/476: Add isync after loading mmu and debug spr's

476 requires an isync after loading MMU and debug related SPR's.  Some of
these are in performance-critical paths and may need to be optimized, but
initially, we're playing it safe.

Signed-off-by: Torez Smith <lnxtorez@linux.vnet.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
14 years agopowerpc/476: add machine check handler for 47x core
Dave Kleikamp [Fri, 5 Mar 2010 03:43:18 +0000 (03:43 +0000)]
powerpc/476: add machine check handler for 47x core

The 47x core's MCSR varies from 44x, so it needs it's own machine check
handler.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
14 years agopowerpc/47x: Base ppc476 support
Dave Kleikamp [Fri, 5 Mar 2010 10:43:12 +0000 (10:43 +0000)]
powerpc/47x: Base ppc476 support

This patch adds the base support for the 476 processor.  The code was
primarily written by Ben Herrenschmidt and Torez Smith, but I've been
maintaining it for a while.

The goal is to have a single binary that will run on 44x and 47x, but
we still have some details to work out.  The biggest is that the L1 cache
line size differs on the two platforms, but it's currently a compile-time
option.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Torez Smith <lnxtorez@linux.vnet.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
14 years agopowerpc/44x: break out cpu init code into stand-alone function
Dave Kleikamp [Fri, 5 Mar 2010 10:43:07 +0000 (10:43 +0000)]
powerpc/44x: break out cpu init code into stand-alone function

The 47x platform supports multiple cores and shares code with 44x.
Break out code that is common for initializing the primary and secondary
cpus into a function which can be called for both.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
14 years agopowerpc/booke: Add Stack Marking support to Booke Exception Prolog
Torez Smith [Fri, 5 Mar 2010 10:43:01 +0000 (10:43 +0000)]
powerpc/booke: Add Stack Marking support to Booke Exception Prolog

This patch adds a marker to the exception stack frame to aid in debugging.
It's already inserted on other platforms and xmon recognizes it and
identifies exception frames when showing stack traces.

Signed-off-by: Torez Smith <lnxtorez@linux.vnet.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
14 years agoRevert "powerpc/mm: Bump SECTION_SIZE_BITS from 16MB to 256MB"
Benjamin Herrenschmidt [Tue, 13 Apr 2010 03:54:39 +0000 (13:54 +1000)]
Revert "powerpc/mm: Bump SECTION_SIZE_BITS from 16MB to 256MB"

This reverts commit 7545ba6f82924d4523f8f8a2baf2e517a750265d.

It breaks eHEA among other issues

14 years agopowerpc: Add kprobe-based event tracer
Mahesh Salgaonkar [Wed, 7 Apr 2010 08:10:20 +0000 (18:10 +1000)]
powerpc: Add kprobe-based event tracer

This patch ports the kprobe-based event tracer to powerpc. This patch
is based on x86 port. This brings powerpc on par with x86.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/vio: Add power management support
Benjamin Herrenschmidt [Wed, 7 Apr 2010 08:09:15 +0000 (18:09 +1000)]
powerpc/vio: Add power management support

Adds support for suspend/resume for VIO devices. This is needed for
support for HMC initiated hibernation.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/iseries/pci: Use __ratelimit
Akinobu Mita [Sun, 28 Feb 2010 00:58:16 +0000 (00:58 +0000)]
powerpc/iseries/pci: Use __ratelimit

Replace open-coded rate limiting logic with __ratelimit().

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/mm: Bump SECTION_SIZE_BITS from 16MB to 256MB
Anton Blanchard [Thu, 25 Feb 2010 20:18:46 +0000 (20:18 +0000)]
powerpc/mm: Bump SECTION_SIZE_BITS from 16MB to 256MB

The current setting for SECTION_SIZE_BITS is quite small compared to
everyone else:

arch/powerpc/include/asm/sparsemem.h:#define SECTION_SIZE_BITS  24

arch/sparc/include/asm/sparsemem.h:#define SECTION_SIZE_BITS    30
arch/ia64/include/asm/sparsemem.h:#define SECTION_SIZE_BITS     (30)
arch/s390/include/asm/sparsemem.h:#define SECTION_SIZE_BITS     28
arch/x86/include/asm/sparsemem.h:# define SECTION_SIZE_BITS     27

And it has proven to be an issue during boot on very large machines.
If hotplug memory is enabled, drivers/base/memory.c does this:

       for (i = 0; i < NR_MEM_SECTIONS; i++) {
                if (!present_section_nr(i))
                        continue;
                err = add_memory_block(0, __nr_to_section(i), MEM_ONLINE,
                                        0, BOOT);
                if (!ret)
                        ret = err;
        }

Which creates a sysfs directory for every 16MB of memory. As a result
I'm seeing up to 30 minutes spent here during boot:

c000000000248ee0 .__sysfs_add_one+0x28/0x128
c0000000002492a8 .sysfs_add_one+0x38/0x188
c000000000249c88 .create_dir+0x70/0x138
c000000000249d98 .sysfs_create_dir+0x48/0x78
c00000000032bad8 .kobject_add_internal+0x140/0x308
c00000000032beb4 .kobject_init_and_add+0x4c/0x68
c00000000046c2c0 .sysdev_register+0xa0/0x220
c00000000047b1dc .add_memory_block+0x124/0x1e8
c0000000008d1f28 .memory_dev_init+0xf4/0x168
c0000000008d1b64 .driver_init+0x50/0x64
c000000000890378 .do_basic_setup+0x40/0xd4

I assume there are some O(n^2) issues in sysfs as we add all the memory
nodes. Bumping SECTION_SIZE_BITS to 256 MB drops the time to about 10
seconds and results in a much smaller /sys.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/pseries: Call ibm,os-term if the ibm,extended-os-term is present
Anton Blanchard [Thu, 18 Feb 2010 12:11:51 +0000 (12:11 +0000)]
powerpc/pseries: Call ibm,os-term if the ibm,extended-os-term is present

We have had issues in the past with ibm,os-term initiating shutdown of a
partition. This is confusing to the user, especially if panic_timeout is
non zero.

The temporary fix was to avoid calling ibm,os-term if a panic_timeout was set
and since we set it on every boot we basically never call ibm,os-term.

An extended version of ibm,os-term has since been implemented which gives us
the behaviour we want:

  "When the platform supports extended ibm,os-term behavior, the return to the
  RTAS will always occur unless there is a kernel assisted dump active as
  initiated by an ibm,configure-kernel-dump call."

This patch checks for the ibm,extended-os-term property and calls ibm,os-term
if it exists.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/numa: Set a smaller value for RECLAIM_DISTANCE to enable zone reclaim
Anton Blanchard [Thu, 18 Feb 2010 12:29:23 +0000 (12:29 +0000)]
powerpc/numa: Set a smaller value for RECLAIM_DISTANCE to enable zone reclaim

I noticed /proc/sys/vm/zone_reclaim_mode was 0 on a ppc64 NUMA box. It gets
enabled via this:

        /*
         * If another node is sufficiently far away then it is better
         * to reclaim pages in a zone before going off node.
         */
        if (distance > RECLAIM_DISTANCE)
                zone_reclaim_mode = 1;

Since we use the default value of 20 for REMOTE_DISTANCE and 20 for
RECLAIM_DISTANCE it never kicks in.

The local to remote bandwidth ratios can be quite large on System p
machines so it makes sense for us to reclaim clean pagecache locally before
going off node.

The patch below sets a smaller value for RECLAIM_DISTANCE and thus enables
zone reclaim.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/pmac: Fix dangling pointers
Wolfram Sang [Sat, 20 Mar 2010 04:12:50 +0000 (04:12 +0000)]
powerpc/pmac: Fix dangling pointers

Fix I2C-drivers which missed setting clientdata to NULL before freeing the
structure it points to. Also fix drivers which do this _after_ the structure
was freed already.

Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: Colin Leroy <colin@colino.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc: Use set_cpus_allowed_ptr
Julia Lawall [Fri, 26 Mar 2010 12:03:29 +0000 (12:03 +0000)]
powerpc: Use set_cpus_allowed_ptr

Use set_cpus_allowed_ptr rather than set_cpus_allowed.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression E1,E2;
@@

- set_cpus_allowed(E1, cpumask_of_cpu(E2))
+ set_cpus_allowed_ptr(E1, cpumask_of(E2))

@@
expression E;
identifier I;
@@

- set_cpus_allowed(E, I)
+ set_cpus_allowed_ptr(E, &I)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/pmac: Add missing unlocks in error path
Julia Lawall [Mon, 29 Mar 2010 05:34:46 +0000 (05:34 +0000)]
powerpc/pmac: Add missing unlocks in error path

In some error handling cases the lock is not unlocked.

A simplified version of the semantic patch that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@r exists@
expression E1;
identifier f;
@@

f (...) { <+...
* spin_lock_irqsave (E1,...);
... when != E1
* return ...;
...+> }
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/vio: Add missing unlock in error path
Julia Lawall [Mon, 29 Mar 2010 05:33:34 +0000 (05:33 +0000)]
powerpc/vio: Add missing unlock in error path

Add an unlock before exiting the function.

A simplified version of the semantic patch that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@r exists@
expression E1;
identifier f;
@@

f (...) { <+...
* spin_lock_irq (E1,...);
... when != E1
* return ...;
...+> }
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/pseries/dlpar: Use kasprintf
Julia Lawall [Wed, 10 Mar 2010 11:15:01 +0000 (11:15 +0000)]
powerpc/pseries/dlpar: Use kasprintf

kasprintf combines kmalloc and sprintf, and takes care of the size
calculation itself.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression a,flag;
expression list args;
statement S;
@@

  a =
-  \(kmalloc\|kzalloc\)(...,flag)
+  kasprintf(flag,args)
  <... when != a
  if (a == NULL || ...) S
  ...>
- sprintf(a,args);
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/pseries/dlpar: Eliminate use after free
Julia Lawall [Fri, 2 Apr 2010 02:47:13 +0000 (02:47 +0000)]
powerpc/pseries/dlpar: Eliminate use after free

dlpar_free_cc_nodes frees its argument, so dlpar_online_cpu should not be
called on the same value.  Skip over the call to dlpar_online_cpu by
jumping directly to out.

A simplified version of the semantic patch that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression E,E2;
@@

dlpar_free_cc_nodes(E)
...
(
  E = E2
|
* E
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/pmac/windfarm: Correct potential double free
Julia Lawall [Sun, 28 Mar 2010 23:39:22 +0000 (23:39 +0000)]
powerpc/pmac/windfarm: Correct potential double free

The conditionals were testing different values, but then all freeing the
same one, which could result in a double free.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression x,e;
identifier f;
iterator I;
statement S;
@@

*kfree(x);
... when != &x
    when != x = e
    when != I(x,...) S
*x
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc: Fix handling of strncmp with zero len
Jeff Mahoney [Wed, 17 Mar 2010 10:55:51 +0000 (10:55 +0000)]
powerpc: Fix handling of strncmp with zero len

Commit 0119536c, which added the assembly version of strncmp to
powerpc, mentions that it adds two instructions to the version from
boot/string.S to allow it to handle len=0. Unfortunately, it doesn't
always return 0 when that is the case. The length is passed in r5, but
the return value is passed back in r3. In certain cases, this will
happen to work. Otherwise it will pass back the address of the first
string as the return value.

This patch lifts the len <= 0 handling code from memcpy to handle that
case.

Reported by: Christian_Sellars@symantec.com
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
CC: <stable@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/pmac/windfarm: Don't test pointers before kfree()
d binderman [Thu, 18 Mar 2010 23:01:42 +0000 (23:01 +0000)]
powerpc/pmac/windfarm: Don't test pointers before kfree()

Fix minor nits found by cppcheck

[./macintosh/windfarm_pm81.c:760]: (style) Redundant condition. It is safe to deallocate a NULL pointer
[./macintosh/windfarm_pm81.c:762]: (style) Redundant condition. It is safe to deallocate a NULL pointer

Signed-off-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/pmac/low_i2c.c: three minor problems
d binderman [Sat, 6 Feb 2010 02:13:29 +0000 (02:13 +0000)]
powerpc/pmac/low_i2c.c: three minor problems

Fix minor nits found by cppcheck

[./arch/powerpc/platforms/powermac/low_i2c.c:594]: (style) The scope of the variable chans can be reduced
[./arch/powerpc/platforms/powermac/low_i2c.c:594]: (style) The scope of the variable i can be reduced
[./arch/powerpc/platforms/powermac/low_i2c.c:1260]: (style) Redundant condition. It is safe to deallocate a NULL pointer

Signed-off-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/aoa: gpio-pmf.c: 3 * redundant code
d binderman [Fri, 19 Mar 2010 00:12:22 +0000 (00:12 +0000)]
powerpc/aoa: gpio-pmf.c: 3 * redundant code

Signed-off-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/8xx: Use SPRG2 and DAR registers to stash r11 and cr.
Joakim Tjernlund [Tue, 2 Mar 2010 05:37:12 +0000 (05:37 +0000)]
powerpc/8xx: Use SPRG2 and DAR registers to stash r11 and cr.

This avoids storing these registers in memory.
CPU6 errata will still use the old way.
Remove some G2 leftover accesses from 2.4

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/8xx: Don't touch ACCESSED when no SWAP.
Joakim Tjernlund [Tue, 2 Mar 2010 05:37:11 +0000 (05:37 +0000)]
powerpc/8xx: Don't touch ACCESSED when no SWAP.

Only the swap function cares about the ACCESSED bit in
the pte. Do not waste cycles updateting ACCESSED when swap
is not compiled into the kernel.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/8xx: Avoid testing for kernel space in ITLB Miss.
Joakim Tjernlund [Tue, 2 Mar 2010 05:37:10 +0000 (05:37 +0000)]
powerpc/8xx: Avoid testing for kernel space in ITLB Miss.

Only modules will cause ITLB Misses as we always pin
the first 8MB of kernel memory.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/8xx: Optimze TLB Miss handlers
Joakim Tjernlund [Tue, 2 Mar 2010 05:37:09 +0000 (05:37 +0000)]
powerpc/8xx: Optimze TLB Miss handlers

This removes a couple of insn's from the TLB Miss
handlers whithout changing functionality.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/ppc32: Fixup pmd_page to work when ARCH_PFN_OFFSET is non-zero
Jason Gunthorpe [Tue, 9 Mar 2010 09:35:00 +0000 (09:35 +0000)]
powerpc/ppc32: Fixup pmd_page to work when ARCH_PFN_OFFSET is non-zero

Instead of referencing mem_map directly, use pfn_to_page. Otherwise
the kernel crashes when trying to start userspace if ARCH_PFN_OFFSET is
non-zero and CONFIG_BOOKE is not defined

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/pseries: Export data from new hcall H_EM_GET_PARMS
Vaidyanathan Srinivasan [Wed, 31 Mar 2010 21:39:24 +0000 (21:39 +0000)]
powerpc/pseries: Export data from new hcall H_EM_GET_PARMS

Add support for H_EM_GET_PARMS hcall that will return data
related to power modes from the platform.  Export the data
directly to user space for administrative tools to interpret
and use.

cat /proc/powerpc/lparcfg will export power mode data

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc: Disable interrupts for data breakpoint exceptions
K.Prasad [Mon, 29 Mar 2010 23:59:25 +0000 (23:59 +0000)]
powerpc: Disable interrupts for data breakpoint exceptions

Data address breakpoint exceptions are currently handled along with page-faults
which require interrupts to remain in enabled state. Since exception handling
for data breakpoints aren't pre-empt safe, we handle them separately.

Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/vio: Add modalias support
Benjamin Herrenschmidt [Wed, 7 Apr 2010 04:44:28 +0000 (14:44 +1000)]
powerpc/vio: Add modalias support

BenH: Added to vio_cmo_dev_attrs as well

Provide a modalias entry for VIO devices in sysfs.  I believe
this was another initrd generation bugfix for anaconda.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc: Fix ioremap_flags() with book3e pte definition
Benjamin Herrenschmidt [Wed, 7 Apr 2010 04:39:36 +0000 (14:39 +1000)]
powerpc: Fix ioremap_flags() with book3e pte definition

We can't just clear the user read permission in book3e pte, because
that will also clear supervisor read permission.  This surely isn't
desired.  Fix the problem by adding the supervisor read back.

BenH: Slightly simplified the ifdef and applied to ppc64 too

Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/mpsc: Set the port device in the mpsc serial driver
Corey Minyard [Mon, 1 Feb 2010 09:37:46 +0000 (09:37 +0000)]
powerpc/mpsc: Set the port device in the mpsc serial driver

The mpsc serial driver needx to set the port's device tree element
to register properly.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc: Add a new zImage for maple using a different link address
Corey Minyard [Fri, 29 Jan 2010 14:18:20 +0000 (14:18 +0000)]
powerpc: Add a new zImage for maple using a different link address

The maple platform failed to load because it's firmware could not take a
link address of 0x4000000.  A new platform type with a link address of
0x400000 had to be created for the maple.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agoRemove unused HDPU driver
Benjamin Herrenschmidt [Wed, 7 Apr 2010 00:08:49 +0000 (10:08 +1000)]
Remove unused HDPU driver

This driver seems to be specific to a "Sky CPU" board for which we
don't appear to have upstream support (or not any more). No Kconfig
file in the kernel ever enables it. So remove it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agoMerge branch 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6
Linus Torvalds [Tue, 6 Apr 2010 20:03:52 +0000 (13:03 -0700)]
Merge branch 'urgent' of git://git./linux/kernel/git/brodo/pcmcia-2.6

* 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6:
  pcmcia: fix up alignf issues

14 years agoMerge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 6 Apr 2010 20:03:22 +0000 (13:03 -0700)]
Merge branch 'irq-core-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  genirq: Force MSI irq handlers to run with interrupts disabled

14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog
Linus Torvalds [Tue, 6 Apr 2010 16:56:40 +0000 (09:56 -0700)]
Merge git://git./linux/kernel/git/wim/linux-2.6-watchdog

* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
  [WATCHDOG] hpwdt - fix lower timeout limit
  [WATCHDOG] iTCO_wdt: TCO Watchdog patch for additional Intel Cougar Point DeviceIDs
  [WATCHDOG] doc: Fix use of WDIOC_SETOPTIONS ioctl.
  [WATCHDOG] doc: watchdog simple example: don't fail on fsync()
  [WATCHDOG] set max63xx driver as ARM only
  [WATCHDOG] powerpc: pika_wdt ident cannot be const

14 years agoMerge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzi...
Linus Torvalds [Tue, 6 Apr 2010 15:36:31 +0000 (08:36 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  libata: unlock HPA if device shrunk
  libata: disable NCQ on Crucial C300 SSD
  libata: don't whine on spurious IRQ

14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Tue, 6 Apr 2010 15:34:06 +0000 (08:34 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (37 commits)
  smc91c92_cs: fix the problem of "Unable to find hardware address"
  r8169: clean up my printk uglyness
  net: Hook up cxgb4 to Kconfig and Makefile
  cxgb4: Add main driver file and driver Makefile
  cxgb4: Add remaining driver headers and L2T management
  cxgb4: Add packet queues and packet DMA code
  cxgb4: Add HW and FW support code
  cxgb4: Add register, message, and FW definitions
  netlabel: Fix several rcu_dereference() calls used without RCU read locks
  bonding: fix potential deadlock in bond_uninit()
  net: check the length of the socket address passed to connect(2)
  stmmac: add documentation for the driver.
  stmmac: fix kconfig for crc32 build error
  be2net: fix bug in vlan rx path for big endian architecture
  be2net: fix flashing on big endian architectures
  be2net: fix a bug in flashing the redboot section
  bonding: bond_xmit_roundrobin() fix
  drivers/net: Add missing unlock
  net: gianfar - align BD ring size console messages
  net: gianfar - initialize per-queue statistics
  ...

14 years agoproc: copy_to_user() returns unsigned
Dan Carpenter [Tue, 6 Apr 2010 10:45:39 +0000 (13:45 +0300)]
proc: copy_to_user() returns unsigned

copy_to_user() returns the number of bytes left to be copied.

This was a typo from: d82ef020cf31 "proc: pagemap: Hold mmap_sem during
page walk".

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agolibata: unlock HPA if device shrunk
Tejun Heo [Mon, 5 Apr 2010 01:33:13 +0000 (10:33 +0900)]
libata: unlock HPA if device shrunk

Some BIOSes don't configure HPA during boot but do so while resuming.
This causes harddrives to shrink during resume making libata detach
and reattach them.  This can be worked around by unlocking HPA if old
size equals native size.

Add ATA_DFLAG_UNLOCK_HPA so that HPA unlocking can be controlled
per-device and update ata_dev_revalidate() such that it sets
ATA_DFLAG_UNLOCK_HPA and fails with -EIO when the above condition is
detected.

This patch fixes the following bug.

  https://bugzilla.kernel.org/show_bug.cgi?id=15396

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Oleksandr Yermolenko <yaa.bta@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
14 years agolibata: disable NCQ on Crucial C300 SSD
Tejun Heo [Mon, 5 Apr 2010 01:51:26 +0000 (10:51 +0900)]
libata: disable NCQ on Crucial C300 SSD

Crucial said,

  Thank you for contacting us. We know that with our M225 line of SSDs
  you sometimes need to disable NCQ (native command queuing) to avoid
  just the type of errors you're seeing. Our recommendation for the
  M225 is to add libata.force=noncq to your Linux kernel boot options,
  under the kernel ATA library option.

  I have sent your feedback to the engineers working on the C300, and
  asked them to please pass it on to the firmware team. I have been
  notified that they are in the process of testing and finalizing a
  new firmware version, that you can expect to see released around the
  end of April. We’ll keep you posted as to when it will be available
  for download.

So, turn off NCQ on the drive w/ the current firmware revision.

Reported in the following bug.

  https://bugzilla.kernel.org/show_bug.cgi?id=15573

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: lethalwp@scarlet.be
Reported-by: Luke Macken <lmacken@redhat.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
14 years agolibata: don't whine on spurious IRQ
Tejun Heo [Wed, 31 Mar 2010 07:41:18 +0000 (16:41 +0900)]
libata: don't whine on spurious IRQ

On configurations where IRQ line is shared with a different
controller, spurious IRQs may happen continuously.  The message was
put there primarily for debugging anyway.  Kill it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
14 years ago[WATCHDOG] hpwdt - fix lower timeout limit
Thomas Mingarelli [Wed, 17 Mar 2010 15:33:31 +0000 (15:33 +0000)]
[WATCHDOG] hpwdt - fix lower timeout limit

[Novell Bug 581103] HP Watchdog driver has arbitrary (wrong) timeout limits.
Fix the lower timeout limit to a more appropriate value.

Signed-off-by: Thomas Mingarelli <Thomas.Mingarelli@hp.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Cc: stable <stable@kernel.org>
14 years ago[WATCHDOG] iTCO_wdt: TCO Watchdog patch for additional Intel Cougar Point DeviceIDs
Seth Heasley [Thu, 25 Mar 2010 23:14:41 +0000 (16:14 -0700)]
[WATCHDOG] iTCO_wdt: TCO Watchdog patch for additional Intel Cougar Point DeviceIDs

This patch adds the Intel Cougar Point PCH LPC Controller DeviceIDs for iTCO Watchdog.

Signed-off-by: Seth Heasley <seth.heasley@intel.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Cc: stable <stable@kernel.org>
14 years ago[WATCHDOG] doc: Fix use of WDIOC_SETOPTIONS ioctl.
James Hogan [Mon, 5 Apr 2010 10:31:29 +0000 (11:31 +0100)]
[WATCHDOG] doc: Fix use of WDIOC_SETOPTIONS ioctl.

In the watchdog-test program and watchdog-api.txt, pass the values to
the WDIOC_SETOPTIONS ioctl as a pointer to an integer containing the
values intead of directly in the third ioctl argument. The actual
watchdog drivers in drivers/watchdog don't read the options directly
from the argument but use get_user and copy_from_user.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
14 years agoFix up possibly racy module refcounting
Nick Piggin [Thu, 1 Apr 2010 08:09:40 +0000 (19:09 +1100)]
Fix up possibly racy module refcounting

Module refcounting is implemented with a per-cpu counter for speed.
However there is a race when tallying the counter where a reference may
be taken by one CPU and released by another.  Reference count summation
may then see the decrement without having seen the previous increment,
leading to lower than expected count.  A module which never has its
actual reference drop below 1 may return a reference count of 0 due to
this race.

Module removal generally runs under stop_machine, which prevents this
race causing bugs due to removal of in-use modules.  However there are
other real bugs in module.c code and driver code (module_refcount is
exported) where the callers do not run under stop_machine.

Fix this by maintaining running per-cpu counters for the number of
module refcount increments and the number of refcount decrements.  The
increments are tallied after the decrements, so any decrement seen will
always have its corresponding increment counted.  The final refcount is
the difference of the total increments and decrements, preventing a
low-refcount from being returned.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6
Linus Torvalds [Mon, 5 Apr 2010 22:37:12 +0000 (15:37 -0700)]
Merge git://git./linux/kernel/git/jejb/scsi-rc-fixes-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
  [SCSI] qla1280: retain firmware for error recovery
  [SCSI] attirbute_container: Initialize sysfs attributes with sysfs_attr_init
  [SCSI] advansys: fix regression with request_firmware change
  [SCSI] qla2xxx: Updated version number to 8.03.02-k2.
  [SCSI] qla2xxx: Prevent sending mbx commands from sysfs during isp reset.
  [SCSI] qla2xxx: Disable MSI on qla24xx chips other than QLA2432.
  [SCSI] qla2xxx: Check to make sure multique and CPU affinity support is not enabled at the same time.
  [SCSI] qla2xxx: Correct vp_idx checking during PORT_UPDATE processing.
  [SCSI] qla2xxx: Honour "Extended BB credits" bit for CNAs.
  [SCSI] scsi_transport_fc: Make sure commands are completed when rport is offline
  [SCSI] libiscsi: Fix recovery slowdown regression

14 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh...
Linus Torvalds [Mon, 5 Apr 2010 20:42:54 +0000 (13:42 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ericvh/v9fs

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  9p: saving negative to unsigned char
  9p: return on mutex_lock_interruptible()
  9p: Creating files with names too long should fail with ENAMETOOLONG.
  9p: Make sure we are able to clunk the cached fid on umount
  9p: drop nlink remove
  fs/9p: Clunk the fid resulting from partial walk of the name
  9p: documentation update
  9p: Fix setting of protocol flags in v9fs_session_info structure.

14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable
Linus Torvalds [Mon, 5 Apr 2010 20:21:15 +0000 (13:21 -0700)]
Merge git://git./linux/kernel/git/mason/btrfs-unstable

* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
  Btrfs: add check for changed leaves in setup_leaf_for_split
  Btrfs: create snapshot references in same commit as snapshot
  Btrfs: fix small race with delalloc flushing waitqueue's
  Btrfs: use add_to_page_cache_lru, use __page_cache_alloc
  Btrfs: fix chunk allocate size calculation
  Btrfs: kill max_extent mount option
  Btrfs: fail to mount if we have problems reading the block groups
  Btrfs: check btrfs_get_extent return for IS_ERR()
  Btrfs: handle kmalloc() failure in inode lookup ioctl
  Btrfs: dereferencing freed memory
  Btrfs: Simplify num_stripes's calculation logical for __btrfs_alloc_chunk()
  Btrfs: Add error handle for btrfs_search_slot() in btrfs_read_chunk_tree()
  Btrfs: Remove unnecessary finish_wait() in wait_current_trans()
  Btrfs: add NULL check for do_walk_down()
  Btrfs: remove duplicate include in ioctl.c

Fix trivial conflict in fs/btrfs/compression.c due to slab.h include
cleanups.

14 years agoaudit: preface audit printk with audit
Eric Paris [Mon, 5 Apr 2010 20:16:26 +0000 (16:16 -0400)]
audit: preface audit printk with audit

There have been a number of reports of people seeing the message:
"name_count maxed, losing inode data: dev=00:05, inode=3185"
in dmesg.  These usually lead to people reporting problems to the filesystem
group who are in turn clueless what they mean.

Eventually someone finds me and I explain what is going on and that
these come from the audit system.  The basics of the problem is that the
audit subsystem never expects a single syscall to 'interact' (for some
wish washy meaning of interact) with more than 20 inodes.  But in fact
some operations like loading kernel modules can cause changes to lots of
inodes in debugfs.

There are a couple real fixes being bandied about including removing the
fixed compile time limit of 20 or not auditing changes in debugfs (or
both) but neither are small and obvious so I am not sending them for
immediate inclusion (I hope Al forwards a real solution next devel
window).

In the meantime this patch simply adds 'audit' to the beginning of the
crap message so if a user sees it, they come blame me first and we can
talk about what it means and make sure we understand all of the reasons
it can happen and make sure this gets solved correctly in the long run.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years ago9p: saving negative to unsigned char
Dan Carpenter [Mon, 5 Apr 2010 19:37:28 +0000 (14:37 -0500)]
9p: saving negative to unsigned char

Saving -EINVAL as unsigned char truncates the high bits and changes it
into 234 instead of -22.  This breaks the test for "if (ret == -EINVAL)"
in parse_opts().

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years ago9p: return on mutex_lock_interruptible()
Dan Carpenter [Tue, 30 Mar 2010 09:41:25 +0000 (09:41 +0000)]
9p: return on mutex_lock_interruptible()

If "err" is -EINTR here the original code calls mutex_unlock() and then
returns, but it should just return directly.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev

14 years agoBtrfs: add check for changed leaves in setup_leaf_for_split
Chris Mason [Fri, 2 Apr 2010 13:20:18 +0000 (09:20 -0400)]
Btrfs: add check for changed leaves in setup_leaf_for_split

setup_leaf_for_split needs to drop the path and search again, and has
checks to see if the item we want to split changed size.  But, it misses
the case where the leaf changed and now has enough room for the item
we want to insert.

This adds an extra check to make sure the leaf really needs splitting
before we call btrfs_split_leaf(), which keeps us from trying to split
a leaf with a single item.

btrfs_split_leaf() will blindly split the single item leaf, leaving us
with one good leaf and one empty leaf and then a crash.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
14 years agoBtrfs: create snapshot references in same commit as snapshot
Sage Weil [Mon, 15 Mar 2010 17:27:13 +0000 (17:27 +0000)]
Btrfs: create snapshot references in same commit as snapshot

This creates the reference to a new snapshot in the same commit as the
snapshot itself.  This avoids the need for a second commit in order for a
snapshot to be persistent, and also avoids the problem of "leaking" a
new snapshot tree root if the host crashes before the second commit takes
place.

It is not at all clear to me why it wasn't always done this way.  If there
is still a reason for the two-stage {create,finish}_pending_snapshots()
approach I'm missing something!  :)

I've been running this for a couple weeks under pretty heavy usage (a few
snapshots per minute) without obvious problems.

Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
14 years agoBtrfs: fix small race with delalloc flushing waitqueue's
Josef Bacik [Fri, 12 Mar 2010 19:28:18 +0000 (19:28 +0000)]
Btrfs: fix small race with delalloc flushing waitqueue's

Everytime we start a new flushing thread, we init the waitqueue if there isn't a
flushing thread running.  The problem with this is we check
space_info->flushing, which we clear right before doing a wake_up on the
flushing waitqueue, which causes problems if we init the waitqueue in the middle
of clearing the flushing flagh and calling wake_up.  This is hard to hit, but
the code is wrong anyway, so init the flushing/allocating waitqueue when
creating the space info and let it be.  I haven't seen the panic since I've been
using this patch.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
14 years agoBtrfs: use add_to_page_cache_lru, use __page_cache_alloc
Nick Piggin [Wed, 17 Mar 2010 13:31:04 +0000 (13:31 +0000)]
Btrfs: use add_to_page_cache_lru, use __page_cache_alloc

Pagecache pages should be allocated with __page_cache_alloc, so they
obey pagecache memory policies.

add_to_page_cache_lru is exported, so it should be used. Benefits over
using a private pagevec: neater code, 128 bytes fewer stack used, percpu
lru ordering is preserved, and finally don't need to flush pagevec
before returning so batching may be shared with other LRU insertions.

Signed-off-by: Nick Piggin <npiggin@suse.de>:
Signed-off-by: Chris Mason <chris.mason@oracle.com>
14 years agoMerge branch 'slabh' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc
Linus Torvalds [Mon, 5 Apr 2010 16:39:11 +0000 (09:39 -0700)]
Merge branch 'slabh' of git://git./linux/kernel/git/tj/misc

* 'slabh' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc:
  eeepc-wmi: include slab.h
  staging/otus: include slab.h from usbdrv.h
  percpu: don't implicitly include slab.h from percpu.h
  kmemcheck: Fix build errors due to missing slab.h
  include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
  iwlwifi: don't include iwl-dev.h from iwl-devtrace.h
  x86: don't include slab.h from arch/x86/include/asm/pgtable_32.h

Fix up trivial conflicts in include/linux/percpu.h due to
is_kernel_percpu_address() having been introduced since the slab.h
cleanup with the percpu_up.c splitup.

14 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Linus Torvalds [Mon, 5 Apr 2010 16:16:37 +0000 (09:16 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tj/percpu

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  module: add stub for is_module_percpu_address
  percpu, module: implement and use is_kernel/module_percpu_address()
  module: encapsulate percpu handling better and record percpu_size

14 years agormap: fix anon_vma_fork() memory leak
Rik van Riel [Mon, 5 Apr 2010 16:13:33 +0000 (12:13 -0400)]
rmap: fix anon_vma_fork() memory leak

Fix a memory leak in anon_vma_fork(), where we fail to tear down the
anon_vmas attached to the new VMA in case setting up the new anon_vma
fails.

This bug also has the potential to leave behind anon_vma_chain structs
with pointers to invalid memory.

Reported-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years ago9p: Creating files with names too long should fail with ENAMETOOLONG.
Sripathi Kodi [Mon, 29 Mar 2010 23:13:59 +0000 (18:13 -0500)]
9p: Creating files with names too long should fail with ENAMETOOLONG.

Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years ago9p: Make sure we are able to clunk the cached fid on umount
Aneesh Kumar K.V [Mon, 29 Mar 2010 23:13:59 +0000 (18:13 -0500)]
9p: Make sure we are able to clunk the cached fid on umount

dcache prune happen on umount. So we cannot mark the client
satus disconnect. That will prevent a 9p call to the server

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years ago9p: drop nlink remove
Aneesh Kumar K.V [Mon, 29 Mar 2010 23:14:50 +0000 (18:14 -0500)]
9p: drop nlink remove

We need to drop the link count on the inode of a sucessfull remove

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years agofs/9p: Clunk the fid resulting from partial walk of the name
Aneesh Kumar K.V [Fri, 19 Mar 2010 12:47:26 +0000 (12:47 +0000)]
fs/9p: Clunk the fid resulting from partial walk of the name

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years ago9p: documentation update
Sripathi Kodi [Thu, 18 Mar 2010 08:01:33 +0000 (08:01 +0000)]
9p: documentation update

This patch adds documentation for new 9P options introduced in
2.6.34.

Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years ago9p: Fix setting of protocol flags in v9fs_session_info structure.
Sripathi Kodi [Wed, 17 Mar 2010 17:02:38 +0000 (17:02 +0000)]
9p: Fix setting of protocol flags in v9fs_session_info structure.

This patch fixes a simple bug I left behind in my earlier protocol
negotiation patch.

Thanks,
Sripathi.

Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years agoeeepc-wmi: include slab.h
Tejun Heo [Mon, 5 Apr 2010 02:37:59 +0000 (11:37 +0900)]
eeepc-wmi: include slab.h

eeepc-wmi uses kfree() but doesn't include slab.h.  Include it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Yong Wang <yong.y.wang@intel.com>
14 years agoMerge branch 'master' into export-slabh
Tejun Heo [Mon, 5 Apr 2010 02:37:28 +0000 (11:37 +0900)]
Merge branch 'master' into export-slabh

14 years agostaging/otus: include slab.h from usbdrv.h
Tejun Heo [Mon, 5 Apr 2010 02:23:16 +0000 (11:23 +0900)]
staging/otus: include slab.h from usbdrv.h

drivers/staging/otus/usbdrv.h users use slab facilities.  Include
linux/slab.h from usbdrv.h.

Signed-off-by: Tejun Heo <tj@kernel.org>
14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
Linus Torvalds [Sun, 4 Apr 2010 19:14:44 +0000 (12:14 -0700)]
Merge git://git./linux/kernel/git/davem/sparc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sunxvr500: Ignore secondary output PCI devices.
  sparc64: Implement perf_arch_fetch_caller_regs
  sparc64: Update defconfig.
  sparc64: Fix array size reported by vmemmap_populate()
  sparc: Fix regset register window handling.
  drivers/serial/sunsu.c: Correct use after free

14 years agoMerge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 4 Apr 2010 19:13:10 +0000 (12:13 -0700)]
Merge branch 'perf-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf: Always build the powerpc perf_arch_fetch_caller_regs version
  perf: Always build the stub perf_arch_fetch_caller_regs version
  perf, probe-finder: Build fix on Debian
  perf/scripts: Tuple was set from long in both branches in python_process_event()
  perf: Fix 'perf sched record' deadlock
  perf, x86: Fix callgraphs of 32-bit processes on 64-bit kernels
  perf, x86: Fix AMD hotplug & constraint initialization
  x86: Move notify_cpu_starting() callback to a later stage
  x86,kgdb: Always initialize the hw breakpoint attribute
  perf: Use hot regs with software sched switch/migrate events
  perf: Correctly align perf event tracing buffer

14 years agoMerge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 4 Apr 2010 19:12:31 +0000 (12:12 -0700)]
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: set_cpus_allowed_ptr(): Don't use rq->migration_thread after unlock
  sched: Fix proc_sched_set_task()

14 years agoMerge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 4 Apr 2010 19:12:19 +0000 (12:12 -0700)]
Merge branch 'tracing-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  ring-buffer: Add missing unlock
  tracing: Fix lockdep warning in global_clock()

14 years agoproc: pagemap: Hold mmap_sem during page walk
KAMEZAWA Hiroyuki [Fri, 2 Apr 2010 00:11:29 +0000 (09:11 +0900)]
proc: pagemap: Hold mmap_sem during page walk

In initial design, walk_page_range() was designed just for walking page
table and it didn't require mmap_sem.  Now, find_vma() etc..  are used
in walk_page_range() and we need mmap_sem around it.

This patch adds mmap_sem around walk_page_range().

Because /proc/<pid>/pagemap's callback routine use put_user(), we have
to get rid of it to do sane fix.

Changelog: 2010/Apr/2
 - fixed start_vaddr and end overflow
Changelog: 2010/Apr/1
 - fixed start_vaddr calculation
 - removed unnecessary cast.
 - removed unnecessary change in smaps.
 - use GFP_TEMPORARY instead of GFP_KERNEL

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: San Mehat <san@google.com>
Cc: Brian Swetland <swetland@google.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
[ Fixed kmalloc failure return code as per Matt ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agopcmcia: fix up alignf issues
Dominik Brodowski [Sun, 4 Apr 2010 16:10:35 +0000 (18:10 +0200)]
pcmcia: fix up alignf issues

- pcmcia_align() used a "start" variable twice. That's obviously a bad
  idea.

- pcmcia_common_resource() needs the current "start" parameter being
  passed, instead of res->start.

- pcmcia_common_resource() doesn't use the size and align parameters,
  so get rid of those.

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>