GitHub/LineageOS/android_kernel_samsung_universal7580.git
15 years agox86, mm: move tlb.c to arch/x86/mm/
Ingo Molnar [Wed, 21 Jan 2009 09:08:53 +0000 (10:08 +0100)]
x86, mm: move tlb.c to arch/x86/mm/

Impact: cleanup

Now that it's unified, move the (SMP) TLB flushing code from arch/x86/kernel/
to arch/x86/mm/, where it belongs logically.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branch 'cpus4096' into core/percpu
Ingo Molnar [Wed, 21 Jan 2009 09:14:17 +0000 (10:14 +0100)]
Merge branch 'cpus4096' into core/percpu

Conflicts:
arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
arch/x86/kernel/tlb_32.c

Merge it here because both the cpumask changes and the ongoing percpu
work is touching the TLB code. The percpu changes take precedence, as
they eliminate tlb_32.c altogether.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branch 'tj-percpu' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc...
Ingo Molnar [Wed, 21 Jan 2009 09:04:52 +0000 (10:04 +0100)]
Merge branch 'tj-percpu' of git://git./linux/kernel/git/tj/misc into core/percpu

15 years agox86: rename tlb_64.c to tlb.c
Tejun Heo [Wed, 21 Jan 2009 08:26:06 +0000 (17:26 +0900)]
x86: rename tlb_64.c to tlb.c

Impact: file rename

tlb_64.c is now the tlb code for both 32 and 64.  Rename it to tlb.c.

Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: make x86_32 use tlb_64.c
Tejun Heo [Wed, 21 Jan 2009 08:26:06 +0000 (17:26 +0900)]
x86: make x86_32 use tlb_64.c

Impact: less contention when issuing invalidate IPI, cleanup

Make x86_32 use the same tlb code as 64bit.  The 64bit code uses
multiple IPI vectors for tlb shootdown to reduce contention.  This
patch makes x86_32 allocate the same 8 IPIs as x86_64 and share the
code paths.

Note that the usage of asmlinkage is inconsistent for x86_32 and 64
and calls for further cleanup.  This has been noted with a FIXME
comment in tlb_64.c.

Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: prepare for tlb merge
Tejun Heo [Wed, 21 Jan 2009 08:26:06 +0000 (17:26 +0900)]
x86: prepare for tlb merge

Impact: clean up, ipi vector number reordering for x86_32

Make the following changes to prepare for tlb merge.

* reorder x86_32 ip vectors

* adjust tlb_32.c and tlb_64.c such that their logics coincide exactly
- on spurious invalidate ipi, tlb_32 acks the irq
- tlb_64 now has proper memory barriers around clearing
          flush_cpumask (no change in generated code)

* unexport flush_tlb_page from tlb_32.c, there's no user

* use unsigned int for cpu id

* drop unnecessary includes from tlb_64.c

Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: uv cleanup
Tejun Heo [Wed, 21 Jan 2009 08:26:06 +0000 (17:26 +0900)]
x86: uv cleanup

Impact: cleanup

Make the following uv related cleanups.

* collect visible uv related definitions and interfaces into uv/uv.h
  and use it.  this cleans up the messy situation where on 64bit, uv
  is defined properly, on 32bit generic it's dummy and on the rest
  undefined.  after this clean up, uv is defined on 64 and dummy on
  32.

* update uv_flush_tlb_others() such that it takes cpumask of
  to-be-flushed cpus as argument, instead of that minus self, and
  returns yet-to-be-flushed cpumask, instead of modifying the passed
  in parameter.  this interface change will ease dummy implementation
  of uv_flush_tlb_others() and makes uv tlb flush related stuff
  defined in tlb_uv proper.

Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: merge irq_regs.h
Brian Gerst [Wed, 21 Jan 2009 08:26:06 +0000 (17:26 +0900)]
x86: merge irq_regs.h

Impact: cleanup, better irq_regs code generation for x86_64

Make 64-bit use the same optimizations as 32-bit.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: merge mmu_context.h
Brian Gerst [Wed, 21 Jan 2009 08:26:06 +0000 (17:26 +0900)]
x86: merge mmu_context.h

Impact: cleanup

tj: * changed cpu to unsigned as was done on mmu_context_64.h as cpu
      id is officially unsigned int
    * added missing ';' to 32bit version of deactivate_mm()

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: set %fs to __KERNEL_PERCPU unconditionally for x86_32
Brian Gerst [Wed, 21 Jan 2009 08:26:05 +0000 (17:26 +0900)]
x86: set %fs to __KERNEL_PERCPU unconditionally for x86_32

Impact: cleanup

%fs is currently set to __KERNEL_DS at boot, and conditionally
switched to __KERNEL_PERCPU for secondary cpus.  Instead, initialize
GDT_ENTRY_PERCPU to the same attributes as GDT_ENTRY_KERNEL_DS and
set %fs to __KERNEL_PERCPU unconditionally.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: fix percpu_write with 64-bit constants
Brian Gerst [Wed, 21 Jan 2009 08:26:05 +0000 (17:26 +0900)]
x86: fix percpu_write with 64-bit constants

Impact: slightly better code generation for percpu_to_op()

The processor will sign-extend 32-bit immediate values in 64-bit
operations.  Use the 'e' constraint ("32-bit signed integer constant,
or a symbolic reference known to fit that range") for 64-bit constants.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: clean up gdt_page definition
Brian Gerst [Wed, 21 Jan 2009 08:26:05 +0000 (17:26 +0900)]
x86: clean up gdt_page definition

Impact: cleanup && more compact percpu area layout with future changes

Move 64-bit GDT to page-aligned section and clean up comment
formatting.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: update canary handling during switch
Tejun Heo [Wed, 21 Jan 2009 08:26:05 +0000 (17:26 +0900)]
x86: update canary handling during switch

Impact: cleanup

In switch_to(), instead of taking offset to irq_stack_union.stack,
make it a proper percpu access using __percpu_arg() and per_cpu_var().

Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86, cpumask: fix tlb flush race
Ingo Molnar [Tue, 20 Jan 2009 08:13:15 +0000 (09:13 +0100)]
x86, cpumask: fix tlb flush race

Impact: fix bootup crash

The cpumask is now passed in as a reference to mm->cpu_vm_mask, not on
the stack - hence it is not constant anymore during the TLB flush.

That way it could race and some static sanity checks would trigger:

[  238.154287] ------------[ cut here ]------------
[  238.156039] kernel BUG at arch/x86/kernel/tlb_32.c:130!
[  238.156039] invalid opcode: 0000 [#1] SMP
[  238.156039] last sysfs file: /sys/class/net/eth2/address
[  238.156039] Modules linked in:
[  238.156039]
[  238.156039] Pid: 6493, comm: ifup-eth Not tainted (2.6.29-rc2-tip #1) P4DC6
[  238.156039] EIP: 0060:[<c0118f87>] EFLAGS: 00010202 CPU: 2
[  238.156039] EIP is at native_flush_tlb_others+0x35/0x158
[  238.156039] EAX: c0ef972c EBX: f6143301 ECX: 00000000 EDX: 00000000
[  238.156039] ESI: f61433a8 EDI: f6143200 EBP: f34f3e00 ESP: f34f3df0
[  238.156039]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[  238.156039] Process ifup-eth (pid: 6493, ti=f34f2000 task=f399ab00 task.ti=f34f2000)
[  238.156039] Stack:
[  238.156039]  ffffffff f61433a8 ffffffff f6143200 f34f3e18 c0118e9c 00000000 f6143200
[  238.156039]  f61433a8 f5bec738 f34f3e28 c0119435 c2b5b830 f6143200 f34f3e34 c01c2dc3
[  238.156039]  bffd9000 f34f3e60 c01c3051 00000000 ffffffff f34f3e4c 00000000 00000071
[  238.156039] Call Trace:
[  238.156039]  [<c0118e9c>] ? flush_tlb_others+0x52/0x5b
[  238.156039]  [<c0119435>] ? flush_tlb_mm+0x7f/0x8b
[  238.156039]  [<c01c2dc3>] ? tlb_finish_mmu+0x2d/0x55
[  238.156039]  [<c01c3051>] ? exit_mmap+0x124/0x170
[  238.156039]  [<c013e965>] ? mmput+0x40/0xf5
[  238.156039]  [<c01e4788>] ? flush_old_exec+0x640/0x94b
[  238.156039]  [<c01ddb4e>] ? fsnotify_access+0x37/0x39
[  238.156039]  [<c01e3435>] ? kernel_read+0x39/0x4b
[  238.156039]  [<c021bc8a>] ? load_elf_binary+0x4a1/0x11bb
[  238.156039]  [<c01c0af9>] ? might_fault+0x51/0x9c
[  238.156039]  [<c010a2cc>] ? paravirt_read_tsc+0x20/0x4f
[  238.156039]  [<c010a406>] ? native_sched_clock+0x5d/0x60
[  238.156039]  [<c01e2fda>] ? search_binary_handler+0xab/0x2c4
[  238.156039]  [<c021b7e9>] ? load_elf_binary+0x0/0x11bb
[  238.156039]  [<c04ae9a5>] ? _raw_read_unlock+0x21/0x46
[  238.156039]  [<c021b7e9>] ? load_elf_binary+0x0/0x11bb
[  238.156039]  [<c01e2fe1>] ? search_binary_handler+0xb2/0x2c4
[  238.156039]  [<c01e4076>] ? do_execve+0x21c/0x2ee
[  238.156039]  [<c01029b7>] ? sys_execve+0x51/0x8c
[  238.156039]  [<c0103eaf>] ? sysenter_do_call+0x12/0x43

Fix it by not assuming that the cpumask is constant.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branch 'tj-percpu' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc...
Ingo Molnar [Tue, 20 Jan 2009 07:23:45 +0000 (08:23 +0100)]
Merge branch 'tj-percpu' of git://git./linux/kernel/git/tj/misc into core/percpu

15 years agolinker script: kill PERCPU_VADDR_PREALLOC()
Tejun Heo [Mon, 19 Jan 2009 03:21:28 +0000 (12:21 +0900)]
linker script: kill PERCPU_VADDR_PREALLOC()

Impact: cleanup

With .data.percpu.first in place, PERCPU_VADDR_PREALLOC() is no longer
necessary.  Kill it.

Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: remove pda.h
Brian Gerst [Mon, 19 Jan 2009 00:52:25 +0000 (19:52 -0500)]
x86: remove pda.h

Impact: cleanup

Signed-off-by: Brian Gerst <brgerst@gmail.com>
15 years agox86: move stack_canary into irq_stack
Brian Gerst [Mon, 19 Jan 2009 03:21:28 +0000 (12:21 +0900)]
x86: move stack_canary into irq_stack

Impact: x86_64 percpu area layout change, irq_stack now at the beginning

Now that the PDA is empty except for the stack canary, it can be removed.
The irqstack is moved to the start of the per-cpu section.  If the stack
protector is enabled, the canary overlaps the bottom 48 bytes of the irqstack.

tj: * updated subject
    * dropped asm relocation of irq_stack_ptr
    * updated comments a bit
    * rebased on top of stack canary changes

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: rework __per_cpu_load adjustments
Brian Gerst [Mon, 19 Jan 2009 03:21:28 +0000 (12:21 +0900)]
x86: rework __per_cpu_load adjustments

Impact: cleanup

Use cpu_number to determine if the adjustment is necessary.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agopercpu: refactor percpu.h
Brian Gerst [Mon, 19 Jan 2009 03:21:27 +0000 (12:21 +0900)]
percpu: refactor percpu.h

Impact: cleanup

Refactor the DEFINE_PER_CPU_* macros and add .data.percpu.first
section.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: remove pda_init()
Brian Gerst [Mon, 19 Jan 2009 03:21:27 +0000 (12:21 +0900)]
x86: remove pda_init()

Impact: cleanup

Copy the code to cpu_init() to satisfy the requirement that the cpu
be reinitialized.  Remove all other calls, since the segments are
already initialized in head_64.S.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: conditionalize stack canary handling in hot path
Tejun Heo [Tue, 20 Jan 2009 03:29:19 +0000 (12:29 +0900)]
x86: conditionalize stack canary handling in hot path

Impact: no unnecessary stack canary swapping during context switch

There's no point in moving stack_canary around during context switch
if it's not enabled.  Conditionalize it.

Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: cleanup stack protector
Tejun Heo [Tue, 20 Jan 2009 03:29:19 +0000 (12:29 +0900)]
x86: cleanup stack protector

Impact: cleanup

Make the following cleanups.

* remove duplicate comment from boot_init_stack_canary() which fits
  better in the other place - cpu_idle().

* move stack_canary offset check from __switch_to() to
  boot_init_stack_canary().

Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: fully honor "nolapic", fix
Ingo Molnar [Mon, 19 Jan 2009 19:49:37 +0000 (20:49 +0100)]
x86: fully honor "nolapic", fix

Impact: build fix

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/travis/linux...
Ingo Molnar [Mon, 19 Jan 2009 16:12:20 +0000 (17:12 +0100)]
Merge branch 'master' of git://git./linux/kernel/git/travis/linux-2.6-cpus4096-for-ingo into cpus4096

15 years agoMerge branch 'stackprotector' into core/percpu
Ingo Molnar [Mon, 19 Jan 2009 11:36:09 +0000 (12:36 +0100)]
Merge branch 'stackprotector' into core/percpu

15 years agoMerge branch 'core/percpu' into stackprotector
Ingo Molnar [Sun, 18 Jan 2009 17:37:14 +0000 (18:37 +0100)]
Merge branch 'core/percpu' into stackprotector

Conflicts:
arch/x86/include/asm/pda.h
arch/x86/include/asm/system.h

Also, moved include/asm-x86/stackprotector.h to arch/x86/include/asm.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branch 'tj-percpu' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc...
Ingo Molnar [Sun, 18 Jan 2009 16:41:32 +0000 (17:41 +0100)]
Merge branch 'tj-percpu' of git://git./linux/kernel/git/tj/misc into core/percpu

15 years agox86-64: Use absolute displacements for per-cpu accesses.
Brian Gerst [Sun, 18 Jan 2009 15:38:59 +0000 (00:38 +0900)]
x86-64: Use absolute displacements for per-cpu accesses.

Accessing memory through %gs should not use rip-relative addressing.
Adding a P prefix for the argument tells gcc to not add (%rip) to
the memory references.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86-64: Move isidle from PDA to per-cpu.
Brian Gerst [Sun, 18 Jan 2009 15:38:59 +0000 (00:38 +0900)]
x86-64: Move isidle from PDA to per-cpu.

tj: s/isidle/is_idle/

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86-64: Move nodenumber from PDA to per-cpu.
Brian Gerst [Sun, 18 Jan 2009 15:38:59 +0000 (00:38 +0900)]
x86-64: Move nodenumber from PDA to per-cpu.

tj: * s/nodenumber/node_number/
    * removed now unused pda variable from pda_init()

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86-64: Move irqcount from PDA to per-cpu.
Brian Gerst [Sun, 18 Jan 2009 15:38:58 +0000 (00:38 +0900)]
x86-64: Move irqcount from PDA to per-cpu.

tj: s/irqcount/irq_count/

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86-64: Move oldrsp from PDA to per-cpu.
Brian Gerst [Sun, 18 Jan 2009 15:38:58 +0000 (00:38 +0900)]
x86-64: Move oldrsp from PDA to per-cpu.

tj: * in asm-offsets_64.c, pda.h inclusion shouldn't be removed as pda
      is still referenced in the file
    * s/oldrsp/old_rsp/

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86-64: Move kernelstack from PDA to per-cpu.
Brian Gerst [Sun, 18 Jan 2009 15:38:58 +0000 (00:38 +0900)]
x86-64: Move kernelstack from PDA to per-cpu.

Also clean up PER_CPU_VAR usage in xen-asm_64.S

tj: * remove now unused stack_thread_info()
    * s/kernelstack/kernel_stack/
    * added FIXME comment in xen-asm_64.S

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86-64: Move current task from PDA to per-cpu and consolidate with 32-bit.
Brian Gerst [Sun, 18 Jan 2009 15:38:58 +0000 (00:38 +0900)]
x86-64: Move current task from PDA to per-cpu and consolidate with 32-bit.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86-64: Move cpu number from PDA to per-cpu and consolidate with 32-bit.
Brian Gerst [Sun, 18 Jan 2009 15:38:58 +0000 (00:38 +0900)]
x86-64: Move cpu number from PDA to per-cpu and consolidate with 32-bit.

tj: moved cpu_number definition out of CONFIG_HAVE_SETUP_PER_CPU_AREA
    for voyager.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86-64: Convert exception stacks to per-cpu
Brian Gerst [Sun, 18 Jan 2009 15:38:58 +0000 (00:38 +0900)]
x86-64: Convert exception stacks to per-cpu

Move the exception stacks to per-cpu, removing specific allocation code.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86-64: Convert irqstacks to per-cpu
Brian Gerst [Sun, 18 Jan 2009 15:38:58 +0000 (00:38 +0900)]
x86-64: Convert irqstacks to per-cpu

Move the irqstackptr variable from the PDA to per-cpu.  Make the
stacks themselves per-cpu, removing some specific allocation code.
Add a seperate flag (is_boot_cpu) to simplify the per-cpu boot
adjustments.

tj: * sprinkle some underbars around.

    * irq_stack_ptr is not used till traps_init(), no reason to
      initialize it early.  On SMP, just leaving it NULL till proper
      initialization in setup_per_cpu_areas() works.  Dropped
      is_boot_cpu and early irq_stack_ptr initialization.

    * do DECLARE/DEFINE_PER_CPU(char[IRQ_STACK_SIZE], irq_stack)
      instead of (char, irq_stack[IRQ_STACK_SIZE]).

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86-64: Move TLB state from PDA to per-cpu and consolidate with 32-bit.
Brian Gerst [Sun, 18 Jan 2009 15:38:57 +0000 (00:38 +0900)]
x86-64: Move TLB state from PDA to per-cpu and consolidate with 32-bit.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86-64: Move irq stats from PDA to per-cpu and consolidate with 32-bit.
Brian Gerst [Sun, 18 Jan 2009 15:38:57 +0000 (00:38 +0900)]
x86-64: Move irq stats from PDA to per-cpu and consolidate with 32-bit.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agolinker script: add missing .data.percpu.page_aligned
Tejun Heo [Sat, 17 Jan 2009 06:26:32 +0000 (15:26 +0900)]
linker script: add missing .data.percpu.page_aligned

arm, arm/mach-integrator and powerpc were missing
.data.percpu.page_aligned in their percpu output section definitions.
Add it.

Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agolinker script: add missing VMLINUX_SYMBOL
Tejun Heo [Sat, 17 Jan 2009 05:42:50 +0000 (14:42 +0900)]
linker script: add missing VMLINUX_SYMBOL

The newly added PERCPU_*() macros define and use __per_cpu_load but
VMLINUX_SYMBOL() was missing from usages causing build failures on
archs where linker visible symbol is different from C symbols
(e.g. blackfin).

Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: put trigger in to detect mismatched apic versions.
Mike Travis [Fri, 16 Jan 2009 23:58:13 +0000 (15:58 -0800)]
x86: put trigger in to detect mismatched apic versions.

Fire off one message if two apic's discovered with different
apic versions.

Signed-off-by: Mike Travis <travis@sgi.com>
15 years agocpufreq: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
Mike Travis [Fri, 16 Jan 2009 23:31:15 +0000 (15:31 -0800)]
cpufreq: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write

Impact: use new work_on_cpu function to reduce stack usage

Replace the saving of current->cpus_allowed and set_cpus_allowed_ptr() with
a work_on_cpu function for drv_read() and drv_write().

Basically converts do_drv_{read,write} into "work_on_cpu" functions that
are now called by drv_read and drv_write.

Note: This patch basically reverts 50c668d6 which reverted 7503bfba, now
that the work_on_cpu() function is more stable.

Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Dieter Ries <clip2@gmx.de>
Tested-by: Maciej Rutecki <maciej.rutecki@gmail.com>
Cc: Dave Jones <davej@redhat.com>
Cc: <cpufreq@vger.kernel.org>
15 years agowork_on_cpu: Use our own workqueue.
Rusty Russell [Fri, 16 Jan 2009 23:31:15 +0000 (15:31 -0800)]
work_on_cpu: Use our own workqueue.

Impact: remove potential clashes with generic kevent workqueue

Annoyingly, some places we want to use work_on_cpu are already in
workqueues.  As per Ingo's suggestion, we create a different workqueue
for work_on_cpu.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
15 years agowork_on_cpu: don't try to get_online_cpus() in work_on_cpu.
Rusty Russell [Fri, 16 Jan 2009 23:31:15 +0000 (15:31 -0800)]
work_on_cpu: don't try to get_online_cpus() in work_on_cpu.

Impact: remove potential circular lock dependency with cpu hotplug lock

This has caused more problems than it solved, with a pile of cpu
hotplug locking issues.

Followup patches will get_online_cpus() in callers that need it, but
if they don't do it they're no worse than before when they were using
set_cpus_allowed without locking.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
15 years agox86_64: initialize this_cpu_off to __per_cpu_load
Tejun Heo [Fri, 16 Jan 2009 03:11:43 +0000 (12:11 +0900)]
x86_64: initialize this_cpu_off to __per_cpu_load

On x86_64, if get_per_cpu_var() is used before per cpu area is setup
(if lockdep is turned on, it happens), it needs this_cpu_off to point
to __per_cpu_load.  Initialize accordingly.

Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: fix build bug introduced during merge
Tejun Heo [Fri, 16 Jan 2009 02:19:03 +0000 (11:19 +0900)]
x86: fix build bug introduced during merge

EXPORT_PER_CPU_SYMBOL() got misplaced during merge leading to build
failure.  Fix it.

Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agopercpu: add optimized generic percpu accessors
Ingo Molnar [Thu, 15 Jan 2009 13:15:53 +0000 (22:15 +0900)]
percpu: add optimized generic percpu accessors

It is an optimization and a cleanup, and adds the following new
generic percpu methods:

  percpu_read()
  percpu_write()
  percpu_add()
  percpu_sub()
  percpu_and()
  percpu_or()
  percpu_xor()

and implements support for them on x86. (other architectures will fall
back to a default implementation)

The advantage is that for example to read a local percpu variable,
instead of this sequence:

 return __get_cpu_var(var);

 ffffffff8102ca2b: 48 8b 14 fd 80 09 74  mov    -0x7e8bf680(,%rdi,8),%rdx
 ffffffff8102ca32: 81
 ffffffff8102ca33: 48 c7 c0 d8 59 00 00  mov    $0x59d8,%rax
 ffffffff8102ca3a: 48 8b 04 10           mov    (%rax,%rdx,1),%rax

We can get a single instruction by using the optimized variants:

 return percpu_read(var);

 ffffffff8102ca3f: 65 48 8b 05 91 8f fd  mov    %gs:0x7efd8f91(%rip),%rax

I also cleaned up the x86-specific APIs and made the x86 code use
these new generic percpu primitives.

tj: * fixed generic percpu_sub() definition as Roel Kluin pointed out
    * added percpu_and() for completeness's sake
    * made generic percpu ops atomic against preemption

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: misc clean up after the percpu update
Tejun Heo [Tue, 13 Jan 2009 11:41:35 +0000 (20:41 +0900)]
x86: misc clean up after the percpu update

Do the following cleanups:

* kill x86_64_init_pda() which now is equivalent to pda_init()

* use per_cpu_offset() instead of cpu_pda() when initializing
  initial_gs

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86: convert pda ops to wrappers around x86 percpu accessors
Tejun Heo [Tue, 13 Jan 2009 11:41:35 +0000 (20:41 +0900)]
x86: convert pda ops to wrappers around x86 percpu accessors

pda is now a percpu variable and there's no reason it can't use plain
x86 percpu accessors.  Add x86_test_and_clear_bit_percpu() and replace
pda op implementations with wrappers around x86 percpu accessors.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86: make pda a percpu variable
Tejun Heo [Tue, 13 Jan 2009 11:41:35 +0000 (20:41 +0900)]
x86: make pda a percpu variable

[ Based on original patch from Christoph Lameter and Mike Travis. ]

As pda is now allocated in percpu area, it can easily be made a proper
percpu variable.  Make it so by defining per cpu symbol from linker
script and declaring it in C code for SMP and simply defining it for
UP.  This change cleans up code and brings SMP and UP closer a bit.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86: merge 64 and 32 SMP percpu handling
Tejun Heo [Tue, 13 Jan 2009 11:41:35 +0000 (20:41 +0900)]
x86: merge 64 and 32 SMP percpu handling

Now that pda is allocated as part of percpu, percpu doesn't need to be
accessed through pda.  Unify x86_64 SMP percpu access with x86_32 SMP
one.  Other than the segment register, operand size and the base of
percpu symbols, they behave identical now.

This patch replaces now unnecessary pda->data_offset with a dummy
field which is necessary to keep stack_canary at its place.  This
patch also moves per_cpu_offset initialization out of init_gdt() into
setup_per_cpu_areas().  Note that this change also necessitates
explicit per_cpu_offset initializations in voyager_smp.c.

With this change, x86_OP_percpu()'s are as efficient on x86_64 as on
x86_32 and also x86_64 can use assembly PER_CPU macros.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86: fold pda into percpu area on SMP
Tejun Heo [Tue, 13 Jan 2009 11:41:35 +0000 (20:41 +0900)]
x86: fold pda into percpu area on SMP

[ Based on original patch from Christoph Lameter and Mike Travis. ]

Currently pdas and percpu areas are allocated separately.  %gs points
to local pda and percpu area can be reached using pda->data_offset.
This patch folds pda into percpu area.

Due to strange gcc requirement, pda needs to be at the beginning of
the percpu area so that pda->stack_canary is at %gs:40.  To achieve
this, a new percpu output section macro - PERCPU_VADDR_PREALLOC() - is
added and used to reserve pda sized chunk at the start of the percpu
area.

After this change, for boot cpu, %gs first points to pda in the
data.init area and later during setup_per_cpu_areas() gets updated to
point to the actual pda.  This means that setup_per_cpu_areas() need
to reload %gs for CPU0 while clearing pda area for other cpus as cpu0
already has modified it when control reaches setup_per_cpu_areas().

This patch also removes now unnecessary get_local_pda() and its call
sites.

A lot of this patch is taken from Mike Travis' "x86_64: Fold pda into
per cpu area" patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86: use static _cpu_pda array
Tejun Heo [Tue, 13 Jan 2009 11:41:35 +0000 (20:41 +0900)]
x86: use static _cpu_pda array

_cpu_pda array first uses statically allocated storage in data.init
and then switches to allocated bootmem to conserve space.  However,
after folding pda area into percpu area, _cpu_pda array will be
removed completely.  Drop the reallocation part to simplify the code
for soon-to-follow changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86: load pointer to pda into %gs while brining up a CPU
Tejun Heo [Tue, 13 Jan 2009 11:41:35 +0000 (20:41 +0900)]
x86: load pointer to pda into %gs while brining up a CPU

[ Based on original patch from Christoph Lameter and Mike Travis. ]

CPU startup code in head_64.S loaded address of a zero page into %gs
for temporary use till pda is loaded but address to the actual pda is
available at the point.  Load the real address directly instead.

This will help unifying percpu and pda handling later on.

This patch is mostly taken from Mike Travis' "x86_64: Fold pda into
per cpu area" patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
15 years agox86: make percpu symbols zerobased on SMP
Tejun Heo [Tue, 13 Jan 2009 11:41:35 +0000 (20:41 +0900)]
x86: make percpu symbols zerobased on SMP

[ Based on original patch from Christoph Lameter and Mike Travis. ]

This patch makes percpu symbols zerobased on x86_64 SMP by adding
PERCPU_VADDR() to vmlinux.lds.h which helps setting explicit vaddr on
the percpu output section and using it in vmlinux_64.lds.S.  A new
PHDR is added as existing ones cannot contain sections near address
zero.  PERCPU_VADDR() also adds a new symbol __per_cpu_load which
always points to the vaddr of the loaded percpu data.init region.

The following adjustments have been made to accomodate the address
change.

* code to locate percpu gdt_page in head_64.S is updated to add the
  load address to the gdt_page offset.

* __per_cpu_load is used in places where access to the init data area
  is necessary.

* pda->data_offset is initialized soon after C code is entered as zero
  value doesn't work anymore.

This patch is mostly taken from Mike Travis' "x86_64: Base percpu
variables at zero" patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86: make vmlinux_32.lds.S use PERCPU() macro
Tejun Heo [Tue, 13 Jan 2009 11:41:35 +0000 (20:41 +0900)]
x86: make vmlinux_32.lds.S use PERCPU() macro

Make vmlinux_32.lds.S use the generic PERCPU() macro instead of open
coding it.  This will ease future changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86: cleanup early setup_percpu references
Mike Travis [Tue, 13 Jan 2009 11:41:34 +0000 (20:41 +0900)]
x86: cleanup early setup_percpu references

[ Based on original patch from Christoph Lameter and Mike Travis. ]

  * Ruggedize some calls in setup_percpu.c to prevent mishaps
    in early calls, particularly for non-critical functions.

  * Cleanup DEBUG_PER_CPU_MAPS usages and some comments.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86: make early_per_cpu() a lvalue and use it
Tejun Heo [Tue, 13 Jan 2009 11:41:34 +0000 (20:41 +0900)]
x86: make early_per_cpu() a lvalue and use it

Make early_per_cpu() a lvalue as per_cpu() is and use it where
applicable.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86: fix pda_to_op()
Tejun Heo [Tue, 13 Jan 2009 11:41:34 +0000 (20:41 +0900)]
x86: fix pda_to_op()

There's no instruction to move a 64bit immediate into memory location.
Drop "i".

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agosched: fix warning on ia64
Mike Travis [Thu, 15 Jan 2009 20:09:44 +0000 (12:09 -0800)]
sched: fix warning on ia64

Andrew Morton reported this warning on ia64:

  kernel/sched.c: In function `sd_init_NODE':
  kernel/sched.c:7449: warning: comparison of distinct pointer types lacks a cast

Using the untyped min() function produces such warnings.
Fix: type the constant 32 as unsigned int to match typeof(num_online_cpus).

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branch 'master' of ssh://master.kernel.org/pub/scm/linux/kernel/git/travis...
Ingo Molnar [Thu, 15 Jan 2009 17:37:07 +0000 (18:37 +0100)]
Merge branch 'master' of ssh:///linux/kernel/git/travis/linux-2.6-cpus4096-for-ingo into cpus4096

15 years agox86: fix build warning when CONFIG_NUMA not defined.
Mike Travis [Thu, 15 Jan 2009 17:19:32 +0000 (09:19 -0800)]
x86: fix build warning when CONFIG_NUMA not defined.

Impact: fix build warning

The macro cpu_to_node did not reference it's argument, and instead
simply returned a 0.  This causes a "unused variable" warning if
it's the only reference in a function (show_cache_disable).

Replace it with the more correct inline function.

Signed-off-by: Mike Travis <travis@sgi.com>
15 years agofix: crash: IP: __bitmap_intersects+0x48/0x73
Ingo Molnar [Thu, 15 Jan 2009 14:46:08 +0000 (15:46 +0100)]
fix: crash: IP: __bitmap_intersects+0x48/0x73

-tip testing found this crash:

> [   35.258515] calling  acpi_cpufreq_init+0x0/0x127 @ 1
> [   35.264127] BUG: unable to handle kernel NULL pointer dereference at (null)
> [   35.267554] IP: [<ffffffff80478092>] __bitmap_intersects+0x48/0x73
> [   35.267554] PGD 0
> [   35.267554] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC

arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c is still broken: there's no
allocation of the variable mask, so we pass in an uninitialized cmd.mask
field to drv_read(), which then passes it to the scheduler which then
crashes ...

Switch it over to the much simpler constant-cpumask-pointers approach.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branch 'linus' into cpus4096
Ingo Molnar [Thu, 15 Jan 2009 14:45:31 +0000 (15:45 +0100)]
Merge branch 'linus' into cpus4096

15 years agoMerge branches 'cpus4096', 'x86/cleanups' and 'x86/urgent' into x86/percpu
Ingo Molnar [Thu, 15 Jan 2009 12:18:57 +0000 (13:18 +0100)]
Merge branches 'cpus4096', 'x86/cleanups' and 'x86/urgent' into x86/percpu

15 years agox86: fix broken flush_tlb_others_ipi(), fix
Ingo Molnar [Thu, 15 Jan 2009 12:04:58 +0000 (13:04 +0100)]
x86: fix broken flush_tlb_others_ipi(), fix

Impact: cleanup

Use the proper type.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86: avoid early crash in disable_local_APIC()
Jan Beulich [Wed, 14 Jan 2009 12:28:51 +0000 (12:28 +0000)]
x86: avoid early crash in disable_local_APIC()

E.g. when called due to an early panic.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86: fully honor "nolapic"
Jan Beulich [Wed, 14 Jan 2009 12:27:35 +0000 (12:27 +0000)]
x86: fully honor "nolapic"

Impact: widen the effect of the 'nolapic' boot parameter

"nolapic" should not only suppress SMP and use of the LAPIC, but it
also ought to have the effect of disabling all IO-APIC related activity
as well as PCI MSI and HT-IRQs.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoirq: update all arches for new irq_desc, fix
Mike Travis [Wed, 14 Jan 2009 23:43:54 +0000 (15:43 -0800)]
irq: update all arches for new irq_desc, fix

Impact: fix build errors

Since the SPARSE IRQS changes redefined how the kstat irqs are
organized, arch's must use the new accessor function:

kstat_incr_irqs_this_cpu(irq, DESC);

If CONFIG_SPARSE_IRQS is set, then DESC is a pointer to the
irq_desc which has a pointer to the kstat_irqs.  If not, then
the .irqs field of struct kernel_stat is used instead.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86, pat: fix reserve_memtype() for legacy 1MB range
Suresh Siddha [Fri, 9 Jan 2009 22:35:20 +0000 (14:35 -0800)]
x86, pat: fix reserve_memtype() for legacy 1MB range

Thierry Vignaud reported:
> http://bugzilla.kernel.org/show_bug.cgi?id=12372
>
> On P4 with an SiS motherboard (video card is a SiS 651)
> X server fails to start with error:
> xf86MapVidMem: Could not mmap framebuffer (0x00000000,0x2000) (Invalid
> argument)

Here X is trying to map first 8KB of memory using /dev/mem. Existing
code treats first 0-4KB of memory as non-RAM and 4KB-8KB as RAM. Recent
code changes don't allow to map memory with different attributes
at the same time.

Fix this by treating the first 1MB legacy region as special and always
track the attribute requests with in this region using linear linked
list (and don't bother if the range is RAM or non-RAM or mixed)

Reported-and-tested-by: Thierry Vignaud <tvignaud@mandriva.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86: make 32bit MAX_HARDIRQS_PER_CPU to be NR_VECTORS
Yinghai Lu [Mon, 12 Jan 2009 19:54:27 +0000 (11:54 -0800)]
x86: make 32bit MAX_HARDIRQS_PER_CPU to be NR_VECTORS

Impact: clean up to be same as 64bit

32-bit is using per-cpu vector too, so don't use default NR_IRQS.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branch 'master' of ssh://master.kernel.org/pub/scm/linux/kernel/git/travis...
Ingo Molnar [Wed, 14 Jan 2009 11:13:45 +0000 (12:13 +0100)]
Merge branch 'master' of ssh:///linux/kernel/git/travis/linux-2.6-cpus4096-for-ingo into cpus4096

15 years agox86: replacing mp_config_intsrc with mpc_intsrc
Jaswinder Singh Rajput [Mon, 12 Jan 2009 12:17:22 +0000 (17:47 +0530)]
x86: replacing mp_config_intsrc with mpc_intsrc

Impact: cleanup, solve 80 columns wrap problems

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86: replacing mp_config_ioapic with mpc_ioapic
Jaswinder Singh Rajput [Mon, 12 Jan 2009 12:16:17 +0000 (17:46 +0530)]
x86: replacing mp_config_ioapic with mpc_ioapic

Impact: cleanup, solve 80 columns wrap problems

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86: fix broken flush_tlb_others_ipi()
Suresh Siddha [Wed, 14 Jan 2009 02:43:03 +0000 (18:43 -0800)]
x86: fix broken flush_tlb_others_ipi()

This commit broke flush_tlb_others_ipi() causing boot hangs on a
16 logical cpu system:

> commit 4595f9620cda8a1e973588e743cf5f8436dd20c6
> Author: Rusty Russell <rusty@rustcorp.com.au>
> Date:   Sat Jan 10 21:58:09 2009 -0800
>
>     x86: change flush_tlb_others to take a const struct cpumask

This change resulted in sending the invalidate tlb vector to the
sender itself causing the hang. flush_tlb_others_ipi() should exclude
the sender itself from the destination list.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branch 'x86-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 13 Jan 2009 22:53:16 +0000 (14:53 -0800)]
Merge branch 'x86-pat-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'x86-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86 PAT: remove CPA WARN_ON for zero pte
  x86 PAT: return compatible mapping to remap_pfn_range callers
  x86 PAT: change track_pfn_vma_new to take pgprot_t pointer param
  x86 PAT: consolidate old memtype new memtype check into a function
  x86 PAT: remove PFNMAP type on track_pfn_vma_new() error

15 years agoMerge master.kernel.org:/home/rmk/linux-2.6-arm
Linus Torvalds [Tue, 13 Jan 2009 22:52:35 +0000 (14:52 -0800)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm

* master.kernel.org:/home/rmk/linux-2.6-arm:
  TWL4030: fix clk API usage
  [ARM] 5364/1: allow flush_ioremap_region() to be used from modules
  [ARM] w90x900: fix build errors and warnings
  [ARM] i.MX add missing include
  [ARM] i.MX: fix breakage from commit 278892736e99330195c8ae5861bcd9d791bbf19e
  [ARM] i.MX: remove LCDC controller register definitions from imx-regs.h

15 years agoFix timeouts in sys_pselect7
Bernd Schmidt [Tue, 13 Jan 2009 21:14:48 +0000 (22:14 +0100)]
Fix timeouts in sys_pselect7

Since we (Analog Devices) updated our Blackfin kernel to 2.6.28, we've
seen occasional 5-second hangs from telnet.  telnetd calls select with a
NULL timeout, but with the new kernel, the system call occasionally
returns 0, which causes telnet to call sleep (5).  This did not happen
with earlier kernels.

The code in sys_pselect7 looks a bit strange, in particular the variable
"to" is initialized to NULL, then changed if a non-null timeout was
passed in, but not used further.  It needs to be passed to
core_sys_select instead of &end_time.

This bug was introduced by 8ff3e8e85fa6c312051134b3953e397feb639f51
("select: switch select() and poll() over to hrtimers").

Signed-off-by: Bernd Schmidt <bernd.schmidt@analog.com>
Reviewed-by: Ulrich Drepper <drepper@redhat.com>
Tested-by: Robin Getz <rgetz@blackfin.uclinux.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agofix early_serial_setup() regression
Helge Deller [Tue, 13 Jan 2009 21:51:07 +0000 (22:51 +0100)]
fix early_serial_setup() regression

Commit b430428a188e8a434325e251d0704af4b88b4711 ("8250: Don't clobber
spinlocks.") introduced a regression on the parisc architecture, which
broke the handover to the serial port at boottime.

early_serial_setup() was changed to only copy a subset of the uart_port
fields, and sadly the "type" and "line" fields were forgotten and thus
the serial port was not initialized and could not be used for a
handover.  This patch fixes this by copying the missing fields.

As this change to early_serial_setup() doesn't need an initialized
spinlock in the uart_port struct any longer, we can drop the spinlock
initialization in the superio driver.

Cc: David Daney <ddaney@caviumnetworks.com>
Cc: Tomaso Paoletti <tpaoletti@caviumnetworks.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Cc: linux-parisc@vger.kernel.org
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoTWL4030: fix clk API usage
Russell King [Sat, 10 Jan 2009 10:40:42 +0000 (10:40 +0000)]
TWL4030: fix clk API usage

Always pass a struct device if one is available; and there's really
no reason for the processor specific stuff in this file if only
people would follow the API usage properly by using the struct device.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
15 years agox86 PAT: remove CPA WARN_ON for zero pte
venkatesh.pallipadi@intel.com [Sat, 10 Jan 2009 00:13:14 +0000 (16:13 -0800)]
x86 PAT: remove CPA WARN_ON for zero pte

Impact: reduce scope of debug check - avoid warnings

The logic to find whether identity map exists or not using
high_memory or max_low_pfn_mapped/max_pfn_mapped are not complete
as the memory withing the range may not be mapped if there is a
unusable hole in e820.

Specifically, on my test system I started seeing these warnings with
tools like hwinfo, acpidump trying to map ACPI region.

[   27.400018] ------------[ cut here ]------------
[   27.400344] WARNING: at /home/venkip/src/linus/linux-2.6/arch/x86/mm/pageattr.c:560 __change_page_attr_set_clr+0xf3/0x8b8()
[   27.400821] Hardware name: X7DB8
[   27.401070] CPA: called for zero pte. vaddr = ffff8800cff6a000 cpa->vaddr = ffff8800cff6a000
[   27.401569] Modules linked in:
[   27.401882] Pid: 4913, comm: dmidecode Not tainted 2.6.28-05716-gfe0bdec #586
[   27.402141] Call Trace:
[   27.402488]  [<ffffffff80237c21>] warn_slowpath+0xd3/0x10f
[   27.402749]  [<ffffffff80274ade>] ? find_get_page+0xb3/0xc9
[   27.403028]  [<ffffffff80274a2b>] ? find_get_page+0x0/0xc9
[   27.403333]  [<ffffffff80226425>] __change_page_attr_set_clr+0xf3/0x8b8
[   27.403628]  [<ffffffff8028ec99>] ? __purge_vmap_area_lazy+0x192/0x1a1
[   27.403883]  [<ffffffff8028eb52>] ? __purge_vmap_area_lazy+0x4b/0x1a1
[   27.404172]  [<ffffffff80290268>] ? vm_unmap_aliases+0x1ab/0x1bb
[   27.404512]  [<ffffffff80290105>] ? vm_unmap_aliases+0x48/0x1bb
[   27.404766]  [<ffffffff80226d28>] change_page_attr_set_clr+0x13e/0x2e6
[   27.405026]  [<ffffffff80698fa7>] ? _spin_unlock+0x26/0x2a
[   27.405292]  [<ffffffff80227e6a>] ? reserve_memtype+0x19b/0x4e3
[   27.405590]  [<ffffffff80226ffd>] _set_memory_wb+0x22/0x24
[   27.405844]  [<ffffffff80225d28>] ioremap_change_attr+0x26/0x28
[   27.406097]  [<ffffffff80228355>] reserve_pfn_range+0x1a3/0x235
[   27.406427]  [<ffffffff80228430>] track_pfn_vma_new+0x49/0xb3
[   27.406686]  [<ffffffff80286c46>] remap_pfn_range+0x94/0x32c
[   27.406940]  [<ffffffff8022878d>] ? phys_mem_access_prot_allowed+0xb5/0x1a8
[   27.407209]  [<ffffffff803e9bf4>] mmap_mem+0x75/0x9d
[   27.407523]  [<ffffffff8028b3b4>] mmap_region+0x2cf/0x53e
[   27.407776]  [<ffffffff8028b8cc>] do_mmap_pgoff+0x2a9/0x30d
[   27.408034]  [<ffffffff8020f4a4>] sys_mmap+0x92/0xce
[   27.408339]  [<ffffffff8020b65b>] system_call_fastpath+0x16/0x1b
[   27.408614] ---[ end trace 4b16ad70c09a602d ]---
[   27.408871] dmidecode:4913 reserve_pfn_range ioremap_change_attr failed write-back for cff6a000-cff6b000

This is wih track_pfn_vma_new trying to keep identity map in sync.
The address cff6a000 is the ACPI region according to e820.

[    0.000000] BIOS-provided physical RAM map:
[    0.000000]  BIOS-e820: 0000000000000000 - 000000000009c000 (usable)
[    0.000000]  BIOS-e820: 000000000009c000 - 00000000000a0000 (reserved)
[    0.000000]  BIOS-e820: 00000000000cc000 - 00000000000d0000 (reserved)
[    0.000000]  BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved)
[    0.000000]  BIOS-e820: 0000000000100000 - 00000000cff60000 (usable)
[    0.000000]  BIOS-e820: 00000000cff60000 - 00000000cff69000 (ACPI data)
[    0.000000]  BIOS-e820: 00000000cff69000 - 00000000cff80000 (ACPI NVS)
[    0.000000]  BIOS-e820: 00000000cff80000 - 00000000d0000000 (reserved)
[    0.000000]  BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
[    0.000000]  BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
[    0.000000]  BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
[    0.000000]  BIOS-e820: 00000000ff000000 - 0000000100000000 (reserved)
[    0.000000]  BIOS-e820: 0000000100000000 - 0000000230000000 (usable)

And is not mapped as per init_memory_mapping.

[    0.000000] init_memory_mapping: 0000000000000000-00000000cff60000
[    0.000000] init_memory_mapping: 0000000100000000-0000000230000000

We can add logic to check for this. But, there can also be other holes in
identity map when we have 1GB of aligned reserved space in e820.

This patch handles it by removing the WARN_ON and returning a specific
error value (EFAULT) to indicate that the address does not have any
identity mapping.

The code that tries to keep identity map in sync can ignore
this error, with other callers of cpa still getting error here.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86 PAT: return compatible mapping to remap_pfn_range callers
venkatesh.pallipadi@intel.com [Sat, 10 Jan 2009 00:13:12 +0000 (16:13 -0800)]
x86 PAT: return compatible mapping to remap_pfn_range callers

Impact: avoid warning message, potentially solve 3D performance regression

Change x86 PAT code to return compatible memtype if the exact memtype that
was requested in remap_pfn_rage and friends is not available due to some
conflict.

This is done by returning the compatible type in pgprot parameter of
track_pfn_vma_new(), and the caller uses that memtype for page table.

Note that track_pfn_vma_copy() which is basically called during fork gets the
prot from existing page table and should not have any conflict. Hence we use
strict memtype check there and do not allow compatible memtypes.

This patch fixes the bug reported here:

  http://marc.info/?l=linux-kernel&m=123108883716357&w=2

Specifically the error message:

  X:5010 map pfn expected mapping type write-back for d0000000-d0101000,
  got write-combining

Should go away.

Reported-and-bisected-by: Kevin Winchester <kjwinchester@gmail.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86 PAT: change track_pfn_vma_new to take pgprot_t pointer param
venkatesh.pallipadi@intel.com [Sat, 10 Jan 2009 00:13:11 +0000 (16:13 -0800)]
x86 PAT: change track_pfn_vma_new to take pgprot_t pointer param

Impact: cleanup

Change the protection parameter for track_pfn_vma_new() into a pgprot_t pointer.
Subsequent patch changes the x86 PAT handling to return a compatible
memtype in pgprot_t, if what was requested cannot be allowed due to conflicts.
No fuctionality change in this patch.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86 PAT: consolidate old memtype new memtype check into a function
venkatesh.pallipadi@intel.com [Sat, 10 Jan 2009 00:13:10 +0000 (16:13 -0800)]
x86 PAT: consolidate old memtype new memtype check into a function

Impact: cleanup

Move the new memtype old memtype allowed check to header so that is can be
shared by other users. Subsequent patch uses this in pat.c in remap_pfn_range()
code path. No functionality change in this patch.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86 PAT: remove PFNMAP type on track_pfn_vma_new() error
venkatesh.pallipadi@intel.com [Sat, 10 Jan 2009 00:13:09 +0000 (16:13 -0800)]
x86 PAT: remove PFNMAP type on track_pfn_vma_new() error

Impact: fix (harmless) double-free of memtype entries and avoid warning

On track_pfn_vma_new() failure, reset the vm_flags so that there will be
no second cleanup happening when upper level routines call unmap_vmas().

This patch fixes part of the bug reported here:

  http://marc.info/?l=linux-kernel&m=123108883716357&w=2

Specifically the error message:

  X:5010 freeing invalid memtype d0000000-d0101000

Is due to multiple frees on error path, will not happen with the patch below.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86, generic: mark complex bitops.h inlines as __always_inline
Andi Kleen [Mon, 12 Jan 2009 22:01:15 +0000 (23:01 +0100)]
x86, generic: mark complex bitops.h inlines as __always_inline

Impact: reduce kernel image size

Hugh Dickins noticed that older gcc versions when the kernel
is built for code size didn't inline some of the bitops.

Mark all complex x86 bitops that have more than a single
asm statement or two as always inline to avoid this problem.

Probably should be done for other architectures too.

Ingo then found a better fix that only requires
a single line change, but it unfortunately only
works on gcc 4.3.

On older gccs the original patch still makes a ~0.3% defconfig
difference with CONFIG_OPTIMIZE_INLINING=y.

With gcc 4.1 and a defconfig like build:

    6116998 1138540  883788 8139326  7c323e vmlinux-oi-with-patch
    6137043 1138540  883788 8159371  7c808b vmlinux-optimize-inlining

~20k / 0.3% difference.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 13 Jan 2009 17:03:02 +0000 (09:03 -0800)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  Revert "i386: add TRACE_IRQS_OFF for the nmi"

15 years agoMerge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 13 Jan 2009 17:02:21 +0000 (09:02 -0800)]
Merge branch 'core-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  smp_call_function_single(): be slightly less stupid, fix #2
  lockdep, mm: fix might_fault() annotation

15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland...
Linus Torvalds [Tue, 13 Jan 2009 16:19:42 +0000 (08:19 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/iser: Add dependency on INFINIBAND_ADDR_TRANS
  IPoIB: Do not join broadcast group if interface is brought down
  RDMA/nes: Fix for NIPQUAD removal
  IPoIB: Fix loss of connectivity after bonding failover on both sides
  IB/mlx4: Don't register IB device for adapters with no IB ports
  mlx4_core: Fix warning from min()
  IB/ehca: spin_lock_irqsave() takes an unsigned long

15 years agoMerge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzi...
Linus Torvalds [Tue, 13 Jan 2009 16:17:41 +0000 (08:17 -0800)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  pata_it821x: Update RDC UDMA handling
  ata: fix wrong WARN_ON_ONCE

15 years agoPrevent oops at boot with VT-d
Dirk Hohndel [Sun, 11 Jan 2009 15:33:51 +0000 (15:33 +0000)]
Prevent oops at boot with VT-d

With some broken BIOSs when VT-d is enabled, the data structures are
filled incorrectly. This can cause a NULL pointer dereference in very
early boot.

Signed-off-by: Dirk Hohndel <hohndel@linux.intel.com>
Acked-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agopata_it821x: Update RDC UDMA handling
Alan Cox [Sun, 11 Jan 2009 19:51:08 +0000 (19:51 +0000)]
pata_it821x: Update RDC UDMA handling

The UDMA affliction is apparently specific to revision 0x11. Keeps us in sync
with drivers/ide current.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agoata: fix wrong WARN_ON_ONCE
Christian Borntraeger [Tue, 13 Jan 2009 09:38:36 +0000 (10:38 +0100)]
ata: fix wrong WARN_ON_ONCE

This patch fixes a wrong WARN_ON that was triggered by 32bit PIO support:
WARNING: at drivers/ata/libata-sff.c:1017 ata_sff_hsm_move+0x45e/0x750()

__atapi_pio_bytes simply doesnt know enough to decide if there is a bug.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agox86, cpufreq: remove leftover copymask_copy()
Ingo Molnar [Tue, 13 Jan 2009 15:11:00 +0000 (16:11 +0100)]
x86, cpufreq: remove leftover copymask_copy()

Impact: fix potential boot crash on MAXSMP

Remove code left over by:

  50c668d: Revert "cpumask: use work_on_cpu in acpi-cpufreq.c for drv_read

That cmd.cpumask is not allocated anymore. No impact on default !MAXSMP
kernels.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branches 'ehca', 'ipoib', 'iser', 'mlx4' and 'nes' into for-next
Roland Dreier [Tue, 13 Jan 2009 03:37:31 +0000 (19:37 -0800)]
Merge branches 'ehca', 'ipoib', 'iser', 'mlx4' and 'nes' into for-next

15 years agoIB/iser: Add dependency on INFINIBAND_ADDR_TRANS
Randy Dunlap [Tue, 13 Jan 2009 03:30:41 +0000 (19:30 -0800)]
IB/iser: Add dependency on INFINIBAND_ADDR_TRANS

Fix ib_iser build to depend on INFINIBAND_ADDR_TRANS; if INET=y but
IPV6=n, then the RDMA CM is not built but INFINIBAND_ISER can be
enabled, leading to:

    ERROR: "rdma_destroy_id" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
    ERROR: "rdma_connect" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
    ERROR: "rdma_destroy_qp" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
    ERROR: "rdma_create_id" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
    ERROR: "rdma_create_qp" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
    ERROR: "rdma_resolve_route" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
    ERROR: "rdma_disconnect" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
    ERROR: "rdma_resolve_addr" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
15 years agoIPoIB: Do not join broadcast group if interface is brought down
Yossi Etigin [Tue, 13 Jan 2009 03:28:42 +0000 (19:28 -0800)]
IPoIB: Do not join broadcast group if interface is brought down

Because the ipoib_workqueue is not flushed when ipoib interface is
brought down, ipoib_mcast_join() may trigger a join to the broadcast
group after priv->broadcast was set to NULL (during cleanup).  This
will cause the system to be a member of the broadcast group when
interface is down.  As a side effect, this breaks the optimization of
setting the Q_key only when joining the broadcast group.

Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
15 years agox86: arch_probe_nr_irqs
Yinghai Lu [Tue, 13 Jan 2009 01:39:24 +0000 (17:39 -0800)]
x86: arch_probe_nr_irqs

Impact: save RAM with large NR_CPUS, get smaller nr_irqs

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Mike Travis <travis@sgi.com>