Linus Torvalds [Tue, 4 Mar 2008 17:23:28 +0000 (09:23 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mingo/linux-2.6-sched-devel
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel:
sched: revert load_balance_monitor() changes
Linus Torvalds [Tue, 4 Mar 2008 17:22:32 +0000 (09:22 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/x86/linux-2.6-x86
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
x86/xen: fix DomU boot problem
x86: not set node to cpu_to_node if the node is not online
x86, i387: fix ptrace leakage using init_fpu()
Linus Torvalds [Tue, 4 Mar 2008 17:22:05 +0000 (09:22 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/avi/kvm
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
x86: disable KVM for Voyager and friends
KVM: VMX: Avoid rearranging switched guest msrs while they are loaded
KVM: MMU: Fix race when instantiating a shadow pte
KVM: Route irq 0 to vcpu 0 exclusively
KVM: Avoid infinite-frequency local apic timer
KVM: make MMU_DEBUG compile again
KVM: move alloc_apic_access_page() outside of non-preemptable region
KVM: SVM: fix Windows XP 64 bit installation crash
KVM: remove the usage of the mmap_sem for the protection of the memory slots.
KVM: emulate access to MSR_IA32_MCG_CTL
KVM: Make the supported cpuid list a host property rather than a vm property
KVM: Fix kvm_arch_vcpu_ioctl_set_sregs so that set_cr0 works properly
KVM: SVM: set NM intercept when enabling CR0.TS in the guest
KVM: SVM: Fix lazy FPU switching
Peter Zijlstra [Mon, 25 Feb 2008 16:34:02 +0000 (17:34 +0100)]
sched: revert load_balance_monitor() changes
The following commits cause a number of regressions:
commit
58e2d4ca581167c2a079f4ee02be2f0bc52e8729
Author: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Date: Fri Jan 25 21:08:00 2008 +0100
sched: group scheduling, change how cpu load is calculated
commit
6b2d7700266b9402e12824e11e0099ae6a4a6a79
Author: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Date: Fri Jan 25 21:08:00 2008 +0100
sched: group scheduler, fix fairness of cpu bandwidth allocation for task groups
Namely:
- very frequent wakeups on SMP, reported by PowerTop users.
- cacheline trashing on (large) SMP
- some latencies larger than 500ms
While there is a mergeable patch to fix the latter, the former issues
are not fixable in a manner suitable for .25 (we're at -rc3 now).
Hence we revert them and try again in v2.6.26.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Tested-by: Alexey Zaytsev <alexey.zaytsev@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ian Campbell [Thu, 28 Feb 2008 23:16:49 +0000 (23:16 +0000)]
x86/xen: fix DomU boot problem
Construct Xen guest e820 map with a hole between 640K-1M.
It's pure luck that Xen kernels have gotten away with it in the past.
The patch below seems like the right thing to do. It certainly boots in
a domU without the DMI problem (without any of the other related patches
such as Alexander's).
Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Tested-by: Mark McLoughlin <markmc@redhat.com>
Acked-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Yinghai Lu [Tue, 19 Feb 2008 23:35:54 +0000 (15:35 -0800)]
x86: not set node to cpu_to_node if the node is not online
resolve boot problem reported by Mel Gorman:
http://lkml.org/lkml/2008/2/13/404
init_cpu_to_node will use cpu->apic (from MADT or mptable) and
apic->node(from SRAT or AMD config space with k8_bus_64.c) to have
cpu->node mapping, and later identify_cpu will overwrite them
again...(with nearby_node...)
this patch checks if the node is online, otherwise it will not
update cpu_node map. so keep cpu_node map to online node before
identify_cpu..., to prevent possible error.
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Suresh Siddha [Mon, 3 Mar 2008 21:01:08 +0000 (13:01 -0800)]
x86, i387: fix ptrace leakage using init_fpu()
This bug got introduced by the recent i387 merge:
commit
4421011120b2304e5c248ae4165a2704588aedf1
Author: Roland McGrath <roland@redhat.com>
Date: Wed Jan 30 13:31:50 2008 +0100
x86: x86 i387 user_regset
Current usage of unlazy_fpu() in ptrace specific routines is wrong.
unlazy_fpu() will not init fpu if the task never used math. So the
ptrace calls can expose the parent tasks FPU data in some cases.
Replace it with the init_fpu() which will init the math state, if the
task never used math before.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Linus Torvalds [Tue, 4 Mar 2008 16:08:05 +0000 (08:08 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: fix blkdev_issue_flush() not detecting and passing EOPNOTSUPP back
block: fix shadowed variable warning in blk-map.c
block: remove extern on function definition
cciss: remove READ_AHEAD define and use block layer defaults
make cdrom.c:check_for_audio_disc() static
block/genhd.c: proper externs
unexport blk_rq_map_user_iov
unexport blk_{get,put}_queue
block/genhd.c: cleanups
proper prototype for blk_dev_init()
block/blk-tag.c should #include "blk.h"
Fix DMA access of block device in 64-bit kernel on some non-x86 systems with 4GB or upper 4GB memory
block: separate out padding from alignment
block: restore the meaning of rq->data_len to the true data length
resubmit: cciss: procfs updates to display info about many
splice: only return -EAGAIN if there's hope of more data
block: fix kernel-docbook parameters and files
Geert Uytterhoeven [Tue, 4 Mar 2008 08:18:16 +0000 (09:18 +0100)]
m68k{,nommu}: Wire up new timerfd syscalls
m68k{,nommu}: Wire up the new timerfd syscalls, which were introduced in
commit
4d672e7ac79b5ec5cdc90e450823441e20464691 ("timerfd: new timerfd API").
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Greg Ungerer [Tue, 4 Mar 2008 06:52:01 +0000 (16:52 +1000)]
m68knommu: fix fec driver interrupt races
The FEC driver has a common interrupt handler for all interrupt event
types. It is raised on a number of distinct interrupt vectors.
This handler can't be re-entered while processing an interrupt, so
make sure all requested vectors are flagged as IRQF_DISABLED.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Greg Ungerer [Tue, 4 Mar 2008 06:35:04 +0000 (16:35 +1000)]
m68knommu: declare do_IRQ()
Need a declaration of do_IRQ for the 68328 interrupt handling code.
It is common to all m68knommu targets, so a common declaration makes
sense.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Greg Ungerer [Tue, 4 Mar 2008 06:24:17 +0000 (16:24 +1000)]
m68knommu: remove duplicate hw_tick() code
Remove duplicate hw_tick() function from 68328 timers code.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Greg Ungerer [Tue, 4 Mar 2008 05:44:23 +0000 (15:44 +1000)]
m68knommu: update defconfig
Update the m68knommu defconfig.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tony Breeds [Tue, 4 Mar 2008 05:05:06 +0000 (16:05 +1100)]
Build fix for drivers/s390/char/defkeymap.c
Commit
5ce2087ed0eb424e0889bdc9102727f65d2ecdde (Fix default compose
table initialization) left a trailing quote.
CC drivers/s390/char/defkeymap.o
drivers/s390/char/defkeymap.c:155: error: missing terminating ' character
drivers/s390/char/defkeymap.c:156: error: syntax error before ';' token
make[3]: *** [drivers/s390/char/defkeymap.o] Error 1
Fix that.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 4 Mar 2008 16:00:34 +0000 (08:00 -0800)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
[CIFS] remove unused variable
[CIFS] consolidate duplicate code in posix/unix inode handling
[CIFS] fix build break when proc disabled
[CIFS] factoring out common code in get_inode_info functions
[CIFS] fix prepath conversion when server supports posix paths
[CIFS] Only convert / when server does not support posix paths
[CIFS] Fix mixed case name in structure dfs_info3_param
[CIFS] fixup prefixpaths which contain multiple path components
[CIFS] fix typo
[CIFS] patch to fix incorrect encoding of number of aces on set mode
[CIFS] Fix typo in quota operations
[CIFS] clean up some hard to read ifdefs
[CIFS] reduce checkpatch warnings
[CIFS] fix warning in cifs_spnego.c
Roland McGrath [Tue, 4 Mar 2008 04:22:05 +0000 (20:22 -0800)]
freezer vs stopped or traced
This changes the "freezer" code used by suspend/hibernate in its treatment
of tasks in TASK_STOPPED (job control stop) and TASK_TRACED (ptrace) states.
As I understand it, the intent of the "freezer" is to hold all tasks
from doing anything significant. For this purpose, TASK_STOPPED and
TASK_TRACED are "frozen enough". It's possible the tasks might resume
from ptrace calls (if the tracer were unfrozen) or from signals
(including ones that could come via timer interrupts, etc). But this
doesn't matter as long as they quickly block again while "freezing" is
in effect. Some minor adjustments to the signal.c code make sure that
try_to_freeze() very shortly follows all wakeups from both kinds of
stop. This lets the freezer code safely leave stopped tasks unmolested.
Changing this fixes the longstanding bug of seeing after resuming from
suspend/hibernate your shell report "[1] Stopped" and the like for all
your jobs stopped by ^Z et al, as if you had freshly fg'd and ^Z'd them.
It also removes from the freezer the arcane special case treatment for
ptrace'd tasks, which relied on intimate knowledge of ptrace internals.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Wed, 20 Feb 2008 17:20:08 +0000 (09:20 -0800)]
x86: disable KVM for Voyager and friends
Most classic Pentiums don't have hardware virtualization extension,
and building kvm with Voyager, Visual Workstation, or NUMAQ
generates spurious failures.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Avi Kivity [Wed, 27 Feb 2008 14:06:57 +0000 (16:06 +0200)]
KVM: VMX: Avoid rearranging switched guest msrs while they are loaded
KVM tries to run as much as possible with the guest msrs loaded instead of
host msrs, since switching msrs is very expensive. It also tries to minimize
the number of msrs switched according to the guest mode; for example,
MSR_LSTAR is needed only by long mode guests. This optimization is done by
setup_msrs().
However, we must not change which msrs are switched while we are running with
guest msr state:
- switch to guest msr state
- call setup_msrs(), removing some msrs from the list
- switch to host msr state, leaving a few guest msrs loaded
An easy way to trigger this is to kexec an x86_64 linux guest. Early during
setup, the guest will switch EFER to not include SCE. KVM will stop saving
MSR_LSTAR, and on the next msr switch it will leave the guest LSTAR loaded.
The next host syscall will end up in a random location in the kernel.
Fix by reloading the host msrs before changing the msr list.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Avi Kivity [Tue, 26 Feb 2008 20:12:10 +0000 (22:12 +0200)]
KVM: MMU: Fix race when instantiating a shadow pte
For improved concurrency, the guest walk is performed concurrently with other
vcpus. This means that we need to revalidate the guest ptes once we have
write-protected the guest page tables, at which point they can no longer be
modified.
The current code attempts to avoid this check if the shadow page table is not
new, on the assumption that if it has existed before, the guest could not have
modified the pte without the shadow lock. However the assumption is incorrect,
as the racing vcpu could have modified the pte, then instantiated the shadow
page, before our vcpu regains control:
vcpu0 vcpu1
fault
walk pte
modify pte
fault in same pagetable
instantiate shadow page
lookup shadow page
conclude it is old
instantiate spte based on stale guest pte
We could do something clever with generation counters, but a test run by
Marcelo suggests this is unnecessary and we can just do the revalidation
unconditionally. The pte will be in the processor cache and the check can
be quite fast.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Avi Kivity [Mon, 25 Feb 2008 08:28:31 +0000 (10:28 +0200)]
KVM: Route irq 0 to vcpu 0 exclusively
Some Linux versions allow the timer interrupt to be processed by more than
one cpu, leading to hangs due to tsc instability. Work around the issue
by only disaptching the interrupt to vcpu 0.
Problem analyzed (and patch tested) by Sheng Yang.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Avi Kivity [Sun, 24 Feb 2008 12:37:50 +0000 (14:37 +0200)]
KVM: Avoid infinite-frequency local apic timer
If the local apic initial count is zero, don't start a an hrtimer with infinite
frequency, locking up the host.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Marcelo Tosatti [Thu, 14 Feb 2008 23:25:39 +0000 (21:25 -0200)]
KVM: make MMU_DEBUG compile again
the cr3 variable is now inside the vcpu->arch structure.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Marcelo Tosatti [Thu, 14 Feb 2008 23:21:43 +0000 (21:21 -0200)]
KVM: move alloc_apic_access_page() outside of non-preemptable region
alloc_apic_access_page() can sleep, while vmx_vcpu_setup is called
inside a non preemptable region. Move it after put_cpu().
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Joerg Roedel [Wed, 13 Feb 2008 15:30:28 +0000 (16:30 +0100)]
KVM: SVM: fix Windows XP 64 bit installation crash
While installing Windows XP 64 bit wants to access the DEBUGCTL and the last
branch record (LBR) MSRs. Don't allowing this in KVM causes the installation to
crash. This patch allow the access to these MSRs and fixes the issue.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Izik Eidus [Sun, 10 Feb 2008 16:04:15 +0000 (18:04 +0200)]
KVM: remove the usage of the mmap_sem for the protection of the memory slots.
This patch replaces the mmap_sem lock for the memory slots with a new
kvm private lock, it is needed beacuse untill now there were cases where
kvm accesses user memory while holding the mmap semaphore.
Signed-off-by: Izik Eidus <izike@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Jens Axboe [Tue, 4 Mar 2008 10:47:46 +0000 (11:47 +0100)]
block: fix blkdev_issue_flush() not detecting and passing EOPNOTSUPP back
This is important to eg dm, that tries to decide whether to stop using
barriers or not.
Tested as working by Anders Henke <anders.henke@1und1.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Harvey Harrison [Tue, 4 Mar 2008 10:31:22 +0000 (11:31 +0100)]
block: fix shadowed variable warning in blk-map.c
Introduced between 2.6.25-rc2 and -rc3
block/blk-map.c:154:14: warning: symbol 'bio' shadows an earlier one
block/blk-map.c:110:13: originally declared here
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Harvey Harrison [Tue, 4 Mar 2008 10:30:18 +0000 (11:30 +0100)]
block: remove extern on function definition
Intoduced between 2.6.25-rc2 and -rc3
block/blk-settings.c:319:12: warning: function 'blk_queue_dma_drain' with external linkage has definition
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Mike Miller [Tue, 4 Mar 2008 10:25:15 +0000 (11:25 +0100)]
cciss: remove READ_AHEAD define and use block layer defaults
This patch removes the #define READ_AHEAD 1024 from the driver and uses the
block layer defaults, instead. We have found that under certain workloads
the setting can cause a disk connected to the e200 controller to go offline.
If the disk hiccups the link may try to downshift but the controller is
never notified that the link successfully completed the renegotiation.
We've also found that performance using the block layer default of 32 pages
was on par with the 1024 setting. We tried setting it to zero at one time
based on info from our firmware guys but that killed performance. Turns out
we were talking about 2 different read ahead settings.
Please consider this for inclusion.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Adrian Bunk [Tue, 4 Mar 2008 10:23:51 +0000 (11:23 +0100)]
make cdrom.c:check_for_audio_disc() static
This patch makes the needlessly global check_for_audio_disc() static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Adrian Bunk [Tue, 4 Mar 2008 10:23:50 +0000 (11:23 +0100)]
block/genhd.c: proper externs
This patch adds proper externs for two structs in include/linux/genhd.h
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Adrian Bunk [Tue, 4 Mar 2008 10:23:48 +0000 (11:23 +0100)]
unexport blk_rq_map_user_iov
This patch removes the unused export of blk_rq_map_user_iov.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Adrian Bunk [Tue, 4 Mar 2008 10:23:47 +0000 (11:23 +0100)]
unexport blk_{get,put}_queue
This patch removes the unused exports of blk_{get,put}_queue.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Adrian Bunk [Tue, 4 Mar 2008 10:23:46 +0000 (11:23 +0100)]
block/genhd.c: cleanups
This patch contains the following cleanups:
- make the needlessly global struct disk_type static
- #if 0 the unused genhd_media_change_notify()
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Adrian Bunk [Tue, 4 Mar 2008 10:23:45 +0000 (11:23 +0100)]
proper prototype for blk_dev_init()
This patch adds a proper prototye for blk_dev_init() in block/blk.h
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Adrian Bunk [Tue, 4 Mar 2008 10:23:44 +0000 (11:23 +0100)]
block/blk-tag.c should #include "blk.h"
Every file should include the headers containing the externs for its
global functions (in this case for __blk_queue_free_tags()).
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Yang Shi [Tue, 4 Mar 2008 10:20:51 +0000 (11:20 +0100)]
Fix DMA access of block device in 64-bit kernel on some non-x86 systems with 4GB or upper 4GB memory
For some non-x86 systems with 4GB or upper 4GB memory,
we need increase the range of addresses that can be
used for direct DMA in 64-bit kernel.
Signed-off-by: Yang Shi <yang.shi@windriver.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Tue, 4 Mar 2008 10:18:17 +0000 (11:18 +0100)]
block: separate out padding from alignment
Block layer alignment was used for two different purposes - memory
alignment and padding. This causes problems in lower layers because
drivers which only require memory alignment ends up with adjusted
rq->data_len. Separate out padding such that padding occurs iff
driver explicitly requests it.
Tomo: restorethe code to update bio in blk_rq_map_user
introduced by the commit
40b01b9bbdf51ae543a04744283bf2d56c4a6afa
according to padding alignment.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
FUJITA Tomonori [Tue, 4 Mar 2008 10:17:11 +0000 (11:17 +0100)]
block: restore the meaning of rq->data_len to the true data length
The meaning of rq->data_len was changed to the length of an allocated
buffer from the true data length. It breaks SG_IO friends and
bsg. This patch restores the meaning of rq->data_len to the true data
length and adds rq->extra_len to store an extended length (due to
drain buffer and padding).
This patch also removes the code to update bio in blk_rq_map_user
introduced by the commit
40b01b9bbdf51ae543a04744283bf2d56c4a6afa.
The commit adjusts bio according to memory alignment
(queue_dma_alignment). However, memory alignment is NOT padding
alignment. This adjustment also breaks SG_IO friends and bsg. Padding
alignment needs to be fixed in a proper way (by a separate patch).
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>
Mike Miller [Thu, 21 Feb 2008 07:54:03 +0000 (08:54 +0100)]
resubmit: cciss: procfs updates to display info about many
volumes
This patch allows us to display information about all of the logical volumes
configured on a particular controller without stepping on memory even when
there are many volumes (128 or more) configured.
Please consider this for inclusion.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Wed, 20 Feb 2008 09:34:51 +0000 (10:34 +0100)]
splice: only return -EAGAIN if there's hope of more data
sys_tee() currently is a bit eager in returning -EAGAIN, it may do so
even if we don't have a chance of anymore data becoming available. So
improve the logic and only return -EAGAIN if we have an attached writer
to the input pipe.
Reported by Johann Felix Soden <johfel@gmx.de> and
Patrick McManus <mcmanus@ducksong.com>.
Tested-by: Johann Felix Soden <johfel@users.sourceforge.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Randy Dunlap [Wed, 20 Feb 2008 08:01:22 +0000 (09:01 +0100)]
block: fix kernel-docbook parameters and files
kernel-doc for block/:
- add missing parameters
- fix one function's parameter list (remove blank line)
- add 2 source files to docbook for non-exported kernel-doc functions
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Steve French [Tue, 4 Mar 2008 04:22:00 +0000 (04:22 +0000)]
Merge branch 'master' of /linux/kernel/git/torvalds/linux-2.6
Linus Torvalds [Mon, 3 Mar 2008 23:00:09 +0000 (15:00 -0800)]
Merge branch 'slab-linus' of git://git./linux/kernel/git/christoph/vm
* 'slab-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/christoph/vm:
slub: fix possible NULL pointer dereference
slub: Add kmalloc_large_node() to support kmalloc_node fallback
slub: look up object from the freelist once
slub: Fix up comments
slub: Rearrange #ifdef CONFIG_SLUB_DEBUG in calculate_sizes()
slub: Remove BUG_ON() from ksize and omit checks for !SLUB_DEBUG
slub: Use the objsize from the kmem_cache_cpu structure
slub: Remove useless checks in alloc_debug_processing
slub: Remove objsize check in kmem_cache_flags()
slub: rename slab_objects to show_slab_objects
Revert "unique end pointer" patch
slab: avoid double initialization & do initialization in 1 place
Oleg Nesterov [Sun, 2 Mar 2008 18:44:44 +0000 (21:44 +0300)]
exit_notify: fix kill_orphaned_pgrp() usage with mt exit
1. exit_notify() always calls kill_orphaned_pgrp(). This is wrong, we
should do this only when the whole process exits.
2. exit_notify() uses "current" as "ignored_task", obviously wrong.
Use ->group_leader instead.
Test case:
void hup(int sig)
{
printf("HUP received\n");
}
void *tfunc(void *arg)
{
sleep(2);
printf("sub-thread exited\n");
return NULL;
}
int main(int argc, char *argv[])
{
if (!fork()) {
signal(SIGHUP, hup);
kill(getpid(), SIGSTOP);
exit(0);
}
pthread_t thr;
pthread_create(&thr, NULL, tfunc, NULL);
sleep(1);
printf("main thread exited\n");
syscall(__NR_exit, 0);
return 0;
}
output:
main thread exited
HUP received
Hangup
With this patch the output is:
main thread exited
sub-thread exited
HUP received
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Sun, 2 Mar 2008 18:44:42 +0000 (21:44 +0300)]
will_become_orphaned_pgrp: partially fix insufficient ->exit_state check
p->exit_state != 0 doesn't mean this process is dead, it may have
sub-threads. Change the code to use "p->exit_state && thread_group_empty(p)"
instead.
Without this patch, ^Z doesn't deliver SIGTSTP to the foreground process
if the main thread has exited.
However, the new check is not perfect either. There is a window when
exit_notify() drops tasklist and before release_task(). Suppose that
the last (non-leader) thread exits. This means that entire group exits,
but thread_group_empty() is not true yet.
As Eric pointed out, is_global_init() is wrong as well, but I did not
dare to do other changes.
Just for the record, has_stopped_jobs() is absolutely wrong too. But we
can't fix it now, we should first fix SIGNAL_STOP_STOPPED issues.
Even with this patch ^Z doesn't play well with the dead main thread.
The task is stopped correctly but do_wait(WSTOPPED) won't see it. This
is another unrelated issue, will be (hopefully) fixed separately.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Sun, 2 Mar 2008 18:44:40 +0000 (21:44 +0300)]
introduce kill_orphaned_pgrp() helper
Factor out the common code in reparent_thread() and exit_notify().
No functional changes.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Samuel Thibault [Mon, 3 Mar 2008 01:23:49 +0000 (01:23 +0000)]
Fix default compose table initialization
Oddly enough, unsigned int c = '\300'; puts a "negative" value in c, not
0300... This fixes the default unicode compose table by using integers
instead of character constants.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cyrill Gorcunov [Sun, 2 Mar 2008 20:28:24 +0000 (23:28 +0300)]
slub: fix possible NULL pointer dereference
This patch fix possible NULL pointer dereference if kzalloc
failed. To be able to return proper error code the function
return type is changed to ssize_t (according to callees and
sysfs definitions).
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Christoph Lameter [Sat, 1 Mar 2008 21:56:40 +0000 (13:56 -0800)]
slub: Add kmalloc_large_node() to support kmalloc_node fallback
Slub is missing some NUMA support for large kmallocs. Provide that.
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Pekka J Enberg [Sat, 1 Mar 2008 21:43:54 +0000 (13:43 -0800)]
slub: look up object from the freelist once
We only need to look up object from c->page->freelist once in
__slab_alloc().
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Christoph Lameter [Sat, 16 Feb 2008 07:45:26 +0000 (23:45 -0800)]
slub: Fix up comments
Provide comments and fix up various spelling / style issues.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Christoph Lameter [Sat, 16 Feb 2008 07:45:25 +0000 (23:45 -0800)]
slub: Rearrange #ifdef CONFIG_SLUB_DEBUG in calculate_sizes()
Group SLUB_DEBUG code together to reduce the number of #ifdefs. Move some
debug checks under the #ifdef.
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Christoph Lameter [Sat, 16 Feb 2008 07:45:25 +0000 (23:45 -0800)]
slub: Remove BUG_ON() from ksize and omit checks for !SLUB_DEBUG
The BUG_ONs are useless since the pointer derefs will lead to
NULL deref errors anyways. Some of the checks are not necessary
if no debugging is possible.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Christoph Lameter [Sat, 16 Feb 2008 07:45:25 +0000 (23:45 -0800)]
slub: Use the objsize from the kmem_cache_cpu structure
No need to access the kmem_cache structure. We have the same value
in kmem_cache_cpu.
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Christoph Lameter [Sat, 16 Feb 2008 07:45:24 +0000 (23:45 -0800)]
slub: Remove useless checks in alloc_debug_processing
Alloc debug processing is never called with a NULL object pointer.
No reason to check for NULL.
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Christoph Lameter [Sat, 16 Feb 2008 07:45:24 +0000 (23:45 -0800)]
slub: Remove objsize check in kmem_cache_flags()
There is no page->offset anymore and also no associated limit on the number
of objects. The page->offset field was removed for 2.6.24. So the check
in kmem_cache_flags() is now also obsolete (should have been dropped
earlier, somehow a hunk vanished).
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-by: Christoph Lameter <clameter@sgi.com>
Christoph Lameter [Fri, 15 Feb 2008 23:22:21 +0000 (15:22 -0800)]
slub: rename slab_objects to show_slab_objects
The sysfs callback is better named show_slab_objects since it is always
called from the xxx_show callbacks. We need the name for other purposes
later.
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Christoph Lameter [Sat, 1 Mar 2008 21:40:44 +0000 (13:40 -0800)]
Revert "unique end pointer" patch
This only made sense for the alternate fastpath which was reverted last week.
Mathieu is working on a new version that addresses the fastpath issues but that
new code first needs to go through mm and it is not clear if we need the
unique end pointers with his new scheme.
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Christoph Lameter [Mon, 3 Mar 2008 19:18:08 +0000 (11:18 -0800)]
Merge branch 'master' of git://git./linux/kernel/git/torvalds/linux-2.6
Linus Torvalds [Mon, 3 Mar 2008 18:47:52 +0000 (10:47 -0800)]
Merge branch 'for-linus' of /home/rmk/linux-2.6-arm
* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] Fix freeing of page tables for ARM in free_pgd_slow
Randy Dunlap [Sat, 1 Mar 2008 06:03:27 +0000 (22:03 -0800)]
docbook: fix fusion source files
Fix docbook problems in fusion source files.
These cause the generated docbook to be incorrect.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Sat, 1 Mar 2008 06:03:15 +0000 (22:03 -0800)]
docbook: fix kernel-api source files
Fix docbook problems in kernel-api.tmpl.
These cause the generated docbook to be incorrect.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Sat, 1 Mar 2008 06:03:07 +0000 (22:03 -0800)]
docbook: fix usb source files
Fix docbook problems in USB source files.
These cause the generated docbook to be incorrect.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Sat, 1 Mar 2008 06:02:50 +0000 (22:02 -0800)]
docbook: fix scsi source file
Fix docbook problem in SCSI source files.
These cause the generated docbook to be incorrect.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Sat, 1 Mar 2008 06:02:40 +0000 (22:02 -0800)]
docbook: fix rapidio source files
Fix docbook problems in rapidio source files.
These cause the generated docbook to be incorrect.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Sat, 1 Mar 2008 06:02:31 +0000 (22:02 -0800)]
docbook: fix filesystems.tmpl source files
Fix docbook problems in filesystems.tmpl.
These cause the generated docbook to be incorrect.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 3 Mar 2008 18:36:50 +0000 (10:36 -0800)]
Merge git://git./linux/kernel/git/x86/linux-2.6-x86
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
x86: revert "x86: fix pmd_bad and pud_bad to support huge pages"
x86: revert "x86: CPA: avoid split of alias mappings"
Linus Torvalds [Mon, 3 Mar 2008 18:35:38 +0000 (10:35 -0800)]
Merge branch 'merge' of git://git./linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (24 commits)
[POWERPC] Convert the cell IOMMU fixed mapping to 16M IOMMU pages
[POWERPC] Allow for different IOMMU page sizes in cell IOMMU code
[POWERPC] Cell IOMMU: n_pte_pages is in 4K page units, not IOMMU_PAGE_SIZE
[POWERPC] Split setup of IOMMU stab and ptab, allocate dynamic/fixed ptabs separately
[POWERPC] Move allocation of cell IOMMU pad page
[POWERPC] Remove unused pte_offset variable
[POWERPC] Use it_offset not pte_offset in cell IOMMU code
[POWERPC] Clearup cell IOMMU fixed mapping terminology
[POWERPC] enable hardware watchpoints on cell blades
[POWERPC] move celleb DABRX definitions
[POWERPC] OProfile: enable callgraph support for Cell
[POWERPC] spufs: fix use time accounting on SPE-overcommit
[POWERPC] spufs: serialize SLB invalidation against SLB loading
[POWERPC] spufs: invalidate SLB translation before adding a new entry
[POWERPC] spufs: synchronize IRQ when disabling
[POWERPC] spufs: fix order of sputrace thread IDs
[POWERPC] Xilinx: hwicap cleanup
[POWERPC] 4xx: Use correct board info structure in cuboot wrappers
[POWERPC] spufs: fix invalid scheduling of forgotten contexts
[POWERPC] 44x: add missing define TARGET_4xx and TARGET_440GX to cuboot-taishan
...
Linus Torvalds [Mon, 3 Mar 2008 18:12:14 +0000 (10:12 -0800)]
Allow ARG_MAX execve string space even with a small stack limit
The new code that removed the limitation on the execve string size
(which was historically 32 pages) replaced it with a much softer limit
based on RLIMIT_STACK which is usually much larger than the traditional
limit. See commit
b6a2fea39318e43fee84fa7b0b90d68bed92d2ba ("mm:
variable length argument support") for details.
However, if you have a small stack limit (perhaps because you need lots
of stacks in a threaded environment), the new heuristic of allowing up
to 1/4th of RLIMIT_STACK to be used for argument and environment strings
could actually be smaller than the old limit.
So just say that it's ok to have up to ARG_MAX strings regardless of the
value of RLIMIT_STACK, and check the rlimit only when going over that
traditional limit.
(Of course, if you actually have a *really* small stack limit, the whole
stack itself will be limited before you hit ARG_MAX, but that has always
been true and is clearly the right behaviour anyway).
Acked-by: Carlos O'Donell <carlos@codesourcery.com>
Cc: Michael Kerrisk <michael.kerrisk@googlemail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ollie Wild <aaw@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 3 Mar 2008 18:02:44 +0000 (10:02 -0800)]
Revert "x86: fix pmd_bad and pud_bad to support huge pages"
This reverts commit
cded932b75ab0a5f9181ee3da34a0a488d1a14fd.
Arjan bisected down a boot-time hang to this, saying:
".. it prevents the kernel to finish booting on my (Penryn based)
laptop. The boot stops right after freeing the init memory."
and while it's not clear exactly what triggers it, at this stage we're
better off just reverting it while Ingo tries to figure out what went
wrong.
Requested-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Hans Rosenfeld <hans.rosenfeld@amd.com>
Cc: Nish Aravamudan <nish.aravamudan@gmail.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ingo Molnar [Mon, 3 Mar 2008 12:53:58 +0000 (13:53 +0100)]
x86: revert "x86: fix pmd_bad and pud_bad to support huge pages"
revert commit
cded932b75ab0a5f9181ee3da34a0a488d1a14fd,
"x86: fix pmd_bad and pud_bad to support huge pages", it causes
a bootup hang, as reported and bisected by Arjan van de Ven.
Bisected-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Rafael J. Wysocki [Mon, 3 Mar 2008 00:17:37 +0000 (01:17 +0100)]
x86: revert "x86: CPA: avoid split of alias mappings"
Revert:
commit
8be8f54bae3453588011cad06363813a5293af53
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Sat Feb 23 20:43:21 2008 +0100
x86: CPA: avoid split of alias mappings
because it clearly mishandles the case when __change_page_attr(), called
from __change_page_attr_set_clr(), changes cpa->processed to 1 and
cpa_process_alias(cpa) is executed right after that.
This crashes my x86-64 test box early in the boot process
(ref. http://bugzilla.kernel.org/show_bug.cgi?id=10140#c4).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Paul Mackerras [Mon, 3 Mar 2008 10:31:09 +0000 (21:31 +1100)]
Merge branch 'for-2.6.25' of /linux/kernel/git/arnd/cell-2.6 into merge
Joerg Roedel [Mon, 11 Feb 2008 19:28:27 +0000 (20:28 +0100)]
KVM: emulate access to MSR_IA32_MCG_CTL
Injecting an GP when accessing this MSR lets Windows crash when running some
stress test tools in KVM. So this patch emulates access to this MSR.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Avi Kivity [Mon, 11 Feb 2008 16:37:23 +0000 (18:37 +0200)]
KVM: Make the supported cpuid list a host property rather than a vm property
One of the use cases for the supported cpuid list is to create a "greatest
common denominator" of cpu capabilities in a server farm. As such, it is
useful to be able to get the list without creating a virtual machine first.
Since the code does not depend on the vm in any way, all that is needed is
to move it to the device ioctl handler. The capability identifier is also
changed so that binaries made against -rc1 will fail gracefully.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Paul Knowles [Wed, 6 Feb 2008 11:02:35 +0000 (11:02 +0000)]
KVM: Fix kvm_arch_vcpu_ioctl_set_sregs so that set_cr0 works properly
Whilst working on getting a VM to initialize in to IA32e mode I found
this issue. set_cr0 relies on comparing the old cr0 to the new one to
work correctly. Move the assignment below so the compare can work.
Signed-off-by: Paul Knowles <paul@transitive.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Joerg Roedel [Tue, 29 Jan 2008 12:01:27 +0000 (13:01 +0100)]
KVM: SVM: set NM intercept when enabling CR0.TS in the guest
Explicitly enable the NM intercept in svm_set_cr0 if we enable TS in the guest
copy of CR0 for lazy FPU switching. This fixes guest SMP with Linux under SVM.
Without that patch Linux deadlocks or panics right after trying to boot the
other CPUs.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Joerg Roedel [Mon, 21 Jan 2008 12:09:33 +0000 (13:09 +0100)]
KVM: SVM: Fix lazy FPU switching
If the guest writes to cr0 and leaves the TS flag at 0 while vcpu->fpu_active
is also 0, the TS flag in the guest's cr0 gets lost. This leads to corrupt FPU
state an causes Windows Vista 64bit to crash very soon after boot. This patch
fixes this bug.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Michael Ellerman [Fri, 29 Feb 2008 07:33:29 +0000 (18:33 +1100)]
[POWERPC] Convert the cell IOMMU fixed mapping to 16M IOMMU pages
The only tricky part is we need to adjust the PTE insertion loop to
cater for holes in the page table. The PTEs for each segment start on
a 4K boundary, so with 16M pages we have 16 PTEs per segment and then
a gap to the next 4K page boundary.
It might be possible to allocate the PTEs for each segment separately,
saving the memory currently filling the gaps. However we'd need to
check that's OK with the hardware, and that it actually saves memory.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Michael Ellerman [Fri, 29 Feb 2008 07:33:27 +0000 (18:33 +1100)]
[POWERPC] Allow for different IOMMU page sizes in cell IOMMU code
Make some preliminary changes to cell_iommu_alloc_ptab() to allow it to
take the page size as a parameter rather than assuming IOMMU_PAGE_SIZE.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Michael Ellerman [Fri, 29 Feb 2008 07:33:26 +0000 (18:33 +1100)]
[POWERPC] Cell IOMMU: n_pte_pages is in 4K page units, not IOMMU_PAGE_SIZE
We use n_pte_pages to calculate the stride through the page tables, but
we also use it to set the NPPT value in the segment table entry. That is
defined as the number of 4K pages per segment, so we should calculate
it as such regardless of the IOMMU page size.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Michael Ellerman [Fri, 29 Feb 2008 07:33:25 +0000 (18:33 +1100)]
[POWERPC] Split setup of IOMMU stab and ptab, allocate dynamic/fixed ptabs separately
Currently the cell IOMMU code allocates the entire IOMMU page table in a
contiguous chunk. This is nice and tidy, but for machines with larger
amounts of RAM the page table allocation can fail due to it simply being
too large.
So split the segment table and page table setup routine, and arrange to
have the dynamic and fixed page tables allocated separately.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Michael Ellerman [Fri, 29 Feb 2008 07:33:24 +0000 (18:33 +1100)]
[POWERPC] Move allocation of cell IOMMU pad page
There's no need to allocate the pad page unless we're going to actually
use it - so move the allocation to where we know we're going to use it.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Michael Ellerman [Fri, 29 Feb 2008 07:33:23 +0000 (18:33 +1100)]
[POWERPC] Remove unused pte_offset variable
The cell IOMMU code no longer needs to save the pte_offset variable
separately, it is incorporated into tbl->it_offset.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Michael Ellerman [Fri, 29 Feb 2008 07:33:23 +0000 (18:33 +1100)]
[POWERPC] Use it_offset not pte_offset in cell IOMMU code
The cell IOMMU tce build and free routines use pte_offset to convert
the index passed from the generic IOMMU code into a page table offset.
This takes into account the SPIDER_DMA_OFFSET which sets the top bit
of every DMA address.
However it doesn't cater for the IOMMU window starting at a non-zero
address, as the base of the window is not incorporated into pte_offset
at all.
As it turns out tbl->it_offset already contains the value we need, it
takes into account the base of the window and also pte_offset. So use
it instead!
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Michael Ellerman [Fri, 29 Feb 2008 07:33:22 +0000 (18:33 +1100)]
[POWERPC] Clearup cell IOMMU fixed mapping terminology
It's called the fixed mapping, not the static mapping.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Jens Osterkamp [Thu, 28 Feb 2008 10:27:31 +0000 (11:27 +0100)]
[POWERPC] enable hardware watchpoints on cell blades
Ulrich Weigand has found that the hardware watchpoints on cell were not
working back in November :
http://ozlabs.org/pipermail/linuxppc-dev/2007-November/046135.html
This patch sets them during initialization.
Signed-off-by: Jens Osterkamp <jens@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Jens Osterkamp [Thu, 28 Feb 2008 10:26:21 +0000 (11:26 +0100)]
[POWERPC] move celleb DABRX definitions
This moves the private DABRX definitions for celleb from beat.h to
reg.h to make them usable for all.
Signed-off-by: Jens Osterkamp <jens@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Bob Nelson [Wed, 20 Feb 2008 04:00:56 +0000 (05:00 +0100)]
[POWERPC] OProfile: enable callgraph support for Cell
This patch enables OProfile callgraph support for the Cell processor. The
original code was just calling a function to add the PC value, now it will
call a function that first checks the callgraph depth. Callgraph is already
enabled on the other Power platforms.
Signed-off-by: Bob Nelson <rrnelson@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Paul Mackerras [Mon, 3 Mar 2008 06:44:06 +0000 (17:44 +1100)]
Merge branch 'master' of git://git./linux/kernel/git/jk/spufs into merge
Paul Mackerras [Mon, 3 Mar 2008 06:38:23 +0000 (17:38 +1100)]
Merge branch 'for-2.6.25' of /linux/kernel/git/jwboyer/powerpc-4xx into merge
Steve French [Mon, 3 Mar 2008 01:53:49 +0000 (01:53 +0000)]
Merge branch 'master' of /linux/kernel/git/torvalds/linux-2.6
Linus Torvalds [Sun, 2 Mar 2008 20:38:17 +0000 (12:38 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ieee1394/linux1394-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: fix crash in automatic module unloading
firewire: potentially invalid pointers used in fw_card_bm_work
firewire: fw-sbp2: better fix for NULL pointer dereference in scsi_remove_device
Stefan Richter [Wed, 27 Feb 2008 21:14:27 +0000 (22:14 +0100)]
firewire: fix crash in automatic module unloading
"modprobe firewire-ohci; sleep .1; modprobe -r firewire-ohci" used to
result in crashes like this:
BUG: unable to handle kernel paging request at
ffffffff8807b455
IP: [<
ffffffff8807b455>]
PGD 203067 PUD 207063 PMD
7c170067 PTE 0
Oops: 0010 [1] PREEMPT SMP
CPU 0
Modules linked in: i915 drm cpufreq_ondemand acpi_cpufreq freq_table applesmc input_polldev led_class coretemp hwmon eeprom snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss button thermal processor sg snd_hda_intel snd_pcm snd_timer snd snd_page_alloc sky2 i2c_i801 rtc [last unloaded: crc_itu_t]
Pid: 9, comm: events/0 Not tainted 2.6.25-rc2 #3
RIP: 0010:[<
ffffffff8807b455>] [<
ffffffff8807b455>]
RSP: 0018:
ffff81007dcdde88 EFLAGS:
00010246
RAX:
ffff81007dc95040 RBX:
ffff81007dee5390 RCX:
0000000000005e13
RDX:
0000000000008c8b RSI:
0000000000000001 RDI:
ffff81007dee5388
RBP:
ffff81007dc5eb40 R08:
0000000000000002 R09:
ffffffff8022d05c
R10:
ffffffff8023b34c R11:
ffffffff8041a353 R12:
ffff81007dee5388
R13:
ffffffff8807b455 R14:
ffffffff80593bc0 R15:
0000000000000000
FS:
0000000000000000(0000) GS:
ffffffff8055a000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0:
000000008005003b
CR2:
ffffffff8807b455 CR3:
0000000000201000 CR4:
00000000000006e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000ffff0ff0 DR7:
0000000000000400
Process events/0 (pid: 9, threadinfo
ffff81007dcdc000, task
ffff81007dc95040)
Stack:
ffffffff8023b396 ffffffff88082524 0000000000000000 ffffffff8807d9ae
ffff81007dc5eb40 ffff81007dc9dce0 ffff81007dc5eb40 ffff81007dc5eb80
ffff81007dc9dce0 ffffffffffffffff ffffffff8023be87 0000000000000000
Call Trace:
[<
ffffffff8023b396>] ? run_workqueue+0xdf/0x1df
[<
ffffffff8023be87>] ? worker_thread+0xd8/0xe3
[<
ffffffff8023e917>] ? autoremove_wake_function+0x0/0x2e
[<
ffffffff8023bdaf>] ? worker_thread+0x0/0xe3
[<
ffffffff8023e813>] ? kthread+0x47/0x74
[<
ffffffff804198e0>] ? trace_hardirqs_on_thunk+0x35/0x3a
[<
ffffffff8020c008>] ? child_rip+0xa/0x12
[<
ffffffff8020b6e3>] ? restore_args+0x0/0x3d
[<
ffffffff8023e68a>] ? kthreadd+0x14c/0x171
[<
ffffffff8023e68a>] ? kthreadd+0x14c/0x171
[<
ffffffff8023e7cc>] ? kthread+0x0/0x74
[<
ffffffff8020bffe>] ? child_rip+0x0/0x12
Code: Bad RIP value.
RIP [<
ffffffff8807b455>]
RSP <
ffff81007dcdde88>
CR2:
ffffffff8807b455
---[ end trace
c7366c6657fe5bed ]---
Note that this crash happened _after_ firewire-core was unloaded. The
shared workqueue tried to run firewire-core's device initialization jobs
or similar jobs.
The fix makes sure that firewire-ohci and hence firewire-core is not
unloaded before all device shutdown jobs have been completed. This is
determined by the count of device initializations minus device releases.
Also skip useless retries in the node initialization job if the node is
to be shut down.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
Stefan Richter [Sun, 24 Feb 2008 17:57:23 +0000 (18:57 +0100)]
firewire: potentially invalid pointers used in fw_card_bm_work
The bus management workqueue job was in danger to dereference NULL
pointers. Also, after having temporarily lifted card->lock, a few node
pointers and a device pointer may have become invalid.
Add NULL pointer checks and get the necessary references. Also, move
card->local_node out of fw_card_bm_work's sight during shutdown of the
card.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
Stefan Richter [Tue, 26 Feb 2008 22:30:02 +0000 (23:30 +0100)]
firewire: fw-sbp2: better fix for NULL pointer dereference in scsi_remove_device
Patch "firewire: fw-sbp2: fix NULL pointer deref. in scsi_remove_device"
had the unintended effect that firewire-sbp2 could not be unloaded
anymore until all SBP-2 devices were unplugged.
We now fix the NULL pointer bug by reacquiring a reference to the sdev
instead of holding a reference to the sdev (and to the module) all the
time.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Tested-by: Jarod Wilson <jwilson@redhat.com>
Uwe Kleine-König [Wed, 27 Feb 2008 12:44:59 +0000 (13:44 +0100)]
[ARM] Fix freeing of page tables for ARM in free_pgd_slow
Since
2f569af (CONFIG_HIGHPTE vs. sub-page page tables.) pte_free() calls
pte_lock_deinit() and dec_zone_page_state(). So free_pgd_slow must not call
the latter two when calling the first.
Signed-off-by: Uwe Kleine-König <Uwe.Kleine-Koenig@digi.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Steve French [Sat, 1 Mar 2008 18:29:55 +0000 (18:29 +0000)]
Merge branch 'master' of /linux/kernel/git/torvalds/linux-2.6
Steve Grubb [Thu, 21 Feb 2008 21:59:22 +0000 (16:59 -0500)]
[PATCH] drop EOE records from printk
Hi,
While we are looking at the printk issue, I see that its printk'ing the EOE
(end of event) records which is really not something that we need in syslog.
Its really intended for the realtime audit event stream handled by the audit
daemon. So, lets avoid printk'ing that record type.
Signed-off-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>