Eric Sandeen [Sat, 22 Dec 2007 22:03:26 +0000 (14:03 -0800)]
ecryptfs: redo dget,mntget on dentry_open failure
Thanks to Jeff Moyer for pointing this out.
If the RDWR dentry_open() in ecryptfs_init_persistent_file fails,
it will do a dput/mntput. Need to re-take references if we
retry as RDONLY.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Acked-by: Mike Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Sandeen [Sat, 22 Dec 2007 22:03:26 +0000 (14:03 -0800)]
ecryptfs: fix unlocking in error paths
Thanks to Josef Bacik for finding these.
A couple of ecryptfs error paths don't properly unlock things they locked.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Cc: Josef Bacik <jbacik@redhat.com>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Kara [Sat, 22 Dec 2007 22:03:25 +0000 (14:03 -0800)]
Don't send quota messages repeatedly when hardlimit reached
We should send quota message to netlink only once when hardlimit is
reached. Otherwise user could easily make the system busy by trying to
exceed the hardlimit (and also the messages could be anoying if you cannot
stop writing just now).
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Kara [Sat, 22 Dec 2007 22:03:25 +0000 (14:03 -0800)]
Fix computation of SKB size for quota messages
Fix computation of size of skb needed for quota message. We should use
netlink provided functions and not just an ad-hoc number. Also don't print
the return value from nla_put_foo() as it is always -1.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Sandeen [Sat, 22 Dec 2007 22:03:24 +0000 (14:03 -0800)]
ecryptfs: fix string overflow on long cipher names
Passing a cipher name > 32 chars on mount results in an overflow when the
cipher name is printed, because the last character in the struct
ecryptfs_key_tfm's cipher_name string was never zeroed.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Acked-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Lameter [Sat, 22 Dec 2007 22:03:23 +0000 (14:03 -0800)]
quicklists: do not release off node pages early
quicklists must keep even off node pages on the quicklists until the TLB
flush has been completed.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 21 Dec 2007 23:52:24 +0000 (15:52 -0800)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (23 commits)
[IPV4]: OOPS with NETLINK_FIB_LOOKUP netlink socket
[NET]: Fix function put_cmsg() which may cause usr application memory overflow
[ATM]: Spelling fixes
[NETFILTER] ipv4: Spelling fixes
[NETFILTER]: Spelling fixes
[SCTP]: Spelling fixes
[NETLABEL]: Spelling fixes
[PKT_SCHED]: Spelling fixes
[NET] net/core/: Spelling fixes
[IPV6]: Spelling fixes
[IRDA]: Spelling fixes
[DCCP]: Spelling fixes
[NET] include/net/: Spelling fixes
[NET]: Correct two mistaken skb_reset_mac_header() conversions.
[IPV4] ip_gre: set mac_header correctly in receive path
[XFRM]: Audit function arguments misordered
[IPSEC]: Avoid undefined shift operation when testing algorithm ID
[IPV4] ARP: Remove not used code
[TG3]: Endianness bugfix.
[TG3]: Endianness annotations.
...
Linus Torvalds [Fri, 21 Dec 2007 23:52:01 +0000 (15:52 -0800)]
Merge git://git./linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC32]: Spelling fixes
[SPARC64]: Spelling fixes
[SPARC64]: Fix OOPS in dma_sync_*_for_device()
Christoph Lameter [Fri, 21 Dec 2007 22:37:37 +0000 (14:37 -0800)]
SLUB: Improve hackbench speed
Increase the mininum number of partial slabs to keep around and put
partial slabs to the end of the partial queue so that they can add
more objects.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Denis V. Lunev [Fri, 21 Dec 2007 10:01:53 +0000 (02:01 -0800)]
[IPV4]: OOPS with NETLINK_FIB_LOOKUP netlink socket
[ Regression added by changeset:
cd40b7d3983c708aabe3d3008ec64ffce56d33b0
[NET]: make netlink user -> kernel interface synchronious
-DaveM ]
nl_fib_input re-reuses incoming skb to send the reply. This means that this
packet will be freed twice, namely in:
- netlink_unicast_kernel
- on receive path
Use clone to send as a cure, the caller is responsible for kfree_skb on error.
Thanks to Alexey Dobryan, who originally found the problem.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 21 Dec 2007 01:25:48 +0000 (17:25 -0800)]
Linux 2.6.24-rc6
Linus Torvalds [Fri, 21 Dec 2007 01:02:37 +0000 (17:02 -0800)]
Merge git://git./linux/kernel/git/x86/linux-2.6-x86
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
x86: intel_cacheinfo.c: cpu cache info entry for Intel Tolapai
x86: fix die() to not be preemptible
Linus Torvalds [Fri, 21 Dec 2007 01:02:22 +0000 (17:02 -0800)]
Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6:
[XFS] Initialise current offset in xfs_file_readdir correctly
[XFS] Fix mknod regression
Lachlan McIlroy [Fri, 21 Dec 2007 00:00:23 +0000 (11:00 +1100)]
[XFS] Initialise current offset in xfs_file_readdir correctly
After reading the directory contents into the temporary buffer, we grab
each dirent and pass it to filldir witht eh current offset of the dirent.
The current offset was not being set for the first dirent in the temporary
buffer, which coul dresult in bad offsets being set in the f_pos field
result in looping and duplicate entries being returned from readdir.
SGI-PV: 974905
SGI-Modid: xfs-linux-melb:xfs-kern:30282a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Christoph Hellwig [Thu, 20 Dec 2007 23:58:56 +0000 (10:58 +1100)]
[XFS] Fix mknod regression
This was broken by my '[XFS] simplify xfs_create/mknod/symlink prototype',
which assigned the re-shuffled ondisk dev_t back to the rdev variable in
xfs_vn_mknod. Because of that i_rdev is set to the ondisk dev_t instead of
the linux dev_t later down the function.
Fortunately the fix for it is trivial: we can just remove the assignment
because xfs_revalidate_inode has done the proper job before unlocking the
inode.
SGI-PV: 974873
SGI-Modid: xfs-linux-melb:xfs-kern:30273a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Jason Gaston [Fri, 21 Dec 2007 00:27:19 +0000 (01:27 +0100)]
x86: intel_cacheinfo.c: cpu cache info entry for Intel Tolapai
This patch adds a cpu cache info entry for the Intel Tolapai cpu.
Signed-off-by: Jason Gaston <jason.d.gaston@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Fri, 21 Dec 2007 00:27:19 +0000 (01:27 +0100)]
x86: fix die() to not be preemptible
Andrew "Eagle Eye" Morton noticed that we use raw_local_save_flags()
instead of raw_local_irq_save(flags) in die(). This allows the
preemption of oopsing contexts - which is highly undesirable. It also
causes CONFIG_DEBUG_PREEMPT to complain, as reported by Miles Lane.
this bug was introduced via:
commit
39743c9ef717fd4f2b5583f010115c5f2482b8ae
Author: Andi Kleen <ak@suse.de>
Date: Fri Oct 19 20:35:03 2007 +0200
x86: use raw locks during oopses
- spin_lock_irqsave(&die.lock, flags);
+ __raw_spin_lock(&die.lock);
+ raw_local_save_flags(flags);
that is not a correct open-coding of spin_lock_irqsave(): both the
ordering is wrong (irqs should be disabled _first_), and the wrong
flags-saving API was used.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Wei Yongjun [Thu, 20 Dec 2007 22:36:44 +0000 (14:36 -0800)]
[NET]: Fix function put_cmsg() which may cause usr application memory overflow
When used function put_cmsg() to copy kernel information to user
application memory, if the memory length given by user application is
not enough, by the bad length calculate of msg.msg_controllen,
put_cmsg() function may cause the msg.msg_controllen to be a large
value, such as 0xFFFFFFF0, so the following put_cmsg() can also write
data to usr application memory even usr has no valid memory to store
this. This may cause usr application memory overflow.
int put_cmsg(struct msghdr * msg, int level, int type, int len, void *data)
{
struct cmsghdr __user *cm
= (__force struct cmsghdr __user *)msg->msg_control;
struct cmsghdr cmhdr;
int cmlen = CMSG_LEN(len);
~~~~~~~~~~~~~~~~~~~~~
int err;
if (MSG_CMSG_COMPAT & msg->msg_flags)
return put_cmsg_compat(msg, level, type, len, data);
if (cm==NULL || msg->msg_controllen < sizeof(*cm)) {
msg->msg_flags |= MSG_CTRUNC;
return 0; /* XXX: return error? check spec. */
}
if (msg->msg_controllen < cmlen) {
~~~~~~~~~~~~~~~~~~~~~~~~
msg->msg_flags |= MSG_CTRUNC;
cmlen = msg->msg_controllen;
}
cmhdr.cmsg_level = level;
cmhdr.cmsg_type = type;
cmhdr.cmsg_len = cmlen;
err = -EFAULT;
if (copy_to_user(cm, &cmhdr, sizeof cmhdr))
goto out;
if (copy_to_user(CMSG_DATA(cm), data, cmlen - sizeof(struct cmsghdr)))
goto out;
cmlen = CMSG_SPACE(len);
~~~~~~~~~~~~~~~~~~~~~~~~~~~
If MSG_CTRUNC flags is set, msg->msg_controllen is less than
CMSG_SPACE(len), "msg->msg_controllen -= cmlen" will cause unsinged int
type msg->msg_controllen to be a large value.
~~~~~~~~~~~~~~~~~~~~~~~~~~~
msg->msg_control += cmlen;
msg->msg_controllen -= cmlen;
~~~~~~~~~~~~~~~~~~~~~
err = 0;
out:
return err;
}
The same promble exists in put_cmsg_compat(). This patch can fix this
problem.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 20 Dec 2007 22:05:37 +0000 (14:05 -0800)]
[ATM]: Spelling fixes
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 20 Dec 2007 22:05:03 +0000 (14:05 -0800)]
[NETFILTER] ipv4: Spelling fixes
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 20 Dec 2007 22:04:24 +0000 (14:04 -0800)]
[NETFILTER]: Spelling fixes
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 20 Dec 2007 22:03:52 +0000 (14:03 -0800)]
[SCTP]: Spelling fixes
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 20 Dec 2007 22:03:11 +0000 (14:03 -0800)]
[NETLABEL]: Spelling fixes
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 20 Dec 2007 22:02:40 +0000 (14:02 -0800)]
[PKT_SCHED]: Spelling fixes
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 20 Dec 2007 22:02:06 +0000 (14:02 -0800)]
[NET] net/core/: Spelling fixes
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 20 Dec 2007 22:01:35 +0000 (14:01 -0800)]
[IPV6]: Spelling fixes
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 20 Dec 2007 22:00:51 +0000 (14:00 -0800)]
[IRDA]: Spelling fixes
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 20 Dec 2007 21:59:39 +0000 (13:59 -0800)]
[DCCP]: Spelling fixes
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 20 Dec 2007 21:56:32 +0000 (13:56 -0800)]
[NET] include/net/: Spelling fixes
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 20 Dec 2007 21:55:45 +0000 (13:55 -0800)]
[SPARC32]: Spelling fixes
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 20 Dec 2007 21:55:10 +0000 (13:55 -0800)]
[SPARC64]: Spelling fixes
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 20 Dec 2007 19:53:41 +0000 (11:53 -0800)]
Merge git://git./linux/kernel/git/mingo/linux-2.6-sched
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
debug: add end-of-oops marker
sched: rt: account the cpu time during the tick
Linus Torvalds [Thu, 20 Dec 2007 19:25:22 +0000 (11:25 -0800)]
Merge git://git./linux/kernel/git/agk/linux-2.6-dm
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm:
dm crypt: use bio_add_page
dm: merge max_hw_sector
dm: trigger change uevent on rename
dm crypt: fix write endio
dm mpath: hp requires scsi
dm: table detect io beyond device
Milan Broz [Thu, 13 Dec 2007 14:16:10 +0000 (14:16 +0000)]
dm crypt: use bio_add_page
Fix possible max_phys_segments violation in cloned dm-crypt bio.
In write operation dm-crypt needs to allocate new bio request
and run crypto operation on this clone. Cloned request has always
the same size, but number of physical segments can be increased
and violate max_phys_segments restriction.
This can lead to data corruption and serious hardware malfunction.
This was observed when using XFS over dm-crypt and at least
two HBA controller drivers (arcmsr, cciss) recently.
Fix it by using bio_add_page() call (which tests for other
restrictions too) instead of constructing own biovec.
All versions of dm-crypt are affected by this bug.
Cc: stable@kernel.org
Cc: dm-crypt@saout.de
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Neil Brown [Thu, 13 Dec 2007 14:16:04 +0000 (14:16 +0000)]
dm: merge max_hw_sector
Make sure dm honours max_hw_sectors of underlying devices
We still have no firm testing evidence in support of this patch but
believe it may help to resolve some bug reports. - agk
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Alasdair G Kergon [Thu, 13 Dec 2007 14:15:57 +0000 (14:15 +0000)]
dm: trigger change uevent on rename
Insert a missing KOBJ_CHANGE notification when a device is renamed.
Cc: Scott James Remnant <scott@ubuntu.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Milan Broz [Thu, 13 Dec 2007 14:15:51 +0000 (14:15 +0000)]
dm crypt: fix write endio
Fix BIO_UPTODATE test for write io.
Cc: stable@kernel.org
Cc: dm-crypt@saout.de
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Paul Mundt [Thu, 13 Dec 2007 14:15:43 +0000 (14:15 +0000)]
dm mpath: hp requires scsi
With CONFIG_SCSI=n __scsi_print_sense() is never linked in.
drivers/built-in.o: In function `hp_sw_end_io':
dm-mpath-hp-sw.c:(.text+0x914f8): undefined reference to `__scsi_print_sense'
Caught with a randconfig on current git.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Jun'ichi Nomura [Thu, 13 Dec 2007 14:15:25 +0000 (14:15 +0000)]
dm: table detect io beyond device
This patch fixes a panic on shrinking a DM device if there is
outstanding I/O to the part of the device that is being removed.
(Normally this doesn't happen - a filesystem would be resized first,
for example.)
The bug is that __clone_and_map() assumes dm_table_find_target()
always returns a valid pointer. It may fail if a bio arrives from the
block layer but its target sector is no longer included in the DM
btree.
This patch appends an empty entry to table->targets[] which will
be returned by a lookup beyond the end of the device.
After calling dm_table_find_target(), __clone_and_map() and target_message()
check for this condition using
dm_target_is_valid().
Sample test script to trigger oops:
Ivan Kokshaysky [Thu, 20 Dec 2007 08:47:07 +0000 (11:47 +0300)]
mm: fix exit_mmap BUG() on a.out binary exit
The problem was introduced by commit "mm: variable length argument
support" (
b6a2fea39318e43fee84fa7b0b90d68bed92d2ba)
as it didn't update fs/binfmt_aout.c like other binfmt's.
I noticed that on alpha when accidentally launched old OSF/1
Acrobat Reader binary. Obviously, other architectures are affected
as well.
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Ollie Wild <aaw@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arjan van de Ven [Thu, 20 Dec 2007 14:01:17 +0000 (15:01 +0100)]
debug: add end-of-oops marker
Right now it's nearly impossible for parsers that collect kernel crashes
from logs or emails (such as www.kerneloops.org) to detect the
end-of-oops condition. In addition, it's not currently possible to
detect whether or not 2 oopses that look alike are actually the same
oops reported twice, or are truly two unique oopses.
This patch adds an end-of-oops marker, and makes the end marker include
a very simple 64-bit random ID to be able to detect duplicate reports.
Normally, this ID is calculated as a late_initcall() (in the hope that
at that time there is enough entropy to get a unique enough ID); however
for early oopses the oops_exit() function needs to generate the ID on
the fly.
We do this all at the _end_ of an oops printout, so this does not impact
our ability to get the most important portions of a crash out to the
console first.
[ Sidenote: the already existing oopses-since-bootup counter we print
during crashes serves as the differentiator between multiple oopses
that trigger during the same bootup. ]
Tested on 32-bit and 64-bit x86. Artificially injected very early
crashes as well, as expected they result in this constant ID after
multiple bootups:
---[ end trace
ca143223eefdc828 ]---
---[ end trace
ca143223eefdc828 ]---
because the random pools are still all zero. But it all still works
fine and causes no additional problems (which is the main goal of
instrumentation code).
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Thu, 20 Dec 2007 14:01:17 +0000 (15:01 +0100)]
sched: rt: account the cpu time during the tick
Realtime tasks would not account their runtime during ticks. Which would lead
to:
struct sched_param param = { .sched_priority = 10 };
pthread_setschedparam(pthread_self(), SCHED_FIFO, ¶m);
while (1) ;
Not showing up in top.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
David S. Miller [Thu, 20 Dec 2007 09:29:45 +0000 (01:29 -0800)]
[SPARC64]: Fix OOPS in dma_sync_*_for_device()
I included these operations vector cases for situations
where we never need to do anything, the entries aren't
filled in by any implementation, so we OOPS trying to
invoke NULL pointer functions.
Really make them NOPs, to fix the bug.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 20 Dec 2007 08:25:54 +0000 (00:25 -0800)]
[NET]: Correct two mistaken skb_reset_mac_header() conversions.
This operation helper abstracts:
skb->mac_header = skb->data;
but it was done in two more places which were actually:
skb->mac_header = skb->network_header;
and those are corrected here.
Signed-off-by: David S. Miller <davem@davemloft.net>
Timo Teras [Thu, 20 Dec 2007 08:10:33 +0000 (00:10 -0800)]
[IPV4] ip_gre: set mac_header correctly in receive path
mac_header update in ipgre_recv() was incorrectly changed to
skb_reset_mac_header() when it was introduced.
Signed-off-by: Timo Teras <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Moore [Thu, 20 Dec 2007 08:00:45 +0000 (00:00 -0800)]
[XFRM]: Audit function arguments misordered
In several places the arguments to the xfrm_audit_start() function are
in the wrong order resulting in incorrect user information being
reported. This patch corrects this by pacing the arguments in the
correct order.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Thu, 20 Dec 2007 07:44:29 +0000 (23:44 -0800)]
[IPSEC]: Avoid undefined shift operation when testing algorithm ID
The aalgos/ealgos fields are only 32 bits wide. However, af_key tries
to test them with the expression 1 << id where id can be as large as
253. This produces different behaviour on different architectures.
The following patch explicitly checks whether ID is greater than 31
and fails the check if that's the case.
We cannot easily extend the mask to be longer than 32 bits due to
exposure to user-space. Besides, this whole interface is obsolete
anyway in favour of the xfrm_user interface which doesn't use this
bit mask in templates (well not within the kernel anyway).
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mark Ryden [Thu, 20 Dec 2007 07:38:11 +0000 (23:38 -0800)]
[IPV4] ARP: Remove not used code
In arp_process() (net/ipv4/arp.c), there is unused code: definition
and assignment of tha (target hw address ).
Signed-off-by: Mark Ryden <markryde@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Al Viro [Tue, 18 Dec 2007 07:00:31 +0000 (23:00 -0800)]
[TG3]: Endianness bugfix.
tg3_nvram_write_block_unbuffered() is reading data from nvram into
allocated buffer before overwriting a part of it with user-supplied
data. Then it feeds the entire page back to nvram. It should be
storing the words it had read as little-endian, not as host-endian.
Note that tg3_set_eeprom() does exactly that for padding the same
data to full words before it gets passed down to tg3_nvram_write_block()
and then to tg3_nvram_write_block_unbuffered().
Moreover, when we get to sending the entire thing back to nvram, we
go through it word-by-word, doing essentially
writel(swab32(le32_to_cpu(word)), ...)
so if we want them to reach the card in host-independent endianness,
we'd better really have all that buffer filled with fixed-endian.
For user-supplied part we obviously do have that (it's an array of
octets memcpy'd in), ditto for padding of user-supplied part to word
boundaries (taken care of in tg3_set_eeprom()). The rest of the
buffer gets filled by tg3_nvram_write_block_unbuffered() and it would
damn better be consistent with that (and with tg3_get_eeprom(), while
we are at it - there we also convert the words read from nvram to
little-endian before returning the buffer to user).
The bug should get triggered on big-endian boxen when set_eeprom is done
for less than entire page. Then the words that should've been unaffected
at all will actually get byteswapped in place in nvram.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Al Viro [Tue, 18 Dec 2007 06:59:57 +0000 (22:59 -0800)]
[TG3]: Endianness annotations.
Fixed misannotations, introduced a new helper - tg3_nvram_read_le().
It gets __le32 * instead of u32 * and puts there the value converted
to little-endian. A lot of callers of tg3_nvram_read() were doing
that; converted them to tg3_nvram_read_le().
At that point the driver is practically endian-clean; the only remaining
place is an actual bug, AFAICS; will be dealt with in the next patch.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cyrill Gorcunov [Fri, 14 Dec 2007 00:17:03 +0000 (16:17 -0800)]
NET: mac80211: fix inappropriate memory freeing
Fix inappropriate memory freeing in case of requested rate_control_ops was
not found. In this case the list head entity is going to be accidentally
wasted.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johannes Berg [Wed, 12 Dec 2007 15:31:52 +0000 (16:31 +0100)]
mac80211: fix header ops
When using recvfrom() on a SOCK_DGRAM packet socket, I noticed that the MAC
address passed back for wireless frames was always completely wrong. The
reason for this is that the header parse function assigned to our virtual
interfaces is a function parsing an 802.11 rather than 802.3 header. This
patch fixes it by keeping the default ethernet header operations assigned.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Michael Wu [Sat, 10 Nov 2007 05:15:25 +0000 (00:15 -0500)]
mac80211: Drop out of associated state if link is lost
There is no point in staying in IEEE80211_ASSOCIATED if there is no
sta_info entry to receive frames with.
Signed-off-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Linus Torvalds [Wed, 19 Dec 2007 22:29:23 +0000 (14:29 -0800)]
Merge branch 'release' of git://git./linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] Adjust CMCI mask on CPU hotplug
[IA64] make flush_tlb_kernel_range() an inline function
[IA64] Guard elfcorehdr_addr with #if CONFIG_PROC_FS
[IA64] Fix Altix BTE error return status
[IA64] Remove assembler warnings on head.S
[IA64] Remove compiler warinings about uninitialized variable in irq_ia64.c
[IA64] set_thread_area fails in IA32 chroot
[IA64] print kernel release in OOPS to make kerneloops.org happy
[IA64] Two trivial spelling fixes
[IA64] Avoid unnecessary TLB flushes when allocating memory
[IA64] ia32 nopage
[IA64] signal: remove redundant code in setup_sigcontext()
IA64: Slim down __clear_bit_unlock
Alan Cox [Wed, 19 Dec 2007 17:50:32 +0000 (17:50 +0000)]
pata_hpt37x: Fix HPT374 detection
Bug #9261
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Geoff Levand [Wed, 19 Dec 2007 10:17:31 +0000 (11:17 +0100)]
ps3fb: Fix ps3fb free_irq() dev_id
The dev_id arg passed to free_irq() must match that passed to
request_irq().
Fixes this PS3 error message:
Trying to free already-free IRQ 44
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Geert Uytterhoeven [Wed, 19 Dec 2007 10:16:41 +0000 (11:16 +0100)]
ps3fb: Update for firmware 2.10
ps3fb: Update for firmware 2.10
As of PS3 firmware version 2.10, the GPU command buffer size must be at least 2
MiB large. Since we use only a small part of the GPU command buffer and don't
want to waste precious XDR memory, move the GPU command buffer back to the
start of the XDR memory reserved for ps3fb and let the unused part overlap with
the actual frame buffer.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 19 Dec 2007 22:25:56 +0000 (14:25 -0800)]
Merge git://git./linux/kernel/git/jejb/scsi-rc-fixes-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] initio: bugfix for accessors patch
[SCSI] st: fix kernel BUG at include/linux/scatterlist.h:59!
[SCSI] initio: fix conflict when loading driver
[SCSI] sym53c8xx: fix "irq X: nobody cared" regression
[SCSI] dpt_i2o: driver is only 32 bit so don't set 64 bit DMA mask
[SCSI] sym53c8xx: fix free_irq() regression
Mike Travis [Wed, 19 Dec 2007 22:20:19 +0000 (23:20 +0100)]
x86: fix show cpuinfo cpu number always zero
when called by setup_arch) after smp_store_cpu_info() had set it to the
correct value.
The error shows up in 'cat /proc/cpuinfo' will all cpus = 0.
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Suresh B Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Adrian Bunk [Wed, 19 Dec 2007 22:20:19 +0000 (23:20 +0100)]
x86_32: disable_pse must be __cpuinitdata
CONFIG_HOTPLUG_CPU=y:
WARNING: vmlinux.o(.text+0xfa52): Section mismatch: reference to .init.data:disable_pse (between 'identify_cpu' and 'identify_secondary_cpu')
[ akpm@linux-foundation.org: initializer fix. ]
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Adrian Bunk [Wed, 19 Dec 2007 22:20:18 +0000 (23:20 +0100)]
x86_32: select_idle_routine() must be __cpuinit
CONFIG_HOTPLUG_CPU=y:
WARNING: vmlinux.o(.text+0x1199a): Section mismatch: reference to .init.text.5:select_idle_routine (between 'init_intel' and 'init_nexgen')
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Adrian Bunk [Wed, 19 Dec 2007 22:20:18 +0000 (23:20 +0100)]
x86 smpboot_32.c section fixes
CONFIG_HOTPLUG_CPU=y:
WARNING: vmlinux.o(.text+0x22c60): Section mismatch: reference to .init.data:cpu_idle_tasks (between 'do_boot_cpu' and 'do_warm_boot_cpu')
WARNING: vmlinux.o(.text+0x22c99): Section mismatch: reference to .init.data:cpu_idle_tasks (between 'do_boot_cpu' and 'do_warm_boot_cpu')
WARNING: vmlinux.o(.text+0x2359b): Section mismatch: reference to .init.data:smp_b_stepping (between 'smp_store_cpu_info' and 'cpu_exit_clear')
WARNING: vmlinux.o(.text+0x235a0): Section mismatch: reference to .init.data:smp_b_stepping (between 'smp_store_cpu_info' and 'cpu_exit_clear')
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Adrian Bunk [Wed, 19 Dec 2007 22:20:18 +0000 (23:20 +0100)]
x86 apic_32.c section fix
CONFIG_HOTPLUG_CPU=y:
WARNING: vmlinux.o(.text+0x2390d): Section mismatch: reference to .init.text.5:setup_local_APIC (between 'start_secondary' and 'check_tsc_warp')
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Linus Torvalds [Wed, 19 Dec 2007 22:05:13 +0000 (14:05 -0800)]
Do dirty page accounting when removing a page from the page cache
Krzysztof Oledzki noticed a dirty page accounting leak on some of his
machines, causing the machine to eventually lock up when the kernel
decided that there was too much dirty data, but nobody could actually
write anything out to fix it.
The culprit turns out to be filesystems (cough ext3 with data=journal
cough) that re-dirty the page when the "->invalidatepage()" callback is
called.
Fix it up by doing a final dirty page accounting check when we actually
remove the page from the page cache.
This fixes bugzilla entry 9182:
http://bugzilla.kernel.org/show_bug.cgi?id=9182
Tested-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Krzysztof Oledzki <olel@ans.pl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hidetoshi Seto [Wed, 19 Dec 2007 19:42:02 +0000 (11:42 -0800)]
[IA64] Adjust CMCI mask on CPU hotplug
Currently CMCI mask of hot-added CPU is always disabled after CPU hotplug.
We should adjust this mask depending on CMC polling state.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Jan Beulich [Wed, 19 Dec 2007 20:30:30 +0000 (12:30 -0800)]
[IA64] make flush_tlb_kernel_range() an inline function
This fixes an unused variable warning in mm/vmalloc.c.
Tony: also fix resulting fallout in uncached.c with a
typo in args to flush_tlb_kernel_range().
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Simon Horman [Mon, 12 Nov 2007 04:55:21 +0000 (13:55 +0900)]
[IA64] Guard elfcorehdr_addr with #if CONFIG_PROC_FS
Access to elfcorehdr_addr needs to be guarded by #if CONFIG_PROC_FS
as well as the existing #if guards.
Fixes the following build problem:
arch/ia64/hp/common/built-in.o: In function
`sba_init':arch/ia64/hp/common/sba_iommu.c:2043: undefined reference to `elfcorehdr_addr'
:arch/ia64/hp/common/sba_iommu.c:2043: undefined reference to `elfcorehdr_addr'
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Russ Anderson [Tue, 21 Aug 2007 21:45:12 +0000 (16:45 -0500)]
[IA64] Fix Altix BTE error return status
The Altix shub2 BTE error detail bits are in a different location
than on shub1. The current code does not take this into account
resulting in all shub2 BTE failures mapping to "unknown".
This patch reads the error detail bits from the proper location,
so the correct BTE failure reason is returned for both shub1
and shub2.
Signed-off-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Hidetoshi Seto [Wed, 12 Dec 2007 07:28:52 +0000 (16:28 +0900)]
[IA64] Remove assembler warnings on head.S
This patch removes the following assembler warning messages.
AS arch/ia64/kernel/head.o
arch/ia64/kernel/head.S: Assembler messages:
arch/ia64/kernel/head.S:1179: Warning: Use of 'ld8' violates RAW dependency 'CR[PTA]' (data)
arch/ia64/kernel/head.S:1179: Warning: Only the first path encountering the conflict is reported
arch/ia64/kernel/head.S:1178: Warning: This is the location of the conflicting usage
arch/ia64/kernel/head.S:1180: Warning: Use of 'ld8' violates RAW dependency 'CR[PTA]' (data)
arch/ia64/kernel/head.S:1180: Warning: Only the first path encountering the conflict is reported
arch/ia64/kernel/head.S:1178: Warning: This is the location of the conflicting usage
:
arch/ia64/kernel/head.S:1213: Warning: Use of 'ldf.fill.nta' violates RAW dependency 'CR[PTA]' (data)
arch/ia64/kernel/head.S:1213: Warning: Only the first path encountering the conflict is reported
arch/ia64/kernel/head.S:1178: Warning: This is the location of the conflicting usage
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Kenji Kaneshige [Wed, 22 Aug 2007 10:28:36 +0000 (19:28 +0900)]
[IA64] Remove compiler warinings about uninitialized variable in irq_ia64.c
This patch removes the following compiler warning messages.
CC arch/ia64/kernel/irq_ia64.o
arch/ia64/kernel/irq_ia64.c: In function 'create_irq':
arch/ia64/kernel/irq_ia64.c:343: warning: 'domain.bits[0u]' may be used uninitialized in this function
arch/ia64/kernel/irq_ia64.c: In function 'assign_irq_vector':
arch/ia64/kernel/irq_ia64.c:203: warning: 'domain.bits[0u]' may be used uninitialized in this function
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Ian Wienand [Tue, 20 Nov 2007 03:12:46 +0000 (14:12 +1100)]
[IA64] set_thread_area fails in IA32 chroot
I tried to upgrade an IA32 chroot on my IA64 to a new glibc with TLS.
It kept dying because set_thread_area was returning -ESRCH
(bugs.debian.org/451939).
I instrumented arch/ia64/ia32/sys_ia32.c:get_free_idx() and ended up
seeing output like
[pid] idx desc->a desc->b
-----------------------------
[2710] 0 ->
c6b0ffff 40dff31b
[2710] 1 -> 0 0
[2710] 2 -> 0 0
[2710] 0 ->
c6b0ffff 40dff31b
[2710] 1 ->
c6b0ffff 40dff31b
[2710] 2 -> 0 0
[2711] 0 ->
c6b0ffff 40dff31b
[2711] 1 ->
c6b0ffff 40dff31b
[2711] 2 ->
48c0ffff 40dff317
which suggested to me that TLS pointers were surviving exec() calls,
leading to GDT pointers filling up and the eventual failure of
get_free_idx().
I think the solution is flushing the tls array on exec.
Signed-Off-By: Ian Wienand <ianw@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Luck, Tony [Tue, 18 Dec 2007 19:46:38 +0000 (11:46 -0800)]
[IA64] print kernel release in OOPS to make kerneloops.org happy
The ia64 oops message doesn't include the kernel version, which
makes it hard to automatically categorize oops messages scraped
from mailing lists and bug databases.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Joe Perches [Wed, 19 Dec 2007 01:02:21 +0000 (17:02 -0800)]
[IA64] Two trivial spelling fixes
s/addres/address/
s/performanc/performance/
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
de Dinechin, Christophe (Integrity VM) [Thu, 13 Dec 2007 15:03:07 +0000 (15:03 +0000)]
[IA64] Avoid unnecessary TLB flushes when allocating memory
Improve performance of memory allocations on ia64 by avoiding a global TLB
purge to purge a single page from the file cache. This happens whenever we
evict a page from the buffer cache to make room for some other allocation.
Test case: Run 'find /usr -type f | xargs cat > /dev/null' in the
background to fill the buffer cache, then run something that uses memory,
e.g. 'gmake -j50 install'. Instrumentation showed that the number of
global TLB purges went from a few millions down to about 170 over a 12
hours run of the above.
The performance impact is particularly noticeable under virtualization,
because a virtual TLB is generally both larger and slower to purge than
a physical one.
Signed-off-by: Christophe de Dinechin <ddd@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Nick Piggin [Thu, 13 Dec 2007 23:58:27 +0000 (15:58 -0800)]
[IA64] ia32 nopage
Convert ia64's ia32 support from nopage to fault.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Shi Weihua [Thu, 13 Dec 2007 23:58:26 +0000 (15:58 -0800)]
[IA64] signal: remove redundant code in setup_sigcontext()
This patch removes some redundant code in the function setup_sigcontext().
The registers ar.ccv,b7,r14,ar.csd,ar.ssd,r2-r3 and r16-r31 are not
restored in restore_sigcontext() when (flags & IA64_SC_FLAG_IN_SYSCALL) is
true. So we don't need to zero those variables in setup_sigcontext().
Signed-off-by: Shi Weihua <shiwh@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Christoph Lameter [Wed, 19 Dec 2007 00:22:46 +0000 (16:22 -0800)]
IA64: Slim down __clear_bit_unlock
__clear_bit_unlock does not need to perform atomic operations on the
variable. Avoid a cmpxchg and simply do a store with release semantics.
Add a barrier to be safe that the compiler does not do funky things.
Tony: Use intrinsic rather than inline assembler
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Boaz Harrosh [Mon, 17 Dec 2007 16:08:59 +0000 (18:08 +0200)]
[SCSI] initio: bugfix for accessors patch
patch: [SCSI] initio: convert to use the data buffer accessors had a
small but fatal bug in that it didn't increment the pointer into the
initio scatterlist descriptors as it looped over the block generated
ones. Fixed here.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
FUJITA Tomonori [Sat, 15 Dec 2007 06:51:55 +0000 (15:51 +0900)]
[SCSI] st: fix kernel BUG at include/linux/scatterlist.h:59!
This is caused by a missing scatterlist initialisation (it only shows
up when sg list handling debugging is turned on).
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Alan Cox [Fri, 14 Dec 2007 00:14:05 +0000 (16:14 -0800)]
[SCSI] initio: fix conflict when loading driver
> I have a scanner connected to a Initio INI-950 SCSI card and I recently
> upgraded from SuSE 10.2 to 10.3. The new kernel doesn't see any of my
> devices. I get the following in /var/log/messages:
>
> ACPI: PCI Interrupt 0000:00:0a.0[A] -> GSI 17 (level, low) -> IRQ 16
> initio: I/O port range 0x0 is busy.
> ACPI: PCI interrupt for device 0000:00:0a.0 disabled
Humm not a collision - thats a bug in the driver updating. Looks like the
changes I made and combined with Christoph's lost a line somewhere when I
was merging it all.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Tony Battersby [Fri, 14 Dec 2007 20:45:16 +0000 (15:45 -0500)]
[SCSI] sym53c8xx: fix "irq X: nobody cared" regression
The patch described by the following excerpt from ChangeLog-2.6.24-rc1
eventually causes a "irq X: nobody cared" error after a while:
commit
99c9e0a1d6cfe1ba1169a7a81435ee85bc00e4a1
Author: Matthew Wilcox <matthew@wil.cx>
Date: Fri Oct 5 15:55:12 2007 -0400
[SCSI] sym53c8xx: Make interrupt handler capable of returning IRQ_NONE
After this happens, the kernel disables the IRQ, causing the SCSI card
to stop working until the next reboot. The problem is caused by the
interrupt handler returning IRQ_NONE instead of IRQ_HANDLED after
handling an interrupt-on-the-fly (INTF) condition. The following patch
fixes the problem.
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Acked-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
James Bottomley [Wed, 12 Dec 2007 20:06:21 +0000 (15:06 -0500)]
[SCSI] dpt_i2o: driver is only 32 bit so don't set 64 bit DMA mask
This fixes a potential corruption bug where the truncation would cause
reading or writing to the wrong memory area on machines with >4GB of
main memory.
Cc: Stable Kernel Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Tony Battersby [Tue, 6 Nov 2007 19:40:54 +0000 (14:40 -0500)]
[SCSI] sym53c8xx: fix free_irq() regression
The following commit changed the pointer passed to request_irq(), but
failed to change the pointer passed to free_irq():
commit
99c9e0a1d6cfe1ba1169a7a81435ee85bc00e4a1
Author: Matthew Wilcox <matthew@wil.cx>
Date: Fri Oct 5 15:55:12 2007 -0400
[SCSI] sym53c8xx: Make interrupt handler capable of returning IRQ_NONE
...
The result is that free_irq() doesn't actually take any action. This
patch fixes it.
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Linus Torvalds [Tue, 18 Dec 2007 17:42:44 +0000 (09:42 -0800)]
Merge git://git./linux/kernel/git/x86/linux-2.6-x86
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
x86: fix "Kernel panic - not syncing: IO-APIC + timer doesn't work!"
genirq: revert lazy irq disable for simple irqs
x86: also define AT_VECTOR_SIZE_ARCH
x86: kprobes bugfix
x86: jprobe bugfix
timer: kernel/timer.c section fixes
genirq: add unlocked version of set_irq_handler()
clockevents: fix reprogramming decision in oneshot broadcast
oprofile: op_model_athlon.c support for AMD family 10h barcelona performance counters
Ingo Molnar [Tue, 18 Dec 2007 17:05:58 +0000 (18:05 +0100)]
x86: fix "Kernel panic - not syncing: IO-APIC + timer doesn't work!"
this is the tale of a full day spent debugging an ancient but elusive bug.
after booting up thousands of random .config kernels, i finally happened
to generate a .config that produced the following rare bootup failure
on 32-bit x86:
| ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
| ..MP-BIOS bug: 8254 timer not connected to IO-APIC
| ...trying to set up timer (IRQ0) through the 8259A ... failed.
| ...trying to set up timer as Virtual Wire IRQ... failed.
| ...trying to set up timer as ExtINT IRQ... failed :(.
| Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with apic=debug
| and send a report. Then try booting with the 'noapic' option
this bug has been reported many times during the years, but it was never
reproduced nor fixed.
the bug that i hit was extremely sensitive to .config details.
First i did a .config-bisection - suspecting some .config detail.
That led to CONFIG_X86_MCE: enabling X86_MCE magically made the bug disappear
and the system would boot up just fine.
Debugging my way through the MCE code ended up identifying two unlikely
candidates: the thing that made a real difference to the hang was that
X86_MCE did two printks:
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#1.
Adding the same printks to a !CONFIG_X86_MCE kernel made the bug go away!
this left timing as the main suspect: i experimented with adding various
udelay()s to the arch/x86/kernel/io_apic_32.c:check_timer() function, and
the race window turned out to be narrower than 30 microseconds (!).
That made debugging especially funny, debugging without having printk
ability before the bug hits is ... interesting ;-)
eventually i started suspecting IRQ activities - those are pretty much the
only thing that happen this early during bootup and have the timescale of
a few dozen microseconds. Also, check_timer() changes the IRQ hardware
in various creative ways, so the main candidate became IRQ0 interaction.
i've added a counter to track timer irqs (on which core they arrived, at
what exact time, etc.) and found that no timer IRQ would arrive after the
bug condition hits - even if we re-enable IRQ0 and re-initialize the i8259A,
but that we'd get a small number of timer irqs right around the time when we
call the check_timer() function.
Eventually i got the following backtrace triggered from debug code in the
timer interrupt:
...trying to set up timer as Virtual Wire IRQ... failed.
...trying to set up timer as ExtINT IRQ...
Pid: 1, comm: swapper Not tainted (2.6.24-rc5 #57)
EIP: 0060:[<
c044d57e>] EFLAGS:
00000246 CPU: 0
EIP is at _spin_unlock_irqrestore+0x5/0x1c
EAX:
c0634178 EBX:
00000000 ECX:
c4947d63 EDX:
00000246
ESI:
00000002 EDI:
00010031 EBP:
c04e0f2e ESP:
f7c41df4
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0:
8005003b CR2:
ffe04000 CR3:
00630000 CR4:
000006d0
DR0:
00000000 DR1:
00000000 DR2:
00000000 DR3:
00000000
DR6:
ffff0ff0 DR7:
00000400
[<
c05f5784>] setup_IO_APIC+0x9c3/0xc5c
the spin_unlock() was called from init_8259A(). Wait ... we have an IRQ0
entry while we are in the middle of setting up the local APIC, the i8259A
and the PIT??
That is certainly not how it's supposed to work! check_timer() was supposed
to be called with irqs turned off - but this eroded away sometime in the
past. This code would still work most of the time because this code runs
very quickly, but just the right timing conditions are present and IRQ0
hits in this small, ~30 usecs window, timer irqs stop and the system does
not boot up. Also, given how early this is during bootup, the hang is
very deterministic - but it would only occur on certain machines (and
certain configs).
The fix was quite simple: disable/restore interrupts properly in this
function. With that in place the test-system now boots up just fine.
(64-bit x86 io_apic_64.c had the same bug.)
Phew! One down, only 1500 other kernel bugs are left ;-)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Tue, 18 Dec 2007 17:05:58 +0000 (18:05 +0100)]
genirq: revert lazy irq disable for simple irqs
In commit
76d2160147f43f982dfe881404cfde9fd0a9da21 lazy irq disabling
was implemented, and the simple irq handler had a masking set to it.
Remy Bohmer discovered that some devices in the ARM architecture
would trigger the mask, but never unmask it. His patch to do the
unmasking was questioned by Russell King about masking simple irqs
to begin with. Looking further, it was discovered that the problems
Remy was seeing was due to improper use of the simple handler by
devices, and he later submitted patches to fix those. But the issue
that was uncovered was that the simple handler should never mask.
This patch reverts the masking in the simple handler.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Jan Beulich [Tue, 18 Dec 2007 17:05:58 +0000 (18:05 +0100)]
x86: also define AT_VECTOR_SIZE_ARCH
The patch introducing this left out 64-bit x86 despite it also having
extra entries.
this solves Xen guest troubles.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Masami Hiramatsu [Tue, 18 Dec 2007 17:05:58 +0000 (18:05 +0100)]
x86: kprobes bugfix
Kprobes for x86-64 may cause a kernel crash if it inserted on "iret"
instruction. "call absolute" is invalid on x86-64, so we don't need
treat it.
- Change the processing order as same as x86-32.
- Add "iret"(0xcf) case.
- Remove next_rip local variable.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Masami Hiramatsu [Tue, 18 Dec 2007 17:05:58 +0000 (18:05 +0100)]
x86: jprobe bugfix
jprobe for x86-64 may cause kernel page fault when the jprobe_return()
is called from incorrect function.
- Use jprobe_saved_regs instead getting it from stack.
(Especially on x86-64, it may get incorrect data, because
pt_regs can not be get by using container_of(rsp))
- Change the type of stack pointer to unsigned long *.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Adrian Bunk [Tue, 18 Dec 2007 17:05:58 +0000 (18:05 +0100)]
timer: kernel/timer.c section fixes
This patch fixes the following section mismatches with CONFIG_HOTPLUG=n,
CONFIG_HOTPLUG_CPU=y:
...
WARNING: vmlinux.o(.text+0x41cd3): Section mismatch: reference to .init.data:tvec_base_done.22610 (between 'timer_cpu_notify' and 'run_timer_softirq')
WARNING: vmlinux.o(.text+0x41d67): Section mismatch: reference to .init.data:tvec_base_done.22610 (between 'timer_cpu_notify' and 'run_timer_softirq')
...
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Kevin Hilman [Tue, 18 Dec 2007 17:05:58 +0000 (18:05 +0100)]
genirq: add unlocked version of set_irq_handler()
Add unlocked version for use by irq_chip.set_type handlers which may
wish to change handler to level or edge handler when IRQ type is
changed.
The normal set_irq_handler() call cannot be used because it tries to
take irq_desc.lock which is already held when the irq_chip.set_type
hook is called.
Signed-off-by: Kevin Hilman <khilman@mvista.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Tue, 18 Dec 2007 17:05:58 +0000 (18:05 +0100)]
clockevents: fix reprogramming decision in oneshot broadcast
Resolve the following regression of a choppy, almost unusable laptop:
http://lkml.org/lkml/2007/12/7/299
http://bugzilla.kernel.org/show_bug.cgi?id=9525
A previous version of the code did the reprogramming of the broadcast
device in the return from idle code. This was removed, but the logic in
tick_handle_oneshot_broadcast() was kept the same.
When a broadcast interrupt happens we signal the expiry to all CPUs
which have an expired event. If none of the CPUs has an expired event,
which can happen in dyntick mode, then we reprogram the broadcast
device. We do not reprogram otherwise, but this is only correct if all
CPUs, which are in the idle broadcast state have been woken up.
The code ignores, that there might be pending not yet expired events on
other CPUs, which are in the idle broadcast state. So the delivery of
those events can be delayed for quite a time.
Change the tick_handle_oneshot_broadcast() function to check for CPUs,
which are in broadcast state and are not woken up by the current event,
and enforce the rearming of the broadcast device for those CPUs.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Barry Kasindorf [Tue, 18 Dec 2007 17:05:58 +0000 (18:05 +0100)]
oprofile: op_model_athlon.c support for AMD family 10h barcelona performance counters
This patch is for controlling the upper 32bits of the event ctrl msrs.
This includes the upper 4 bits of the event select and the Guest Only and
Host Only bits
This patch is necessary to make Event Based Profiling work reliably on a
Family 10h processor
[akpm@linux-foundation.org: checkpatch.pl fixes]
Signed-off-by: Barry Kasindorf <barry.kasindorf@amd.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Linus Torvalds [Tue, 18 Dec 2007 16:11:01 +0000 (08:11 -0800)]
Merge git://git./linux/kernel/git/mingo/linux-2.6-sched
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
sched: do not hurt SCHED_BATCH on wakeup
sched: touch softlockup watchdog after idling
sched: sysctl, proc_dointvec_minmax() expects int values for
sched: mark rwsem functions as __sched for wchan/profiling
sched: fix crash on ia64, introduce task_current()
Linus Torvalds [Tue, 18 Dec 2007 16:04:24 +0000 (08:04 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
Cleanup umem driver: fix most checkpatch warnings, conform to kernel
block: let elv_register() return void
as-iosched: fix write batch start point
as-iosched: fix incorrect comments
block: use jiffies conversion functions in scsi_ioctl.c
Linus Torvalds [Tue, 18 Dec 2007 16:03:32 +0000 (08:03 -0800)]
Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6:
[XFS] Put the correct offset in dirent d_off
[XFS] Don't wait for pending I/Os when purging blocks beyond eof.
Linus Torvalds [Tue, 18 Dec 2007 16:03:01 +0000 (08:03 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/drzeus/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
mmc: remove unused 'mode' from the mmc_host structure
sdhci: support JMicron JMB38x chips
sdhci: use PIO when DMA can't satisfy the request
sdhci: don't warn about sdhci 2.0 controllers
sdhci: describe quirks
Ingo Molnar [Tue, 18 Dec 2007 14:21:13 +0000 (15:21 +0100)]
sched: do not hurt SCHED_BATCH on wakeup
measurements by Yanmin Zhang have shown that SCHED_BATCH tasks benefit
if they run the same place_entity() logic as SCHED_OTHER tasks - so
uniformize behavior in this area.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Tue, 18 Dec 2007 14:21:13 +0000 (15:21 +0100)]
sched: touch softlockup watchdog after idling
touch softlockup watchdog after idling.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Eric Dumazet [Tue, 18 Dec 2007 14:21:13 +0000 (15:21 +0100)]
sched: sysctl, proc_dointvec_minmax() expects int values for
min_sched_granularity_ns, max_sched_granularity_ns,
min_wakeup_granularity_ns and max_wakeup_granularity_ns are declared
"unsigned long".
This is incorrect since proc_dointvec_minmax() expects plain "int" guard
values.
This bug only triggers on big endian 64 bit arches.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>