Huang Ying [Thu, 19 Feb 2009 06:42:19 +0000 (14:42 +0800)]
crypto: cryptd - Per-CPU thread implementation based on kcrypto_wq
Original cryptd thread implementation has scalability issue, this
patch solve the issue with a per-CPU thread implementation.
struct cryptd_queue is defined to be a per-CPU queue, which holds one
struct cryptd_cpu_queue for each CPU. In struct cryptd_cpu_queue, a
struct crypto_queue holds all requests for the CPU, a struct
work_struct is used to run all requests for the CPU.
Testing based on dm-crypt on an Intel Core 2 E6400 (two cores) machine
shows 19.2% performance gain. The testing script is as follow:
-------------------- script begin ---------------------------
#!/bin/sh
dmc_create()
{
# Create a crypt device using dmsetup
dmsetup create $2 --table "0 `blockdev --getsize $1` crypt cbc(aes-asm)?cryptd?plain:plain
babebabebabebabebabebabebabebabe 0 $1 0"
}
dmsetup remove crypt0
dmsetup remove crypt1
dd if=/dev/zero of=/dev/ram0 bs=1M count=4 >& /dev/null
dd if=/dev/zero of=/dev/ram1 bs=1M count=4 >& /dev/null
dmc_create /dev/ram0 crypt0
dmc_create /dev/ram1 crypt1
cat >tr.sh <<EOF
#!/bin/sh
for n in \$(seq 10); do
dd if=/dev/dm-0 of=/dev/null >& /dev/null &
dd if=/dev/dm-1 of=/dev/null >& /dev/null &
done
wait
EOF
for n in $(seq 10); do
/usr/bin/time sh tr.sh
done
rm tr.sh
-------------------- script end ---------------------------
The separator of dm-crypt parameter is changed from "-" to "?", because
"-" is used in some cipher driver name too, and cryptds need to specify
cipher driver name instead of cipher name.
The test result on an Intel Core2 E6400 (two cores) is as follow:
without patch:
-----------------wo begin --------------------------
0.04user 0.38system 0:00.39elapsed 107%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6566minor)pagefaults 0swaps
0.07user 0.35system 0:00.35elapsed 121%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6567minor)pagefaults 0swaps
0.06user 0.34system 0:00.30elapsed 135%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6562minor)pagefaults 0swaps
0.05user 0.37system 0:00.36elapsed 119%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6607minor)pagefaults 0swaps
0.06user 0.36system 0:00.35elapsed 120%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6562minor)pagefaults 0swaps
0.05user 0.37system 0:00.31elapsed 136%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6594minor)pagefaults 0swaps
0.04user 0.34system 0:00.30elapsed 126%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6597minor)pagefaults 0swaps
0.06user 0.32system 0:00.31elapsed 125%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6571minor)pagefaults 0swaps
0.06user 0.34system 0:00.31elapsed 134%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6581minor)pagefaults 0swaps
0.05user 0.38system 0:00.31elapsed 138%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6600minor)pagefaults 0swaps
-----------------wo end --------------------------
with patch:
------------------w begin --------------------------
0.02user 0.31system 0:00.24elapsed 141%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6554minor)pagefaults 0swaps
0.05user 0.34system 0:00.31elapsed 127%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6606minor)pagefaults 0swaps
0.07user 0.33system 0:00.26elapsed 155%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6559minor)pagefaults 0swaps
0.07user 0.32system 0:00.26elapsed 151%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6562minor)pagefaults 0swaps
0.05user 0.34system 0:00.26elapsed 150%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6603minor)pagefaults 0swaps
0.03user 0.36system 0:00.31elapsed 124%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6562minor)pagefaults 0swaps
0.04user 0.35system 0:00.26elapsed 147%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6586minor)pagefaults 0swaps
0.03user 0.37system 0:00.27elapsed 146%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6562minor)pagefaults 0swaps
0.04user 0.36system 0:00.26elapsed 154%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6594minor)pagefaults 0swaps
0.04user 0.35system 0:00.26elapsed 154%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6557minor)pagefaults 0swaps
------------------w end --------------------------
The middle value of elapsed time is:
wo cryptwq: 0.31
w cryptwq: 0.26
The performance gain is about (0.31-0.26)/0.26 = 0.192.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Huang Ying [Thu, 19 Feb 2009 06:33:40 +0000 (14:33 +0800)]
crypto: api - Use dedicated workqueue for crypto subsystem
Use dedicated workqueue for crypto subsystem
A dedicated workqueue named kcrypto_wq is created to be used by crypto
subsystem. The system shared keventd_wq is not suitable for
encryption/decryption, because of potential starvation problem.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Wed, 18 Feb 2009 13:41:29 +0000 (21:41 +0800)]
crypto: testmgr - Test skciphers with no IVs
As it is an skcipher with no IV escapes testing altogether because
we only test givcipher objects. This patch fixes the bypass logic
to test these algorithms.
Conversely, we're currently testing nivaead algorithms with IVs,
which would have deadlocked had it not been for the fact that no
nivaead algorithms have any test vectors. This patch also fixes
that case.
Both fixes are ugly as hell, but this ugliness should hopefully
disappear once we move them into the per-type code (i.e., the
AEAD test would live in aead.c and the skcipher stuff in ablkcipher.c).
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Wed, 18 Feb 2009 12:41:47 +0000 (20:41 +0800)]
crypto: aead - Avoid infinite loop when nivaead fails selftest
When an aead constructed through crypto_nivaead_default fails
its selftest, we'll loop forever trying to construct new aead
objects but failing because it already exists.
The crux of the issue is that once an aead fails the selftest,
we'll ignore it on the next run through crypto_aead_lookup and
attempt to construct a new aead.
We should instead return an error to the caller if we find an
an that has failed the test.
This bug hasn't manifested itself yet because we don't have any
test vectors for the existing nivaead algorithms. They're tested
through the underlying algorithms only.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Wed, 18 Feb 2009 12:33:55 +0000 (20:33 +0800)]
crypto: skcipher - Avoid infinite loop when cipher fails selftest
When an skcipher constructed through crypto_givcipher_default fails
its selftest, we'll loop forever trying to construct new skcipher
objects but failing because it already exists.
The crux of the issue is that once a givcipher fails the selftest,
we'll ignore it on the next run through crypto_skcipher_lookup and
attempt to construct a new givcipher.
We should instead return an error to the caller if we find a
givcipher that has failed the test.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Wed, 18 Feb 2009 08:56:59 +0000 (16:56 +0800)]
crypto: api - Fix crypto_alloc_tfm/create_create_tfm return convention
This is based on a report and patch by Geert Uytterhoeven.
The functions crypto_alloc_tfm and create_create_tfm return a
pointer that needs to be adjusted by the caller when successful
and otherwise an error value. This means that the caller has
to check for the error and only perform the adjustment if the
pointer returned is valid.
Since all callers want to make the adjustment and we know how
to adjust it ourselves, it's much easier to just return adjusted
pointer directly.
The only caveat is that we have to return a void * instead of
struct crypto_tfm *. However, this isn't that bad because both
of these functions are for internal use only (by types code like
shash.c, not even algorithms code).
This patch also moves crypto_alloc_tfm into crypto/internal.h
(crypto_create_tfm is already there) to reflect this.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Tue, 17 Feb 2009 12:18:34 +0000 (20:18 +0800)]
crypto: api - crypto_alg_mod_lookup either tested or untested
As it stands crypto_alg_mod_lookup will search either tested or
untested algorithms, but never both at the same time. However,
we need exactly that when constructing givcipher and aead so
this patch adds support for that by setting the tested bit in
type but clearing it in mask. This combination is currently
unused.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
James Hsiao [Thu, 5 Feb 2009 05:18:13 +0000 (16:18 +1100)]
crypto: amcc - Add crypt4xx driver
This patch adds support for AMCC ppc4xx security device driver. This is the
initial release that includes the driver framework with AES and SHA1 algorithms
support.
The remaining algorithms will be released in the near future.
Signed-off-by: James Hsiao <jhsiao@amcc.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Neil Horman [Thu, 5 Feb 2009 05:03:04 +0000 (16:03 +1100)]
crypto: ansi_cprng - Add maintainer
Add myself as the maintainer for the CPRNG. Herbert shouldn't deal with it
alone if (when?) it breaks :)
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Neil Horman [Thu, 5 Feb 2009 05:01:38 +0000 (16:01 +1100)]
crypto: ansi_cprng - Panic on CPRNG test failure when in FIPS mode
FIPS 140-2 specifies that all access to various cryptographic modules be
prevented in the event that any of the provided self tests fail on the various
implemented algorithms. We already panic when any of the test in testmgr.c
fail when we are operating in fips mode. The continuous test in the cprng here
was missed when that was implmented. This code simply checks for the
fips_enabled flag if the test fails, and warns us via syslog or panics the box
accordingly.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Sun, 18 Jan 2009 09:33:33 +0000 (20:33 +1100)]
crypto: sha-s390 - Switch to shash
This patch converts the S390 sha algorithms to the new shash interface.
With fixes by Jan Glauber.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Tue, 3 Feb 2009 01:47:44 +0000 (12:47 +1100)]
crypto: shash - Add crypto_shash_blocksize
This function is needed by algorithms that don't know their own
block size, e.g., in s390 where the code is common between multiple
versions of SHA.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Neil Horman [Wed, 28 Jan 2009 04:20:51 +0000 (15:20 +1100)]
crypto: ansi_cprng - Force reset on allocation
Pseudo RNGs provide predictable outputs based on input parateters {key, V, DT},
the idea behind them is that only the user should know what the inputs are.
While its nice to have default known values for testing purposes, it seems
dangerous to allow the use of those default values without some sort of safety
measure in place, lest an attacker easily guess the output of the cprng. This
patch forces the NEED_RESET flag on when allocating a cprng context, so that any
user is forced to reseed it before use. The defaults can still be used for
testing, but this will prevent their inadvertent use, and be more secure.
Signed-off-by: Neil Horman <nhorman@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Huang Ying [Sun, 18 Jan 2009 05:28:34 +0000 (16:28 +1100)]
crypto: aes-ni - Add support to Intel AES-NI instructions for x86_64 platform
Intel AES-NI is a new set of Single Instruction Multiple Data (SIMD)
instructions that are going to be introduced in the next generation of
Intel processor, as of 2009. These instructions enable fast and secure
data encryption and decryption, using the Advanced Encryption Standard
(AES), defined by FIPS Publication number 197. The architecture
introduces six instructions that offer full hardware support for
AES. Four of them support high performance data encryption and
decryption, and the other two instructions support the AES key
expansion procedure.
The white paper can be downloaded from:
http://softwarecommunity.intel.com/isn/downloads/intelavx/AES-Instructions-Set_WP.pdf
AES may be used in soft_irq context, but MMX/SSE context can not be
touched safely in soft_irq context. So in_interrupt() is checked, if
in IRQ or soft_irq context, the general x86_64 implementation are used
instead.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Huang Ying [Sun, 18 Jan 2009 05:19:46 +0000 (16:19 +1100)]
crypto: cryptd - Add support to access underlying blkcipher
cryptd_alloc_ablkcipher() will allocate a cryptd-ed ablkcipher for
specified algorithm name. The new allocated one is guaranteed to be
cryptd-ed ablkcipher, so the blkcipher underlying can be gotten via
cryptd_ablkcipher_child().
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Wed, 14 Jan 2009 02:34:48 +0000 (13:34 +1100)]
crypto: shash - Remove superfluous check in init_tfm
We're currently checking the frontend type in init_tfm. This is
completely pointless because the fact that we're called at all
means that the frontend is ours so the type must match as well.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Huang Ying [Fri, 9 Jan 2009 06:25:50 +0000 (17:25 +1100)]
crypto: aes - Export x86 AES encrypt/decrypt functions
Intel AES-NI AES acceleration instructions touch XMM state, to use
that in soft_irq context, general x86 AES implementation is used as
fallback. The first parameter is changed from struct crypto_tfm * to
struct crypto_aes_ctx * to make it easier to deal with 16 bytes
alignment requirement of AES-NI implementation.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Huang Ying [Fri, 9 Jan 2009 05:49:30 +0000 (16:49 +1100)]
crypto: aes - Move key_length in struct crypto_aes_ctx to be the last field
The Intel AES-NI AES acceleration instructions need key_enc, key_dec
in struct crypto_aes_ctx to be 16 byte aligned, it make this easier to
move key_length to be the last one.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Tue, 17 Feb 2009 12:00:11 +0000 (20:00 +0800)]
crypto: lrw - Fix big endian support
It turns out that LRW has never worked properly on big endian.
This was never discussed because nobody actually used it that
way. In fact, it was only discovered when Geert Uytterhoeven
loaded it through tcrypt which failed the test on it.
The fix is straightforward, on big endian the to find the nth
bit we should be grouping them by words instead of bytes. So
setbit128_bbe should xor with 128 - BITS_PER_LONG instead of
128 - BITS_PER_BYTE == 0x78.
Tested-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Linus Torvalds [Fri, 13 Feb 2009 23:31:30 +0000 (15:31 -0800)]
Linux 2.6.29-rc5
Linus Torvalds [Fri, 13 Feb 2009 16:19:11 +0000 (08:19 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ASoC: Only register AC97 bus if it's not done already
ALSA: hda - Add snd_hda_multi_out_dig_cleanup()
ALSA: hda - Add missing terminator in slave dig-out array
ALSA: hda - Change HP dv7 (103c:30f4) quirk from hp-m4 to hp-dv5 model
ALSA: hda - Register (new) devices at reconfig
ALSA: mtpav - Fix initial value for input hwport
ALSA: hda - add id for Intel IbexPeak integrated HDMI codec
ALSA: hda - compute checksum in HDMI audio infoframe
ALSA: hda - enable HDMI audio pin out at module loading time
ALSA: hda - allow multi-channel HDMI audio playback when ELD is not present
ASoC: Update SDP3430 machine driver for snd_soc_card
ALSA: hda - Add quirk for Asus z37e (1043:8284)
sound: Remove OSSlib stuff from linux/soundcard.h
ASoC: WM8990: Fix kcontrol's private value use in put callback
ASoC: TLV320AIC3X: Fix kcontrol's private value use in put callback
Serge E. Hallyn [Fri, 13 Feb 2009 14:04:21 +0000 (14:04 +0000)]
User namespaces: Only put the userns when we unhash the uid
uids in namespaces other than init don't get a sysfs entry.
For those in the init namespace, while we're waiting to remove
the sysfs entry for the uid the uid is still hashed, and
alloc_uid() may re-grab that uid without getting a new
reference to the user_ns, which we've already put in free_user
before scheduling remove_user_sysfs_dir().
Reported-and-tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Takashi Iwai [Fri, 13 Feb 2009 14:06:04 +0000 (15:06 +0100)]
Merge branch 'fix/asoc' into for-linus
Takashi Iwai [Fri, 13 Feb 2009 14:05:59 +0000 (15:05 +0100)]
Merge branch 'fix/hda' into for-linus
Takashi Iwai [Fri, 13 Feb 2009 14:05:56 +0000 (15:05 +0100)]
Merge branch 'fix/misc' into for-linus
Takashi Iwai [Fri, 13 Feb 2009 14:05:51 +0000 (15:05 +0100)]
Merge branch 'fix/oss-header-fix' into for-linus
Mark Brown [Thu, 12 Feb 2009 19:33:19 +0000 (19:33 +0000)]
ASoC: Only register AC97 bus if it's not done already
ASoC supports both explicit codec drivers for AC97 devices and a simple
driver which uses the standard ALSA AC97 framework for codec support.
When used with the generic AC97 codec support that will provide the
ad hoc AC97 device for drivers like touchscreens to attach to so the
core shouldn't do so.
Reported-by: Manuel Lauss <mano@roarinelk.homelinux.net>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Takashi Iwai [Fri, 13 Feb 2009 10:32:28 +0000 (11:32 +0100)]
ALSA: hda - Add snd_hda_multi_out_dig_cleanup()
Added the helper function snd_hda_multi_out_dig_cleanup() to clean up
the digital outputs with multi setup. This call is needed in cases
the codec supports multiple digital outputs as slaves. Otherwise the
slave widgets aren't properly cleaned up.
For a single digital output (e.g. in patch_conexant.c), this call isn't
needed.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Takashi Iwai [Fri, 13 Feb 2009 10:37:08 +0000 (11:37 +0100)]
ALSA: hda - Add missing terminator in slave dig-out array
Added the missing terminator for ad1989b_slave_dig_outs[].
Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Herton Ronaldo Krzesinski [Thu, 12 Feb 2009 19:27:27 +0000 (17:27 -0200)]
ALSA: hda - Change HP dv7 (103c:30f4) quirk from hp-m4 to hp-dv5 model
Change HP dv7 quirk: although reported to work with hp-m4 model
(https://bugzilla.novell.com/show_bug.cgi?id=445321), the original
report doesn't contain info about testing of internal microphone.
Recently I received a report about internal mic not working
(https://qa.mandriva.com/show_bug.cgi?id=44855#c193), this must be
related with the forced line in on pin 0x0e done with hp-m4 model. Thus
change the current quirk from STAC_HP_M4 to STAC_HP_DV5, later reported
to be fixed on a provided kernel with this change
(https://qa.mandriva.com/show_bug.cgi?id=44855#c196).
Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Linus Torvalds [Fri, 13 Feb 2009 01:47:15 +0000 (17:47 -0800)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (32 commits)
wimax: fix oops in wimax_dev_get_by_genl_info() when looking up non-wimax iface
net: 4 bytes kernel memory disclosure in SO_BSDCOMPAT gsopt try #2
netxen: fix compile waring "label ‘set_32_bit_mask’ defined but not used" on IA64 platform
bnx2: Update version to 1.9.2 and copyright.
bnx2: Fix jumbo frames error handling.
bnx2: Update 5709 firmware.
bnx2: Update 5706/5708 firmware.
3c505: do not set pcb->data.raw beyond its size
Documentation/connector/cn_test.c: don't use gfp_any()
net: don't use in_atomic() in gfp_any()
IRDA: cnt is off by 1
netxen: remove pcie workaround
sun3: print when lance_open() fails
qlge: bugfix: Add missing rx buf clean index on early exit.
qlge: bugfix: Fix RX scaling values.
qlge: bugfix: Fix TSO breakage.
qlge: bugfix: Add missing dev_kfree_skb_any() call.
qlge: bugfix: Add missing put_page() call.
qlge: bugfix: Fix fatal error recovery hang.
qlge: bugfix: Use netif_receive_skb() and vlan_hwaccel_receive_skb().
...
Inaky Perez-Gonzalez [Fri, 13 Feb 2009 01:00:20 +0000 (17:00 -0800)]
wimax: fix oops in wimax_dev_get_by_genl_info() when looking up non-wimax iface
When a non-wimax interface is looked up by the stack, a bad pointer is
returned when the looked-up interface is not found in the list (of
registered WiMAX interfaces). This causes an oops in the caller when
trying to use the pointer.
Fix by properly setting the pointer to NULL if we don't exit from the
list_for_each() with a found entry.
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clément Lecigne [Fri, 13 Feb 2009 00:59:09 +0000 (16:59 -0800)]
net: 4 bytes kernel memory disclosure in SO_BSDCOMPAT gsopt try #2
In function sock_getsockopt() located in net/core/sock.c, optval v.val
is not correctly initialized and directly returned in userland in case
we have SO_BSDCOMPAT option set.
This dummy code should trigger the bug:
int main(void)
{
unsigned char buf[4] = { 0, 0, 0, 0 };
int len;
int sock;
sock = socket(33, 2, 2);
getsockopt(sock, 1, SO_BSDCOMPAT, &buf, &len);
printf("%x%x%x%x\n", buf[0], buf[1], buf[2], buf[3]);
close(sock);
}
Here is a patch that fix this bug by initalizing v.val just after its
declaration.
Signed-off-by: Clément Lecigne <clement.lecigne@netasq.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Hongyang [Fri, 13 Feb 2009 00:57:12 +0000 (16:57 -0800)]
netxen: fix compile waring "label ‘set_32_bit_mask’ defined but not used" on IA64 platform
When compile the latest kernel on IA64 platform,I got a warning:
drivers/net/netxen/netxen_nic_main.c:203: warning: label ‘set_32_bit_mask’
defined but not used
We do not need label ‘set_32_bit_mask’ on IA64 platform,So move it to #else.
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Fri, 13 Feb 2009 00:54:48 +0000 (16:54 -0800)]
bnx2: Update version to 1.9.2 and copyright.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Fri, 13 Feb 2009 00:54:13 +0000 (16:54 -0800)]
bnx2: Fix jumbo frames error handling.
If errors are reported on a frame descriptor, we need to
account for the buffer pages that may have been used for this
error packet and recycle them. Otherwise, we may get the wrong
pages for the next packet.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Fri, 13 Feb 2009 00:53:48 +0000 (16:53 -0800)]
bnx2: Update 5709 firmware.
New firmware fixes a data corruption issue when receiving and
placing jumbo frames into host buffers. In some cases, the
buffer descriptor is not updated correctly and this will lead
to the driver linking the wrong number of pages into the SKB.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Fri, 13 Feb 2009 00:53:22 +0000 (16:53 -0800)]
bnx2: Update 5706/5708 firmware.
New firmware fixes a data corruption issue when receiving and
placing jumbo frames into host buffers. In some cases, the
buffer descriptor is not updated correctly and this will lead
to the driver linking the wrong number of pages into the SKB.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roel Kluin [Fri, 13 Feb 2009 00:52:31 +0000 (16:52 -0800)]
3c505: do not set pcb->data.raw beyond its size
Ensure that we do not set pcb->data.raw beyond its size, print an error message
and return false if we attempt to. A timout message was printed one too early.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Morton [Fri, 13 Feb 2009 00:47:01 +0000 (16:47 -0800)]
Documentation/connector/cn_test.c: don't use gfp_any()
cn_test_timer_func() is a timer handler and can never use GFP_KERNEL -
there's no point in using gfp_any() here.
Also, use setup_timer().
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Morton [Fri, 13 Feb 2009 00:43:17 +0000 (16:43 -0800)]
net: don't use in_atomic() in gfp_any()
The problem is that in_atomic() will return false inside spinlocks if
CONFIG_PREEMPT=n. This will lead to deadlockable GFP_KERNEL allocations
from spinlocked regions.
Secondly, if CONFIG_PREEMPT=y, this bug solves itself because networking
will instead use GFP_ATOMIC from this callsite. Hence we won't get the
might_sleep() debugging warnings which would have informed us of the buggy
callsites.
Solve both these problems by switching to in_interrupt(). Now, if someone
runs a gfp_any() allocation from inside spinlock we will get the warning
if CONFIG_PREEMPT=y.
I reviewed all callsites and most of them were too complex for my little
brain and none of them documented their interface requirements. I have no
idea what this patch will do.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roel Kluin [Fri, 13 Feb 2009 00:42:31 +0000 (16:42 -0800)]
IRDA: cnt is off by 1
If no prior break occurs, cnt reaches 101 after the loop, so we are still able
to change speed when cnt has become 100.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dhananjay Phadke [Fri, 13 Feb 2009 00:41:14 +0000 (16:41 -0800)]
netxen: remove pcie workaround
Remove workaround for pcie bug in early revisions of NX3031
(rev 41 or earlier). This is taken care of during firmware init.
The workaround required writing pcie config reg of every
pcie function on a card, not all of which are enabled.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roel Kluin [Fri, 13 Feb 2009 00:40:20 +0000 (16:40 -0800)]
sun3: print when lance_open() fails
With while (--i > 0) { ... } i reaches 0; print when lance_open() fails
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ron Mercer [Fri, 13 Feb 2009 00:38:34 +0000 (16:38 -0800)]
qlge: bugfix: Add missing rx buf clean index on early exit.
The large receive buffer queue is not properly tracking the current
index in the case where an early exit occurs. This can happen when a
page alloc or dma mapping fails. If this occurs the queue will get
out of sync and invalid indexes can be written to the hardware.
Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ron Mercer [Fri, 13 Feb 2009 00:38:18 +0000 (16:38 -0800)]
qlge: bugfix: Fix RX scaling values.
Receive packets were only scaling across 2 of the receive queues. The
value was hardcoded to 2 instead of being based on how many rx queues
were running.
Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ron Mercer [Fri, 13 Feb 2009 00:38:03 +0000 (16:38 -0800)]
qlge: bugfix: Fix TSO breakage.
Moved the buffer mapping to a point after TSO logic has modified the
iph->check field. We were seeing stale data on the PCIe bus.
Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ron Mercer [Fri, 13 Feb 2009 00:37:48 +0000 (16:37 -0800)]
qlge: bugfix: Add missing dev_kfree_skb_any() call.
We put the skb back if we can't get mapping for it. We don't
want unmapped buffers on our receive buffer queue.
Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ron Mercer [Fri, 13 Feb 2009 00:37:32 +0000 (16:37 -0800)]
qlge: bugfix: Add missing put_page() call.
We put the page back if we can't get mapping for it. We don't
want unmapped buffers on our receive buffer queue.
Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ron Mercer [Fri, 13 Feb 2009 00:37:13 +0000 (16:37 -0800)]
qlge: bugfix: Fix fatal error recovery hang.
Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ron Mercer [Fri, 13 Feb 2009 00:36:50 +0000 (16:36 -0800)]
qlge: bugfix: Use netif_receive_skb() and vlan_hwaccel_receive_skb().
Replace calls to vlan_hwaccel_rx() and netif_rx().
Thanks to Dave Miller for pointing out the the driver was making
the wrong upcall for passing packets into the stack.
Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roel Kluin [Fri, 13 Feb 2009 00:33:27 +0000 (16:33 -0800)]
TG3: limit reaches -1
With while (limit--) { ... } limit reaches -1, so 0 means success.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 12 Feb 2009 17:56:14 +0000 (09:56 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/penberg/slab-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
mm: Export symbol ksize()
Nick Piggin [Thu, 12 Feb 2009 03:34:23 +0000 (04:34 +0100)]
Fix page writeback thinko, causing Berkeley DB slowdown
A bug was introduced into write_cache_pages cyclic writeout by commit
31a12666d8f0c22235297e1c1575f82061480029 ("mm: write_cache_pages cyclic
fix"). The intention (and comments) is that we should cycle back and
look for more dirty pages at the beginning of the file if there is no
more work to be done.
But the !done condition was dropped from the test. This means that any
time the page writeout loop breaks (eg. due to nr_to_write == 0), we
will set index to 0, then goto again. This will set done_index to
index, then find done is set, so will proceed to the end of the
function. When updating mapping->writeback_index for cyclic writeout,
we now use done_index == 0, so we're always cycling back to 0.
This seemed to be causing random mmap writes (slapadd and iozone) to
start writing more pages from the LRU and writeout would slowdown, and
caused bugzilla entry
http://bugzilla.kernel.org/show_bug.cgi?id=12604
about Berkeley DB slowing down dramatically.
With this patch, iozone random write performance is increased nearly
5x on my system (iozone -B -r 4k -s 64k -s 512m -s 1200m on ext2).
Signed-off-by: Nick Piggin <npiggin@suse.de>
Reported-and-tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kirill A. Shutemov [Tue, 10 Feb 2009 13:21:44 +0000 (15:21 +0200)]
mm: Export symbol ksize()
Commit
7b2cd92adc5430b0c1adeb120971852b4ea1ab08 ("crypto: api - Fix
zeroing on free") added modular user of ksize(). Export that to fix
crypto.ko compilation.
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Linus Torvalds [Thu, 12 Feb 2009 00:28:08 +0000 (16:28 -0800)]
Merge git://git.infradead.org/users/cbou/battery-2.6.29
* git://git.infradead.org/users/cbou/battery-2.6.29:
pcf50633_charger: Fix typo
Takashi Iwai [Wed, 11 Feb 2009 23:13:19 +0000 (00:13 +0100)]
ALSA: hda - Register (new) devices at reconfig
The devices that have been newly added during reconfig must be
registered. Otherwise they won't be visible to user-space.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Takashi Iwai [Wed, 11 Feb 2009 23:06:42 +0000 (00:06 +0100)]
ALSA: mtpav - Fix initial value for input hwport
Fix the initial value for input hwport. The old value (-1) may cause
Oops when an realtime MIDI byte is received before the input port is
explicitly given.
Instead, now it's set to the broadcasting as default.
Tested-by: Holger Dehnhardt <dehnhardt@ahdehnhardt.de>
Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Ian Dall [Wed, 11 Feb 2009 21:04:46 +0000 (13:04 -0800)]
w1: w1 temp calculation overflow fix
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=12646
When the temperature exceeds 32767 milli-degrees the temperature overflows
to -32768 millidegrees. These are bothe well within the -55 - +125 degree
range for the sensor.
Fix overflow in left-shift of a u8.
Signed-off-by: Ian Dall <ian@beware.dropbear.id.au>
Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul Clements [Wed, 11 Feb 2009 21:04:45 +0000 (13:04 -0800)]
nbd: fix I/O hang on disconnected nbds
Fix a problem that causes I/O to a disconnected (or partially initialized)
nbd device to hang indefinitely. To reproduce:
# ioctl NBD_SET_SIZE_BLOCKS /dev/nbd23 514048
# dd if=/dev/nbd23 of=/dev/null bs=4096 count=1
...hangs...
This can also occur when an nbd device loses its nbd-client/server
connection. Although we clear the queue of any outstanding I/Os after the
client/server connection fails, any additional I/Os that get queued later
will hang.
This bug may also be the problem reported in this bug report:
http://bugzilla.kernel.org/show_bug.cgi?id=12277
Testing would need to be performed to determine if the two issues are the
same.
This problem was introduced by the new request handling thread code ("NBD:
allow nbd to be used locally", 3/2008), which entered into mainline around
2.6.25.
The fix, which is fairly simple, is to restore the check for lo->sock
being NULL in do_nbd_request. This causes I/O to an uninitialized nbd to
immediately fail with an I/O error, as it did prior to the introduction of
this bug.
Signed-off-by: Paul Clements <paul.clements@steeleye.com>
Reported-by: Jon Nelson <jnelson-kernel-bugzilla@jamponi.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: <stable@kernel.org> [2.6.26.x, 2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jeremy Fitzhardinge [Wed, 11 Feb 2009 21:04:41 +0000 (13:04 -0800)]
mm: rearrange exit_mmap() to unlock before arch_exit_mmap
Christophe Saout reported [in precursor to:
http://marc.info/?l=linux-kernel&m=
123209902707347&w=4]:
> Note that I also some a different issue with CONFIG_UNEVICTABLE_LRU.
> Seems like Xen tears down current->mm early on process termination, so
> that __get_user_pages in exit_mmap causes nasty messages when the
> process had any mlocked pages. (in fact, it somehow manages to get into
> the swapping code and produces a null pointer dereference trying to get
> a swap token)
Jeremy explained:
Yes. In the normal case under Xen, an in-use pagetable is "pinned",
meaning that it is RO to the kernel, and all updates must go via hypercall
(or writes are trapped and emulated, which is much the same thing). An
unpinned pagetable is not currently in use by any process, and can be
directly accessed as normal RW pages.
As an optimisation at process exit time, we unpin the pagetable as early
as possible (switching the process to init_mm), so that all the normal
pagetable teardown can happen with direct memory accesses.
This happens in exit_mmap() -> arch_exit_mmap(). The munlocking happens
a few lines below. The obvious thing to do would be to move
arch_exit_mmap() to below the munlock code, but I think we'd want to
call it even if mm->mmap is NULL, just to be on the safe side.
Thus, this patch:
exit_mmap() needs to unlock any locked vmas before calling arch_exit_mmap,
as the latter may switch the current mm to init_mm, which would cause the
former to fail.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Christophe Saout <christophe@saout.de>
Cc: Keir Fraser <keir.fraser@eu.citrix.com>
Cc: Christophe Saout <christophe@saout.de>
Cc: Alex Williamson <alex.williamson@hp.com>
Cc: <stable@kernel.org> [2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiri Slaby [Wed, 11 Feb 2009 21:04:40 +0000 (13:04 -0800)]
parport: parport_serial, don't bind netmos ibm 0299
Since netmos 9835 with subids 0x1014(IBM):0x0299 is now bound with
serial/8250_pci, because it has no parallel ports and subdevice id isn't
in the expected form, return -ENODEV from probe function.
This is performed in netmos preinit_hook.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Federico Cuello [Wed, 11 Feb 2009 21:04:39 +0000 (13:04 -0800)]
writeback: fix break condition
Commit
dcf6a79dda5cc2a2bec183e50d829030c0972aaa ("write-back: fix
nr_to_write counter") fixed nr_to_write counter, but didn't set the break
condition properly.
If nr_to_write == 0 after being decremented it will loop one more time
before setting done = 1 and breaking the loop.
[akpm@linux-foundation.org: coding-style fixes]
Cc: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Heiko Carstens [Wed, 11 Feb 2009 21:04:38 +0000 (13:04 -0800)]
syscall define: fix uml compile bug
With the new system call defines we get this on uml:
arch/um/sys-i386/built-in.o: In function `sys_call_table':
(.rodata+0x308): undefined reference to `sys_sigprocmask'
Reason for this is that uml passes the preprocessor option
-Dsigprocmask=kernel_sigprocmask to gcc when compiling the kernel.
This causes SYSCALL_DEFINE3(sigprocmask, ...) to be expanded to
SYSCALL_DEFINEx(3, kernel_sigprocmask, ...) and finally to a system
call named sys_kernel_sigprocmask. However sys_sigprocmask is missing
because of this.
To avoid macro expansion for the system call name just concatenate the
name at first define instead of carrying it through severel levels.
This was pointed out by Al Viro.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: WANG Cong <wangcong@zeuux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Carsten Otte [Wed, 11 Feb 2009 21:04:37 +0000 (13:04 -0800)]
ext2/xip: refuse to change xip flag during remount with busy inodes
For a reason that I was unable to understand in three months of debugging,
mount ext2 -o remount stopped working properly when remounting from
regular operation to xip, or the other way around. According to a git
bisect search, the problem was introduced with the VM_MIXEDMAP/PTE_SPECIAL
rework in the vm:
commit
70688e4dd1647f0ceb502bbd5964fa344c5eb411
Author: Nick Piggin <npiggin@suse.de>
Date: Mon Apr 28 02:13:02 2008 -0700
xip: support non-struct page backed memory
In the failing scenario, the filesystem is mounted read only via root=
kernel parameter on s390x. During remount (in rc.sysinit), the inodes of
the bash binary and its libraries are busy and cannot be invalidated (the
bash which is running rc.sysinit resides on subject filesystem).
Afterwards, another bash process (running ifup-eth) recurses into a
subshell, runs dup_mm (via fork). Some of the mappings in this bash
process were created from inodes that could not be invalidated during
remount.
Both parent and child process crash some time later due to inconsistencies
in their address spaces. The issue seems to be timing sensitive, various
attempts to recreate it have failed.
This patch refuses to change the xip flag during remount in case some
inodes cannot be invalidated. This patch keeps users from running into
that issue.
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Jared Hulbert <jaredeh@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Li Zefan [Wed, 11 Feb 2009 21:04:36 +0000 (13:04 -0800)]
cgroups: fix lockdep subclasses overflow
I enabled all cgroup subsystems when compiling kernel, and then:
# mount -t cgroup -o net_cls xxx /mnt
# mkdir /mnt/0
This showed up immediately:
BUG: MAX_LOCKDEP_SUBCLASSES too low!
turning off the locking correctness validator.
It's caused by the cgroup hierarchy lock:
for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
struct cgroup_subsys *ss = subsys[i];
if (ss->root == root)
mutex_lock_nested(&ss->hierarchy_mutex, i);
}
Now we have 9 cgroup subsystems, and the above 'i' for net_cls is 8, but
MAX_LOCKDEP_SUBCLASSES is 8.
This patch uses different lockdep keys for different subsystems.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KOSAKI Motohiro [Wed, 11 Feb 2009 21:04:35 +0000 (13:04 -0800)]
cgroups: add Li Zefan as a maintainer
Add Li Zefan as co-maintainer.
Acked-by: Paul Menage <menage@google.com>
Acked-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roel Kluin [Wed, 11 Feb 2009 21:04:34 +0000 (13:04 -0800)]
rtc: t reaches -1, tested 0
With a postfix decrement t will reach -1 rather than 0, so neither the
warning nor the `goto error_out' will occur.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Manuel Lauss <mano@roarinelk.homelinux.net>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Wed, 11 Feb 2009 21:04:33 +0000 (13:04 -0800)]
kernel-doc: fix syscall wrapper processing
Fix kernel-doc processing of SYSCALL wrappers.
The SYSCALL wrapper patches played havoc with kernel-doc for
syscalls. Syscalls that were scanned for DocBook processing
reported warnings like this one, for sys_tgkill:
Warning(kernel/signal.c:2285): No description found for parameter 'tgkill'
Warning(kernel/signal.c:2285): No description found for parameter 'pid_t'
Warning(kernel/signal.c:2285): No description found for parameter 'int'
because the macro parameters all "look like" function parameters,
although they are not:
/**
* sys_tgkill - send signal to one specific thread
* @tgid: the thread group ID of the thread
* @pid: the PID of the thread
* @sig: signal to be sent
*
* This syscall also checks the @tgid and returns -ESRCH even if the PID
* exists but it's not belonging to the target process anymore. This
* method solves the problem of threads exiting and PIDs getting reused.
*/
SYSCALL_DEFINE3(tgkill, pid_t, tgid, pid_t, pid, int, sig)
{
...
This patch special-cases the handling SYSCALL_DEFINE* function
prototypes by expanding them to
long sys_foobar(type1 arg1, type1 arg2, ...)
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Wed, 11 Feb 2009 21:04:31 +0000 (13:04 -0800)]
kernel-doc: preferred ending marker and examples
Fix kernel-doc-nano-HOWTO.txt to use */ as the ending marker in kernel-doc
examples and state that */ is the preferred ending marker.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Reported-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KAMEZAWA Hiroyuki [Wed, 11 Feb 2009 21:04:29 +0000 (13:04 -0800)]
memcg: use __GFP_NOWARN in page cgroup allocation
page_cgroup's page allocation at init/memory hotplug uses kmalloc() and
vmalloc(). If kmalloc() failes, vmalloc() is used.
This is because vmalloc() is very limited resource on 32bit systems.
We want to use kmalloc() first.
But in this kind of call, __GFP_NOWARN should be specified.
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Uwe Kleine-Koenig [Wed, 11 Feb 2009 21:04:28 +0000 (13:04 -0800)]
video/framebuffer: move the probe func into .devinit.text in Blackfin LCD driver
Signed-off-by: Uwe Kleine-Koenig <ukleinek@strlen.de>
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Marcel Selhorst [Wed, 11 Feb 2009 21:04:27 +0000 (13:04 -0800)]
tpm: correct email address for tpm_infineon-driver
Update my email address.
Signed-off-by: Marcel Selhorst <m.selhorst@sirrix.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
MinChan Kim [Wed, 11 Feb 2009 21:04:27 +0000 (13:04 -0800)]
mm: fix mlocked page counter mismatch
When I tested following program, I found that the mlocked counter
is strange. It cannot free some mlocked pages.
It is because try_to_unmap_file() doesn't check real
page mappings in vmas.
That is because the goal of an address_space for a file is to find all
processes into which the file's specific interval is mapped. It is
related to the file's interval, not to pages.
Even if the page isn't really mapped by the vma, it returns SWAP_MLOCK
since the vma has VM_LOCKED, then calls try_to_mlock_page. After this the
mlocked counter is increased again.
COWed anon page in a file-backed vma could be a such case. This patch
resolves it.
-- my test program --
int main()
{
mlockall(MCL_CURRENT);
return 0;
}
-- before --
root@barrios-target-linux:~# cat /proc/meminfo | egrep 'Mlo|Unev'
Unevictable: 0 kB
Mlocked: 0 kB
-- after --
root@barrios-target-linux:~# cat /proc/meminfo | egrep 'Mlo|Unev'
Unevictable: 8 kB
Mlocked: 8 kB
Signed-off-by: MinChan Kim <minchan.kim@gmail.com>
Acked-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Tested-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Kara [Wed, 11 Feb 2009 21:04:26 +0000 (13:04 -0800)]
ext3: revert "ext3: wait on all pending commits in ext3_sync_fs"
This reverts commit
c87591b719737b4e91eb1a9fa8fd55a4ff1886d6.
Since journal_start_commit() is now fixed to return 1 when we started a
transaction commit, there's some transaction waiting to be committed or
there's a transaction already committing, we don't need to call
ext3_force_commit() in ext3_sync_fs(). Furthermore ext3_force_commit()
can unnecessarily create sync transaction which is expensive so it's
worthwhile to remove it when we can.
Cc: Eric Sandeen <sandeen@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Kara [Wed, 11 Feb 2009 21:04:25 +0000 (13:04 -0800)]
jbd: fix return value of journal_start_commit()
journal_start_commit() returns 1 if either a transaction is committing or
the function has queued a transaction commit. But it returns 0 if we
raced with somebody queueing the transaction commit as well. This
resulted in ext3_sync_fs() not functioning correctly (description from
Arthur Jones): In the case of a data=ordered umount with pending long
symlinks which are delayed due to a long list of other I/O on the backing
block device, this causes the buffer associated with the long symlinks to
not be moved to the inode dirty list in the second phase of fsync_super.
Then, before they can be dirtied again, kjournald exits, seeing the UMOUNT
flag and the dirty pages are never written to the backing block device,
causing long symlink corruption and exposing new or previously freed block
data to userspace.
This can be reproduced with a script created by Eric Sandeen
<sandeen@redhat.com>:
#!/bin/bash
umount /mnt/test2
mount /dev/sdb4 /mnt/test2
rm -f /mnt/test2/*
dd if=/dev/zero of=/mnt/test2/bigfile bs=1M count=512
touch /mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
ln -s /mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
/mnt/test2/link
umount /mnt/test2
mount /dev/sdb4 /mnt/test2
ls /mnt/test2/
This patch fixes journal_start_commit() to always return 1 when there's
a transaction committing or queued for commit.
Cc: Eric Sandeen <sandeen@redhat.com>
Cc: Mike Snitzer <snitzer@gmail.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sven Wegener [Wed, 11 Feb 2009 21:04:23 +0000 (13:04 -0800)]
mm: fix dirty_bytes/dirty_background_bytes sysctls on 64bit arches
We need to pass an unsigned long as the minimum, because it gets casted
to an unsigned long in the sysctl handler. If we pass an int, we'll
access four more bytes on 64bit arches, resulting in a random minimum
value.
[rientjes@google.com: fix type of `old_bytes']
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andres Salomon [Wed, 11 Feb 2009 21:04:23 +0000 (13:04 -0800)]
gx1fb: properly alloc cmap and plug cmap leak
We weren't properly allocating the cmap for depths greater than 8bpp,
which caused pain for things like DirectFB. Also, we never freed the cmap
memory upon module unload..
Signed-off-by: Andres Salomon <dilinger@debian.org>
Cc: Marco La Porta <marco-laporta@tiscali.it>
Cc: Jordan Crouse <jordan@cosmicpenguin.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andres Salomon [Wed, 11 Feb 2009 21:04:22 +0000 (13:04 -0800)]
gxfb: properly alloc cmap and plug cmap leak
We weren't properly allocating the cmap for depths greater than 8bpp,
which caused pain for things like DirectFB. Also, we never freed the cmap
memory upon module unload..
Signed-off-by: Andres Salomon <dilinger@debian.org>
Cc: Marco La Porta <marco-laporta@tiscali.it>
Cc: Jordan Crouse <jordan@cosmicpenguin.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Marco La Porta [Wed, 11 Feb 2009 21:04:20 +0000 (13:04 -0800)]
lxfb: properly alloc cmap in all cases and don't leak the memory
We weren't properly allocating the cmap for depths greater than 8bpp,
which caused pain for things like DirectFB. Also, we never freed the cmap
memory upon module unload..
[dilinger@debian.org: dropped unnecessary code and clean up patch]
[dilinger@debian.org: add error checking and handling]
Signed-off-by: Andres Salomon <dilinger@debian.org>
Cc: Jordan Crouse <jordan@cosmicpenguin.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robert Jarzmik [Wed, 11 Feb 2009 21:04:19 +0000 (13:04 -0800)]
rtc: update maintainership of pxa rtc driver
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Daisuke Nishimura [Wed, 11 Feb 2009 21:04:18 +0000 (13:04 -0800)]
migration: migrate_vmas should check "vma"
migrate_vmas() should check "vma" not "vma->vm_next" for for-loop condition.
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Wed, 11 Feb 2009 16:34:16 +0000 (16:34 +0000)]
Do not account for hugetlbfs quota at mmap() time if mapping [SHM|MAP]_NORESERVE
Commit
5a6fe125950676015f5108fb71b2a67441755003 brought hugetlbfs more
in line with the core VM by obeying VM_NORESERVE and not reserving
hugepages for both shared and private mappings when [SHM|MAP]_NORESERVE
are specified. However, it is still taking filesystem quota
unconditionally.
At fault time, if there are no reserves and attempt is made to allocate
the page and account for filesystem quota. If either fail, the fault
fails. The impact is that quota is getting accounted for twice. This
patch partially reverts
5a6fe125950676015f5108fb71b2a67441755003. To
help prevent this mistake happening again, it improves the documentation
of hugetlb_reserve_pages()
Reported-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reinette Chatre [Tue, 10 Feb 2009 20:02:49 +0000 (12:02 -0800)]
iwlwifi: fix suspend/resume and its usage of pci saved state
Here we do two things:
First, revert "iwlwifi: save PCI state before suspend, restore after
resume". That misguided patch led to being unable to use iwlwifi
devices after resume.
Next, indicate to PCI driver that the saved PCI state is valid during suspend.
We restore PCI state and enable the device when network interface is created,
similarly PCI state is saved and the device is disabled when network interface
is removed. Thus, when .suspend is called the PCI state is saved and device
is disabled. This is the case even if an interface is never created as PCI
state is saved and device disabled during .probe.
PCI driver assumes PCI state is saved in .suspend. Saving the state at this
time will save state of disabled device and thus cause problems during
resume (resuming a disabled device). We thus indicate directly to PCI
driver that current PCI saved state is valid.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Alex Riesen <fork0@users.sf.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Hin-Tak Leung [Wed, 4 Feb 2009 23:40:43 +0000 (23:40 +0000)]
zd1211rw: treat MAXIM_NEW_RF(0x08) as UW2453_RF(0x09) for TP-Link WN322/422G
Three people (Petr Mensik <pihhan@cipis.net>
["si" should be U+0161 U+00ED], Stephen Ho <stephenhoinhk@gmail.com>
on zd1211-devs and Ismael Ojeda Perez <iojedaperez@gmail.com>
on linux-wireless) reported success in getting TP-Link WN322G/WN422G
working by treating MAXIM_NEW_RF(0x08) as UW2453_RF(0x09) for rf
chip hardware initialization.
Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Tested-by: Petr Mensik <pihhan@cipis.net>
Tested-by: Stephen Ho <stephenhoinhk@gmail.com>
Tested-by: Ismael Ojeda Perez <iojedaperez@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Hin-Tak Leung [Sun, 8 Feb 2009 02:13:56 +0000 (02:13 +0000)]
zd1211rw: adding 0ace:0xa211 as a ZD1211 device
Christoph Biedl <sourceforge.bnwi@manchmal.in-ulm.de> reported success
in the sourceforge zd1211 mailing list on this addition. This product ID
was supported by the vendor driver ZD1211LnxDrv 2.22.0.0 (and possibly
earlier) and it probably should have been added earlier.
Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Tested-by: Christoph Biedl <sourceforge.bnwi@manchmal.in-ulm.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johannes Berg [Thu, 5 Feb 2009 23:27:32 +0000 (00:27 +0100)]
mac80211: restrict to AP in outgoing interface heuristic
We try to find the correct outgoing interface for injected frames
based on the TA, but since this is a hack for hostapd 11w, restrict
the heuristic to AP mode interfaces. At some point we'll add the
ability to give an interface index in radiotap or so and just
remove this heuristic again.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: stable@kernel.org [2.6.28.x]
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Bob Copeland [Sat, 10 Jan 2009 19:42:54 +0000 (14:42 -0500)]
ath5k: fix bf->skb==NULL panic in ath5k_tasklet_rx
Under memory pressure, we may not be able to allocate a new skb for
new packets. If the allocation fails, ath5k_tasklet_rx will exit but
will leave a buffer in the list with a NULL skb, eventually triggering
a BUG_ON.
Extract the skb allocation from ath5k_rxbuf_setup() and change the
tasklet to allocate the next skb before accepting a packet.
Changes-licensed-under: 3-Clause-BSD
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Linus Torvalds [Wed, 11 Feb 2009 16:25:06 +0000 (08:25 -0800)]
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: revert recent sync wakeup changes
Linus Torvalds [Wed, 11 Feb 2009 16:24:32 +0000 (08:24 -0800)]
Merge branch 'timers-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
timers: fix TIMER_ABSTIME for process wide cpu timers
timers: split process wide cpu clocks/timers, fix
x86: clean up hpet timer reinit
timers: split process wide cpu clocks/timers, remove spurious warning
timers: split process wide cpu clocks/timers
signal: re-add dead task accumulation stats.
x86: fix hpet timer reinit for x86_64
sched: fix nohz load balancer on cpu offline
Linus Torvalds [Wed, 11 Feb 2009 16:23:22 +0000 (08:23 -0800)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
ptrace, x86: fix the usage of ptrace_fork()
i8327: fix outb() parameter order
x86: fix math_emu register frame access
x86: math_emu info cleanup
x86: include correct %gs in a.out core dump
x86, vmi: put a missing paravirt_release_pmd in pgd_dtor
x86: find nr_irqs_gsi with mp_ioapic_routing
x86: add clflush before monitor for Intel 7400 series
x86: disable intel_iommu support by default
x86: don't apply __supported_pte_mask to non-present ptes
x86: fix grammar in user-visible BIOS warning
x86/Kconfig.cpu: make Kconfig help readable in the console
x86, 64-bit: print DMI info in the oops trace
Linus Torvalds [Wed, 11 Feb 2009 16:22:26 +0000 (08:22 -0800)]
Merge branch 'tracing-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing, x86: fix constraint for parent variable
tracing, x86: fix fixup section to return to original code
profiling: fix broken profiling regression
Linus Torvalds [Wed, 11 Feb 2009 16:21:29 +0000 (08:21 -0800)]
Merge branch 'for-linus' of git://git390.marist.edu/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
[S390] Update default configuration.
[S390] dasd: fix race in dasd timer handling
[S390] dasd: bus_id -> dev_name() conversion.
[S390] Fix init irq proc build break.
[S390] vdso: fix per cpu vdso pointer in lowcore
Linus Torvalds [Wed, 11 Feb 2009 16:21:11 +0000 (08:21 -0800)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/mm: Fix _PAGE_COHERENT support on classic ppc32 HW
Peter Zijlstra [Wed, 11 Feb 2009 13:27:17 +0000 (14:27 +0100)]
sched: revert recent sync wakeup changes
Intel reported a 10% regression (mysql+sysbench) on a 16-way machine
with these patches:
1596e29: sched: symmetric sync vs avg_overlap
d942fb6: sched: fix sync wakeups
Revert them.
Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Bisected-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 11 Feb 2009 10:30:27 +0000 (11:30 +0100)]
timers: fix TIMER_ABSTIME for process wide cpu timers
The POSIX timer interface allows for absolute time expiry values through the
TIMER_ABSTIME flag, therefore we have to synchronize the timer to the clock
every time we start it.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Tue, 10 Feb 2009 15:37:31 +0000 (16:37 +0100)]
timers: split process wide cpu clocks/timers, fix
To decrease the chance of a missed enable, always enable the timer when we
sample it, we'll always disable it when we find that there are no active timers
in the jiffy tick.
This fixes a flood of warnings reported by Mike Galbraith.
Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Martin Schwidefsky [Wed, 11 Feb 2009 09:37:32 +0000 (10:37 +0100)]
[S390] Update default configuration.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Stefan Weinhuber [Wed, 11 Feb 2009 09:37:31 +0000 (10:37 +0100)]
[S390] dasd: fix race in dasd timer handling
In dasd_device_set_timer and dasd_block_set_timer we interpret the
return value of mod_timer in a wrong way. If the timer expires in
the small window between our check of timer_pending and the call to
mod_timer, then the timer will be set, mod_timer returns zero and
we will call add_timer for a timer that is already pending.
As del_timer and mod_timer do all the necessary checking themselves,
we can simplify our code and remove the race a the same time.
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cornelia Huck [Wed, 11 Feb 2009 09:37:30 +0000 (10:37 +0100)]
[S390] dasd: bus_id -> dev_name() conversion.
bus_id usage crept in again; fix it.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>