GitHub/LineageOS/G12/android_kernel_amlogic_linux-4.9.git
16 years agosctp: Correctly cleanup procfs entries upon failure.
Wei Yongjun [Mon, 16 Jun 2008 23:59:55 +0000 (16:59 -0700)]
sctp: Correctly cleanup procfs entries upon failure.

This patch remove the proc fs entry which has been created if fail to
set up proc fs entry for the SCTP protocol.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agotcp: Revert reset of deferred accept changes in 2.6.26
David S. Miller [Mon, 16 Jun 2008 23:57:40 +0000 (16:57 -0700)]
tcp: Revert reset of deferred accept changes in 2.6.26

Ingo's system is still seeing strange behavior, and he
reports that is goes away if the rest of the deferred
accept changes are reverted too.

Therefore this reverts e4c78840284f3f51b1896cf3936d60a6033c4d2c
("[TCP]: TCP_DEFER_ACCEPT updates - dont retxmt synack") and
539fae89bebd16ebeafd57a87169bc56eb530d76 ("[TCP]: TCP_DEFER_ACCEPT
updates - defer timeout conflicts with max_thresh").

Just like the other revert, these ideas can be revisited for
2.6.27

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoipv6 sit: Avoid extra need for compat layer in PRL management.
YOSHIFUJI Hideaki [Mon, 16 Jun 2008 23:48:20 +0000 (16:48 -0700)]
ipv6 sit: Avoid extra need for compat layer in PRL management.

We've introduced extra need of compat layer for ip_tunnel_prl{}
for PRL (Potential Router List) management.  Though compat_ioctl
is still missing in ipv4/ipv6, let's make the interface more
straight-forward and eliminate extra need for nasty compat layer
anyway since the interface is new for 2.6.26.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: Change HTB_HYSTERESIS to a runtime parameter htb_hysteresis.
Jesper Dangaard Brouer [Mon, 16 Jun 2008 23:39:32 +0000 (16:39 -0700)]
pkt_sched: Change HTB_HYSTERESIS to a runtime parameter htb_hysteresis.

Add a htb_hysteresis parameter to htb_sch.ko and by sysfs magic make
it runtime adjustable via
/sys/module/sch_htb/parameters/htb_hysteresis mode 640.

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Acked-by: Martin Devera <devik@cdi.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: HTB scheduler, change default hysteresis mode to off.
Jesper Dangaard Brouer [Mon, 16 Jun 2008 23:38:33 +0000 (16:38 -0700)]
pkt_sched: HTB scheduler, change default hysteresis mode to off.

The HTB hysteresis mode reduce the CPU load, but at the
cost of scheduling accuracy.

On ADSL links (512 kbit/s upstream), this inaccuracy introduce
significant jitter, enought to disturbe VoIP.  For details see my
masters thesis (http://www.adsl-optimizer.dk/thesis/), chapter 7,
section 7.3.1, pp 69-70.

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Acked-by: Martin Devera <devik@cdi.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireles...
David S. Miller [Sun, 15 Jun 2008 00:33:38 +0000 (17:33 -0700)]
Merge branch 'master' of /linux/kernel/git/linville/wireless-2.6

16 years agort2x00: Add D-link DWA111 support
Ivo van Doorn [Thu, 12 Jun 2008 18:47:17 +0000 (20:47 +0200)]
rt2x00: Add D-link DWA111 support

Add new rt73usb USB ID for D-Link DWA111

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
16 years agomac80211: add missing new line in debug print HT_DEBUG
Tomas Winkler [Thu, 12 Jun 2008 17:08:19 +0000 (20:08 +0300)]
mac80211: add missing new line in debug print HT_DEBUG

This patch adds '\n' in debug printk (wme.c HT DEBUG)

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
16 years agomac80211 : fix for iwconfig in ad-hoc mode
Abhijeet Kolekar [Thu, 12 Jun 2008 01:47:16 +0000 (09:47 +0800)]
mac80211 : fix for iwconfig in ad-hoc mode

The patch checks interface status, if it is in IBSS_JOINED mode
show cell id it is associated with.

Signed-off-by: Abhijeet Kolekar <abhijeet.kolekar@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
16 years agossb: Fix coherent DMA mask for PCI devices
Michael Buesch [Thu, 12 Jun 2008 13:33:13 +0000 (15:33 +0200)]
ssb: Fix coherent DMA mask for PCI devices

This fixes setting the coherent DMA mask for PCI devices.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
16 years agort2x00: LEDS build failure
Randy Dunlap [Wed, 11 Jun 2008 20:32:22 +0000 (13:32 -0700)]
rt2x00: LEDS build failure

Config symbols that select LEDS_CLASS need to depend on NEW_LEDS so that
undefined symbols are not used in the build.

The alternative is to select NEW_LEDS, which some drivers do.

This patch fixes the led_* symbols build errors.

(.text+0x174cdc): undefined reference to `input_unregister_device'
(.text+0x174d9f): undefined reference to `input_allocate_device'
(.text+0x174e2d): undefined reference to `input_register_device'
(.text+0x174e53): undefined reference to `input_free_device'
rt2x00rfkill.c:(.text+0x176dc4): undefined reference to `input_allocate_polled_device'
rt2x00rfkill.c:(.text+0x176e8b): undefined reference to `input_event'
rt2x00rfkill.c:(.text+0x176e9f): undefined reference to `input_event'
(.text+0x176eca): undefined reference to `input_unregister_polled_device'
(.text+0x176efc): undefined reference to `input_free_polled_device'
(.text+0x176f37): undefined reference to `input_free_polled_device'
(.text+0x176fd8): undefined reference to `input_register_polled_device'
(.text+0x1772c0): undefined reference to `led_classdev_resume'
(.text+0x1772d4): undefined reference to `led_classdev_resume'
(.text+0x1772e8): undefined reference to `led_classdev_resume'
(.text+0x17730a): undefined reference to `led_classdev_suspend'
(.text+0x17731e): undefined reference to `led_classdev_suspend'
(.text+0x17732f): undefined reference to `led_classdev_suspend'
rt2x00leds.c:(.text+0x177348): undefined reference to `led_classdev_unregister'
rt2x00leds.c:(.text+0x1773c0): undefined reference to `led_classdev_register'
rfkill-input.c:(.text+0x209e4c): undefined reference to `input_close_device'
rfkill-input.c:(.text+0x209e53): undefined reference to `input_unregister_handle'
rfkill-input.c:(.text+0x209ea1): undefined reference to `input_register_handle'
rfkill-input.c:(.text+0x209eae): undefined reference to `input_open_device'
rfkill-input.c:(.text+0x209ebb): undefined reference to `input_unregister_handle'
rfkill-input.c:(.init.text+0x17405): undefined reference to `input_register_handler'
rfkill-input.c:(.exit.text+0x194f): undefined reference to `input_unregister_handler'
make[1]: *** [vmlinux] Error 1

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
16 years agort2x00: INPUT build failure
Randy Dunlap [Wed, 11 Jun 2008 19:57:58 +0000 (12:57 -0700)]
rt2x00: INPUT build failure

Config symbols that select RFKILL need to depend on INPUT so that
undefined symbols are not used in the build.

This patch fixes the input_* symbols build errors.

(.text+0x174cdc): undefined reference to `input_unregister_device'
(.text+0x174d9f): undefined reference to `input_allocate_device'
(.text+0x174e2d): undefined reference to `input_register_device'
(.text+0x174e53): undefined reference to `input_free_device'
rt2x00rfkill.c:(.text+0x176dc4): undefined reference to `input_allocate_polled_device'
rt2x00rfkill.c:(.text+0x176e8b): undefined reference to `input_event'
rt2x00rfkill.c:(.text+0x176e9f): undefined reference to `input_event'
(.text+0x176eca): undefined reference to `input_unregister_polled_device'
(.text+0x176efc): undefined reference to `input_free_polled_device'
(.text+0x176f37): undefined reference to `input_free_polled_device'
(.text+0x176fd8): undefined reference to `input_register_polled_device'
(.text+0x1772c0): undefined reference to `led_classdev_resume'
(.text+0x1772d4): undefined reference to `led_classdev_resume'
(.text+0x1772e8): undefined reference to `led_classdev_resume'
(.text+0x17730a): undefined reference to `led_classdev_suspend'
(.text+0x17731e): undefined reference to `led_classdev_suspend'
(.text+0x17732f): undefined reference to `led_classdev_suspend'
rt2x00leds.c:(.text+0x177348): undefined reference to `led_classdev_unregister'
rt2x00leds.c:(.text+0x1773c0): undefined reference to `led_classdev_register'
rfkill-input.c:(.text+0x209e4c): undefined reference to `input_close_device'
rfkill-input.c:(.text+0x209e53): undefined reference to `input_unregister_handle'
rfkill-input.c:(.text+0x209ea1): undefined reference to `input_register_handle'
rfkill-input.c:(.text+0x209eae): undefined reference to `input_open_device'
rfkill-input.c:(.text+0x209ebb): undefined reference to `input_unregister_handle'
rfkill-input.c:(.init.text+0x17405): undefined reference to `input_register_handler'
rfkill-input.c:(.exit.text+0x194f): undefined reference to `input_unregister_handler'
make[1]: *** [vmlinux] Error 1

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
16 years agob43: Fix noise calculation WARN_ON
Michael Buesch [Thu, 12 Jun 2008 10:36:29 +0000 (12:36 +0200)]
b43: Fix noise calculation WARN_ON

This removes a WARN_ON that is responsible for the following koops:
http://www.kerneloops.org/searchweek.php?search=b43_generate_noise_sample

The comment in the patch describes why it's safe to simply remove
the check.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
16 years agob43: Fix possible NULL pointer dereference in DMA code
Michael Buesch [Thu, 12 Jun 2008 09:58:56 +0000 (11:58 +0200)]
b43: Fix possible NULL pointer dereference in DMA code

This fixes a possible NULL pointer dereference in an error path of the
DMA allocation error checking code. This is also necessary for a future
DMA API change that is on its way into the mainline kernel that adds
an additional dev parameter to dma_mapping_error().

This patch moves the whole struct b43_dmaring struct initialization
right before any DMA allocation operation.

Reported-by: Miles Lane <miles.lane@gmail.com>
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
16 years agort2x00: Restrict DMA to 32-bit addresses.
Gertjan van Wingerde [Tue, 3 Jun 2008 18:29:47 +0000 (20:29 +0200)]
rt2x00: Restrict DMA to 32-bit addresses.

None of the rt2x00 PCI devices support 64-bit DMA addresses (they all
only accept 32-bit buffer addresses). Hence it makes no sense to try to
enable 64-bit DMA addresses. Only try to enable 32-bit DMA addresses.

Signed-off-by: Gertjan van Wingerde <gwingerde@kpnplanet.nl>
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
16 years agort2x00: Don't kill guardian_urb when it wasn't created
Ivo van Doorn [Tue, 3 Jun 2008 18:29:50 +0000 (20:29 +0200)]
rt2x00: Don't kill guardian_urb when it wasn't created

This fixes a "BUG: unable to handle kernel paging request"
bug in rt73usb which was caused by killing the guardian_urb
while it had never been allocated for rt73usb.

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6
Linus Torvalds [Fri, 13 Jun 2008 16:47:07 +0000 (09:47 -0700)]
Merge git://git./linux/kernel/git/kyle/parisc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6:
  parisc: update my email address
  parisc: fix miscompilation of ip_fast_csum with gcc >= 4.3
  parisc: fix off by one in setup_sigcontext32
  parisc: export empty_zero_page
  parisc: export copy_user_page_asm
  parisc: move head.S to head.text section
  Revert "parisc: fix trivial section name warnings"

16 years agoparisc: update my email address
Kyle McMartin [Fri, 6 Jun 2008 21:16:17 +0000 (17:16 -0400)]
parisc: update my email address

Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
16 years agoparisc: fix miscompilation of ip_fast_csum with gcc >= 4.3
Kyle McMartin [Sat, 31 May 2008 16:15:42 +0000 (12:15 -0400)]
parisc: fix miscompilation of ip_fast_csum with gcc >= 4.3

ip_fast_csum needs an asm "memory" clobber, otherwise the aggressive
optimizations in gcc-4.3 cause it to be miscompiled.

Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
16 years agoparisc: fix off by one in setup_sigcontext32
Kyle McMartin [Tue, 27 May 2008 05:56:29 +0000 (01:56 -0400)]
parisc: fix off by one in setup_sigcontext32

Thankfully, the values were irrelevant... Spotted by
newer gcc.

Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
16 years agoparisc: export empty_zero_page
Kyle McMartin [Mon, 26 May 2008 05:54:35 +0000 (01:54 -0400)]
parisc: export empty_zero_page

Needed by ext4 when built as a module.

Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
16 years agoparisc: export copy_user_page_asm
Kyle McMartin [Mon, 26 May 2008 05:49:01 +0000 (01:49 -0400)]
parisc: export copy_user_page_asm

Needed by fuse (via copy_highpage).

Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
16 years agoparisc: move head.S to head.text section
Kyle McMartin [Thu, 22 May 2008 18:38:26 +0000 (14:38 -0400)]
parisc: move head.S to head.text section

And explicitly list it in vmlinux.lds...

Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
16 years agoRevert "parisc: fix trivial section name warnings"
Kyle McMartin [Thu, 22 May 2008 18:36:31 +0000 (14:36 -0400)]
Revert "parisc: fix trivial section name warnings"

This reverts commit bd3bb8c15b9a80dbddfb7905b237a4a11a4725b4.

Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
16 years agoMerge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzi...
Linus Torvalds [Fri, 13 Jun 2008 14:40:57 +0000 (07:40 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  ahci: Workaround HW bug for SB600/700 SATA controller PMP support
  ahci: workarounds for mcp65

16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Fri, 13 Jun 2008 14:34:47 +0000 (07:34 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  tcp: Revert 'process defer accept as established' changes.
  ipv6: Fix duplicate initialization of rawv6_prot.destroy
  bnx2x: Updating the Maintainer
  net: Eliminate flush_scheduled_work() calls while RTNL is held.
  drivers/net/r6040.c: correct bad use of round_jiffies()
  fec_mpc52xx: MPC52xx_MESSAGES_DEFAULT: 2nd NETIF_MSG_IFDOWN => IFUP
  ipg: fix receivemode IPG_RM_RECEIVEMULTICAST{,HASH} in ipg_nic_set_multicast_list()
  netfilter: nf_conntrack: fix ctnetlink related crash in nf_nat_setup_info()
  netfilter: Make nflog quiet when no one listen in userspace.
  ipv6: Fail with appropriate error code when setting not-applicable sockopt.
  ipv6: Check IPV6_MULTICAST_LOOP option value.
  ipv6: Check the hop limit setting in ancillary data.
  ipv6 route: Fix route lifetime in netlink message.
  ipv6 mcast: Check address family of gf_group in getsockopt(MS_FILTER).
  dccp: Bug in initial acknowledgment number assignment
  dccp ccid-3: X truncated due to type conversion
  dccp ccid-3: TFRC reverse-lookup Bug-Fix
  dccp ccid-2: Bug-Fix - Ack Vectors need to be ignored on request sockets
  dccp: Fix sparse warnings
  dccp ccid-3: Bug-Fix - Zero RTT is possible

16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
Linus Torvalds [Fri, 13 Jun 2008 14:34:01 +0000 (07:34 -0700)]
Merge git://git./linux/kernel/git/davem/sparc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc: get leo framebuffer working

16 years agoahci: Workaround HW bug for SB600/700 SATA controller PMP support
Shane Huang [Tue, 10 Jun 2008 07:52:04 +0000 (15:52 +0800)]
ahci: Workaround HW bug for SB600/700 SATA controller PMP support

There is one bug in ATI SATA PMP of SB600 and SB700 old revision, which leads
to soft reset failure. This patch can fix the bug.

Signed-off-by: Shane Huang <shane.huang@amd.com>
Acked-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agoahci: workarounds for mcp65
Tejun Heo [Mon, 9 Jun 2008 15:13:04 +0000 (00:13 +0900)]
ahci: workarounds for mcp65

MCP65 ahci can do NCQ but doesn't set the CAP bit and rev A0 and A1
can't do MSI but have MSI capability.  Implement AHCI_HFLAG_YES_NCQ
and apply appropriate workarounds.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Peer Chen <pchen@nvidia.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agoMerge master.kernel.org:/home/rmk/linux-2.6-arm
Linus Torvalds [Fri, 13 Jun 2008 02:37:29 +0000 (19:37 -0700)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm

* master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] 5091/1: Add missing bitfield include to regs-lcd.h
  [ARM] 5090/1: Correct pxafb palette typo error
  [ARM] 5077/1: spi: fix list scan success verification in PXA ssp driver

16 years agortc: Ramtron FM3130 RTC support
Sergey Lapin [Thu, 12 Jun 2008 22:21:55 +0000 (15:21 -0700)]
rtc: Ramtron FM3130 RTC support

Ramtron FM3130 is a chip with two separate devices inside, RTC clock and
FRAM.  This driver provides only RTC functionality.

This chip is met in lots of custom boards with AT91SAMXXXX CPU I work
with, is cheap and in no way better or worse than any other RTC on market.
 While it is mostly met on much smaller devices, I think it is great to
have it supported in Linux.

Signed-off-by: Sergey Lapin <slapin@ossfans.org>
Signed-off-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agortc: make HPET_RTC_IRQ track HPET_EMULATE_RTC
David Brownell [Thu, 12 Jun 2008 22:21:55 +0000 (15:21 -0700)]
rtc: make HPET_RTC_IRQ track HPET_EMULATE_RTC

More Kconfig tweaks related to the legacy PC RTC code:

 - Describe the legacy PC RTC driver as such ... it's never quite
   been clear that this driver is for PC RTCs, and now it's fair
   to call this the "legacy" driver.

 - Force it to understand about HPET stealing its IRQs ... kernel
   code does this always when HPET is in use, there should be no
   option for users to goof up the config.

This seems to fix kernel bugzilla #10729.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoprovide rtc_cmos platform device
Stas Sergeev [Thu, 12 Jun 2008 22:21:54 +0000 (15:21 -0700)]
provide rtc_cmos platform device

Recently (around 2.6.25) I've noticed that RTC no longer works for me.  It
turned out this is because I use pnpacpi=off kernel option to work around
the parport_pc bugs.  I always did so, but RTC used to work fine in the
past, and now it have regressed.

The patch fixes the problem by creating the platform device for the RTC
when PNP is disabled.  This may also help running the PNP-enabled kernel
on an older PCs.

Signed-off-by: Stas Sergeev <stsp@aknet.ru>
Cc: David Brownell <david-b@pacbell.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Adam Belay <ambx1@neo.rr.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoradix-tree: fix small lockless radix-tree bug
Nick Piggin [Thu, 12 Jun 2008 22:21:52 +0000 (15:21 -0700)]
radix-tree: fix small lockless radix-tree bug

We shrink a radix tree when its root node has only one child, in the left
most slot.  The child becomes the new root node.  To perform this
operation in a manner compatible with concurrent lockless lookups, we
atomically switch the root pointer from the parent to its child.

However a concurrent lockless lookup may now have loaded a pointer to the
parent (and is presently deciding what to do next).  For this reason, we
also have to keep the parent node in a valid state after shrinking the
tree, until the next RCU grace period -- otherwise this lookup with the
parent pointer may not do the right thing.  Notably, we need to keep the
child in the left most slot there in case that is requested by the lookup.

This is all pretty standard RCU stuff.  It is worth repeating because in
my eagerness to obey the radix tree node constructor scheme, I had broken
it by zeroing the radix tree node before the grace period.

What could happen is that a lookup can load the parent pointer, then
decide it wants to follow the left most child slot, only to find the slot
contained NULL due to the concurrent shrinker having zeroed the parent
node before waiting for a grace period.  The lookup would return a false
negative as a result.

Fix it by doing that clearing in the RCU callback.  I would normally want
to rip out the constructor entirely, but radix tree nodes are one of those
places where they make sense (only few cachelines will be touched soon
after allocation).

This was never actually found in any lockless pagecache testing or by the
test harness, but by seeing the odd problem with my scalable vmap rewrite.
 I have not tickled the test harness into reproducing it yet, but I'll
keep working at it.

Fortunately, it is not a problem anywhere lockless pagecache is used in
mainline kernels (pagecache probe is not a guarantee, and brd does not
have concurrent lookups and deletes).

Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoconsole keyboard mapping broken by 04c71976
Jiri Bohac [Thu, 12 Jun 2008 22:21:51 +0000 (15:21 -0700)]
console keyboard mapping broken by 04c71976

Several console keyboard maps are broken since

commit 04c71976500352d02f60616d2b960267d8c5fe24
Author: Samuel Thibault <samuel.thibault@ens-lyon.org>
Date:   Tue Oct 16 23:27:04 2007 -0700

    unicode diacritics support

because that changeset made k_self consider the value as a latin1
character when in Unicode mode, which is wrong; k_self should still take
the console map into account.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years ago/proc/sysvipc/shm: fix 32-bit truncation of segment sizes
Paul Menage [Thu, 12 Jun 2008 22:21:49 +0000 (15:21 -0700)]
/proc/sysvipc/shm: fix 32-bit truncation of segment sizes

sysvipc_shm_proc_show() picks between format strings (based on the
expected maximum length of a SHM segment) in a way that prevents gcc from
performing format checks on the seq_printf() parameters.  This hid two
format errors - shp->shm_segsz and shp->shm_nattach are both unsigned
long, but were being printed as unsigned int and signed int respectively.
This leads to 32-bit truncation of SHM segment sizes reported in
/proc/sysvipc/shm.  (And for nattach, but that's less of a problem for
most users).

This patch makes the format string directly visible to gcc's format
specifier checker, and fixes the two broken format specifiers.

Signed-off-by: Paul Menage <menage@google.com>
Cc: Nadia Derbey <Nadia.Derbey@bull.net>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Pierre Peiffer <peifferp@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agopagemap: fix large pages in pagemap
Dave Hansen [Thu, 12 Jun 2008 22:21:48 +0000 (15:21 -0700)]
pagemap: fix large pages in pagemap

We were walking right into huge page areas in the pagemap walker, and
calling the pmds pmd_bad() and clearing them.

That leaked huge pages.  Bad.

This patch at least works around that for now.  It ignores huge pages in
the pagemap walker for the time being, and won't leak those pages.

Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agopagemap: pass mm into pagewalkers
Dave Hansen [Thu, 12 Jun 2008 22:21:47 +0000 (15:21 -0700)]
pagemap: pass mm into pagewalkers

We need this at least for huge page detection for now, because powerpc
needs the vm_area_struct to be able to determine whether a virtual address
is referring to a huge page (its pmd_huge() doesn't work).

It might also come in handy for some of the other users.

Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agodriver/char/generic_nvram: fix banner
Philippe De Muyter [Thu, 12 Jun 2008 22:21:46 +0000 (15:21 -0700)]
driver/char/generic_nvram: fix banner

The generic nvram driver announces itself as
'Macintosh non-volatile memory driver'
instead of 'Generic non-volatile memory driver'.  Fix that.

Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agodrivers/video/cirrusfb: fix RAM address printk
Philippe De Muyter [Thu, 12 Jun 2008 22:21:45 +0000 (15:21 -0700)]
drivers/video/cirrusfb: fix RAM address printk

In the cirrusfb driver, the RAM address printk has a superfluous 'x' that
could be interpreted as "don't care", while it is actually a typo.  Fix
that.

[akpm@linux-foundation.org: join the two printk strings to make it atomic]
Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Cc: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agointel_rng: make device not found a warning
Stephen Hemminger [Thu, 12 Jun 2008 22:21:45 +0000 (15:21 -0700)]
intel_rng: make device not found a warning

Since many distros load this driver by default (throw it against the wall
and see what sticks method).  Change the error message severity level to
avoid alarming users.  Isn't it annoying when users actually read the
error logs...

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Cc: Michael Buesch <mb@bu3sch.de>
Acked-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agodrivers/isdn/sc/ioctl.c: add missing kfree
Julia Lawall [Thu, 12 Jun 2008 22:21:43 +0000 (15:21 -0700)]
drivers/isdn/sc/ioctl.c: add missing kfree

spid has been allocated in this function and so should be freed before
leaving it, as in the other error handling cases.

The semantic match that finds the problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)

@r exists@
expression E,E1;
statement S;
position p1,p2,p3;
@@

E =@p1 \(kmalloc\|kcalloc\|kzalloc\)(...)
... when != E = E1
if (E == NULL || ...) S
... when != E = E1
if@p2 (...) {
 ... when != kfree(E)
 }
... when != E = E1
kfree@p3(E);

@forall@
position r.p2;
expression r.E;
int E1 != 0;
@@

* if@p2 (...) {
 ... when != kfree(E)
     when strict
return E1; }

Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: Karsten Keil <kkeil@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agommc: wbsd: initialize tasklets before requesting interrupt
Chuck Ebbert [Thu, 12 Jun 2008 22:21:42 +0000 (15:21 -0700)]
mmc: wbsd: initialize tasklets before requesting interrupt

With CONFIG_DEBUG_SHIRQ set we will get an interrupt as soon as we
allocate one.  Tasklets may be scheduled in the interrupt handler but they
will be initialized after the handler returns, causing a BUG() in
kernel/softirq.c when they run.

Should fix this Fedora bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=449817

Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Acked-by: Pierre Ossman <drzeus@drzeus.cx>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMAINTAINERS: update maintainership of pxa2xx/pxa3xx
Eric Miao [Thu, 12 Jun 2008 22:21:41 +0000 (15:21 -0700)]
MAINTAINERS: update maintainership of pxa2xx/pxa3xx

Signed-off-by: Eric Miao <eric.miao@marvell.com>
Acked-by: Russell King <rmk@arm.linux.org.uk>
Acked-by: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agouml: work around broken host PTRACE_SYSEMU
Jeff Dike [Thu, 12 Jun 2008 22:21:41 +0000 (15:21 -0700)]
uml: work around broken host PTRACE_SYSEMU

Fedora broke PTRACE_SYSEMU again, and UML crashes as a result when it
doesn't need to.  This patch makes the PTRACE_SYSEMU check fail gracefully
and makes UML fall back to PTRACE_SYSCALL.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agouml: remove include of asm/user.h
Jeff Dike [Thu, 12 Jun 2008 22:21:40 +0000 (15:21 -0700)]
uml: remove include of asm/user.h

I allowed an include of asm/user.h to sneak back in.  This patch replaces
it with sys/user.h.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agortc-at32ap700x: fix bug in at32_rtc_readalarm()
Haavard Skinnemoen [Thu, 12 Jun 2008 22:21:38 +0000 (15:21 -0700)]
rtc-at32ap700x: fix bug in at32_rtc_readalarm()

alarm->pending indicates whether there's an alarm that has actually been
triggered, not whether we're waiting for it.  alarm->enabled indicates
that.

Also add missing locking around reading the RTC registers.

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Signed-off-by: Alessandro Zummo <a.zummo@towertech.it>
Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agom68knommu: init coldfire timer TRR with n - 1, not n
Philippe De Muyter [Thu, 12 Jun 2008 22:21:36 +0000 (15:21 -0700)]
m68knommu: init coldfire timer TRR with n - 1, not n

The coldfire timer must be initialised to n - 1 if we want it to count n
cycles between each tick interrupt.  This was already fixed, but has been
lost with the conversion to GENERIC_TIMER.

Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Acked-by: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agokprobes: fix error checking of batch registration
Masami Hiramatsu [Thu, 12 Jun 2008 22:21:35 +0000 (15:21 -0700)]
kprobes: fix error checking of batch registration

Fix error checking routine to catch an error which occurs in first
__register_*probe().

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: David Miller <davem@davemloft.net>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agocciss: add new hardware support
Mike Miller [Thu, 12 Jun 2008 22:21:34 +0000 (15:21 -0700)]
cciss: add new hardware support

Add support for the next generation of HP Smart Array SAS/SATA
controllers.  Shipping date is late Fall 2008.

Bump the driver version to 3.6.20 to reflect the new hardware support from
patch 1 of this set.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agocapabilities: add (back) dummy support for KEEPCAPS
Andrew G. Morgan [Thu, 12 Jun 2008 22:21:33 +0000 (15:21 -0700)]
capabilities: add (back) dummy support for KEEPCAPS

The dummy module is used by folk that run security conscious code(!?).  A
feature of such code (for example, dhclient) is that it tries to operate
with minimum privilege (dropping unneeded capabilities).  While the dummy
module doesn't restrict code execution based on capability state, the user
code expects the kernel to appear to support it.  This patch adds back
faked support for the PR_SET_KEEPCAPS etc., calls - making the kernel
behave as before 2.6.26.

For details see: http://bugzilla.kernel.org/show_bug.cgi?id=10748

Signed-off-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc_fs.h: move struct mm_struct forward-declaration
Ben Nizette [Thu, 12 Jun 2008 22:21:31 +0000 (15:21 -0700)]
proc_fs.h: move struct mm_struct forward-declaration

Move the forward-declaration of struct mm_struct a little way up
proc_fs.h.  This fixes a bunch of "'struct mm_struct' declared inside
parameter list" warnings with CONFIG_PROC_FS=n

Signed-off-by: Ben Nizette <bn@niasdigital.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agocpusets: provide another web page URL in MAINTAINERS file
Paul Jackson [Thu, 12 Jun 2008 22:21:31 +0000 (15:21 -0700)]
cpusets: provide another web page URL in MAINTAINERS file

Add URL for another CPUSETS web page to the MAINTAINERS file.

This URL provides links to major LGPL user level C libraries supporting
cpuset usage and user level cpu and node masks.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agohgafb: resource management fix
Krzysztof Helt [Thu, 12 Jun 2008 22:21:29 +0000 (15:21 -0700)]
hgafb: resource management fix

Release ports which are requested during detection which are not freed if
there is no hga card.  Otherwise there is a crash during cat /proc/ioports
command.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agom68k: Add ext2_find_{first,next}_bit() for ext4
Aneesh Kumar K.V [Thu, 12 Jun 2008 22:21:29 +0000 (15:21 -0700)]
m68k: Add ext2_find_{first,next}_bit() for ext4

Add ext2_find_{first,next}_bit(), which are needed for ext4.  They're
derived out of the ext2_find_next_zero_bit found in the same file.
Compile tested with crosstools

[Reworked to preserve all symmetry with ext2_find_{first,next}_zero_bit()]

This fixes http://bugzilla.kernel.org/show_bug.cgi?id=10393

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agofat: relax the permission check of fat_setattr()
OGAWA Hirofumi [Thu, 12 Jun 2008 22:21:28 +0000 (15:21 -0700)]
fat: relax the permission check of fat_setattr()

New chmod() allows only acceptable permission, and if not acceptable, it
returns -EPERM.  Old one allows even if it can't store permission to on
disk inode.  But it seems too strict for users.

E.g.  https://bugzilla.redhat.com/show_bug.cgi?id=449080: With new one,
rsync couldn't create the temporary file.

So, this patch allows like old one, but now it doesn't change the
permission if it can't store, and it returns 0.

Also, this patch fixes missing check.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agomm: fix incorrect variable type in do_try_to_free_pages()
kosaki.motohiro@jp.fujitsu.com [Thu, 12 Jun 2008 22:21:27 +0000 (15:21 -0700)]
mm: fix incorrect variable type in do_try_to_free_pages()

"Smarter retry of costly-order allocations" patch series change behaver of
do_try_to_free_pages().  But unfortunately ret variable type was
unchanged.

Thus an overflow is possible.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoagp: add support for Radeon Mobility 9000 chipset
Amit Kucheria [Thu, 12 Jun 2008 22:21:26 +0000 (15:21 -0700)]
agp: add support for Radeon Mobility 9000 chipset

Addresses https://bugs.edge.launchpad.net/ubuntu/+source/linux-source-2.6.22/+bug/178634

Signed-off-by: Amit Kucheria <amit.kucheria@ubuntu.com>
Signed-off-by: maximilian attems <max@stro.at>
Acked-by: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agotcp: Revert 'process defer accept as established' changes.
David S. Miller [Thu, 12 Jun 2008 23:31:35 +0000 (16:31 -0700)]
tcp: Revert 'process defer accept as established' changes.

This reverts two changesets, ec3c0982a2dd1e671bad8e9d26c28dcba0039d87
("[TCP]: TCP_DEFER_ACCEPT updates - process as established") and
the follow-on bug fix 9ae27e0adbf471c7a6b80102e38e1d5a346b3b38
("tcp: Fix slab corruption with ipv6 and tcp6fuzz").

This change causes several problems, first reported by Ingo Molnar
as a distcc-over-loopback regression where connections were getting
stuck.

Ilpo Järvinen first spotted the locking problems.  The new function
added by this code, tcp_defer_accept_check(), only has the
child socket locked, yet it is modifying state of the parent
listening socket.

Fixing that is non-trivial at best, because we can't simply just grab
the parent listening socket lock at this point, because it would
create an ABBA deadlock.  The normal ordering is parent listening
socket --> child socket, but this code path would require the
reverse lock ordering.

Next is a problem noticed by Vitaliy Gusev, he noted:

----------------------------------------
>--- a/net/ipv4/tcp_timer.c
>+++ b/net/ipv4/tcp_timer.c
>@@ -481,6 +481,11 @@ static void tcp_keepalive_timer (unsigned long data)
>  goto death;
>  }
>
>+ if (tp->defer_tcp_accept.request && sk->sk_state == TCP_ESTABLISHED) {
>+ tcp_send_active_reset(sk, GFP_ATOMIC);
>+ goto death;

Here socket sk is not attached to listening socket's request queue. tcp_done()
will not call inet_csk_destroy_sock() (and tcp_v4_destroy_sock() which should
release this sk) as socket is not DEAD. Therefore socket sk will be lost for
freeing.
----------------------------------------

Finally, Alexey Kuznetsov argues that there might not even be any
real value or advantage to these new semantics even if we fix all
of the bugs:

----------------------------------------
Hiding from accept() sockets with only out-of-order data only
is the only thing which is impossible with old approach. Is this really
so valuable? My opinion: no, this is nothing but a new loophole
to consume memory without control.
----------------------------------------

So revert this thing for now.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoipv6: Fix duplicate initialization of rawv6_prot.destroy
David S. Miller [Thu, 12 Jun 2008 21:47:58 +0000 (14:47 -0700)]
ipv6: Fix duplicate initialization of rawv6_prot.destroy

In changeset 22dd485022f3d0b162ceb5e67d85de7c3806aa20
("raw: Raw socket leak.") code was added so that we
flush pending frames on raw sockets to avoid leaks.

The ipv4 part was fine, but the ipv6 part was not
done correctly.  Unlike the ipv4 side, the ipv6 code
already has a .destroy method for rawv6_prot.

So now there were two assignments to this member, and
what the compiler does is use the last one, effectively
making the ipv6 parts of that changeset a NOP.

Fix this by removing the:

.destroy    = inet6_destroy_sock,

line, and adding an inet6_destroy_sock() call to the
end of raw6_destroy().

Noticed by Al Viro.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
16 years agobnx2x: Updating the Maintainer
Eilon Greenstein [Thu, 12 Jun 2008 21:30:28 +0000 (14:30 -0700)]
bnx2x: Updating the Maintainer

I would like to thank Eliezer Tamir for writing and maintaining the
driver for the past two years. I will take over maintaining the bnx2x
driver from now on.

Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: Eliezer Tamir <eliezert@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoLinux 2.6.26-rc6
Linus Torvalds [Thu, 12 Jun 2008 21:22:24 +0000 (14:22 -0700)]
Linux 2.6.26-rc6

.. and a new name, courtesy of Alan.

16 years agoMerge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 12 Jun 2008 19:55:32 +0000 (12:55 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix pointer type warning in arch/x86/mm/init_64.c:early_memtest
  x86, lockdep: fix "WARNING: at kernel/lockdep.c:2658 check_flags+0x4c/0x128()"
  x86: fix an incompatible pointer type warning on 64-bit compilations
  x86: fix lockdep warning during suspend-to-ram
  x86: fix unused variable 'loops' warning in arch/x86/boot/a20.c
  Revert "x86: fix ioapic bug again"
  x86: fix asm warning in head_32.S
  x86: fix endless page faults in mount_block_root for Linux 2.6
  geode: fix modular build

16 years agoMerge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 12 Jun 2008 19:55:18 +0000 (12:55 -0700)]
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: 64-bit: fix arithmetics overflow
  sched: fair group: fix overflow(was: fix divide by zero)
  sched: fix TASK_WAKEKILL vs SIGKILL race

16 years ago[ARM] 5091/1: Add missing bitfield include to regs-lcd.h
Stefan Schmidt [Thu, 12 Jun 2008 06:07:22 +0000 (07:07 +0100)]
[ARM] 5091/1: Add missing bitfield include to regs-lcd.h

Macros like Fld() or FShft used in regs-lcd.h are defined in bitfield.h, but
the latter is not included.
Also fix one whitespace issue while being there.

Signed-off-by: Antonio Ospite <ao2@openezx.org>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Acked-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
16 years agox86: fix pointer type warning in arch/x86/mm/init_64.c:early_memtest
Kevin Winchester [Fri, 30 May 2008 00:14:35 +0000 (21:14 -0300)]
x86: fix pointer type warning in arch/x86/mm/init_64.c:early_memtest

Changed the call to find_e820_area_size to pass u64 instead of unsigned long.

Signed-off-by: Kevin Winchester <kjwinchester@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, lockdep: fix "WARNING: at kernel/lockdep.c:2658 check_flags+0x4c/0x128()"
Vegard Nossum [Thu, 12 Jun 2008 06:55:59 +0000 (08:55 +0200)]
x86, lockdep: fix "WARNING: at kernel/lockdep.c:2658 check_flags+0x4c/0x128()"

Alessandro Suardi reported:
> Recently upgraded my FC6 desktop to Fedora 9; with the
>  latest nautilus RPM updates my VNC session went nuts
>  with nautilus pegging the CPU for everything that breathed.
>
> I now reverted to an earlier nautilus package, but during
>  the peak CPU period my kernel spat this:
>
> [314185.623294] ------------[ cut here ]------------
> [314185.623414] WARNING: at kernel/lockdep.c:2658 check_flags+0x4c/0x128()
> [314185.623514] Modules linked in: iptable_filter ip_tables x_tables
> sunrpc ipv6 fuse snd_via82xx snd_ac97_codec ac97_bus snd_mpu401_uart
> snd_rawmidi via686a hwmon parport_pc sg parport uhci_hcd ehci_hcd
> [314185.623924] Pid: 12314, comm: nautilus Not tainted 2.6.26-rc5-git2 #4
> [314185.624021]  [<c0115b95>] warn_on_slowpath+0x41/0x7b
> [314185.624021]  [<c010de70>] ? do_page_fault+0x2c1/0x5fd
> [314185.624021]  [<c0128396>] ? up_read+0x16/0x28
> [314185.624021]  [<c010de70>] ? do_page_fault+0x2c1/0x5fd
> [314185.624021]  [<c012fa33>] ? __lock_acquire+0xbb4/0xbc3
> [314185.624021]  [<c012d0a0>] check_flags+0x4c/0x128
> [314185.624021]  [<c012fa73>] lock_acquire+0x31/0x7d
> [314185.624021]  [<c0128cf6>] __atomic_notifier_call_chain+0x30/0x80
> [314185.624021]  [<c0128cc6>] ? __atomic_notifier_call_chain+0x0/0x80
> [314185.624021]  [<c0128d52>] atomic_notifier_call_chain+0xc/0xe
> [314185.624021]  [<c0128d81>] notify_die+0x2d/0x2f
> [314185.624021]  [<c01043b0>] do_int3+0x1f/0x4d
> [314185.624021]  [<c02f2d3b>] int3+0x27/0x2c
> [314185.624021]  =======================
> [314185.624021] ---[ end trace 1923f65a2d7bb246 ]---
> [314185.624021] possible reason: unannotated irqs-off.
> [314185.624021] irq event stamp: 488879
> [314185.624021] hardirqs last  enabled at (488879): [<c0102d67>]
> restore_nocheck+0x12/0x15
> [314185.624021] hardirqs last disabled at (488878): [<c0102dca>]
> work_resched+0x19/0x30
> [314185.624021] softirqs last  enabled at (488876): [<c011a1ba>]
> __do_softirq+0xa6/0xac
> [314185.624021] softirqs last disabled at (488865): [<c010476e>]
> do_softirq+0x57/0xa6
>
> I didn't seem to find it with some googling, so here it is.
>
> I was incidentally ltracing that process to try and find out
>  what was gulping down that much CPU (sorry, no idea
>  whether ltrace and the WARNING happened at the same
>  time or which came first) and:

Yeah, this is extremely likely to be the source of the warning.

The warning should be harmless, however.

> Box is my trusty noname K7-800, 512MB RAM; if there's
>  anything else useful I might be able to provide, just ask.

It would be interesting to see where the int3 comes from.  Too bad,
lockdep doesn't provide the register dump. The stacktrace also doesn't
go further than the int3(), I wonder if this int3 came from userspace?
The ltrace readme says "software breakpoints, like gdb", so I guess
this is the case. Yep, seems like it.

This looks relevant:

| commit fb1dac909d94ff807cd833d340c6827c3a957159
| Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
| Date:   Wed Jan 16 09:51:59 2008 +0100
|
|     lockdep: more hardirq annotations for notify_die()

I'm attaching a similarly-looking patch for this case (DO_VM86_ERROR),
though I suspect it might be missing for the other cases
(DO_ERROR/DO_ERROR_INFO) as well.

Reported-by: Alessandro Suardi <alessandro.suardi@gmail.com>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: fix an incompatible pointer type warning on 64-bit compilations
David Howells [Sat, 7 Jun 2008 16:18:40 +0000 (17:18 +0100)]
x86: fix an incompatible pointer type warning on 64-bit compilations

Fix an incompatible pointer type warning on x86_64 compilations.
early_memtest() is passing a u64* to find_e820_area_size() which is expecting
an unsigned long.  Change t_start and t_size to unsigned long as those are
also 64-bit types on x88_64.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: fix lockdep warning during suspend-to-ram
Peter Zijlstra [Fri, 6 Jun 2008 08:14:08 +0000 (10:14 +0200)]
x86: fix lockdep warning during suspend-to-ram

Andrew Morton wrote:

> I've been seeing the below for a long time during suspend-to-ram on the Vaio.
>
>
> PM: Syncing filesystems ... done.
> PM: Preparing system for mem sleep
> Freezing user space processes ... <4>------------[ cut here ]------------
> WARNING: at kernel/lockdep.c:2658 check_flags+0x4c/0x127()
> Modules linked in: i915 drm ipw2200 sonypi ipv6 autofs4 hidp l2cap bluetooth sunrpc nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4 xt_state nf_conntrack xt_tcpudp iptable_filter ip_tables x_tables acpi_cpufreq nvram ohci1394 ieee1394 ehci_hcd uhci_hcd sg joydev snd_hda_intel snd_seq_dummy sr_mod snd_seq_oss cdrom snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss ieee80211 pcspkr ieee80211_crypt snd_pcm i2c_i801 snd_timer i2c_core ide_pci_generic piix snd soundcore snd_page_alloc button ext3 jbd ide_disk ide_core [last unloaded: ipw2200]
> Pid: 3250, comm: zsh Not tainted 2.6.26-rc5 #1
>  [<c011c5f5>] warn_on_slowpath+0x41/0x6d
>  [<c01080e6>] ? native_sched_clock+0x82/0x96
>  [<c013789c>] ? mark_held_locks+0x41/0x5c
>  [<c0315688>] ? _spin_unlock_irqrestore+0x36/0x58
>  [<c0137a29>] ? trace_hardirqs_on+0xe6/0x10d
>  [<c0138637>] ? __lock_acquire+0xae3/0xb2b
>  [<c0313413>] ? schedule+0x39b/0x3b4
>  [<c0135596>] check_flags+0x4c/0x127
>  [<c01386b9>] lock_acquire+0x3a/0x86
>  [<c0315075>] _spin_lock+0x26/0x53
>  [<c0140660>] ? refrigerator+0x13/0xc3
>  [<c0140660>] refrigerator+0x13/0xc3
>  [<c012684a>] get_signal_to_deliver+0x3c/0x31e
>  [<c0102fe7>] do_notify_resume+0x91/0x6ee
>  [<c01359fd>] ? lock_release_holdtime+0x50/0x56
>  [<c0315688>] ? _spin_unlock_irqrestore+0x36/0x58
>  [<c0235d24>] ? read_chan+0x0/0x58c
>  [<c0137a29>] ? trace_hardirqs_on+0xe6/0x10d
>  [<c0315694>] ? _spin_unlock_irqrestore+0x42/0x58
>  [<c0230afa>] ? tty_ldisc_deref+0x5c/0x63
>  [<c0233104>] ? tty_read+0x66/0x98
>  [<c014b3f0>] ? audit_syscall_exit+0x2aa/0x2c5
>  [<c0109430>] ? do_syscall_trace+0x6b/0x16f
>  [<c0103a9c>] work_notifysig+0x13/0x1b
>  =======================
> ---[ end trace 25b49fe59a25afa5 ]---
> possible reason: unannotated irqs-off.
> irq event stamp: 58919
> hardirqs last  enabled at (58919): [<c0103afd>] syscall_exit_work+0x11/0x26

Joy - I so love entry.S

Best I can make of it:

syscall_exit_work
  resume_userspace
    DISABLE_INTERRUPTS
    (no TRACE_IRQS_OFF)
      work_pending
        work_notifysig
          do_notify_resume()
            do_signal()
              get_signal_to_deliver()
                try_to_freeze()
                  refrigerator()
                    task_lock() -> check_flags() -> BANG

The normal path is:

syscall_exit_work
  resume_userspace
    DISABLE_INTERRUPTS
    restore_all
      TRACE_IRQS_IRET
      iret

No idea why that would not warn..

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: fix unused variable 'loops' warning in arch/x86/boot/a20.c
Manish Katiyar [Thu, 5 Jun 2008 13:44:15 +0000 (19:14 +0530)]
x86: fix unused variable 'loops' warning in arch/x86/boot/a20.c

Following patch fixes the below warning message :
arch/x86/boot/a20.c:118: warning: unused variable 'loops'

Signed-off-by : Manish Katiyar <mkatiyar@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoRevert "x86: fix ioapic bug again"
Ingo Molnar [Mon, 9 Jun 2008 11:29:43 +0000 (13:29 +0200)]
Revert "x86: fix ioapic bug again"

This reverts commit 6e908947b4995bc0e551a8257c586d5c3e428201.

Németh Márton reported:

| there is a problem in 2.6.26-rc3 which was not there in case of
| 2.6.25: the CPU wakes up ~90,000 times per sec instead of ~60 per sec.
|
| I also "git bisected" the problem, the result is:
|
6e908947b4995bc0e551a8257c586d5c3e428201 is first bad commit
| commit 6e908947b4995bc0e551a8257c586d5c3e428201
| Author: Ingo Molnar <mingo@elte.hu>
| Date:   Fri Mar 21 14:32:36 2008 +0100
|
|     x86: fix ioapic bug again

the original problem is fixed by Maciej W. Rozycki in the tip/x86/apic
branch (confirmed by Márton), but those changes are too intrusive for
v2.6.26 so we'll go for the less intrusive (repeated) revert now.

Reported-and-bisected-by: Németh Márton <nm127@freemail.hu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: fix asm warning in head_32.S
Joe Korty [Mon, 2 Jun 2008 21:21:06 +0000 (17:21 -0400)]
x86: fix asm warning in head_32.S

On Mon, May 19, 2008 at 04:10:02PM -0700, Linus Torvalds wrote:
> It also causes these warnings on 32-bit PAE:
>
>    AS      arch/x86/kernel/head_32.o
>  arch/x86/kernel/head_32.S: Assembler messages:
>  arch/x86/kernel/head_32.S:225: Warning: left operand is a bignum; integer 0 assumed
>  arch/x86/kernel/head_32.S:609: Warning: left operand is a bignum; integer 0 assumed
>
> and I do not see why (the end result seems to be identical).

Fix head_32.S gcc bignum warnings when CONFIG_PAE=y.

    arch/x86/kernel/head_32.S: Assembler messages:
    arch/x86/kernel/head_32.S:225: Warning: left operand is a bignum; integer 0 assumed
    arch/x86/kernel/head_32.S:609: Warning: left operand is a bignum; integer 0 assumed

The assembler was stumbling over the 64-bit constant 0x100000000 in the
KPMDS #define.

Testing: a cmp(1) on head_32.o before and after shows the binary is unchanged.

Signed-off-by: Joe Korty <joe.korty@ccur.com
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Theodore Tso <tytso@mit.edu>
Cc: Gabriel C <nix.or.die@googlemail.com>
Cc: Keith Packard <keithp@keithp.com>
Cc: "Pallipadi Venkatesh" <venkatesh.pallipadi@intel.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: "Siddha Suresh B" <suresh.b.siddha@intel.com>
Cc: bugme-daemon@bugzilla.kernel.org
Cc: airlied@linux.ie
Cc: "Barnes Jesse" <jesse.barnes@intel.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: fix endless page faults in mount_block_root for Linux 2.6
Henry Nestler [Mon, 12 May 2008 13:44:39 +0000 (15:44 +0200)]
x86: fix endless page faults in mount_block_root for Linux 2.6

Page faults in kernel address space between PAGE_OFFSET up to
VMALLOC_START should not try to map as vmalloc.

Fix rarely endless page faults inside mount_block_root for root
filesystem at boot time.

All 32bit kernels up to 2.6.25 can fail into this hole.
I can not present this under native linux kernel. I see, that the 64bit
has fixed the problem. I copied the same lines into 32bit part.

Recorded debugs are from coLinux kernel 2.6.22.18 (virtualisation):
http://www.henrynestler.com/colinux/testing/pfn-check-0.7.3/20080410-antinx/bug16-recursive-page-fault-endless.txt
The physicaly memory was trimmed down to 192MB to better catch the bug.
More memory gets the bug more rarely.

Details, how every x86 32bit system can fail:

Start from "mount_block_root",
http://lxr.linux.no/linux/init/do_mounts.c#L297
There the variable "fs_names" got one memory page with 4096 bytes.
Variable "p" walks through the existing file system types. The first
string is no problem.
But, with the second loop in mount_block_root the offset of "p" is not
at beginning of page, the offset is for example +9, if "reiserfs" is the
first in list.
Than calls do_mount_root, and lands in sys_mount.
Remember: Variable "type_page" contains now "fs_type+9" and not contains
a full page.
The sys_mount copies 4096 bytes with function "exact_copy_from_user()":
http://lxr.linux.no/linux/fs/namespace.c#L1540

Mostly exist pages after the buffer "fs_names+4096+9" and the page fault
handler was not called. No problem.

In the case, if the page after "fs_names+4096" is not mapped, the page
fault handler was called from http://lxr.linux.no/linux/fs/namespace.c#L1320

The do_page_fault gots an address 0xc03b4000.
It's kernel address, address >= TASK_SIZE, but not from vmalloc! It's
from "__getname()" alias "kmem_cache_alloc".
The "error_code" is 0. "vmalloc_fault" will be call:
http://lxr.linux.no/linux/arch/i386/mm/fault.c#L332

"vmalloc_fault" tryed to find the physical page for a non existing
virtual memory area. The macro "pte_present" in vmalloc_fault()
got a next page fault for 0xc0000ed0 at:
http://lxr.linux.no/linux/arch/i386/mm/fault.c#L282

No PTE exist for such virtual address. The page fault handler was trying
to sync the physical page for the PTE lockup.

This called vmalloc_fault() again for address 0xc000000, and that also
was not existing. The endless began...

In normal case the cpu would still loop with disabled interrrupts. Under
coLinux this was catched by a stack overflow inside printk debugs.

Signed-off-by: Henry Nestler <henry.nestler@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agogeode: fix modular build
Ingo Molnar [Wed, 4 Jun 2008 16:13:37 +0000 (18:13 +0200)]
geode: fix modular build

-tip testing found this build bug:

 MODPOST 331 modules
 ERROR: "geode_mfgpt_toggle_event" [drivers/watchdog/geodewdt.ko] undefined!
 ERROR: "geode_mfgpt_alloc_timer" [drivers/watchdog/geodewdt.ko] undefined!
 make[1]: *** [__modpost] Error 1
 make: *** [modules] Error 2

with this config:

  http://redhat.com/~mingo/misc/config-Wed_Jun__4_18_01_59_CEST_2008.bad

export those symbols.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoblock: disable IRQs until data is written to relay channel
Carl Henrik Lunde [Thu, 12 Jun 2008 18:13:58 +0000 (20:13 +0200)]
block: disable IRQs until data is written to relay channel

As we may run relay_reserve from interrupt context we must always disable
IRQs.  This is because a call to relay_reserve may expose previously written
data to use space.

Updated new message code and an old but related comment.

Signed-off-by: Carl Henrik Lunde <chlunde@ping.uio.no>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes
Linus Torvalds [Thu, 12 Jun 2008 14:56:39 +0000 (07:56 -0700)]
Merge git://git./linux/kernel/git/sam/kbuild-fixes

* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes:
  kbuild: ignore powerpc specific symbols in modpost

16 years agonommu: Correct kobjsize() page validity checks.
Paul Mundt [Thu, 12 Jun 2008 07:29:55 +0000 (16:29 +0900)]
nommu: Correct kobjsize() page validity checks.

This implements a few changes on top of the recent kobjsize() refactoring
introduced by commit 6cfd53fc03670c7a544a56d441eb1a6cc800d72b.

As Christoph points out:

virt_to_head_page cannot return NULL. virt_to_page also
does not return NULL. pfn_valid() needs to be used to
figure out if a page is valid.  Otherwise the page struct
reference that was returned may have PageReserved() set
to indicate that it is not a valid page.

As discussed further in the thread, virt_addr_valid() is the preferable
way to validate the object pointer in this case. In addition to fixing
up the reserved page case, it also has the benefit of encapsulating the
hack introduced by commit 4016a1390d07f15b267eecb20e76a48fd5c524ef on
the impacted platforms, allowing us to get rid of the extra checking in
kobjsize() for the platforms that don't perform this type of bizarre
memory_end abuse (every nommu platform that isn't blackfin). If blackfin
decides to get in line with every other platform and use PageReserved
for the DMA pages in question, kobjsize() will also continue to work
fine.

It also turns out that compound_order() will give us back 0-order for
non-head pages, so we can get rid of the PageCompound check and just
use compound_order() directly. Clean that up while we're at it.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Reviewed-by: Christoph Lameter <clameter@sgi.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agofsl-diu-db: compile fix
Jeff Mahoney [Thu, 12 Jun 2008 06:05:26 +0000 (02:05 -0400)]
fsl-diu-db: compile fix

This patch fixes a compile failure in 2.6.26-rc5-git5.

The variable is expected to be called ofdev.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMerge branch 'core/iter-div' of git://git.kernel.org/pub/scm/linux/kernel/git/tip...
Linus Torvalds [Thu, 12 Jun 2008 14:47:44 +0000 (07:47 -0700)]
Merge branch 'core/iter-div' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'core/iter-div' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  always_inline timespec_add_ns
  add an inlined version of iter_div_u64_rem
  common implementation of iterative div/mod

16 years agokbuild: ignore powerpc specific symbols in modpost
Sam Ravnborg [Thu, 12 Jun 2008 13:02:55 +0000 (15:02 +0200)]
kbuild: ignore powerpc specific symbols in modpost

Kumar Gala <galak@kernel.crashing.org> wrote:
We have a case in powerpc in which we want to link some library
routines with all module objects.  The routines are intended for
handling out-of-line function call register save/restore so having
them as EXPORT_SYMBOL() is counter productive (we do also need to
link the same "library" code into the kernel).

Without this patch a powerpc build would error out and fail
to build modules with the added register save/restore module.

There were two obvious solutions:
1) To link the .o file before the modpost stage
2) To ignore the symbols in modpost

Option 1) was ruled out because we do not have any separate
linking stage for single file modules.

This patch implements option 2 - and do so only for powerpc.

The symbols we ignore are all undefined symbols named:
_restgpr_*, _savegpr_*, _rest32gpr_*, _save32gpr_*

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
16 years agosched: 64-bit: fix arithmetics overflow
Lai Jiangshan [Thu, 12 Jun 2008 08:43:07 +0000 (16:43 +0800)]
sched: 64-bit: fix arithmetics overflow

(overflow means weight >= 2^32 here, because inv_weigh = 2^32/weight)

A weight of a cfs_rq is the sum of weights of which entities
are queued on this cfs_rq, so it will overflow when there are
too many entities.

Although, overflow occurs very rarely, but it break fairness when
it occurs. 64-bits systems have more memory than 32-bit systems
and 64-bit systems can create more process usually, so overflow may
occur more frequently.

This patch guarantees fairness when overflow happens on 64-bit systems.
Thanks to the optimization of compiler, it changes nothing on 32-bit.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: fair group: fix overflow(was: fix divide by zero)
Lai Jiangshan [Thu, 12 Jun 2008 08:42:58 +0000 (16:42 +0800)]
sched: fair group: fix overflow(was: fix divide by zero)

I found a bug which can be reproduced by this way:(linux-2.6.26-rc5, x86-64)
(use 2^32, 2^33, ...., 2^63 as shares value)

# mkdir /dev/cpuctl
# mount -t cgroup -o cpu cpuctl /dev/cpuctl
# cd /dev/cpuctl
# mkdir sub
# echo 0x8000000000000000 > sub/cpu.shares
# echo $$ > sub/tasks
oops here! divide by zero.

This is because do_div() expects the 2th parameter to be 32 bits,
but unsigned long is 64 bits in x86_64.

Peter Zijstra pointed it out that the sane thing to do is limit the
shares value to something smaller instead of using an even more
expensive divide.

Also, I found another bug about "the shares value is too large":

pid1 and pid2 are set affinity to cpu#0
pid1 is attached to cg1 and pid2 is attached to cg2

if cg1/cpu.shares = 1024 cg2/cpu.shares = 2000000000
then pid2 got 100% usage of cpu, and pid1 0%

if cg1/cpu.shares = 1024 cg2/cpu.shares = 20000000000
then pid2 got 0% usage of cpu, and pid1 100%

And a weight of a cfs_rq is the sum of weights of which entities
are queued on this cfs_rq, so the shares value should be limited
to a smaller value.

I think that (1UL << 18) is a good limited value:

1) it's not too large, we can create a lot of group before overflow
2) it's several times the weight value for nice=-19 (not too small)

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agonet: Eliminate flush_scheduled_work() calls while RTNL is held.
David S. Miller [Thu, 12 Jun 2008 09:22:02 +0000 (02:22 -0700)]
net: Eliminate flush_scheduled_work() calls while RTNL is held.

If the RTNL is held when we invoke flush_scheduled_work() we could
deadlock.  One such case is linkwatch, it is a work struct which tries
to grab the RTNL semaphore.

The most common case are net driver ->stop() methods.  The
simplest conversion is to instead use cancel_{delayed_}work_sync()
explicitly on the various work struct the driver uses.

This is an OK transformation because these work structs are doing
things like resetting the chip, restarting link negotiation, and so
forth.  And if we're bringing down the device, we're about to turn the
chip off and reset it anways.  So if we cancel a pending work event,
that's fine here.

Some drivers were working around this deadlock by using a msleep()
polling loop of some sort, and those cases are converted to instead
use cancel_{delayed_}work_sync() as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoalways_inline timespec_add_ns
Jeremy Fitzhardinge [Thu, 12 Jun 2008 08:48:00 +0000 (10:48 +0200)]
always_inline timespec_add_ns

timespec_add_ns is used from the x86-64 vdso, which cannot call out to
other kernel code.  Make sure that timespec_add_ns is always inlined
(and only uses always_inlined functions) to make sure there are no
unexpected calls.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoadd an inlined version of iter_div_u64_rem
Jeremy Fitzhardinge [Thu, 12 Jun 2008 08:47:58 +0000 (10:47 +0200)]
add an inlined version of iter_div_u64_rem

iter_div_u64_rem is used in the x86-64 vdso, which cannot call other
kernel code.  For this case, provide the always_inlined version,
__iter_div_u64_rem.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agocommon implementation of iterative div/mod
Jeremy Fitzhardinge [Thu, 12 Jun 2008 08:47:56 +0000 (10:47 +0200)]
common implementation of iterative div/mod

We have a few instances of the open-coded iterative div/mod loop, used
when we don't expcet the dividend to be much bigger than the divisor.
Unfortunately modern gcc's have the tendency to strength "reduce" this
into a full mod operation, which isn't necessarily any faster, and
even if it were, doesn't exist if gcc implements it in libgcc.

The workaround is to put a dummy asm statement in the loop to prevent
gcc from performing the transformation.

This patch creates a single implementation of this loop, and uses it
to replace the open-coded versions I know about.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Robert Hancock <hancockr@shaw.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branch 'davem-fixes' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik...
David S. Miller [Thu, 12 Jun 2008 03:27:51 +0000 (20:27 -0700)]
Merge branch 'davem-fixes' of /linux/kernel/git/jgarzik/netdev-2.6

16 years agodrivers/net/r6040.c: correct bad use of round_jiffies()
Christophe Jaillet [Thu, 15 May 2008 21:26:22 +0000 (23:26 +0200)]
drivers/net/r6040.c: correct bad use of round_jiffies()

Compared to other places in the kernel, I think that this driver misuses
the function round_jiffies.

Signed-off-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agofec_mpc52xx: MPC52xx_MESSAGES_DEFAULT: 2nd NETIF_MSG_IFDOWN => IFUP
Roel Kluin [Mon, 9 Jun 2008 23:33:51 +0000 (16:33 -0700)]
fec_mpc52xx: MPC52xx_MESSAGES_DEFAULT: 2nd NETIF_MSG_IFDOWN => IFUP

Duplicate NETIF_MSG_IFDOWN, 2nd should be NETIF_MSG_IFUP

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Sylvain Munaut <tnt@246tNt.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agoipg: fix receivemode IPG_RM_RECEIVEMULTICAST{,HASH} in ipg_nic_set_multicast_list()
Roel Kluin [Mon, 9 Jun 2008 23:33:50 +0000 (16:33 -0700)]
ipg: fix receivemode IPG_RM_RECEIVEMULTICAST{,HASH} in ipg_nic_set_multicast_list()

The branches are dead code.  even when dev->flag IFF_MULTICAST (defined
0x1000) is set, dev->flags & IFF_MULTICAST & [boolean] always evaluates to
0.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Francois Romieu <romieu@fr.zoreil.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agoMerge branch 'net-2.6-misc-20080611a' of git://git.linux-ipv6.org/gitroot/yoshfuji...
David S. Miller [Thu, 12 Jun 2008 01:11:16 +0000 (18:11 -0700)]
Merge branch 'net-2.6-misc-20080611a' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-fix

16 years agoMerge branch 'master' of git://eden-feed.erg.abdn.ac.uk/net-2.6
David S. Miller [Thu, 12 Jun 2008 00:53:04 +0000 (17:53 -0700)]
Merge branch 'master' of git://eden-feed.erg.abdn.ac.uk/net-2.6

16 years agonetfilter: nf_conntrack: fix ctnetlink related crash in nf_nat_setup_info()
Patrick McHardy [Thu, 12 Jun 2008 00:51:10 +0000 (17:51 -0700)]
netfilter: nf_conntrack: fix ctnetlink related crash in nf_nat_setup_info()

When creation of a new conntrack entry in ctnetlink fails after having
set up the NAT mappings, the conntrack has an extension area allocated
that is not getting properly destroyed when freeing the conntrack again.
This means the NAT extension is still in the bysource hash, causing a
crash when walking over the hash chain the next time:

BUG: unable to handle kernel paging request at 00120fbd
IP: [<c03d394b>] nf_nat_setup_info+0x221/0x58a
*pde = 00000000
Oops: 0000 [#1] PREEMPT SMP

Pid: 2795, comm: conntrackd Not tainted (2.6.26-rc5 #1)
EIP: 0060:[<c03d394b>] EFLAGS: 00010206 CPU: 1
EIP is at nf_nat_setup_info+0x221/0x58a
EAX: 00120fbd EBX: 00120fbd ECX: 00000001 EDX: 00000000
ESI: 0000019e EDI: e853bbb4 EBP: e853bbc8 ESP: e853bb78
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process conntrackd (pid: 2795, ti=e853a000 task=f7de10f0 task.ti=e853a000)
Stack: 00000000 e853bc2c e85672ec 00000008 c0561084 63c1db4a 00000000 00000000
       00000000 0002e109 61d2b1c3 00000000 00000000 00000000 01114e22 61d2b1c3
       00000000 00000000 f7444674 e853bc04 00000008 c038e728 0000000a f7444674
Call Trace:
 [<c038e728>] nla_parse+0x5c/0xb0
 [<c0397c1b>] ctnetlink_change_status+0x190/0x1c6
 [<c0397eec>] ctnetlink_new_conntrack+0x189/0x61f
 [<c0119aee>] update_curr+0x3d/0x52
 [<c03902d1>] nfnetlink_rcv_msg+0xc1/0xd8
 [<c0390228>] nfnetlink_rcv_msg+0x18/0xd8
 [<c0390210>] nfnetlink_rcv_msg+0x0/0xd8
 [<c038d2ce>] netlink_rcv_skb+0x2d/0x71
 [<c0390205>] nfnetlink_rcv+0x19/0x24
 [<c038d0f5>] netlink_unicast+0x1b3/0x216
 ...

Move invocation of the extension destructors to nf_conntrack_free()
to fix this problem.

Fixes http://bugzilla.kernel.org/show_bug.cgi?id=10875

Reported-and-Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonetfilter: Make nflog quiet when no one listen in userspace.
Eric Leblond [Thu, 12 Jun 2008 00:50:27 +0000 (17:50 -0700)]
netfilter: Make nflog quiet when no one listen in userspace.

The message "nf_log_packet: can't log since no backend logging module loaded
in! Please either load one, or disable logging explicitly" was displayed for
each logged packet when no userspace application is listening to nflog events.
The message seems to warn for a problem with a kernel module missing but as
said before this is not the case. I thus propose to suppress the message (I
don't see any reason to flood the log because a user application has crashed.)

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
Linus Torvalds [Thu, 12 Jun 2008 00:29:32 +0000 (17:29 -0700)]
Merge git://git./linux/kernel/git/gregkh/usb-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
  USB: don't use reset-resume if drivers don't support it
  USB: isp1760: Assign resource fields before adding hcd
  isight_firmware: Avoid crash on loading invalid firmware
  USB: fix build bug in USB_ISIGHTFW

16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6
Linus Torvalds [Thu, 12 Jun 2008 00:29:06 +0000 (17:29 -0700)]
Merge git://git./linux/kernel/git/gregkh/driver-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6:
  kobject: Documentation Spelling Patch
  dev_set_name: fix missing kernel-doc

16 years agoipv6: Fail with appropriate error code when setting not-applicable sockopt.
YOSHIFUJI Hideaki [Wed, 11 Jun 2008 18:27:26 +0000 (03:27 +0900)]
ipv6: Fail with appropriate error code when setting not-applicable sockopt.

IPV6_MULTICAST_HOPS, for example, is not valid for stream sockets.
Since they are virtually unavailable for stream sockets,
we should return ENOPROTOOPT instead of EINVAL.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
16 years agoipv6: Check IPV6_MULTICAST_LOOP option value.
YOSHIFUJI Hideaki [Wed, 11 Jun 2008 18:14:51 +0000 (03:14 +0900)]
ipv6: Check IPV6_MULTICAST_LOOP option value.

Only 0 and 1 are valid for IPV6_MULTICAST_LOOP socket option,
and we should return an error of EINVAL otherwise, per RFC3493.

Based on patch from Shan Wei <shanwei@cn.fujitsu.com>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
16 years agoipv6: Check the hop limit setting in ancillary data.
Shan Wei [Tue, 10 Jun 2008 07:50:55 +0000 (15:50 +0800)]
ipv6: Check the hop limit setting in ancillary data.

When specifing the outgoing hop limit as ancillary data for sendmsg(),
the kernel doesn't check the integer hop limit value as specified in
[RFC-3542] section 6.3.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
16 years agoipv6 route: Fix route lifetime in netlink message.
YOSHIFUJI Hideaki [Mon, 12 May 2008 17:52:55 +0000 (02:52 +0900)]
ipv6 route: Fix route lifetime in netlink message.

1) We may have route lifetime larger than INT_MAX.
In that case we had wired value in lifetime.
Use INT_MAX if lifetime does not fit in s32.

2) Lifetime is valid iif RTF_EXPIRES is set.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>