GitHub/LineageOS/G12/android_kernel_amlogic_linux-4.9.git
11 years agortl8192e: don't use create_proc_entry() for directories
Al Viro [Fri, 29 Mar 2013 23:30:06 +0000 (19:30 -0400)]
rtl8192e: don't use create_proc_entry() for directories

proc_mkdir() is there for purpose...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoprocfs: switch /proc/self away from proc_dir_entry
Al Viro [Fri, 29 Mar 2013 23:27:05 +0000 (19:27 -0400)]
procfs: switch /proc/self away from proc_dir_entry

Just have it pinned in dcache all along and let procfs ->kill_sb()
drop it before kill_anon_super().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agomode_t, whack-a-mole at 11...
Al Viro [Fri, 29 Mar 2013 16:23:28 +0000 (12:23 -0400)]
mode_t, whack-a-mole at 11...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agosnd_info_register: switch to proc_create_data/proc_mkdir_mode
Al Viro [Fri, 29 Mar 2013 03:01:34 +0000 (23:01 -0400)]
snd_info_register: switch to proc_create_data/proc_mkdir_mode

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agohysdn: stash pointer to card into proc_dir_entry->data
Al Viro [Thu, 28 Mar 2013 22:56:21 +0000 (18:56 -0400)]
hysdn: stash pointer to card into proc_dir_entry->data

no need to search later - we know the card when we are
creating procfs entries

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoatags_proc: switch to proc_create_data()
Al Viro [Thu, 28 Mar 2013 22:11:13 +0000 (18:11 -0400)]
atags_proc: switch to proc_create_data()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agogadgetfs: don't bother with fops->owner
Al Viro [Thu, 28 Mar 2013 17:12:32 +0000 (13:12 -0400)]
gadgetfs: don't bother with fops->owner

filesystem module as whole is pinned down by its superblock, no need
to have opened files on it to add anything to that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoccg: don't bother with fops->owner
Al Viro [Thu, 28 Mar 2013 17:00:31 +0000 (13:00 -0400)]
ccg: don't bother with fops->owner

filesystem module as whole is pinned down by its superblock, no need
to have opened files on it to add anything to that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agospufs: don't bother with fops->owner
Al Viro [Thu, 28 Mar 2013 16:45:40 +0000 (12:45 -0400)]
spufs: don't bother with fops->owner

filesystem module as whole is pinned down by its superblock, no need
to have opened files on it to add anything to that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoget rid of the last free_pipe_info() callers
Al Viro [Thu, 21 Mar 2013 15:06:46 +0000 (11:06 -0400)]
get rid of the last free_pipe_info() callers

and rename __free_pipe_info() to free_pipe_info()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoget rid of alloc_pipe_info() argument
Al Viro [Thu, 21 Mar 2013 15:04:15 +0000 (11:04 -0400)]
get rid of alloc_pipe_info() argument

not used anymore

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoget rid of pipe->inode
Al Viro [Thu, 21 Mar 2013 15:01:38 +0000 (11:01 -0400)]
get rid of pipe->inode

it's used only as a flag to distinguish normal pipes/FIFOs from the
internal per-task one used by file-to-file splice.  And pipe->files
would work just as well for that purpose...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agointroduce variants of pipe_lock/pipe_unlock for real pipes/FIFOs
Al Viro [Thu, 21 Mar 2013 16:24:01 +0000 (12:24 -0400)]
introduce variants of pipe_lock/pipe_unlock for real pipes/FIFOs

fs/pipe.c file_operations methods *know* that pipe is not an internal one;
no need to check pipe->inode for those callers.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agopipe: set file->private_data to ->i_pipe
Al Viro [Thu, 21 Mar 2013 15:16:56 +0000 (11:16 -0400)]
pipe: set file->private_data to ->i_pipe

simplify get_pipe_info(), while we are at it

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agopipe: don't use ->i_mutex
Al Viro [Thu, 21 Mar 2013 06:32:24 +0000 (02:32 -0400)]
pipe: don't use ->i_mutex

now it can be done - put mutex into pipe_inode_info, use it instead
of ->i_mutex

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agopipe: take allocation and freeing of pipe_inode_info out of ->i_mutex
Al Viro [Thu, 21 Mar 2013 06:21:19 +0000 (02:21 -0400)]
pipe: take allocation and freeing of pipe_inode_info out of ->i_mutex

* new field - pipe->files; number of struct file over that pipe (all
  sharing the same inode, of course); protected by inode->i_lock.
* pipe_release() decrements pipe->files, clears inode->i_pipe when
  if the counter has reached 0 (all under ->i_lock) and, in that case,
  frees pipe after having done pipe_unlock()
* fifo_open() starts with grabbing ->i_lock, and either bumps pipe->files
  if ->i_pipe was non-NULL or allocates a new pipe (dropping and regaining
  ->i_lock) and rechecks ->i_pipe; if it's still NULL, inserts new pipe
  there, otherwise bumps ->i_pipe->files and frees the one we'd allocated.
  At that point we know that ->i_pipe is non-NULL and won't go away, so
  we can do pipe_lock() on it and proceed as we used to.  If we end up
  failing, decrement pipe->files and if it reaches 0 clear ->i_pipe and
  free the sucker after pipe_unlock().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agopipe: preparation to new locking rules
Al Viro [Thu, 21 Mar 2013 06:16:30 +0000 (02:16 -0400)]
pipe: preparation to new locking rules

* use the fact that file_inode(file)->i_pipe doesn't change
  while the file is opened - no locks needed to access that.
* switch to pipe_lock/pipe_unlock where it's easy to do

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agopipe: switch wait_for_partner() and wake_up_partner() to pipe_inode_info
Al Viro [Thu, 21 Mar 2013 06:07:59 +0000 (02:07 -0400)]
pipe: switch wait_for_partner() and wake_up_partner() to pipe_inode_info

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agopipe: fold file_operations instances in one
Al Viro [Tue, 12 Mar 2013 13:58:10 +0000 (09:58 -0400)]
pipe: fold file_operations instances in one

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agofold fifo.c into pipe.c
Al Viro [Tue, 12 Mar 2013 13:46:27 +0000 (09:46 -0400)]
fold fifo.c into pipe.c

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agolift sb_start_write out of ->splice_write()
Al Viro [Wed, 20 Mar 2013 17:21:32 +0000 (13:21 -0400)]
lift sb_start_write out of ->splice_write()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agolift sb_start_write into default_file_splice_write()
Al Viro [Wed, 20 Mar 2013 17:19:30 +0000 (13:19 -0400)]
lift sb_start_write into default_file_splice_write()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agolift sb_start_write() out of ->write()
Al Viro [Wed, 20 Mar 2013 17:04:20 +0000 (13:04 -0400)]
lift sb_start_write() out of ->write()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoswitch compat readv/writev variants to COMPAT_SYSCALL_DEFINE
Al Viro [Wed, 20 Mar 2013 14:42:10 +0000 (10:42 -0400)]
switch compat readv/writev variants to COMPAT_SYSCALL_DEFINE

... and take to fs/read_write.c

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agof2fs: use mnt_want_write_file() in ioctl
Al Viro [Wed, 20 Mar 2013 13:33:23 +0000 (09:33 -0400)]
f2fs: use mnt_want_write_file() in ioctl

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agolift sb_start_write/sb_end_write out of ->aio_write()
Al Viro [Wed, 20 Mar 2013 01:01:03 +0000 (21:01 -0400)]
lift sb_start_write/sb_end_write out of ->aio_write()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agohpfs: move setting hpfs-private i_dirty to ->write_end()
Al Viro [Wed, 20 Mar 2013 00:35:00 +0000 (20:35 -0400)]
hpfs: move setting hpfs-private i_dirty to ->write_end()

... so that writev(2) doesn't miss it.  Get rid of hpfs_file_write().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoreiserfs: don't wank with EFBIG before calling do_sync_write()
Al Viro [Tue, 19 Mar 2013 23:46:45 +0000 (19:46 -0400)]
reiserfs: don't wank with EFBIG before calling do_sync_write()

look for file_capable() in there...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agofold release_mounts() into namespace_unlock()
Al Viro [Sat, 16 Mar 2013 19:12:40 +0000 (15:12 -0400)]
fold release_mounts() into namespace_unlock()

... and provide namespace_lock() as a trivial wrapper;
switch to those two consistently.

Result is patterned after rtnl_lock/rtnl_unlock pair.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoswitch unlock_mount() to namespace_unlock(), convert all umount_tree() callers
Al Viro [Sat, 16 Mar 2013 18:49:45 +0000 (14:49 -0400)]
switch unlock_mount() to namespace_unlock(), convert all umount_tree() callers

which allows to kill the last argument of umount_tree() and make release_mounts()
static.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agomore conversions to namespace_unlock()
Al Viro [Sat, 16 Mar 2013 18:42:19 +0000 (14:42 -0400)]
more conversions to namespace_unlock()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoget rid of the second argument of shrink_submounts()
Al Viro [Sat, 16 Mar 2013 18:39:34 +0000 (14:39 -0400)]
get rid of the second argument of shrink_submounts()

... it's always &unmounted.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agosaner umount_tree()/release_mounts(), part 1
Al Viro [Sat, 16 Mar 2013 18:35:16 +0000 (14:35 -0400)]
saner umount_tree()/release_mounts(), part 1

global list of release_mounts() fodder, protected by namespace_sem;
eventually, all umount_tree() callers will use it as kill list.
Helper picking the contents of that list, releasing namespace_sem
and doing release_mounts() on what it got.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoget rid of full-hash scan on detaching vfsmounts
Al Viro [Fri, 15 Mar 2013 14:53:28 +0000 (10:53 -0400)]
get rid of full-hash scan on detaching vfsmounts

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agomnt: release locks on error path in do_loopback
Andrey Vagin [Tue, 9 Apr 2013 13:33:29 +0000 (17:33 +0400)]
mnt: release locks on error path in do_loopback

do_loopback calls lock_mount(path) and forget to unlock_mount
if clone_mnt or copy_mnt fails.

[   77.661566] ================================================
[   77.662939] [ BUG: lock held when returning to user space! ]
[   77.664104] 3.9.0-rc5+ #17 Not tainted
[   77.664982] ------------------------------------------------
[   77.666488] mount/514 is leaving the kernel with locks still held!
[   77.668027] 2 locks held by mount/514:
[   77.668817]  #0:  (&sb->s_type->i_mutex_key#7){+.+.+.}, at: [<ffffffff811cca22>] lock_mount+0x32/0xe0
[   77.671755]  #1:  (&namespace_sem){+++++.}, at: [<ffffffff811cca3a>] lock_mount+0x4a/0xe0

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agopalinfo fixes
Al Viro [Mon, 1 Apr 2013 02:34:37 +0000 (22:34 -0400)]
palinfo fixes

* check for proc_mkdir() failures
* fix buffer overrun - sizeof(format string) is *not* enough to
hold sprintf() result.
* use proc_remove_subtree(); life's much easier with it

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoprocfs: add proc_remove_subtree()
Al Viro [Sun, 31 Mar 2013 00:13:46 +0000 (20:13 -0400)]
procfs: add proc_remove_subtree()

just what it sounds like; do that only to procfs subtrees you've
created - doing that to something shared with another driver is
not only antisocial, but might cause interesting races with
proc_create() and its ilk.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoecryptfs: close rmmod race
Al Viro [Thu, 28 Mar 2013 17:30:23 +0000 (13:30 -0400)]
ecryptfs: close rmmod race

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Wed, 27 Mar 2013 00:42:55 +0000 (17:42 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs

Pull vfs fixes from Al Viro:
 "stable fodder; assorted deadlock fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  vt: synchronize_rcu() under spinlock is not nice...
  Nest rename_lock inside vfsmount_lock
  Don't bother with redoing rw_verify_area() from default_file_splice_from()

11 years agovt: synchronize_rcu() under spinlock is not nice...
Al Viro [Wed, 27 Mar 2013 00:30:17 +0000 (20:30 -0400)]
vt: synchronize_rcu() under spinlock is not nice...

vcs_poll_data_free() calls unregister_vt_notifier(), which calls
atomic_notifier_chain_unregister(), which calls synchronize_rcu().
Do it *after* we'd dropped ->f_lock.

Cc: stable@vger.kernel.org (all kernels since 2.6.37)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoNest rename_lock inside vfsmount_lock
Al Viro [Tue, 26 Mar 2013 22:25:57 +0000 (18:25 -0400)]
Nest rename_lock inside vfsmount_lock

... lest we get livelocks between path_is_under() and d_path() and friends.

The thing is, wrt fairness lglocks are more similar to rwsems than to rwlocks;
it is possible to have thread B spin on attempt to take lock shared while thread
A is already holding it shared, if B is on lower-numbered CPU than A and there's
a thread C spinning on attempt to take the same lock exclusive.

As the result, we need consistent ordering between vfsmount_lock (lglock) and
rename_lock (seq_lock), even though everything that takes both is going to take
vfsmount_lock only shared.

Spotted-by: Brad Spengler <spender@grsecurity.net>
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Tue, 26 Mar 2013 21:24:29 +0000 (14:24 -0700)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Always increment IPV4 ID field in encapsulated GSO packets, even
    when DF is set.  Regression fix from Pravin B Shelar.

 2) Fix per-net subsystem initialization in netfilter conntrack,
    otherwise we may access dynamically allocated memory before it is
    actually allocated.  From Gao Feng.

 3) Fix DMA buffer lengths in iwl3945 driver, from Stanislaw Gruszka.

 4) Fix race between submission of sync vs async commands in mwifiex
    driver, from Amitkumar Karwar.

 5) Add missing cancel of command timer in mwifiex driver, from Bing
    Zhao.

 6) Missing SKB free in rtlwifi USB driver, from Jussi Kivilinna.

 7) Thermal layer tries to use a genetlink multicast string that is
    longer than the 16 character limit.  Fix it and add a BUG check to
    prevent this kind of thing from happening in the future.

 From Masatake YAMATO.

 8) Fix many bugs in the handling of the teardown of L2TP connections,
    UDP encapsulation instances, and sockets.  From Tom Parkin.

 9) Missing socket release in IRDA, from Kees Cook.

10) Fix fec driver modular build, from Fabio Estevam.

11) Erroneous use of kfree() instead of free_netdev() in lantiq_etop,
    from Wei Yongjun.

12) Fix bugs in handling of queue numbers and steering rules in mlx4
    driver, from Moshe Lazer, Hadar Hen Zion, and Or Gerlitz.

13) Some FOO_DIAG_MAX constants were defined off by one, fix from Andrey
    Vagin.

14) TCP segmentation deferral is unintentionally done too strongly,
    breaking ACK clocking.  Fix from Eric Dumazet.

15) net_enable_timestamp() can legitimately be invoked from software
    interrupts, and in a way that is safe, so remove the WARN_ON().
    Also from Eric Dumazet.

16) Fix use after free in VLANs, from Cong Wang.

17) Fix TCP slow start retransmit storms after SACK reneging, from
    Yuchung Cheng.

18) Unix socket release should mark a socket dead before NULL'ing out
    sock->sk, otherwise we can race.  Fix from Paul Moore.

19) IPV6 addrconf code can try to free static memory, from Hong Zhiguo.

20) Fix register mis-programming, NULL pointer derefs, and wrong PHC
    clock frequency in IGB driver.  From Lior LevyAlex Williamson, Jiri
    Benc, and Jeff Kirsher.

21) skb->ip_summed logic in pch_gbe driver is reversed, breaking packet
    forwarding.  Fix from Veaceslav Falico.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits)
  ipv4: Fix ip-header identification for gso packets.
  bonding: remove already created master sysfs link on failure
  af_unix: dont send SCM_CREDENTIAL when dest socket is NULL
  pch_gbe: fix ip_summed checksum reporting on rx
  igb: fix PHC stopping on max freq
  igb: make sensor info static
  igb: SR-IOV init reordering
  igb: Fix null pointer dereference
  igb: fix i350 anti spoofing config
  ixgbevf: don't release the soft entries
  ipv6: fix bad free of addrconf_init_net
  unix: fix a race condition in unix_release()
  tcp: undo spurious timeout after SACK reneging
  bnx2x: fix assignment of signed expression to unsigned variable
  bridge: fix crash when set mac address of br interface
  8021q: fix a potential use-after-free
  net: remove a WARN_ON() in net_enable_timestamp()
  tcp: preserve ACK clocking in TSO
  net: fix *_DIAG_MAX constants
  net/mlx4_core: Disallow releasing VF QPs which have steering rules
  ...

11 years agoMerge tag 'nfs-for-3.9-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Linus Torvalds [Tue, 26 Mar 2013 21:23:45 +0000 (14:23 -0700)]
Merge tag 'nfs-for-3.9-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 - Fix an NFSv4 idmapper regression
 - Fix an Oops in the pNFS blocks client
 - Fix up various issues with pNFS layoutcommit
 - Ensure correct read ordering of variables in
   rpc_wake_up_task_queue_locked

* tag 'nfs-for-3.9-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  SUNRPC: Add barriers to ensure read ordering in rpc_wake_up_task_queue_locked
  NFSv4.1: Add a helper pnfs_commit_and_return_layout
  NFSv4.1: Always clear the NFS_INO_LAYOUTCOMMIT in layoutreturn
  NFSv4.1: Fix a race in pNFS layoutcommit
  pnfs-block: removing DM device maybe cause oops when call dev_remove
  NFSv4: Fix the string length returned by the idmapper

11 years agoipv4: Fix ip-header identification for gso packets.
Pravin B Shelar [Sun, 24 Mar 2013 17:36:29 +0000 (17:36 +0000)]
ipv4: Fix ip-header identification for gso packets.

ip-header id needs to be incremented even if IP_DF flag is set.
This behaviour was changed in commit 490ab08127cebc25e3a26
(IP_GRE: Fix IP-Identification).

Following patch fixes it so that identification is always
incremented.

Reported-by: Cong Wang <amwang@redhat.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobonding: remove already created master sysfs link on failure
Veaceslav Falico [Tue, 26 Mar 2013 16:43:28 +0000 (17:43 +0100)]
bonding: remove already created master sysfs link on failure

If slave sysfs symlink failes to be created - we end up without removing
the master sysfs symlink. Remove it in case of failure.

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoaf_unix: dont send SCM_CREDENTIAL when dest socket is NULL
dingtianhong [Mon, 25 Mar 2013 17:02:04 +0000 (17:02 +0000)]
af_unix: dont send SCM_CREDENTIAL when dest socket is NULL

SCM_SCREDENTIALS should apply to write() syscalls only either source or destination
socket asserted SOCK_PASSCRED. The original implememtation in maybe_add_creds is wrong,
and breaks several LSB testcases ( i.e. /tset/LSB.os/netowkr/recvfrom/T.recvfrom).

Origionally-authored-by: Karel Srot <ksrot@redhat.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net
David S. Miller [Tue, 26 Mar 2013 16:21:31 +0000 (12:21 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net

Jeff Kirsher says:

====================
This series contains updates to ixgbevf and igb.

The ixgbevf calls to pci_disable_msix() and to free the msix_entries
memory should not occur if device open fails.  Instead they should be
called during device driver removal to balance with the call to
pci_enable_msix() and the call to allocate msix_entries memory
during the device probe and driver load.

The remaining 4 of 5 igb patches are simple 1-3 line patches to fix
several issues such as possible null pointer dereference, PHC stopping
on max frequency, make sensor info static and SR-IOV initialization
reordering.

The remaining igb patch to fix anti-spoofing config fixes a problem
in i350 where anti spoofing configuration was written into a wrong
register.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agopch_gbe: fix ip_summed checksum reporting on rx
Veaceslav Falico [Mon, 25 Mar 2013 22:26:21 +0000 (22:26 +0000)]
pch_gbe: fix ip_summed checksum reporting on rx

skb->ip_summed should be CHECKSUM_UNNECESSARY when the driver reports that
checksums were correct and CHECKSUM_NONE in any other case. They're
currently placed vice versa, which breaks the forwarding scenario. Fix it
by placing them as described above.

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoigb: fix PHC stopping on max freq
Jiri Benc [Wed, 20 Mar 2013 09:06:34 +0000 (09:06 +0000)]
igb: fix PHC stopping on max freq

For 82576 MAC type, max_adj is reported as 1000000000 ppb. However, if
this value is passed to igb_ptp_adjfreq_82576, incvalue overflows out of
INCVALUE_82576_MASK, resulting in setting of zero TIMINCA.incvalue, stopping
the PHC (instead of going at twice the nominal speed).

Fix the advertised max_adj value to the largest value hardware can handle.
As there is no min_adj value available (-max_adj is used instead), this will
also prevent stopping the clock intentionally. It's probably not a big deal,
other igb MAC types don't support stopping the clock, either.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoigb: make sensor info static
Stephen Hemminger [Wed, 20 Mar 2013 09:06:29 +0000 (09:06 +0000)]
igb: make sensor info static

Trivial sparse warning.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoigb: SR-IOV init reordering
Alex Williamson [Wed, 13 Mar 2013 15:50:29 +0000 (15:50 +0000)]
igb: SR-IOV init reordering

igb is ineffective at setting a lower total VFs because:

int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs)
{
        ...
        /* Shouldn't change if VFs already enabled */
        if (dev->sriov->ctrl & PCI_SRIOV_CTRL_VFE)
                return -EBUSY;

Swap init ordering.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoigb: Fix null pointer dereference
Alex Williamson [Wed, 13 Mar 2013 15:50:24 +0000 (15:50 +0000)]
igb: Fix null pointer dereference

The max_vfs= option has always been self limiting to the number of VFs
supported by the device.  fa44f2f1 added SR-IOV configuration via
sysfs, but in the process broke this self correction factor.  The
failing path is:

igb_probe
  igb_sw_init
    if (max_vfs > 7) {
        adapter->vfs_allocated_count = 7;
    ...
    igb_probe_vfs
    igb_enable_sriov(, max_vfs)
      if (num_vfs > 7) {
        err = -EPERM;
        ...

This leaves vfs_allocated_count = 7 and vf_data = NULL, so we bomb out
when igb_probe finally calls igb_reset.  It seems like a really bad
idea, and somewhat pointless, to set vfs_allocated_count separate from
vf_data, but limiting max_vfs is enough to avoid the null pointer.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoigb: fix i350 anti spoofing config
Lior Levy [Tue, 12 Mar 2013 15:49:32 +0000 (15:49 +0000)]
igb: fix i350 anti spoofing config

Fix a problem in i350 where anti spoofing configuration was written into a
wrong register.

Signed-off-by: Lior Levy <lior.levy@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoixgbevf: don't release the soft entries
xunleer [Tue, 5 Mar 2013 07:44:20 +0000 (07:44 +0000)]
ixgbevf: don't release the soft entries

When the ixgbevf driver is opened the request to allocate MSIX irq
vectors may fail.  In that case the driver will call ixgbevf_down()
which will call ixgbevf_irq_disable() to clear the HW interrupt
registers and calls synchronize_irq() using the msix_entries pointer in
the adapter structure.  However, when the function to request the MSIX
irq vectors failed it had already freed the msix_entries which causes
an OOPs from using the NULL pointer in synchronize_irq().

The calls to pci_disable_msix() and to free the msix_entries memory
should not occur if device open fails.  Instead they should be called
during device driver removal to balance with the call to
pci_enable_msix() and the call to allocate msix_entries memory
during the device probe and driver load.

Signed-off-by: Li Xun <xunleer.li@huawei.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoMerge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 26 Mar 2013 01:03:34 +0000 (18:03 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull timer fix from Thomas Gleixner:
 "A single bugfix which prevents that a non functional timer device is
  selected to provide the fallback device, which is supposed to serve
  timer interrupts on behalf of non functional devices ..."

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clockevents: Don't allow dummy broadcast timers

11 years agoipv6: fix bad free of addrconf_init_net
Hong Zhiguo [Mon, 25 Mar 2013 17:52:45 +0000 (01:52 +0800)]
ipv6: fix bad free of addrconf_init_net

Signed-off-by: Hong Zhiguo <honkiko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agounix: fix a race condition in unix_release()
Paul Moore [Mon, 25 Mar 2013 03:18:33 +0000 (03:18 +0000)]
unix: fix a race condition in unix_release()

As reported by Jan, and others over the past few years, there is a
race condition caused by unix_release setting the sock->sk pointer
to NULL before properly marking the socket as dead/orphaned.  This
can cause a problem with the LSM hook security_unix_may_send() if
there is another socket attempting to write to this partially
released socket in between when sock->sk is set to NULL and it is
marked as dead/orphaned.  This patch fixes this by only setting
sock->sk to NULL after the socket has been marked as dead; I also
take the opportunity to make unix_release_sock() a void function
as it only ever returned 0/success.

Dave, I think this one should go on the -stable pile.

Special thanks to Jan for coming up with a reproducer for this
problem.

Reported-by: Jan Stancek <jan.stancek@gmail.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland...
Linus Torvalds [Mon, 25 Mar 2013 16:44:39 +0000 (09:44 -0700)]
Merge tag 'rdma-for-linus' of git://git./linux/kernel/git/roland/infiniband

Pull infiniband/rdma fixes from Roland Dreier:
 "Small batch of InfiniBand/RDMA fixes for 3.9:

   - Fix for TX lockup in IPoIB
   - QLogic -> Intel update for qib driver
   - Small static checker fix for qib
   - Fix error path return value in cxgb4"

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/qib: change QLogic to Intel
  IB/ipath: Silence a static checker warning
  IPoIB: Fix send lockup due to missed TX completion
  RDMA/cxgb4: Fix error return code in create_qp()

11 years agoMerge tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Linus Torvalds [Mon, 25 Mar 2013 16:26:10 +0000 (09:26 -0700)]
Merge tag 'fixes' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM SoC bug fixes from Arnd Bergmann:
 "Four patches for arm-soc this week:

   - Kevin Hilman is no longer reachable under his previous email
     address.  He submitted the patch earlier, but nobody felt
     responsible to pick it up.

   - One Tegra fix for an incorect register address in device tree.

   - IMX multiplatform support exposes a configuration option that leads
     to unbootable kernels on all other machines and that needs to
     depend on that platform.

   - A nontrivial bug fix for the setup of the mxs video output."

* tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  MAINTAINERS: update email address for Kevin Hilman
  ARM: tegra: fix register address of slink controller
  ARM: imx: add dependency check for DEBUG_IMX_UART_PORT
  ARM: video: mxs: Fix mxsfb misconfiguring VDCTRL0

11 years agoMerge branch 'for-3.9' of git://linux-nfs.org/~bfields/linux
Linus Torvalds [Mon, 25 Mar 2013 16:25:12 +0000 (09:25 -0700)]
Merge branch 'for-3.9' of git://linux-nfs.org/~bfields/linux

Pull nfsd bugfixes from J Bruce Fields:
 "Fixes for a couple mistakes in the new DRC code.  And thanks to Kent
  Overstreet for noticing we've been sync'ing the wrong range on stable
  writes since 3.8."

* 'for-3.9' of git://linux-nfs.org/~bfields/linux:
  nfsd: fix bad offset use
  nfsd: fix startup order in nfsd_reply_cache_init
  nfsd: only unhash DRC entries that are in the hashtable

11 years agoSUNRPC: Add barriers to ensure read ordering in rpc_wake_up_task_queue_locked
Trond Myklebust [Mon, 25 Mar 2013 15:23:40 +0000 (11:23 -0400)]
SUNRPC: Add barriers to ensure read ordering in rpc_wake_up_task_queue_locked

We need to be careful when testing task->tk_waitqueue in
rpc_wake_up_task_queue_locked, because it can be changed while we
are holding the queue->lock.
By adding appropriate memory barriers, we can ensure that it is safe to
test task->tk_waitqueue for equality if the RPC_TASK_QUEUED bit is set.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
11 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Mon, 25 Mar 2013 09:57:32 +0000 (02:57 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "Exynos and Intel fixes.

  The intel fixes are fairly straightforward, mostly reverts due to bugs
  found.  The exynos one is a big larger since they found some issues
  with the G2D engine and iommu interaction, and needed to verify the
  operations a lot better than they were previously, otherwise a user
  app can just crash the kernel with an iommu fault."

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  Revert "drm/i915: write backlight harder"
  drm/i915: don't disable the power well yet
  Revert "drm/i915: set TRANSCODER_EDP even earlier"
  drm/exynos: Check g2d cmd list for g2d restrictions
  drm/exynos: Add a new function to get gem buffer size
  drm/exynos: Deal with g2d buffer info more efficiently
  drm/exynos: Clean up some G2D codes for readability
  drm/exynos: Fix G2D core malfunctioning issue
  drm/exynos: clear node object type at gem unmap
  drm/exynos: Fix error routine to getting dma addr.
  drm/exynos: Replaced kzalloc & memcpy with kmemdup
  drm/exynos: fimd: calculate the correct address offset
  drm/exynos: Make mixer_check_timing static
  drm/exynos: modify the compatible string for exynos fimd

11 years agoMerge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel...
Dave Airlie [Mon, 25 Mar 2013 02:20:00 +0000 (12:20 +1000)]
Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into HEAD

Daniel writes:
"Just three revert/disable by default patches, one of them cc: stable
(since the offending commit was cc: stable, too)."

* 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel:
  Revert "drm/i915: write backlight harder"
  drm/i915: don't disable the power well yet
  Revert "drm/i915: set TRANSCODER_EDP even earlier"

11 years agoMerge branch 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git...
Dave Airlie [Mon, 25 Mar 2013 02:19:10 +0000 (12:19 +1000)]
Merge branch 'exynos-drm-fixes' of git://git./linux/kernel/git/daeinki/drm-exynos into HEAD

Inki writes:
Includes bug fixes and code cleanups.
And it considers some restrictions to G2D hardware.
With this, the malfunction and page fault issues to g2d driver
would be fixed.

* 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
  drm/exynos: Check g2d cmd list for g2d restrictions
  drm/exynos: Add a new function to get gem buffer size
  drm/exynos: Deal with g2d buffer info more efficiently
  drm/exynos: Clean up some G2D codes for readability
  drm/exynos: Fix G2D core malfunctioning issue
  drm/exynos: clear node object type at gem unmap
  drm/exynos: Fix error routine to getting dma addr.
  drm/exynos: Replaced kzalloc & memcpy with kmemdup
  drm/exynos: fimd: calculate the correct address offset
  drm/exynos: Make mixer_check_timing static
  drm/exynos: modify the compatible string for exynos fimd

11 years agotcp: undo spurious timeout after SACK reneging
Yuchung Cheng [Sun, 24 Mar 2013 10:42:25 +0000 (10:42 +0000)]
tcp: undo spurious timeout after SACK reneging

On SACK reneging the sender immediately retransmits and forces a
timeout but disables Eifel (undo). If the (buggy) receiver does not
drop any packet this can trigger a false slow-start retransmit storm
driven by the ACKs of the original packets. This can be detected with
undo and TCP timestamps.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: fix assignment of signed expression to unsigned variable
Kumar Amit Mehta [Sat, 23 Mar 2013 20:10:25 +0000 (20:10 +0000)]
bnx2x: fix assignment of signed expression to unsigned variable

fix for incorrect assignment of signed expression to unsigned variable.

Signed-off-by: Kumar Amit Mehta <gmate.amit@gmail.com>
Acked-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobridge: fix crash when set mac address of br interface
Hong zhi guo [Sat, 23 Mar 2013 02:27:50 +0000 (02:27 +0000)]
bridge: fix crash when set mac address of br interface

When I tried to set mac address of a bridge interface to a mac
address which already learned on this bridge, I got system hang.

The cause is straight forward: function br_fdb_change_mac_address
calls fdb_insert with NULL source nbp. Then an fdb lookup is
performed. If an fdb entry is found and it's local, it's OK. But
if it's not local, source is dereferenced for printk without NULL
check.

Signed-off-by: Hong Zhiguo <honkiko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years ago8021q: fix a potential use-after-free
Cong Wang [Fri, 22 Mar 2013 19:14:07 +0000 (19:14 +0000)]
8021q: fix a potential use-after-free

vlan_vid_del() could possibly free ->vlan_info after a RCU grace
period, however, we may still refer to the freed memory area
by 'grp' pointer. Found by code inspection.

This patch moves vlan_vid_del() as behind as possible.

Cc: Patrick McHardy <kaber@trash.net>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: remove a WARN_ON() in net_enable_timestamp()
Eric Dumazet [Fri, 22 Mar 2013 14:38:28 +0000 (14:38 +0000)]
net: remove a WARN_ON() in net_enable_timestamp()

The WARN_ON(in_interrupt()) in net_enable_timestamp() can get false
positive, in socket clone path, run from softirq context :

[ 3641.624425] WARNING: at net/core/dev.c:1532 net_enable_timestamp+0x7b/0x80()
[ 3641.668811] Call Trace:
[ 3641.671254]  <IRQ>  [<ffffffff80286817>] warn_slowpath_common+0x87/0xc0
[ 3641.677871]  [<ffffffff8028686a>] warn_slowpath_null+0x1a/0x20
[ 3641.683683]  [<ffffffff80742f8b>] net_enable_timestamp+0x7b/0x80
[ 3641.689668]  [<ffffffff80732ce5>] sk_clone_lock+0x425/0x450
[ 3641.695222]  [<ffffffff8078db36>] inet_csk_clone_lock+0x16/0x170
[ 3641.701213]  [<ffffffff807ae449>] tcp_create_openreq_child+0x29/0x820
[ 3641.707663]  [<ffffffff807d62e2>] ? ipt_do_table+0x222/0x670
[ 3641.713354]  [<ffffffff807aaf5b>] tcp_v4_syn_recv_sock+0xab/0x3d0
[ 3641.719425]  [<ffffffff807af63a>] tcp_check_req+0x3da/0x530
[ 3641.724979]  [<ffffffff8078b400>] ? inet_hashinfo_init+0x60/0x80
[ 3641.730964]  [<ffffffff807ade6f>] ? tcp_v4_rcv+0x79f/0xbe0
[ 3641.736430]  [<ffffffff807ab9bd>] tcp_v4_do_rcv+0x38d/0x4f0
[ 3641.741985]  [<ffffffff807ae14a>] tcp_v4_rcv+0xa7a/0xbe0

Its safe at this point because the parent socket owns a reference
on the netstamp_needed, so we cant have a 0 -> 1 transition, which
requires to lock a mutex.

Instead of refining the check, lets remove it, as all known callers
are safe. If it ever changes in the future, static_key_slow_inc()
will complain anyway.

Reported-by: Laurent Chavey <chavey@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge tag 'pinctrl-fixes-for-v3.9' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 24 Mar 2013 17:11:29 +0000 (10:11 -0700)]
Merge tag 'pinctrl-fixes-for-v3.9' of git://git./linux/kernel/git/linusw/linux-pinctrl

Pull pinctrl fixes from Linus Walleij:
 "Here are a few pinctrl fixes for the v3.9 rc series:
   - Usecount bounds checking so we do not go below zero on mux
     usecounts.
   - Loop range checking in GPIO ranges in the DT range parser.
   - Proper print in debugfs for pinconf state.
   - Fix compilation bug in generic pinconf code.
   - Minor bugfixes to abx500 and mvebu drivers."

* tag 'pinctrl-fixes-for-v3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinmux: forbid mux_usecount to be set at UINT_MAX
  pinctrl: mvebu: fix checking for SoC specific controls
  pinctrl: generic: Fix compilation error
  pinctrl: Print the correct information in debugfs pinconf-state file
  pinctrl: abx500: Fix checking if pin use AlternateFunction register
  gpio: fix wrong checking condition for gpio range

11 years agoMerge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Linus Torvalds [Sun, 24 Mar 2013 17:10:34 +0000 (10:10 -0700)]
Merge branch 'x86/urgent' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Peter Anvin:
 "A collection of minor fixes, more EFI variables paranoia
  (anti-bricking) plus the ability to disable the pstore either as a
  runtime default or completely, due to bricking concerns."

* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efivars: Fix check for CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE
  x86, microcode_intel_early: Mark apply_microcode_early() as cpuinit
  efivars: Handle duplicate names from get_next_variable()
  efivars: explicitly calculate length of VariableName
  efivars: Add module parameter to disable use as a pstore backend
  efivars: Allow disabling use as a pstore backend
  x86-32, microcode_intel_early: Fix crash with CONFIG_DEBUG_VIRTUAL
  x86-64: Fix the failure case in copy_user_handle_tail()

11 years agoRevert "drm/i915: write backlight harder"
Daniel Vetter [Fri, 22 Mar 2013 14:44:46 +0000 (15:44 +0100)]
Revert "drm/i915: write backlight harder"

This reverts commit cf0a6584aa6d382f802f2c3cacac23ccbccde0cd.

Turns out that cargo-culting breaks systems. Note that we can't revert
further, since

commit 770c12312ad617172b1a65b911d3e6564fc5aca8
Author: Takashi Iwai <tiwai@suse.de>
Date:   Sat Aug 11 08:56:42 2012 +0200

    drm/i915: Fix blank panel at reopening lid

fixed a regression in 3.6-rc kernels for which we've never figured out
the exact root cause. But some further inspection of the backlight
code reveals that it's seriously lacking locking. And especially the
asle backlight update is know to get fired (through some smm magic)
when writing specific backlight control registers. So the possibility
of suffering from races is rather real.

Until those races are fixed I don't think it makes sense to try
further hacks. Which sucks a bit, but sometimes that's how it is :(

References: http://www.mail-archive.com/intel-gfx@lists.freedesktop.org/msg18788.html
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=47941
Tested-by: Takashi Iwai <tiwai@suse.de>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: stable@vger.kernel.org (the reverted commit was cc: stable, too)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
11 years agodrm/i915: don't disable the power well yet
Paulo Zanoni [Fri, 22 Mar 2013 17:07:23 +0000 (14:07 -0300)]
drm/i915: don't disable the power well yet

We're still not 100% ready to disable the power well, so don't disable
it for now. When we disable it we break the audio driver (because some
of the audio registers are on the power well) and machines with eDP on
port D (because it doesn't use TRANSCODER_EDP).

Also, instead of just reverting the code, add a Kernel option to let
us disable it if we want. This will allow us to keep developing and
testing the feature while it's not enabled.

This fixes problems caused by the following commit:
  commit d6dd9eb1d96d2b7345fe4664066c2b7ed86da898
  Author: Daniel Vetter <daniel.vetter@ffwll.ch>
  Date:   Tue Jan 29 16:35:20 2013 -0200
       drm/i915: dynamic Haswell display power well support

References: http://www.mail-archive.com/intel-gfx@lists.freedesktop.org/msg18788.html
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Mengdong Lin <mengdong.lin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
11 years agoRevert "drm/i915: set TRANSCODER_EDP even earlier"
Daniel Vetter [Fri, 22 Mar 2013 09:53:40 +0000 (10:53 +0100)]
Revert "drm/i915: set TRANSCODER_EDP even earlier"

This reverts commit cc464b2a17c59adedbdc02cc54341d630354edc3.

The reason is that Takashi Iwai reported a regression bisected to this
commit:

http://www.mail-archive.com/intel-gfx@lists.freedesktop.org/msg18788.html

His machine has eDP on port D (usual desktop all-in-on setup), which
intel_dp.c identifies as an eDP panel, but the hsw ddi code
mishandles.

Closer inspection of the code reveals that haswell_crtc_mode_set also
checks intel_encoder_is_pch_edp when setting is_cpu_edp. On haswell
that doesn't make much sense (since there's no edp on the pch), but
what this function _really_ checks is whether that edp connector is on
port A or port D. It's just that on ilk-ivb port D was on the pch ...

So that explains why this seemingly innocent change killed eDP on port
D. Furthermore it looks like everything else accidentally works, since
we've never enabled eDP on port D support for hsw intentionally (e.g.
we still register the HDMI output for port D in that case).

But in retrospective I also don't like that this leaks highly platform
specific details into common code, and the reason is that the drm
vblank layer sucks. So instead I think we should:
- move the cpu_transcoder into the dynamic pipe_config tracking (once
  that's merged).
- fix up the drm vblank layer to finally deal with kms crtc objects
  instead of int pipes.

v2: Pimp commit message with the better diagnosis as discussed with
Paulo on irc.

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Takashi Iwai <tiwai@suse.de>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
11 years agoMerge tag 'efi-for-3.9-rc4' into x86/urgent
H. Peter Anvin [Sun, 24 Mar 2013 04:49:51 +0000 (21:49 -0700)]
Merge tag 'efi-for-3.9-rc4' into x86/urgent

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
11 years agoLinux 3.9-rc4
Linus Torvalds [Sat, 23 Mar 2013 23:52:44 +0000 (16:52 -0700)]
Linux 3.9-rc4

11 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Linus Torvalds [Sat, 23 Mar 2013 23:51:55 +0000 (16:51 -0700)]
Merge git://git./linux/kernel/git/nab/target-pending

Pull SCSI target fixes from Nicholas Bellinger:
 "These are mostly minor fixes this time around.  The iscsi-target CHAP
  big-endian bugfix and bump FD_MAX_SECTORS=2048 default patch to allow
  1MB sized I/Os for FILEIO backends on >= v3.5 code are both CC'ed to
  stable.

  Also, there is a persistent reservations regression that has recently
  been reported for >= v3.8.x code, that is currently being tracked down
  for v3.9."

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  target/pscsi: Reject cross page boundary case in pscsi_map_sg
  target/file: Bump FD_MAX_SECTORS to 2048 to handle 1M sized I/Os
  tcm_vhost: Flush vhost_work in vhost_scsi_flush()
  tcm_vhost: Add missed lock in vhost_scsi_clear_endpoint()
  target: fix possible memory leak in core_tpg_register()
  target/iscsi: Fix mutual CHAP auth on big-endian arches
  target_core_sbc: use noop for SYNCHRONIZE_CACHE

11 years agoMerge tag 'md-3.9-fixes' of git://neil.brown.name/md
Linus Torvalds [Sat, 23 Mar 2013 22:49:49 +0000 (15:49 -0700)]
Merge tag 'md-3.9-fixes' of git://neil.brown.name/md

Pull md fixes from NeilBrown:
 "A few bugfixes for md

   - recent regressions in raid5
   - recent regressions in dmraid
   - a few instances of CONFIG_MULTICORE_RAID456 linger

  Several tagged for -stable"

* tag 'md-3.9-fixes' of git://neil.brown.name/md:
  md: remove CONFIG_MULTICORE_RAID456 entirely
  md/raid5: ensure sync and DISCARD don't happen at the same time.
  MD: Prevent sysfs operations on uninitialized kobjects
  MD RAID5: Avoid accessing gendisk or queue structs when not available
  md/raid5: schedule_construction should abort if nothing to do.

11 years agoMerge tag 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik...
Linus Torvalds [Sat, 23 Mar 2013 19:33:36 +0000 (12:33 -0700)]
Merge tag 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev

Pull libata updates from Jeff Garzik:
 "Simple stuff.  See one-line summaries."

* tag 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  pata_samsung_cf: use module_platform_driver_probe()
  [libata] Avoid specialized TLA's in ZPODD's Kconfig
  libata-acpi.c: fix copy and paste mistake in ata_acpi_register_power_resource
  sata_fsl: Remove redundant NULL check before kfree
  ahci: Add Device IDs for Intel Wellsburg PCH
  ata_piix: Add MODULE_PARM_DESC to prefer_ms_hyperv

11 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Sat, 23 Mar 2013 19:32:14 +0000 (12:32 -0700)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
 "One bugfix for the tegra driver.  Two updates regarding email
  addresses and MAINTAINERS which I like to have up-to-date so people
  can be reached immediately.  While we are here, there is on PCI_ID
  addition."

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  MAINTAINERS: add maintainer entry for atmel i2c driver
  i2c: Fix my e-mail address in drivers and documentation
  i2c: iSMT: add Intel Avoton DeviceIDs
  i2c: tegra: check the clk_prepare_enable() return value

11 years agoMerge git://www.linux-watchdog.org/linux-watchdog
Linus Torvalds [Sat, 23 Mar 2013 19:30:39 +0000 (12:30 -0700)]
Merge git://www.linux-watchdog.org/linux-watchdog

Pull watchdog fixes from Wim Van Sebroeck:
 "Fix a boot issues and correct the AcpiMmioSel bitmask in the
  sp5100_tco watchdog device driver"

* git://www.linux-watchdog.org/linux-watchdog:
  watchdog: sp5100_tco: Set the AcpiMmioSel bitmask value to 1 instead of 2
  watchdog: sp5100_tco: Remove code that may cause a boot failure

11 years agoKMS: fix EDID detailed timing frame rate
Torsten Duwe [Sat, 23 Mar 2013 14:39:34 +0000 (15:39 +0100)]
KMS: fix EDID detailed timing frame rate

When KMS has parsed an EDID "detailed timing", it leaves the frame rate
zeroed.  Consecutive (debug-) output of that mode thus yields 0 for
vsync.  This simple fix also speeds up future invocations of
drm_mode_vrefresh().

While it is debatable whether this qualifies as a -stable fix I'd apply
it for consistency's sake; drm_helper_probe_single_connector_modes()
does the same thing already for all probed modes.

Cc: stable@vger.kernel.org
Signed-off-by: Torsten Duwe <duwe@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoKMS: fix EDID detailed timing vsync parsing
Torsten Duwe [Sat, 23 Mar 2013 14:38:22 +0000 (15:38 +0100)]
KMS: fix EDID detailed timing vsync parsing

EDID spreads some values across multiple bytes; bit-fiddling is needed
to retrieve these.  The current code to parse "detailed timings" has a
cut&paste error that results in a vsync offset of at most 15 lines
instead of 63.

See

   http://en.wikipedia.org/wiki/EDID

and in the "EDID Detailed Timing Descriptor" see bytes 10+11 show why
that needs to be a left shift.

Cc: stable@vger.kernel.org
Signed-off-by: Torsten Duwe <duwe@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoMerge branches 'cxgb4', 'ipoib' and 'qib' into for-next
Roland Dreier [Sat, 23 Mar 2013 01:08:03 +0000 (18:08 -0700)]
Merge branches 'cxgb4', 'ipoib' and 'qib' into for-next

11 years agoIB/qib: change QLogic to Intel
Vinit Agnihotri [Thu, 14 Mar 2013 18:13:41 +0000 (18:13 +0000)]
IB/qib: change QLogic to Intel

These changes modify the qib driver as part of acquiring
the InfiniBand assets of QLogic.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Vinit Agnihotri <vinit.abhay.agnihotri@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
11 years agoIB/ipath: Silence a static checker warning
Dan Carpenter [Mon, 18 Mar 2013 20:25:26 +0000 (20:25 +0000)]
IB/ipath: Silence a static checker warning

I have a static checker which complains that 0x255 is too high for
the "dev->opstats[opcode]" array.  It turns out that the hardware
has already validated the opcode at this point so it can't actually
overflow.

However, silencing the warning is good and this matches how the
opcode is treated in qib_ib_rcv() as well.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
11 years agoIPoIB: Fix send lockup due to missed TX completion
Mike Marciniszyn [Tue, 26 Feb 2013 15:46:27 +0000 (15:46 +0000)]
IPoIB: Fix send lockup due to missed TX completion

Commit f0dc117abdfa ("IPoIB: Fix TX queue lockup with mixed UD/CM
traffic") attempts to solve an issue where unprocessed UD send
completions can deadlock the netdev.

The patch doesn't fully resolve the issue because if more than half
the tx_outstanding's were UD and all of the destinations are RC
reachable, arming the CQ doesn't solve the issue.

This patch uses the IB_CQ_REPORT_MISSED_EVENTS on the
ib_req_notify_cq().  If the rc is above 0, the UD send cq completion
callback is called directly to re-arm the send completion timer.

This issue is seen in very large parallel filesystem deployments
and the patch has been shown to correct the issue.

Cc: <stable@vger.kernel.org>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
11 years agoRDMA/cxgb4: Fix error return code in create_qp()
Wei Yongjun [Fri, 15 Mar 2013 09:42:12 +0000 (09:42 +0000)]
RDMA/cxgb4: Fix error return code in create_qp()

Fix to return a negative error code from the error handling case
instead of 0, as returned elsewhere in this function.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
11 years agoMerge git://git.infradead.org/users/willy/linux-nvme
Linus Torvalds [Fri, 22 Mar 2013 23:43:53 +0000 (16:43 -0700)]
Merge git://git.infradead.org/users/willy/linux-nvme

Pull NVMe driver update from Matthew Wilcox:
 "These patches have mostly been baking for a few months; sorry I didn't
  get them in during the merge window.  They're all bug fixes, except
  for the addition of the SMART log and the addition to MAINTAINERS."

* git://git.infradead.org/users/willy/linux-nvme:
  NVMe: Add namespaces with no LBA range feature
  MAINTAINERS: Add entry for the NVMe driver
  NVMe: Initialize iod nents to 0
  NVMe: Define SMART log
  NVMe: Add result to nvme_get_features
  NVMe: Set result from user admin command
  NVMe: End queued bio requests when freeing queue
  NVMe: Free cmdid on nvme_submit_bio error

11 years agoMerge branch 'akpm' (fixes from Andrew)
Linus Torvalds [Fri, 22 Mar 2013 23:41:44 +0000 (16:41 -0700)]
Merge branch 'akpm' (fixes from Andrew)

Merge misc fixes from Andrew Morton.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mqueue: sys_mq_open: do not call mnt_drop_write() if read-only
  mm/hotplug: only free wait_table if it's allocated by vmalloc
  dma-debug: update DMA debug API to better handle multiple mappings of a buffer
  dma-debug: fix locking bug in check_unmap()
  drivers/rtc/rtc-at91rm9200.c: use a variable for storing IMR
  drivers/video/ep93xx-fb.c: include <linux/io.h> for devm_ioremap()
  drivers/rtc/rtc-da9052.c: fix for rtc device registration
  mm: zone_end_pfn is too small
  poweroff: change orderly_poweroff() to use schedule_work()
  mm/hugetlb: fix total hugetlbfs pages count when using memory overcommit accouting
  printk: Provide a wake_up_klogd() off-case
  irq_work.h: fix warning when CONFIG_IRQ_WORK=n

11 years agomqueue: sys_mq_open: do not call mnt_drop_write() if read-only
Vladimir Davydov [Fri, 22 Mar 2013 22:04:51 +0000 (15:04 -0700)]
mqueue: sys_mq_open: do not call mnt_drop_write() if read-only

mnt_drop_write() must be called only if mnt_want_write() succeeded,
otherwise the mnt_writers counter will diverge.

mnt_writers counters are used to check if remounting FS as read-only is
OK, so after an extra mnt_drop_write() call, it would be impossible to
remount mqueue FS as read-only.  Besides, on umount a warning would be
printed like this one:

  =====================================
  [ BUG: bad unlock balance detected! ]
  3.9.0-rc3 #5 Not tainted
  -------------------------------------
  a.out/12486 is trying to release lock (sb_writers) at:
  mnt_drop_write+0x1f/0x30
  but there are no more locks to release!

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agomm/hotplug: only free wait_table if it's allocated by vmalloc
Jianguo Wu [Fri, 22 Mar 2013 22:04:50 +0000 (15:04 -0700)]
mm/hotplug: only free wait_table if it's allocated by vmalloc

zone->wait_table may be allocated from bootmem, it can not be freed.

Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Reviewed-by: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agodma-debug: update DMA debug API to better handle multiple mappings of a buffer
Alexander Duyck [Fri, 22 Mar 2013 22:04:49 +0000 (15:04 -0700)]
dma-debug: update DMA debug API to better handle multiple mappings of a buffer

There were reports of the igb driver unmapping buffers without calling
dma_mapping_error.  On closer inspection issues were found in the DMA
debug API and how it handled multiple mappings of the same buffer.

The issue I found is the fact that the debug_dma_mapping_error would
only set the map_err_type to MAP_ERR_CHECKED in the case that the was
only one match for device and device address.  However in the case of
non-IOMMU, multiple addresses existed and as a result it was not setting
this field once a second mapping was instantiated.  I have resolved this
by changing the search so that it instead will now set MAP_ERR_CHECKED
on the first buffer that matches the device and DMA address that is
currently in the state MAP_ERR_NOT_CHECKED.

A secondary side effect of this patch is that in the case of multiple
buffers using the same address only the last mapping will have a valid
map_err_type.  The previous mappings will all end up with map_err_type
set to MAP_ERR_CHECKED because of the dma_mapping_error call in
debug_dma_map_page.  However this behavior may be preferable as it means
you will likely only see one real error per multi-mapped buffer, versus
the current behavior of multiple false errors mer multi-mapped buffer.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Reviewed-by: Shuah Khan <shuah.khan@hp.com>
Tested-by: Shuah Khan <shuah.khan@hp.com>
Cc: Jakub Kicinski <kubakici@wp.pl>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agodma-debug: fix locking bug in check_unmap()
Alexander Duyck [Fri, 22 Mar 2013 22:04:48 +0000 (15:04 -0700)]
dma-debug: fix locking bug in check_unmap()

In check_unmap() it is possible to get into a dead-locked state if
dma_mapping_error is called.  The problem is that the bucket is locked in
check_unmap, and locked again by debug_dma_mapping_error which is called
by dma_mapping_error.  To resolve that we must release the lock on the
bucket before making the call to dma_mapping_error.

[akpm@linux-foundation.org: restore 80-col trickery to be consistent with the rest of the file]
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Reviewed-by: Shuah Khan <shuah.khan@hp.com>
Tested-by: Shuah Khan <shuah.khan@hp.com>
Cc: Jakub Kicinski <kubakici@wp.pl>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agodrivers/rtc/rtc-at91rm9200.c: use a variable for storing IMR
Nicolas Ferre [Fri, 22 Mar 2013 22:04:47 +0000 (15:04 -0700)]
drivers/rtc/rtc-at91rm9200.c: use a variable for storing IMR

On some revisions of AT91 SoCs, the RTC IMR register is not working.
Instead of elaborating a workaround for that specific SoC or IP version,
we simply use a software variable to store the Interrupt Mask Register
and modify it for each enabling/disabling of an interrupt.  The overhead
of this is negligible anyway.

The interrupt mask register (IMR) for the RTC is broken on the AT91SAM9x5
sub-family of SoCs (good overview of the members here:
http://www.eewiki.net/display/linuxonarm/AT91SAM9x5 ).  The "user visible
effect" is the RTC doesn't work.

That sub-family is less than two years old and only has devicetree (DT)
support and came online circa lk 3.7 .  The dust is yet to settle on the
DT stuff at least for AT91 SoCs (translation: lots of stuff is still
broken, so much that it is hard to know where to start).

The fix in the patch is pretty simple: just shadow the silicon IMR
register with a variable in the driver.  Some older SoCs (pre-DT) use the
the rtc-at91rm9200 driver (e.g.  obviously the AT91RM9200) and they should
not be impacted by the change.  There shouldn't be a large volume of
interrupts associated with a RTC.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Reported-by: Douglas Gilbert <dgilbert@interlog.com>
Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agodrivers/video/ep93xx-fb.c: include <linux/io.h> for devm_ioremap()
H Hartley Sweeten [Fri, 22 Mar 2013 22:04:45 +0000 (15:04 -0700)]
drivers/video/ep93xx-fb.c: include <linux/io.h> for devm_ioremap()

Commit be8678149701 ("drivers/video/ep93xx-fb.c: use devm_ functions")
introduced a build error:

  drivers/video/ep93xx-fb.c: In function 'ep93xxfb_probe':
  drivers/video/ep93xx-fb.c:532: error: implicit declaration of function 'devm_ioremap'
  drivers/video/ep93xx-fb.c:533: warning: assignment makes pointer from integer without a cast

Include <linux/io.h> to pickup the declaration of 'devm_ioremap'.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Acked-by: Ryan Mallon <rmallon@gmail.com>
Cc: Damien Cassou <damien.cassou@lifl.fr>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agodrivers/rtc/rtc-da9052.c: fix for rtc device registration
Ashish Jangam [Fri, 22 Mar 2013 22:04:44 +0000 (15:04 -0700)]
drivers/rtc/rtc-da9052.c: fix for rtc device registration

Add support for the virtual irq since now MFD only handles virtual irq
Without this patch rtc device will fail in registration.

(akpm: Ashish has a different version whcih will be needed for 3.8.x and
earlier kernels)

Signed-off-by: Ashish <ashish.jangam@kpitcummins.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agomm: zone_end_pfn is too small
Russ Anderson [Fri, 22 Mar 2013 22:04:43 +0000 (15:04 -0700)]
mm: zone_end_pfn is too small

Booting with 32 TBytes memory hits BUG at mm/page_alloc.c:552! (output
below).

The key hint is "page 4294967296 outside zone".
4294967296 = 0x100000000 (bit 32 is set).

The problem is in include/linux/mmzone.h:

  530 static inline unsigned zone_end_pfn(const struct zone *zone)
  531 {
  532         return zone->zone_start_pfn + zone->spanned_pages;
  533 }

zone_end_pfn is "unsigned" (32 bits).  Changing it to "unsigned long"
(64 bits) fixes the problem.

zone_end_pfn() was added recently in commit 108bcc96ef70 ("mm: add & use
zone_end_pfn() and zone_spans_pfn()")

Output from the failure.

  No AGP bridge found
  page 4294967296 outside zone [ 4294967296 - 4327469056 ]
  ------------[ cut here ]------------
  kernel BUG at mm/page_alloc.c:552!
  invalid opcode: 0000 [#1] SMP
  Modules linked in:
  CPU 0
  Pid: 0, comm: swapper Not tainted 3.9.0-rc2.dtp+ #10
  RIP: free_one_page+0x382/0x430
  Process swapper (pid: 0, threadinfo ffffffff81942000, task ffffffff81955420)
  Call Trace:
    __free_pages_ok+0x96/0xb0
    __free_pages+0x25/0x50
    __free_pages_bootmem+0x8a/0x8c
    __free_memory_core+0xea/0x131
    free_low_memory_core_early+0x4a/0x98
    free_all_bootmem+0x45/0x47
    mem_init+0x7b/0x14c
    start_kernel+0x216/0x433
    x86_64_start_reservations+0x2a/0x2c
    x86_64_start_kernel+0x144/0x153
  Code: 89 f1 ba 01 00 00 00 31 f6 d3 e2 4c 89 ef e8 66 a4 01 00 e9 2c fe ff ff 0f 0b eb fe 0f 0b 66 66 2e 0f 1f 84 00 00 00 00 00 eb f3 <0f> 0b eb fe 0f 0b 0f 1f 84 00 00 00 00 00 eb f6 0f 0b eb fe 49

Signed-off-by: Russ Anderson <rja@sgi.com>
Reported-by: George Beshers <gbeshers@sgi.com>
Acked-by: Hedi Berriche <hedi@sgi.com>
Cc: Cody P Schafer <cody@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agopoweroff: change orderly_poweroff() to use schedule_work()
Oleg Nesterov [Fri, 22 Mar 2013 22:04:41 +0000 (15:04 -0700)]
poweroff: change orderly_poweroff() to use schedule_work()

David said:

    Commit 6c0c0d4d1080 ("poweroff: fix bug in orderly_poweroff()")
    apparently fixes one bug in orderly_poweroff(), but introduces
    another.  The comments on orderly_poweroff() claim it can be called
    from any context - and indeed we call it from interrupt context in
    arch/powerpc/platforms/pseries/ras.c for example.  But since that
    commit this is no longer safe, since call_usermodehelper_fns() is not
    safe in interrupt context without the UMH_NO_WAIT option.

orderly_poweroff() can be used from any context but UMH_WAIT_EXEC is
sleepable.  Move the "force" logic into __orderly_poweroff() and change
orderly_poweroff() to use the global poweroff_work which simply calls
__orderly_poweroff().

While at it, remove the unneeded "int argc" and change argv_split() to
use GFP_KERNEL.

We use the global "bool poweroff_force" to pass the argument, this can
obviously affect the previous request if it is pending/running.  So we
only allow the "false => true" transition assuming that the pending
"true" should succeed anyway.  If schedule_work() fails after that we
know that work->func() was not called yet, it must see the new value.

This means that orderly_poweroff() becomes async even if we do not run
the command and always succeeds, schedule_work() can only fail if the
work is already pending.  We can export __orderly_poweroff() and change
the non-atomic callers which want the old semantics.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reported-by: David Gibson <david@gibson.dropbear.id.au>
Cc: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Cc: Feng Hong <hongfeng@marvell.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agomm/hugetlb: fix total hugetlbfs pages count when using memory overcommit accouting
Wanpeng Li [Fri, 22 Mar 2013 22:04:40 +0000 (15:04 -0700)]
mm/hugetlb: fix total hugetlbfs pages count when using memory overcommit accouting

hugetlb_total_pages is used for overcommit calculations but the current
implementation considers only the default hugetlb page size (which is
either the first defined hugepage size or the one specified by
default_hugepagesz kernel boot parameter).

If the system is configured for more than one hugepage size, which is
possible since commit a137e1cc6d6e ("hugetlbfs: per mount huge page
sizes") then the overcommit estimation done by __vm_enough_memory()
(resp.  shown by meminfo_proc_show) is not precise - there is an
impression of more available/allowed memory.  This can lead to an
unexpected ENOMEM/EFAULT resp.  SIGSEGV when memory is accounted.

Testcase:
  boot: hugepagesz=1G hugepages=1
  the default overcommit ratio is 50
  before patch:

    egrep 'CommitLimit' /proc/meminfo
    CommitLimit:     55434168 kB

  after patch:

    egrep 'CommitLimit' /proc/meminfo
    CommitLimit:     54909880 kB

[akpm@linux-foundation.org: coding-style tweak]
Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: <stable@vger.kernel.org> [3.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>