Trond Myklebust [Fri, 21 Sep 2012 00:46:49 +0000 (20:46 -0400)]
NFSv4.1: Clean up pnfs_put_lseg()
There is no longer a need to use pnfs_free_lseg_list(). Just call
pnfs_free_lseg() directly.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Thu, 20 Sep 2012 21:31:43 +0000 (17:31 -0400)]
NFSv4.1: Clean up the removal of pnfs_layout_hdr from the server list
Move the code into pnfs_free_layout_hdr(), and add checks to
get_layout_by_fh_locked to ensure that they don't reference a layout
that is being freed.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Thu, 20 Sep 2012 21:23:11 +0000 (17:23 -0400)]
NFSv4.1: Free the pnfs_layout_hdr outside the inode->i_lock
None of the existing pNFS layout drivers seem to require the inode
to be locked while they free the layout header.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Thu, 20 Sep 2012 21:02:32 +0000 (17:02 -0400)]
NFSv4.1: Remove redundant reference to the pnfs_layout_hdr
Each layout segment already holds a reference to the pnfs_layout_hdr,
so there is no need to hold an extra reference that is released once
the last layout segment is freed.
Ensure that pnfs_find_alloc_layout() always returns a reference
to the pnfs_layout_hdr, which will be matched by the final call to
pnfs_put_layout_hdr() in pnfs_update_layout().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Thu, 20 Sep 2012 20:33:30 +0000 (16:33 -0400)]
NFSv4.1: Rename the pnfs_put_lseg_common to pnfs_layout_remove_lseg
The latter name is more descriptive of the actual function.
Also rename pnfs_insert_layout to pnfs_layout_insert_lseg.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Thu, 20 Sep 2012 19:52:13 +0000 (15:52 -0400)]
NFSv4.1: reset the inode MDS threshold counters on layout destruction
Instead of resetting the inode MDS threshold counters when we mark
the layout for destruction, do it as part of freeing the layout.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Thu, 20 Sep 2012 19:07:45 +0000 (15:07 -0400)]
NFSv4.1: Get rid of pNFS layout state "NFS_LAYOUT_INVALID"
In all cases where we set NFS_LAYOUT_INVALID, we also set NFS_LAYOUT_DESTROYED.
Furthermore, in all cases where we test for NFS_LAYOUT_INVALID, we should
also be testing for NFS_LAYOUT_DESTROYED, since the latter means that
we hold no valid layout segments.
Ergo the two are redundant.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 21 Sep 2012 00:31:51 +0000 (20:31 -0400)]
NFSv4.1: Simplify the pNFS return-on-close code
Confine it to the nfs4_do_close() code.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 21 Sep 2012 00:15:57 +0000 (20:15 -0400)]
NFSv4.1: Fix a race in the pNFS return-on-close code
If we sleep after dropping the inode->i_lock, then we are no longer
atomic with respect to the rpc_wake_up() call in pnfs_layout_remove_lseg().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 21 Sep 2012 01:19:43 +0000 (21:19 -0400)]
NFSv4.1: pnfs_layout_io_set_failed must clear invalid lsegs
If pnfs_layout_io_test_failed() authorises a retry of the failed layoutgets,
we should clear the existing layout segments so that we start afresh. Do
this in pnfs_layout_io_set_failed().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 24 Sep 2012 17:07:16 +0000 (13:07 -0400)]
NFSv4.1: Don't drop the pnfs_layout_hdr after a layoutget failure
We want to cache the pnfs_layout_hdr after a layoutget or i/o
failure so that pnfs_update_layout() can find it and know when
it is time to retry.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 21 Sep 2012 01:25:19 +0000 (21:25 -0400)]
NFSv4.1: Fix a reference leak in pnfs_update_layout
If we exit after the call to pnfs_find_alloc_layout(), we have to ensure
that we put the struct pnfs_layout_hdr.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 18 Sep 2012 23:51:12 +0000 (19:51 -0400)]
NFSv4.1: pNFS data servers may be temporarily offline
In cases where the pNFS data server is just temporarily out of service,
we want to mark it as such, and then try again later. Typically that will
be in cases of network connection errors etc.
This patch allows us to mark the devices as being "unavailable" for such
transient errors, and will make them available for retries after a
2 minute timeout period.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 18 Sep 2012 21:01:12 +0000 (17:01 -0400)]
NFSv4.1: Retry pNFS after a 2 minute timeout
If we had to fall back to read/write through MDS, then assume that we should
retry pNFS after a suitable timeout period.
The following patch sets a timeout of 2 minutes.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 18 Sep 2012 20:41:18 +0000 (16:41 -0400)]
NFSv4.1: Add helpers for setting/reading the I/O fail bit
...and make them local to the pnfs.c file.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 26 Sep 2012 15:21:40 +0000 (11:21 -0400)]
NFSv4.1: Replace dprintk() in pnfs_update_layout with something less buggy
Dereferencing nfsi->layout in order to read plh_flags without holding
a spin lock is bug prone. Furthermore, the dprintk() tells you nothing
about whether or not the call succeeded.
Replace it with something that tells you about whether or not a valid
layout segment was returned for the inode in question.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 19 Sep 2012 01:02:29 +0000 (21:02 -0400)]
NFSv4.1: Replace get_device_info() with filelayout_get_device_info()
Fix the namespace pollution issue.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 19 Sep 2012 00:57:08 +0000 (20:57 -0400)]
NFSv4.1: Cleanup; add "pnfs_" prefix to put_lseg() and get_lseg()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 19 Sep 2012 00:51:13 +0000 (20:51 -0400)]
NFSv4.1: Cleanup; add "pnfs_" prefix to get_layout_hdr() and put_layout_hdr()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 19 Sep 2012 00:43:31 +0000 (20:43 -0400)]
NFSv4.1: Cleanup add a "pnfs_" prefix to mark_matching_lsegs_invalid
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 17 Sep 2012 21:12:15 +0000 (17:12 -0400)]
NFS: Clean up the pNFS layoutget interface
Ensure that we do return errors from nfs4_proc_layoutget() and that we
don't mark the layout as having failed if the error was due to a
signal or resource problem on the client side.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 11 Sep 2012 21:21:25 +0000 (17:21 -0400)]
SUNRPC: Get rid of the redundant xprt->shutdown bit field
It is only set after everyone has dereferenced the transport,
and serves no useful purpose: setting it is racy, so all the
socket code, etc still needs to be able to cope with the cases
where they miss reading it.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 11 Sep 2012 20:19:38 +0000 (16:19 -0400)]
NFS: Write the entire file if a server reboot occurs during fsync()
This is to ensure that we don't clear the NFS_CONTEXT_RESEND_WRITES
flag while there are still writes that haven't been resent.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 11 Sep 2012 20:01:22 +0000 (16:01 -0400)]
NFS: Fix fdatasync/fsync() when confronted with a server reboot
If the server reboots before it can commit the unstable writes to disk,
then nfs_commit_release_pages() will detect this when it compares the
verifier returned by COMMIT to the one returned by WRITE. When this
happens, the client needs to resend those writes in order to guarantee
that they make it to stable storage.
This patch adds a signalling mechanism to notify fsync() that it
needs to retry all writes before it can exit.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 10 Sep 2012 17:26:49 +0000 (13:26 -0400)]
NFSv4: Convert the nfs4_lock_state->ls_flags to a bit field
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 13 Aug 2012 22:54:45 +0000 (18:54 -0400)]
NFS: Clean up helper function nfs4_select_rw_stateid()
We want to be able to pass on the information that the page was not
dirtied under a lock. Instead of adding a flag parameter, do this
by passing a pointer to a 'struct nfs_lock_owner' that may be NULL.
Also reuse this structure in struct nfs_lock_context to carry the
fl_owner_t and pid_t.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 13 Aug 2012 21:15:50 +0000 (17:15 -0400)]
NFS: Convert nfs_get_lock_context to return an ERR_PTR on failure
We want to be able to distinguish between allocation failures, and
the case where the lock context is not needed (because there are no
locks).
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Thu, 2 Aug 2012 17:21:43 +0000 (13:21 -0400)]
SUNRPC: Optimise away unnecessary data moves in xdr_align_pages
We only have to call xdr_shrink_pagelen() if the remaining RPC
message does not fit in the page buffer length that we supplied
to xdr_align_pages().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 1 Aug 2012 18:21:12 +0000 (14:21 -0400)]
NFSv4.1: decode_getdeviceinfo should check xdr_read_pages() return value
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 1 Aug 2012 18:32:13 +0000 (14:32 -0400)]
SUNRPC: Fix the return value of xdr_align_pages()
The callers of xdr_align_pages() expect it to return the number of bytes
of actual XDR data remaining in the pages.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
NeilBrown [Mon, 17 Sep 2012 06:46:34 +0000 (16:46 +1000)]
NFS4: avoid underflow when converting error to pointer.
In nfs4_create_sec_client, 'flavor' can hold a negative error
code (returned from nfs4_negotiate_security), even though it
is an 'enum' and hence unsigned.
The code is careful to cast it to an (int) before testing if it
is negative, however it doesn't cast to an (int) before calling
ERR_PTR.
On a machine where "void*" is larger than "int", this results in
the unsigned equivalent of -1 (e.g. 0xffffffff) being converted
to a pointer. Subsequent code determines that this is not
negative, and so dereferences it with predictable results.
So: cast 'flavor' to a (signed) int before passing to ERR_PTR.
cc: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Wei Yongjun [Fri, 21 Sep 2012 04:27:41 +0000 (12:27 +0800)]
NFS: fix the return value check by using IS_ERR
In case of error, the function rpcauth_create() returns ERR_PTR()
and never returns NULL pointer. The NULL test in the return value
check should be replaced with IS_ERR().
dpatch engine is used to auto generated this patch.
(https://github.com/weiyj/dpatch)
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Bryan Schumaker [Mon, 24 Sep 2012 17:39:01 +0000 (13:39 -0400)]
SUNRPC: Set alloc_slot for backchannel tcp ops
f39c1bfb5a03e2d255451bff05be0d7255298fa4 (SUNRPC: Fix a UDP transport
regression) introduced the "alloc_slot" function for xprt operations,
but never created one for the backchannel operations. This patch fixes
a null pointer dereference when mounting NFS over v4.1.
Call Trace:
[<
ffffffffa0207957>] ? xprt_reserve+0x47/0x50 [sunrpc]
[<
ffffffffa02023a4>] call_reserve+0x34/0x60 [sunrpc]
[<
ffffffffa020e280>] __rpc_execute+0x90/0x400 [sunrpc]
[<
ffffffffa020e61a>] rpc_async_schedule+0x2a/0x40 [sunrpc]
[<
ffffffff81073589>] process_one_work+0x139/0x500
[<
ffffffff81070e70>] ? alloc_worker+0x70/0x70
[<
ffffffffa020e5f0>] ? __rpc_execute+0x400/0x400 [sunrpc]
[<
ffffffff81073d1e>] worker_thread+0x15e/0x460
[<
ffffffff8145c839>] ? preempt_schedule+0x49/0x70
[<
ffffffff81073bc0>] ? rescuer_thread+0x230/0x230
[<
ffffffff81079603>] kthread+0x93/0xa0
[<
ffffffff81465d04>] kernel_thread_helper+0x4/0x10
[<
ffffffff81079570>] ? kthread_freezable_should_stop+0x70/0x70
[<
ffffffff81465d00>] ? gs_change+0x13/0x13
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 12 Sep 2012 20:49:15 +0000 (16:49 -0400)]
SUNRPC: Ensure that the TCP socket is closed when in CLOSE_WAIT
Instead of doing a shutdown() call, we need to do an actual close().
Ditto if/when the server is sending us junk RPC headers.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Tested-by: Simon Kirby <sim@hostway.ca>
Cc: stable@vger.kernel.org
Linus Torvalds [Wed, 19 Sep 2012 18:04:34 +0000 (11:04 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A small collection of driver fixes/updates and a core fix for 3.6. It
contains:
- Bug fixes for mtip32xx, and support for new hardware (just addition
of IDs). They have been queued up for 3.7 for a few weeks as well.
- rate-limit a failing command error message in block core.
- A fix for an old cciss bug from Stephen.
- Prevent overflow of partition count from Alan."
* 'for-linus' of git://git.kernel.dk/linux-block:
cciss: fix handling of protocol error
blk: add an upper sanity check on partition adding
mtip32xx: fix user_buffer check in exec_drive_command
mtip32xx: Remove dead code
mtip32xx: Change printk to pr_xxxx
mtip32xx: Proper reporting of write protect status on big-endian
mtip32xx: Increase timeout for standby command
mtip32xx: Handle NCQ commands during the security locked state
mtip32xx: Add support for new devices
block: rate-limit the error message from failing commands
Linus Torvalds [Wed, 19 Sep 2012 18:03:55 +0000 (11:03 -0700)]
Merge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh
Pull SuperH fixes from Paul Mundt.
* tag 'sh-for-linus' of git://github.com/pmundt/linux-sh:
sh: Fix up TIF_NOTIFY_RESUME sans TIF_SIGPENDING handling.
sh: pfc: Release spinlock in sh_pfc_gpio_request_enable() error path
sh: intc: Fix up multi-evt irq association.
Linus Torvalds [Wed, 19 Sep 2012 18:03:13 +0000 (11:03 -0700)]
Merge tag 'rpmsg-3.6-fix' of git://git./linux/kernel/git/ohad/rpmsg
Pull rpmsg fix from Ohad Ben-Cohen:
"A quick rpmsg fix from Fernando, fixing two buggy invocations of
dma_free_coherent"
* tag 'rpmsg-3.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/rpmsg:
rpmsg: fix dma_free_coherent dev parameter
Linus Torvalds [Wed, 19 Sep 2012 18:01:38 +0000 (11:01 -0700)]
Merge tag 'md-3.6-fixes' of git://neil.brown.name/md
Pull md fixes from NeilBrown:
"3 fixes for md in 3.6.
One reverts a recent patch which turns out to not be such a good idea.
Other two fix minor bugs with the new (since 3.3) 'replacement' code
and have been tagged for -stable."
* tag 'md-3.6-fixes' of git://neil.brown.name/md:
md: make sure metadata is updated when spares are activated or removed.
md/raid5: fix calculate of 'degraded' when a replacement becomes active.
Revert "md/raid5: For odirect-write performance, do not set STRIPE_PREREAD_ACTIVE."
Linus Torvalds [Wed, 19 Sep 2012 18:00:07 +0000 (11:00 -0700)]
Merge branch 'for-3.6-fixes' of git://git./linux/kernel/git/tj/wq
Pull workqueue / powernow-k8 fix from Tejun Heo:
"This is the fix for the bug where cpufreq/powernow-k8 was tripping
BUG_ON() in try_to_wake_up_local() by migrating workqueue worker to a
different CPU.
https://bugzilla.kernel.org/show_bug.cgi?id=47301
As discussed, the fix is now two parts - one to reimplement
work_on_cpu() so that it doesn't create a new kthread each time and
the actual fix which makes powernow-k8 use work_on_cpu() instead of
performing manual migration.
While pretty late in the merge cycle, both changes are on the safer
side. Jiri and I verified two existing users of work_on_cpu() and
Duncan confirmed that the powernow-k8 fix survived about 18 hours of
testing."
* 'for-3.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
cpufreq/powernow-k8: workqueue user shouldn't migrate the kworker to another CPU
workqueue: reimplement work_on_cpu() using system_wq
Tejun Heo [Tue, 18 Sep 2012 21:24:59 +0000 (14:24 -0700)]
cpufreq/powernow-k8: workqueue user shouldn't migrate the kworker to another CPU
powernowk8_target() runs off a per-cpu work item and if the
cpufreq_policy->cpu is different from the current one, it migrates the
kworker to the target CPU by manipulating current->cpus_allowed. The
function migrates the kworker back to the original CPU but this is
still broken. Workqueue concurrency management requires the kworkers
to stay on the same CPU and powernowk8_target() ends up triggerring
BUG_ON(rq != this_rq()) in try_to_wake_up_local() if it contends on
fidvid_mutex and sleeps.
It is unclear why this bug is being reported now. Duncan says it
appeared to be a regression of 3.6-rc1 and couldn't reproduce it on
3.5. Bisection seemed to point to
63d95a91 "workqueue: use @pool
instead of @gcwq or @cpu where applicable" which is an non-functional
change. Given that the reproduce case sometimes took upto days to
trigger, it's easy to be misled while bisecting. Maybe something made
contention on fidvid_mutex more likely? I don't know.
This patch fixes the bug by using work_on_cpu() instead if @pol->cpu
isn't the same as the current one. The code assumes that
cpufreq_policy->cpu is kept online by the caller, which Rafael tells
me is the case.
stable:
ed48ece27c ("workqueue: reimplement work_on_cpu() using
system_wq") should be applied before this; otherwise, the
behavior could be horrible.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Duncan <1i5t5.duncan@cox.net>
Tested-by: Duncan <1i5t5.duncan@cox.net>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: stable@vger.kernel.org
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=47301
Tejun Heo [Tue, 18 Sep 2012 19:48:43 +0000 (12:48 -0700)]
workqueue: reimplement work_on_cpu() using system_wq
The existing work_on_cpu() implementation is hugely inefficient. It
creates a new kthread, execute that single function and then let the
kthread die on each invocation.
Now that system_wq can handle concurrent executions, there's no
advantage of doing this. Reimplement work_on_cpu() using system_wq
which makes it simpler and way more efficient.
stable: While this isn't a fix in itself, it's needed to fix a
workqueue related bug in cpufreq/powernow-k8. AFAICS, this
shouldn't break other existing users.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: stable@vger.kernel.org
NeilBrown [Wed, 19 Sep 2012 02:54:22 +0000 (12:54 +1000)]
md: make sure metadata is updated when spares are activated or removed.
It isn't always necessary to update the metadata when spares are
removed as the presence-or-not of a spare isn't really important to
the integrity of an array.
Also activating a spare doesn't always require updating the metadata
as the update on 'recovery-completed' is usually sufficient.
However the introduction of 'replacement' devices have made these
transitions sometimes more important. For example the 'Replacement'
flag isn't cleared until the original device is removed, so we need
to ensure a metadata update after that 'spare' is removed.
So set MD_CHANGE_DEVS whenever a spare is activated or removed, to
complement the current situation where it is set when a spare is added
or a device is failed (or a number of other less common situations).
This is suitable for -stable as out-of-data metadata could lead
to data corruption.
This is only relevant for 3.3 and later 9when 'replacement' as
introduced.
Cc: stable@vger.kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Wed, 19 Sep 2012 02:52:30 +0000 (12:52 +1000)]
md/raid5: fix calculate of 'degraded' when a replacement becomes active.
When a replacement device becomes active, we mark the device that it
replaces as 'faulty' so that it can subsequently get removed.
However 'calc_degraded' only pays attention to the primary device, not
the replacement, so the array appears to become degraded, which is
wrong.
So teach 'calc_degraded' to consider any replacement if a primary
device is faulty.
This is suitable for -stable as an incorrect 'degraded' value can
confuse md and could lead to data corruption.
This is only relevant for 3.3 and later.
Cc: stable@vger.kernel.org
Reported-by: Robin Hill <robin@robinhill.me.uk>
Reported-by: John Drescher <drescherjm@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Wed, 19 Sep 2012 02:48:30 +0000 (12:48 +1000)]
Revert "md/raid5: For odirect-write performance, do not set STRIPE_PREREAD_ACTIVE."
This reverts commit
895e3c5c58a80bb9e4e05d9ac38b4f30e0f97d80.
While this patch seemed like a good idea and did help some workloads,
it hurts other workloads.
Large sequential O_DIRECT writes were faster,
Small random O_DIRECT writes were slower.
Other changes (batching RAID5 writes) have improved the sequential
writes using a different mechanism, so the net result of this patch
is definitely negative. So revert it.
Reported-by: Shaohua Li <shli@kernel.org>
Tested-by: Jianpeng Ma <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Linus Torvalds [Tue, 18 Sep 2012 18:58:54 +0000 (11:58 -0700)]
Merge tag 'hwspinlock-3.6-fix' of git://git./linux/kernel/git/ohad/hwspinlock
Pull hwspinlock fix from Ohad Ben-Cohen:
"A single hwspinlock fix by Wei Yongjun, which prevents potential NULL
dereferences"
* tag 'hwspinlock-3.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/hwspinlock:
hwspinlock/core: move the dereference below the NULL test
Miklos Szeredi [Mon, 17 Sep 2012 20:31:38 +0000 (22:31 +0200)]
vfs: dcache: use DCACHE_DENTRY_KILLED instead of DCACHE_DISCONNECTED in d_kill()
IBM reported a soft lockup after applying the fix for the rename_lock
deadlock. Commit
c83ce989cb5f ("VFS: Fix the nfs sillyrename regression
in kernel 2.6.38") was found to be the culprit.
The nfs sillyrename fix used DCACHE_DISCONNECTED to indicate that the
dentry was killed. This flag can be set on non-killed dentries too,
which results in infinite retries when trying to traverse the dentry
tree.
This patch introduces a separate flag: DCACHE_DENTRY_KILLED, which is
only set in d_kill() and makes try_to_ascend() test only this flag.
IBM reported successful test results with this patch.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stephen M. Cameron [Fri, 14 Sep 2012 21:35:10 +0000 (16:35 -0500)]
cciss: fix handling of protocol error
If a command completes with a status of CMD_PROTOCOL_ERR, this
information should be conveyed to the SCSI mid layer, not dropped
on the floor. Unlike a similar bug in the hpsa driver, this bug
only affects tape drives and CD and DVD ROM drives in the cciss
driver, and to induce it, you have to disconnect (or damage) a
cable, so it is not a very likely scenario (which would explain
why the bug has gone undetected for the last 10 years.)
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Alan Cox [Mon, 17 Sep 2012 10:47:13 +0000 (11:47 +0100)]
blk: add an upper sanity check on partition adding
65536 should be ludicrous anyway but without it we overflow the
memory computation doing the allocation and badness occurs.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Al Viro [Tue, 18 Sep 2012 08:04:37 +0000 (17:04 +0900)]
sh: Fix up TIF_NOTIFY_RESUME sans TIF_SIGPENDING handling.
As Al notes, we missed a TIF_NOTIFY_RESUME check which caused any
handlers without TIF_SIGPENDING also set to skip the notification:
Looks like while it is in the relevant masks *and* checked in
do_notify_resume() both on 32bit and 64bit variants since commit
ab99c733ae73cce31f2a2434f7099564e5a73d95 ("sh: Make syscall tracer
use tracehook notifiers, add TIF_NOTIFY_RESUME.") they are
actually *not* reached without simulataneous SIGPENDING, since
the actual glue in the callers had not been updated back then and
still checks for _TIF_SIGPENDING alone when deciding whether to
hit do_notify_resume() or not.
Reported-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Laurent Pinchart [Fri, 14 Sep 2012 18:25:48 +0000 (20:25 +0200)]
sh: pfc: Release spinlock in sh_pfc_gpio_request_enable() error path
The sh_pfc_gpio_request_enable() function acquires a spinlock but fails
to release it before returning if the requested mux type is not
supported. Fix this.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Linus Torvalds [Mon, 17 Sep 2012 23:05:23 +0000 (16:05 -0700)]
Merge branch 'for-3.6-fixes' of git://git./linux/kernel/git/tj/wq
Pull another workqueue fix from Tejun Heo:
"Unfortunately, yet another late fix. This too is discovered and fixed
by Lai. This bug was introduced during this merge window by commit
25511a477657 ("workqueue: reimplement CPU online rebinding to handle
idle workers") which started using WORKER_REBIND flag for idle rebind
too.
The bug is relatively easy to trigger if the CPU rapidly goes through
off, on and then off (and stay off). The fix is on the safer side.
This hasn't been on linux-next yet but I'm pushing early so that it
can get more exposure before v3.6 release."
* 'for-3.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: always clear WORKER_REBIND in busy_worker_rebind_fn()
Lai Jiangshan [Mon, 17 Sep 2012 22:42:31 +0000 (15:42 -0700)]
workqueue: always clear WORKER_REBIND in busy_worker_rebind_fn()
busy_worker_rebind_fn() didn't clear WORKER_REBIND if rebinding failed
(CPU is down again). This used to be okay because the flag wasn't
used for anything else.
However, after
25511a477 "workqueue: reimplement CPU online rebinding
to handle idle workers", WORKER_REBIND is also used to command idle
workers to rebind. If not cleared, the worker may confuse the next
CPU_UP cycle by having REBIND spuriously set or oops / get stuck by
prematurely calling idle_worker_rebind().
WARNING: at /work/os/wq/kernel/workqueue.c:1323 worker_thread+0x4cd/0x5
00()
Hardware name: Bochs
Modules linked in: test_wq(O-)
Pid: 33, comm: kworker/1:1 Tainted: G O 3.6.0-rc1-work+ #3
Call Trace:
[<
ffffffff8109039f>] warn_slowpath_common+0x7f/0xc0
[<
ffffffff810903fa>] warn_slowpath_null+0x1a/0x20
[<
ffffffff810b3f1d>] worker_thread+0x4cd/0x500
[<
ffffffff810bc16e>] kthread+0xbe/0xd0
[<
ffffffff81bd2664>] kernel_thread_helper+0x4/0x10
---[ end trace
e977cf20f4661968 ]---
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<
ffffffff810b3db0>] worker_thread+0x360/0x500
PGD 0
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: test_wq(O-)
CPU 0
Pid: 33, comm: kworker/1:1 Tainted: G W O 3.6.0-rc1-work+ #3 Bochs Bochs
RIP: 0010:[<
ffffffff810b3db0>] [<
ffffffff810b3db0>] worker_thread+0x360/0x500
RSP: 0018:
ffff88001e1c9de0 EFLAGS:
00010086
RAX:
0000000000000000 RBX:
ffff88001e633e00 RCX:
0000000000004140
RDX:
0000000000000000 RSI:
0000000000000000 RDI:
0000000000000009
RBP:
ffff88001e1c9ea0 R08:
0000000000000000 R09:
0000000000000001
R10:
0000000000000002 R11:
0000000000000000 R12:
ffff88001fc8d580
R13:
ffff88001fc8d590 R14:
ffff88001e633e20 R15:
ffff88001e1c6900
FS:
0000000000000000(0000) GS:
ffff88001fc00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
CR2:
0000000000000000 CR3:
00000000130e8000 CR4:
00000000000006f0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000ffff0ff0 DR7:
0000000000000400
Process kworker/1:1 (pid: 33, threadinfo
ffff88001e1c8000, task
ffff88001e1c6900)
Stack:
ffff880000000000 ffff88001e1c9e40 0000000000000001 ffff88001e1c8010
ffff88001e519c78 ffff88001e1c9e58 ffff88001e1c6900 ffff88001e1c6900
ffff88001e1c6900 ffff88001e1c6900 ffff88001fc8d340 ffff88001fc8d340
Call Trace:
[<
ffffffff810bc16e>] kthread+0xbe/0xd0
[<
ffffffff81bd2664>] kernel_thread_helper+0x4/0x10
Code: b1 00 f6 43 48 02 0f 85 91 01 00 00 48 8b 43 38 48 89 df 48 8b 00 48 89 45 90 e8 ac f0 ff ff 3c 01 0f 85 60 01 00 00 48 8b 53 50 <8b> 02 83 e8 01 85 c0 89 02 0f 84 3b 01 00 00 48 8b 43 38 48 8b
RIP [<
ffffffff810b3db0>] worker_thread+0x360/0x500
RSP <
ffff88001e1c9de0>
CR2:
0000000000000000
There was no reason to keep WORKER_REBIND on failure in the first
place - WORKER_UNBOUND is guaranteed to be set in such cases
preventing incorrectly activating concurrency management. Always
clear WORKER_REBIND.
tj: Updated comment and description.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Linus Torvalds [Mon, 17 Sep 2012 22:01:14 +0000 (15:01 -0700)]
Merge branch 'akpm' (Andrew's patch-bomb)
Merge fixes from Andrew Morton:
"13 patches. 12 are fixes and one is a little preparatory thing for
Andi."
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (13 commits)
memory hotplug: fix section info double registration bug
mm/page_alloc: fix the page address of higher page's buddy calculation
drivers/rtc/rtc-twl.c: ensure all interrupts are disabled during probe
compiler.h: add __visible
pid-namespace: limit value of ns_last_pid to (0, max_pid)
include/net/sock.h: squelch compiler warning in sk_rmem_schedule()
slub: consider pfmemalloc_match() in get_partial_node()
slab: fix starting index for finding another object
slab: do ClearSlabPfmemalloc() for all pages of slab
nbd: clear waiting_queue on shutdown
MAINTAINERS: fix TXT maintainer list and source repo path
mm/ia64: fix a memory block size bug
memory hotplug: reset pgdat->kswapd to NULL if creating kernel thread fails
qiuxishi [Mon, 17 Sep 2012 21:09:24 +0000 (14:09 -0700)]
memory hotplug: fix section info double registration bug
There may be a bug when registering section info. For example, on my
Itanium platform, the pfn range of node0 includes the other nodes, so
other nodes' section info will be double registered, and memmap's page
count will equal to 3.
node0: start_pfn=0x100, spanned_pfn=0x20fb00, present_pfn=0x7f8a3, => 0x000100-0x20fc00
node1: start_pfn=0x80000, spanned_pfn=0x80000, present_pfn=0x80000, => 0x080000-0x100000
node2: start_pfn=0x100000, spanned_pfn=0x80000, present_pfn=0x80000, => 0x100000-0x180000
node3: start_pfn=0x180000, spanned_pfn=0x80000, present_pfn=0x80000, => 0x180000-0x200000
free_all_bootmem_node()
register_page_bootmem_info_node()
register_page_bootmem_info_section()
When hot remove memory, we can't free the memmap's page because
page_count() is 2 after put_page_bootmem().
sparse_remove_one_section()
free_section_usemap()
free_map_bootmem()
put_page_bootmem()
[akpm@linux-foundation.org: add code comment]
Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Li Haifeng [Mon, 17 Sep 2012 21:09:21 +0000 (14:09 -0700)]
mm/page_alloc: fix the page address of higher page's buddy calculation
The heuristic method for buddy has been introduced since commit
43506fad21ca ("mm/page_alloc.c: simplify calculation of combined index
of adjacent buddy lists"). But the page address of higher page's buddy
was wrongly calculated, which will lead page_is_buddy to fail for ever.
IOW, the heuristic method would be disabled with the wrong page address
of higher page's buddy.
Calculating the page address of higher page's buddy should be based
higher_page with the offset between index of higher page and index of
higher page's buddy.
Signed-off-by: Haifeng Li <omycle@gmail.com>
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: KyongHo Cho <pullip.cho@samsung.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: <stable@vger.kernel.org> [2.6.38+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kevin Hilman [Mon, 17 Sep 2012 21:09:17 +0000 (14:09 -0700)]
drivers/rtc/rtc-twl.c: ensure all interrupts are disabled during probe
On some platforms, bootloaders are known to do some interesting RTC
programming. Without going into the obscurities as to why this may be
the case, suffice it to say the the driver should not make any
assumptions about the state of the RTC when the driver loads. In
particular, the driver probe should be sure that all interrupts are
disabled until otherwise programmed.
This was discovered when finding bursty I2C traffic every second on
Overo platforms. This I2C overhead was keeping the SoC from hitting
deep power states. The cause was found to be the RTC firing every
second on the I2C-connected TWL PMIC.
Special thanks to Felipe Balbi for suggesting to look for a rogue driver
as the source of the I2C traffic rather than the I2C driver itself.
Special thanks to Steve Sakoman for helping track down the source of the
continuous RTC interrups on the Overo boards.
Signed-off-by: Kevin Hilman <khilman@ti.com>
Cc: Felipe Balbi <balbi@ti.com>
Tested-by: Steve Sakoman <steve@sakoman.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Tested-by: Shubhrajyoti Datta <omaplinuxkernel@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andi Kleen [Mon, 17 Sep 2012 21:09:15 +0000 (14:09 -0700)]
compiler.h: add __visible
gcc 4.6+ has support for a externally_visible attribute that prevents the
optimizer from optimizing unused symbols away. Add a __visible macro to
use it with that compiler version or later.
This is used (at least) by the "Link Time Optimization" patchset.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Vagin [Mon, 17 Sep 2012 21:09:12 +0000 (14:09 -0700)]
pid-namespace: limit value of ns_last_pid to (0, max_pid)
The kernel doesn't check the pid for negative values, so if you try to
write -2 to /proc/sys/kernel/ns_last_pid, you will get a kernel panic.
The crash happens because the next pid is -1, and alloc_pidmap() will
try to access to a nonexistent pidmap.
map = &pid_ns->pidmap[pid/BITS_PER_PAGE];
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chuck Lever [Mon, 17 Sep 2012 21:09:11 +0000 (14:09 -0700)]
include/net/sock.h: squelch compiler warning in sk_rmem_schedule()
This warning:
In file included from linux/include/linux/tcp.h:227:0,
from linux/include/linux/ipv6.h:221,
from linux/include/net/ipv6.h:16,
from linux/include/linux/sunrpc/clnt.h:26,
from linux/net/sunrpc/stats.c:22:
linux/include/net/sock.h: In function `sk_rmem_schedule':
linux/nfs-2.6/include/net/sock.h:1339:13: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
is seen with gcc (GCC) 4.6.3
20120306 (Red Hat 4.6.3-2) using the
-Wextra option.
Commit
c76562b6709f ("netvm: prevent a stream-specific deadlock")
accidentally replaced the "size" parameter of sk_rmem_schedule() with an
unsigned int. This changes the semantics of the comparison in the
return statement.
In sk_wmem_schedule we have syntactically the same comparison, but
"size" is a signed integer. In addition, __sk_mem_schedule() takes a
signed integer for its "size" parameter, so there is an implicit type
conversion in sk_rmem_schedule() anyway.
Revert the "size" parameter back to a signed integer so that the
semantics of the expressions in both sk_[rw]mem_schedule() are exactly
the same.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: David Miller <davem@davemloft.net>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Mon, 17 Sep 2012 21:09:09 +0000 (14:09 -0700)]
slub: consider pfmemalloc_match() in get_partial_node()
get_partial() is currently not checking pfmemalloc_match() meaning that
it is possible for pfmemalloc pages to leak to non-pfmemalloc users.
This is a problem in the following situation. Assume that there is a
request from normal allocation and there are no objects in the per-cpu
cache and no node-partial slab.
In this case, slab_alloc enters the slow path and new_slab_objects() is
called which may return a PFMEMALLOC page. As the current user is not
allowed to access PFMEMALLOC page, deactivate_slab() is called
([
5091b74a: mm: slub: optimise the SLUB fast path to avoid pfmemalloc
checks]) and returns an object from PFMEMALLOC page.
Next time, when we get another request from normal allocation,
slab_alloc() enters the slow-path and calls new_slab_objects(). In
new_slab_objects(), we call get_partial() and get a partial slab which
was just deactivated but is a pfmemalloc page. We extract one object
from it and re-deactivate.
"deactivate -> re-get in get_partial -> re-deactivate" occures repeatedly.
As a result, access to PFMEMALLOC page is not properly restricted and it
can cause a performance degradation due to frequent deactivation.
deactivation frequently.
This patch changes get_partial_node() to take pfmemalloc_match() into
account and prevents the "deactivate -> re-get in get_partial()
scenario. Instead, new_slab() is called.
Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: David Miller <davem@davemloft.net>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Mon, 17 Sep 2012 21:09:06 +0000 (14:09 -0700)]
slab: fix starting index for finding another object
In array cache, there is a object at index 0, check it.
Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: David Miller <davem@davemloft.net>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Mon, 17 Sep 2012 21:09:03 +0000 (14:09 -0700)]
slab: do ClearSlabPfmemalloc() for all pages of slab
Right now, we call ClearSlabPfmemalloc() for first page of slab when we
clear SlabPfmemalloc flag. This is fine for most swap-over-network use
cases as it is expected that order-0 pages are in use. Unfortunately it
is possible that that __ac_put_obj() checks SlabPfmemalloc on a tail
page and while this is harmless, it is sloppy. This patch ensures that
the head page is always used.
This problem was originally identified by Joonsoo Kim.
[js1304@gmail.com: Original implementation and problem identification]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: David Miller <davem@davemloft.net>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul Clements [Mon, 17 Sep 2012 21:09:02 +0000 (14:09 -0700)]
nbd: clear waiting_queue on shutdown
Fix a serious but uncommon bug in nbd which occurs when there is heavy
I/O going to the nbd device while, at the same time, a failure (server,
network) or manual disconnect of the nbd connection occurs.
There is a small window between the time that the nbd_thread is stopped
and the socket is shutdown where requests can continue to be queued to
nbd's internal waiting_queue. When this happens, those requests are
never completed or freed.
The fix is to clear the waiting_queue on shutdown of the nbd device, in
the same way that the nbd request queue (queue_head) is already being
cleared.
Signed-off-by: Paul Clements <paul.clements@steeleye.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gang Wei [Mon, 17 Sep 2012 21:08:59 +0000 (14:08 -0700)]
MAINTAINERS: fix TXT maintainer list and source repo path
Signed-off-by: Gang Wei <gang.wei@intel.com>
Cc: Richard L Maliszewski <richard.l.maliszewski@intel.com>
Cc: Gang Wei <gang.wei@intel.com>
Cc: Shane Wang <shane.wang@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jianguo Wu [Mon, 17 Sep 2012 21:08:56 +0000 (14:08 -0700)]
mm/ia64: fix a memory block size bug
I found following definition in include/linux/memory.h, in my IA64
platform, SECTION_SIZE_BITS is equal to 32, and MIN_MEMORY_BLOCK_SIZE
will be 0.
#define MIN_MEMORY_BLOCK_SIZE (1 << SECTION_SIZE_BITS)
Because MIN_MEMORY_BLOCK_SIZE is int type and length of 32bits,
so MIN_MEMORY_BLOCK_SIZE(1 << 32) will will equal to 0.
Actually when SECTION_SIZE_BITS >= 31, MIN_MEMORY_BLOCK_SIZE will be wrong.
This will cause wrong system memory infomation in sysfs.
I think it should be:
#define MIN_MEMORY_BLOCK_SIZE (1UL << SECTION_SIZE_BITS)
And "echo offline > memory0/state" will cause following call trace:
kernel BUG at mm/memory_hotplug.c:885!
sh[6455]: bugcheck! 0 [1]
Pid: 6455, CPU 0, comm: sh
psr :
0000101008526030 ifs :
8000000000000fa4 ip : [<
a0000001008c40f0>] Not tainted (3.6.0-rc1)
ip is at offline_pages+0x210/0xee0
Call Trace:
show_stack+0x80/0xa0
show_regs+0x640/0x920
die+0x190/0x2c0
die_if_kernel+0x50/0x80
ia64_bad_break+0x3d0/0x6e0
ia64_native_leave_kernel+0x0/0x270
offline_pages+0x210/0xee0
alloc_pages_current+0x180/0x2a0
Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wen Congyang [Mon, 17 Sep 2012 21:08:55 +0000 (14:08 -0700)]
memory hotplug: reset pgdat->kswapd to NULL if creating kernel thread fails
If kthread_run() fails, pgdat->kswapd contains errno. When we stop this
thread, we only check whether pgdat->kswapd is NULL and access it. If
it contains errno, it will cause page fault. Reset pgdat->kswapd to
NULL when creating kernel thread fails can avoid this problem.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 17 Sep 2012 20:21:02 +0000 (13:21 -0700)]
Merge tag 'rdma-for-linus' of git://git./linux/kernel/git/roland/infiniband
Pull InfiniBand/RDMA fixes from Roland Dreier:
- A couple more IPoIB fixes for regressions introduced by path database
conversion
- Minor other fixes to low-level drivers (cxgb4, mlx4, qib, ocrdma)
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/qib: Fix failure of compliance test C14-024#06_LocalPortNum
RDMA/ocrdma: Fix CQE expansion of unsignaled WQE
mlx4_core: Fix integer overflows so 8TBs of memory registration works
IPoIB: Fix AB-BA deadlock when deleting neighbours
IPoIB: Fix memory leak in the neigh table deletion flow
RDMA/cxgb4: Move dereference below NULL test
Francesco Ruggeri [Thu, 13 Sep 2012 22:03:37 +0000 (15:03 -0700)]
fs/proc: fix potential unregister_sysctl_table hang
The unregister_sysctl_table() function hangs if all references to its
ctl_table_header structure are not dropped.
This can happen sometimes because of a leak in proc_sys_lookup():
proc_sys_lookup() gets a reference to the table via lookup_entry(), but
it does not release it when a subsequent call to sysctl_follow_link()
fails.
This patch fixes this leak by making sure the reference is always
dropped on return.
See also commit
076c3eed2c31 ("sysctl: Rewrite proc_sys_lookup
introducing find_entry and lookup_entry") which reorganized this code in
3.4.
Tested in Linux 3.4.4.
Signed-off-by: Francesco Ruggeri <fruggeri@aristanetworks.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 16 Sep 2012 21:58:51 +0000 (14:58 -0700)]
Linux 3.6-rc6
Linus Torvalds [Sun, 16 Sep 2012 20:22:21 +0000 (13:22 -0700)]
Merge tag 'mfd-for-linus-3.6-2' of git://git./linux/kernel/git/sameo/mfd-2.6
Pull mfd fixes from Samuel Ortiz:
"This is the remaining MFD fixes for 3.6, with 5 pending fixes:
- A tps65217 build error fix.
- A lcp_ich regression fix caused by the MFD driver failing to
initialize the watchdog sub device due to ACPI conflicts.
- 2 MAX77693 interrupt handling bug fixes.
- An MFD core fix, adding an IRQ domain argument to the MFD device
addition API in order to prevent silent and potentially harmful
remapping behaviour changes for drivers supporting non-DT
platforms."
* tag 'mfd-for-linus-3.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
mfd: MAX77693: Fix NULL pointer error when initializing irqs
mfd: MAX77693: Fix interrupt handling bug
mfd: core: Push irqdomain mapping out into devices
mfd: lpc_ich: Fix a 3.5 kernel regression for iTCO_wdt driver
mfd: Move tps65217 regulator plat data handling to regulator
Linus Torvalds [Sun, 16 Sep 2012 20:20:43 +0000 (13:20 -0700)]
Merge tag 'for-3.6-rc6' of git://gitorious.org/linux-pwm/linux-pwm
Pull pwm fixes from Thierry Reding:
"While this comes a bit later than I had wished, both patches are
rather minor and touch only new drivers so I think these are still
safe for merging."
* tag 'for-3.6-rc6' of git://gitorious.org/linux-pwm/linux-pwm:
pwm: pwm-tiehrpwm: Fix conflicting channel period setting
pwm: pwm-tiecap: Disable APWM mode after configure
Linus Torvalds [Sun, 16 Sep 2012 20:00:36 +0000 (13:00 -0700)]
Merge git://git./linux/kernel/git/nab/target-pending
Pull scsi target fixes from Nicholas Bellinger:
"Here is the current set of target-pending fixes headed for v3.6-final
The main parts of this series include bug-fixes from Paolo Bonzini to
address an use-after-free bug in pSCSI sense exception handling, along
with addressing some long-standing bugs wrt the handling of zero-
length SCSI CDB payloads also specific to pSCSI pass-through device
backends."
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
target: go through normal processing for zero-length REQUEST_SENSE
target: support zero allocation length in REQUEST SENSE
target: support zero-size allocation lengths in transport_kmap_data_sg
target: fail REPORT LUNS with less than 16 bytes of payload
target: report too-small parameter lists everywhere
target: go through normal processing for zero-length PSCSI commands
target: fix use-after-free with PSCSI sense data
target: simplify code around transport_get_sense_data
target: move transport_get_sense_data
target: Check idr_get_new return value in iscsi_login_zero_tsih_s1
target: Fix ->data_length re-assignment bug with SCSI overflow
Linus Torvalds [Sun, 16 Sep 2012 19:59:42 +0000 (12:59 -0700)]
Merge tag 'pm-for-3.6-rc6' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael J. Wysocki:
"Three ACPI device power management fixes related to checking and
setting device power states."
* tag 'pm-for-3.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / PM: Use KERN_DEBUG when no power resources are found
ACPI / PM: Fix resource_lock dead lock in acpi_power_on_device
ACPI / PM: Infer parent power state from child if unknown, v2
Linus Torvalds [Sun, 16 Sep 2012 19:58:44 +0000 (12:58 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs
Pull a btrfs revert from Chris Mason:
"My for-linus branch has one revert in the new quota code.
We're building up more fixes at etc for the next merge window, but I'm
keeping them out unless they are bigger regressions or have a huge
impact."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Revert "Btrfs: fix some error codes in btrfs_qgroup_inherit()"
Linus Torvalds [Sun, 16 Sep 2012 19:57:59 +0000 (12:57 -0700)]
Merge tag 'sound-3.6' of git://git./linux/kernel/git/tiwai/sound
Pull more sound fixes from Takashi Iwai:
"Yet more (a bunch of) small fixes that slipped from the previous pull
request. Most of commits are pending ASoC fixes, all of which are
fairly trivial commits."
* tag 'sound-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ASoC: wm8904: correct the index
ALSA: hda - Yet another position_fix quirk for ASUS machines
ASoC: tegra: fix maxburst settings in dmaengine code
ASoC: samsung dma - Don't indicate support for pause/resume.
ASoC: mc13783: Remove mono support
ASoC: arizona: Fix typo in 44.1kHz rates
ASoC: spear: correct the check for NULL dma_buffer pointer
sound: tegra_alc5632: remove HP detect GPIO inversion
ASoC: atmel-ssc: include linux/io.h for raw io
ASoC: dapm: Don't force card bias level to be updated
ASoC: dapm: Make sure we update the bias level for CODECs with no op
ASoC: am3517evm: fix error return code
ASoC: ux500_msp_i2s: better use devm functions and fix error return code
ASoC: imx-sgtl5000: fix error return code
Linus Torvalds [Sun, 16 Sep 2012 19:29:43 +0000 (12:29 -0700)]
Revert "sched: Improve scalability via 'CPU buddies', which withstand random perturbations"
This reverts commit
970e178985cadbca660feb02f4d2ee3a09f7fdda.
Nikolay Ulyanitsky reported thatthe 3.6-rc5 kernel has a 15-20%
performance drop on PostgreSQL 9.2 on his machine (running "pgbench").
Borislav Petkov was able to reproduce this, and bisected it to this
commit
970e178985ca ("sched: Improve scalability via 'CPU buddies' ...")
apparently because the new single-idle-buddy model simply doesn't find
idle CPU's to reschedule on aggressively enough.
Mike Galbraith suspects that it is likely due to the user-mode spinlocks
in PostgreSQL not reacting well to preemption, but we don't really know
the details - I'll just revert the commit for now.
There are hopefully other approaches to improve scheduler scalability
without it causing these kinds of downsides.
Reported-by: Nikolay Ulyanitsky <lystor@gmail.com>
Bisected-by: Borislav Petkov <bp@alien8.de>
Acked-by: Mike Galbraith <efault@gmx.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chanwoo Choi [Tue, 21 Aug 2012 06:16:23 +0000 (15:16 +0900)]
mfd: MAX77693: Fix NULL pointer error when initializing irqs
This patch initialize register map of MUIC device because mfd driver
of Maxim MAX77693 use regmap-muic instance of MUIC device when irqs of
Maxim MAX77693 is initialized before call max77693-muic probe() function.
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Reported-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Chanwoo Choi [Tue, 21 Aug 2012 06:15:52 +0000 (15:15 +0900)]
mfd: MAX77693: Fix interrupt handling bug
This patch fix bug related to interrupt handling for MAX77693 devices.
- Unmask interrupt masking bit for charger/flash/muic to revolve
that interrupt isn't happened when external connector is attached.
- Fix wrong regmap instance when muic interrupt is happened.
This patch were discussed and confirm discussion about this patch on below url:
http://lkml.org/lkml/2012/7/16/118
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Mark Brown [Tue, 11 Sep 2012 07:16:36 +0000 (15:16 +0800)]
mfd: core: Push irqdomain mapping out into devices
Currently the MFD core supports remapping MFD cell interrupts using an
irqdomain but only if the MFD is being instantiated using device tree
and only if the device tree bindings use the pattern of registering IPs
in the device tree with compatible properties. This will be actively
harmful for drivers which support non-DT platforms and use this pattern
for their DT bindings as it will mean that the core will silently change
remapping behaviour and it is also limiting for drivers which don't do
DT with this particular pattern. There is also a potential fragility if
there are interrupts not associated with MFD cells and all the cells are
omitted from the device tree for some reason.
Instead change the code to take an IRQ domain as an optional argument,
allowing drivers to take the decision about the parent domain for their
interrupts. The one current user of this feature is ab8500-core, it has
the domain lookup pushed out into the driver.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Takashi Iwai [Sat, 15 Sep 2012 06:24:42 +0000 (08:24 +0200)]
Merge tag 'asoc-3.6' of git://git./linux/kernel/git/broonie/sound into for-linus
ASoC: Updates for 3.6
A bigger set of updates than I'm entirely comfortable with - things
backed up a bit due to travel. As ever the majority of these are small,
focused updates for specific drivers though there are a couple of core
changes. There's been good exposure in -next.
The AT91 patch fixes a build break.
Linus Torvalds [Sat, 15 Sep 2012 01:05:14 +0000 (18:05 -0700)]
Merge git://git./linux/kernel/git/steve/gfs2-3.0-fixes
Pull GFS2 fixes from Steven Whitehouse:
"Here are three GFS2 fixes for the current kernel tree. These are all
related to the block reservation code which was added at the merge
window. That code will be getting an update at the forthcoming merge
window too. In the mean time though there are a few smaller issues
which should be fixed.
The first patch resolves an issue with write sizes of greater than 32
bits with the size hinting code. The second ensures that the
allocation data structure is initialised when using xattrs and the
third takes into account allocations which may have been made by other
nodes which affect a reservation on the local node."
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes:
GFS2: Take account of blockages when using reserved blocks
GFS2: Fix missing allocation data for set/remove xattr
GFS2: Make write size hinting code common
Linus Torvalds [Sat, 15 Sep 2012 00:59:35 +0000 (17:59 -0700)]
Merge branch 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86
Pull x86 platform driver updates from Matthew Garrett:
"A few small updates for 3.6 - a trivial regression fix and a couple of
conformance updates for the gmux driver, plus some tiny fixes for
asus-wmi, eeepc-laptop and thinkpad_acpi."
* 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86:
thinkpad_acpi: buffer overflow in fan_get_status()
eeepc-laptop: fix device reference count leakage in eeepc_rfkill_hotplug()
platform/x86: fix asus_laptop.wled_type description
asus-laptop: HRWS/HWRS typo
drivers-platform-x86: remove useless #ifdef CONFIG_ACPI_VIDEO
apple-gmux: Fix port address calculation in gmux_pio_write32()
apple-gmux: Fix index read functions
apple-gmux: Obtain version info from indexed gmux
Linus Torvalds [Sat, 15 Sep 2012 00:55:57 +0000 (17:55 -0700)]
Merge branch 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux
Pull i2c embedded fixes from Wolfram Sang:
"The last bunch of (typical) i2c-embedded driver fixes for 3.6.
Also update the MAINTAINERS file to point to my tree since people keep
asking where to find their patches."
* 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux:
i2c: algo: pca: Fix mode selection for PCA9665
MAINTAINERS: fix tree for current i2c-embedded development
i2c: mxs: correctly setup speed for non devicetree
i2c: pnx: Fix read transactions of >= 2 bytes
i2c: pnx: Fix bit definitions
Linus Torvalds [Sat, 15 Sep 2012 00:53:55 +0000 (17:53 -0700)]
Merge tag 'ecryptfs-3.6-rc6-fixes' of git://git./linux/kernel/git/tyhicks/ecryptfs
Pull ecryptfs fixes from Tyler Hicks:
- Fixes a regression, introduced in 3.6-rc1, when a file is closed
before its shared memory mapping is dirtied and unmapped. The lower
file was being released when the eCryptfs file was closed and the
dirtied pages could not be written out.
- Adds a call to the lower filesystem's ->flush() from
ecryptfs_flush().
- Fixes a regression, introduced in 2.6.39, when a file is renamed on
top of another file. The target file's inode was not being evicted
and the space taken by the file was not reclaimed until eCryptfs was
unmounted.
* tag 'ecryptfs-3.6-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
eCryptfs: Copy up attributes of the lower target inode after rename
eCryptfs: Call lower ->flush() from ecryptfs_flush()
eCryptfs: Write out all dirty pages just before releasing the lower file
Linus Torvalds [Sat, 15 Sep 2012 00:53:11 +0000 (17:53 -0700)]
Merge branch 'fixes-for-3.6' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull one more DMA-mapping fix from Marek Szyprowski:
"This patch fixes very subtle bug (typical off-by-one error) which
might appear in very rare circumstances."
* 'fixes-for-3.6' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
arm: mm: fix DMA pool affiliation check
Linus Torvalds [Sat, 15 Sep 2012 00:52:29 +0000 (17:52 -0700)]
Merge tag 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
"Fix word size register read and write operations in ina2xx driver, and
initialize uninitialized structure elements in twl4030-madc-hwmon
driver."
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (ina2xx) Fix word size register read and write operations
hwmon: (twl4030-madc-hwmon) Initialize uninitialized structure elements
Linus Torvalds [Sat, 15 Sep 2012 00:51:10 +0000 (17:51 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"I realise this a bit bigger than I would want at this point.
Exynos is a large chunk, I got them to half what they wanted already,
and hey its ARM based, so not going to hurt many people.
Radeon has only two fixes, but the PLL fixes were a bit bigger, but
required for a lot of scenarios, the fence fix is really urgent.
vmwgfx: I've pulled in a dumb ioctl support patch that I was going to
shove in later and cc stable, but we need it asap, its mainly to stop
mesa growing a really ugly dependency in userspace to run stuff on
vmware, and if I don't stick it in the kernel now, everyone will have
to ship ugly userspace libs to workaround it.
nouveau: single urgent fix found in F18 testing, causes X to not start
properly when f18 plymouth is used
i915: smattering of fixes and debug quieting
gma500: single regression fix
So as I said a bit large, but its fairly well scattered and its all
stuff I'll be shipping in F18's 3.6 kernel."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (26 commits)
drm/nouveau: fix booting with plymouth + dumb support
drm/radeon: make 64bit fences more robust v3
drm/radeon: rework pll selection (v3)
drm: Drop the NV12M and YUV420M formats
drm/exynos: remove DRM_FORMAT_NV12M from plane module
drm/exynos: fix double call of drm_prime_(init/destroy)_file_private
drm/exynos: add dummy support for dmabuf-mmap
drm/exynos: Add missing braces around sizeof in exynos_mixer.c
drm/exynos: Add missing braces around sizeof in exynos_hdmi.c
drm/exynos: Make g2d_pm_ops static
drm/exynos: Add dependency for G2D in Kconfig
drm/exynos: fixed page align bug.
drm/exynos: Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(.. [1]
drm/exynos: Use devm_* functions in exynos_drm_g2d.c file
drm/exynos: Use devm_kzalloc in exynos_drm_hdmi.c file
drm/exynos: Use devm_kzalloc in exynos_drm_vidi.c file
drm/exynos: Remove redundant check in exynos_drm_fimd.c file
drm/exynos: Remove redundant check in exynos_hdmi.c file
vmwgfx: add dumb ioctl support
gma500: Fix regression on Oaktrail devices
...
Linus Torvalds [Sat, 15 Sep 2012 00:44:52 +0000 (17:44 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"Smaller fixlets"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Fix kernel-doc warnings in kernel/sched/fair.c
sched: Unthrottle rt runqueues in __disable_runtime()
sched: Add missing call to calc_load_exit_idle()
sched: Fix load avg vs cpu-hotplug
Linus Torvalds [Sat, 15 Sep 2012 00:43:45 +0000 (17:43 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"This tree includes various fixes"
Ingo really needs to improve on the whole "explain git pull" part.
"Various fixes" indeed.
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/hwpb: Invoke __perf_event_disable() if interrupts are already disabled
perf/x86: Enable Intel Cedarview Atom suppport
perf_event: Switch to internal refcount, fix race with close()
oprofile, s390: Fix uninitialized memory access when writing to oprofilefs
perf/x86: Fix microcode revision check for SNB-PEBS
Linus Torvalds [Sat, 15 Sep 2012 00:43:14 +0000 (17:43 -0700)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull a core sparse warning fix from Ingo Molnar
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
mm/memblock: Use NULL instead of 0 for pointers
Chris Mason [Sat, 15 Sep 2012 00:06:30 +0000 (20:06 -0400)]
Revert "Btrfs: fix some error codes in btrfs_qgroup_inherit()"
This reverts commit
5986802c2fcc754040bb7ed95f30bb16c4a843b7.
Both paths are not error paths but regular cases where non-qgroup
subvols are involved.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Linus Torvalds [Fri, 14 Sep 2012 22:34:07 +0000 (15:34 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Use after free and new device IDs in bluetooth from Andre Guedes,
Yevgeniy Melnichuk, Gustavo Padovan, and Henrik Rydberg.
2) Fix crashes with short packet lengths and VLAN in pktgen, from
Nishank Trivedi.
3) mISDN calls flush_work_sync() with locks held, fix from Karsten
Keil.
4) Packet scheduler gred parameters are reported to userspace
improperly scaled, and WRED idling is not performed correctly. All
from David Ward.
5) Fix TCP socket refcount problem in ipv6, from Julian Anastasov.
6) ibmveth device has RX queue alignment requirements which are not
being explicitly met resulting in sporadic failures, fix from
Santiago Leon.
7) Netfilter needs to take care when interpreting sockets attached to
socket buffers, they could be time-wait minisockets. Fix from Eric
Dumazet.
8) sock_edemux() has the same issue as netfilter did in #7 above, fix
from Eric Dumazet.
9) Avoid infinite loops in CBQ scheduler with some configurations, from
Eric Dumazet.
10) Deal with "Reflection scan: an Off-Path Attack on TCP", from Jozsef
Kadlecsik.
11) SCTP overcharges socket for TX packets, fix from Thomas Graf.
12) CODEL packet scheduler should not reset it's state every time it
builds a new flow, fix from Eric Dumazet.
13) Fix memory leak in nl80211, from Wei Yongjun.
14) NETROM doesn't check skb_copy_datagram_iovec() return values, from
Alan Cox.
15) l2tp ethernet was using sizeof(ETH_HLEN) instead of plain ETH_HLEN,
oops. From Eric Dumazet.
16) Fix selection of ath9k chips on which PA linearization and AM2PM
predistoration are used, from Felix Fietkau.
17) Flow steering settings in mlx4 driver need to be validated properly,
from Hadar Hen Zion.
18) bnx2x doesn't show the correct link duplex setting, from Yaniv
Rosner.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
pktgen: fix crash with vlan and packet size less than 46
bnx2x: Add missing afex code
bnx2x: fix registers dumped
bnx2x: correct advertisement of pause capabilities
bnx2x: display the correct duplex value
bnx2x: prevent timeouts when using PFC
bnx2x: fix stats copying logic
bnx2x: Avoid sending multiple statistics queries
net: qmi_wwan: call subdriver with control intf only
net_sched: gred: actually perform idling in WRED mode
net_sched: gred: fix qave reporting via netlink
net_sched: gred: eliminate redundant DP prio comparisons
net_sched: gred: correct comment about qavg calculation in RIO mode
mISDN: Fix wrong usage of flush_work_sync while holding locks
netfilter: log: Fix log-level processing
net-sched: sch_cbq: avoid infinite loop
net: qmi_wwan: fix Gobi device probing for un2430
net: fix net/core/sock.c build error
ixp4xx_hss: fix build failure due to missing linux/module.h inclusion
caif: move the dereference below the NULL test
...
Linus Torvalds [Fri, 14 Sep 2012 21:54:57 +0000 (14:54 -0700)]
Merge tag 'usb-3.6-rc6' of git://git./linux/kernel/git/gregkh/usb
Pull USB patches from Greg Kroah-Hartman:
"Here are a number of USB patches, a bit more than I normally like this
late in the -rc series, but given people's vacations (myself
included), and the kernel summit, it seems to have happened this way.
All are tiny, but they add up. A number of gadget and xhci fixes, and
a few new device ids. All have been tested in linux-next.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'usb-3.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (33 commits)
usb: chipidea: udc: don't stall endpoint if request list is empty in isr_tr_complete_low
usb: chipidea: cleanup dma_pool if udc_start() fails
usb: chipidea: udc: fix error path in udc_start()
usb: chipidea: udc: add pullup fuction, needed by the uvc gadget
usb: chipidea: udc: fix setup of endpoint maxpacket size
USB: option: replace ZTE K5006-Z entry with vendor class rule
EHCI: Update qTD next pointer in QH overlay region during unlink
USB: cdc-wdm: fix wdm_find_device* return value
USB: ftdi_sio: do not claim CDC ACM function
usb: dwc3: gadget: fix pending isoc handling
usb: renesas_usbhs: fixup DMA transport data alignment
usb: gadget: at91udc: Don't check for ep->ep.desc
usb: gadget: at91udc: don't overwrite driver data
usb: dwc3: core: fix incorrect usage of resource pointer
usb: musb: musbhsdma: fix IRQ check
usb: musb: tusb6010: fix error path in tusb_probe()
usb: musb: host: fix for musb_start_urb Oops
usb: gadget: dummy_hcd: add support for USB_DT_BOS on rh
usb: gadget: dummy_hcd: fixup error probe path
usb: gadget: s3c-hsotg.c: fix error return code
...
Linus Torvalds [Fri, 14 Sep 2012 21:54:29 +0000 (14:54 -0700)]
Merge tag 'tty-3.6-rc6' of git://git./linux/kernel/git/gregkh/tty
Pull TTY fixes from Greg Kroah-Hartman:
"Here are 2 tiny patches for a serial driver to resolve issues that
people have reported with the 3.6-rc tree.
Both of these have been in the linux-next tree for a while now.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'tty-3.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: serial: imx: don't reinit clock in imx_setup_ufcr()
tty: serial: imx: console write routing is unsafe on SMP
Linus Torvalds [Fri, 14 Sep 2012 21:53:51 +0000 (14:53 -0700)]
Merge tag 'staging-3.6-rc6' of git://git./linux/kernel/git/gregkh/staging
Pull staging tree fixes from Greg Kroah-Hartman:
"Here are a few staging tree fixes for problems that have been
reported.
Nothing major, just a number of tiny driver fixes. All of these have
been in the linux-next tree for a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'staging-3.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
drm/omap: add more new timings fields
drm/omap: update for interlaced
staging: r8712u: fix bug in r8712_recv_indicatepkt()
staging: zcache: fix cleancache race condition with shrinker
Staging: Android alarm: IOCTL command encoding fix
staging: vt6656: [BUG] - Failed connection, incorrect endian.
staging: ozwpan: fix memcmp() test in oz_set_active_pd()
staging: wlan-ng: Fix problem with wrong arguments
staging: comedi: das08: Correct AO output for das08jr-16-ao
staging: comedi: das08: Correct AI encoding for das08jr-16-ao
staging: comedi: das08: Fix PCI ref count
staging: comedi: amplc_pci230: Fix PCI ref count
staging: comedi: amplc_pc263: Fix PCI ref count
staging: comedi: amplc_pc236: Fix PCI ref count
staging: comedi: amplc_dio200: Fix PCI ref count
staging: comedi: amplc_pci224: Fix PCI ref count
drivers/iio/adc/at91_adc.c: adjust inconsistent IS_ERR and PTR_ERR
staging iio: fix potential memory leak in lis3l02dq_ring.c
staging:iio: prevent divide by zero bugs
Linus Torvalds [Fri, 14 Sep 2012 21:53:22 +0000 (14:53 -0700)]
Merge tag 'driver-core-3.6-rc6' of git://git./linux/kernel/git/gregkh/driver-core
Pull driver core fix from Greg Kroah-Hartman:
"Here is one fix for 3.6-rc6 for the kobject.h file.
It fixes a reported oops if CONFIG_HOTPLUG is disabled. It's been in
the linux-next tree for a while now.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'driver-core-3.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
kobject: fix oops with "input0: bad kobj_uevent_env content in show_uevent()"
Linus Torvalds [Fri, 14 Sep 2012 21:48:21 +0000 (14:48 -0700)]
vfs: make O_PATH file descriptors usable for 'fstat()'
We already use them for openat() and friends, but fstat() also wants to
be able to use O_PATH file descriptors. This should make it more
directly comparable to the O_SEARCH of Solaris.
Note that you could already do the same thing with "fstatat()" and an
empty path, but just doing "fstat()" directly is simpler and faster, so
there is no reason not to just allow it directly.
See also commit
332a2e1244bd, which did the same thing for fchdir, for
the same reasons.
Reported-by: ольга крыжановская <olga.kryzhanovska@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org # O_PATH introduced in 3.0+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Aaron Lu [Fri, 14 Sep 2012 18:54:44 +0000 (20:54 +0200)]
ACPI / PM: Use KERN_DEBUG when no power resources are found
commit
a606dac368eed5696fb38e16b1394f1d049c09e9 adds support to link
devices which have _PRx, if a device does not have _PRx, a warning
message will be printed.
This commit is for ZPODD on Intel ZPODD capable platforms, on other
platforms, it has no problem if there is no power resource for this
device, so a warning here is not appropriate, change it to debug.
Reported-by: Borislav Petkov <bp@amd64.org>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Roland Dreier [Fri, 14 Sep 2012 17:42:52 +0000 (10:42 -0700)]
Merge branches 'cxgb4', 'ipoib', 'mlx4', 'ocrdma' and 'qib' into for-next
Mike Marciniszyn [Wed, 12 Sep 2012 13:01:29 +0000 (13:01 +0000)]
IB/qib: Fix failure of compliance test C14-024#06_LocalPortNum
Commit
3236b2d469db ("IB/qib: MADs with misset M_Keys should return
failure") introduced a return code assignment that unfortunately
introduced an unconditional exit for the routine due to missing braces.
This patch adds the braces to correct the original patch.
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>