GitHub/LineageOS/android_kernel_motorola_exynos9610.git
13 years agox86, nmi: Track NMI usage stats
Don Zickus [Fri, 30 Sep 2011 19:06:23 +0000 (15:06 -0400)]
x86, nmi: Track NMI usage stats

Now that the NMI handler are broken into lists, increment the appropriate
stats for each list.  This allows us to see what is going on when they
get printed out in the next patch.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1317409584-23662-6-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agox86, nmi: Add in logic to handle multiple events and unknown NMIs
Don Zickus [Fri, 30 Sep 2011 19:06:22 +0000 (15:06 -0400)]
x86, nmi: Add in logic to handle multiple events and unknown NMIs

Previous patches allow the NMI subsystem to process multipe NMI events
in one NMI.  As previously discussed this can cause issues when an event
triggered another NMI but is processed in the current NMI.  This causes the
next NMI to go unprocessed and become an 'unknown' NMI.

To handle this, we first have to flag whether or not the NMI handler handled
more than one event or not.  If it did, then there exists a chance that
the next NMI might be already processed.  Once the NMI is flagged as a
candidate to be swallowed, we next look for a back-to-back NMI condition.

This is determined by looking at the %rip from pt_regs.  If it is the same
as the previous NMI, it is assumed the cpu did not have a chance to jump
back into a non-NMI context and execute code and instead handled another NMI.

If both of those conditions are true then we will swallow any unknown NMI.

There still exists a chance that we accidentally swallow a real unknown NMI,
but for now things seem better.

An optimization has also been added to the nmi notifier rountine.  Because x86
can latch up to one NMI while currently processing an NMI, we don't have to
worry about executing _all_ the handlers in a standalone NMI.  The idea is
if multiple NMIs come in, the second NMI will represent them.  For those
back-to-back NMI cases, we have the potentail to drop NMIs.  Therefore only
execute all the handlers in the second half of a detected back-to-back NMI.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1317409584-23662-5-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agox86, nmi: Wire up NMI handlers to new routines
Don Zickus [Fri, 30 Sep 2011 19:06:21 +0000 (15:06 -0400)]
x86, nmi: Wire up NMI handlers to new routines

Just convert all the files that have an nmi handler to the new routines.
Most of it is straight forward conversion.  A couple of places needed some
tweaking like kgdb which separates the debug notifier from the nmi handler
and mce removes a call to notify_die.

[Thanks to Ying for finding out the history behind that mce call

https://lkml.org/lkml/2010/5/27/114

And Boris responding that he would like to remove that call because of it

https://lkml.org/lkml/2011/9/21/163]

The things that get converted are the registeration/unregistration routines
and the nmi handler itself has its args changed along with code removal
to check which list it is on (most are on one NMI list except for kgdb
which has both an NMI routine and an NMI Unknown routine).

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Corey Minyard <minyard@acm.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Corey Minyard <minyard@acm.org>
Cc: Jack Steiner <steiner@sgi.com>
Link: http://lkml.kernel.org/r/1317409584-23662-4-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agox86, nmi: Create new NMI handler routines
Don Zickus [Fri, 30 Sep 2011 19:06:20 +0000 (15:06 -0400)]
x86, nmi: Create new NMI handler routines

The NMI handlers used to rely on the notifier infrastructure.  This worked
great until we wanted to support handling multiple events better.

One of the key ideas to the nmi handling is to process _all_ the handlers for
each NMI.  The reason behind this switch is because NMIs are edge triggered.
If enough NMIs are triggered, then they could be lost because the cpu can
only latch at most one NMI (besides the one currently being processed).

In order to deal with this we have decided to process all the NMI handlers
for each NMI.  This allows the handlers to determine if they recieved an
event or not (the ones that can not determine this will be left to fend
for themselves on the unknown NMI list).

As a result of this change it is now possible to have an extra NMI that
was destined to be received for an already processed event.  Because the
event was processed in the previous NMI, this NMI gets dropped and becomes
an 'unknown' NMI.  This of course will cause printks that scare people.

However, we prefer to have extra NMIs as opposed to losing NMIs and as such
are have developed a basic mechanism to catch most of them.  That will be
a later patch.

To accomplish this idea, I unhooked the nmi handlers from the notifier
routines and created a new mechanism loosely based on doIRQ.  The reason
for this is the notifier routines have a couple of shortcomings.  One we
could't guarantee all future NMI handlers used NOTIFY_OK instead of
NOTIFY_STOP.  Second, we couldn't keep track of the number of events being
handled in each routine (most only handle one, perf can handle more than one).
Third, I wanted to eventually display which nmi handlers are registered in
the system in /proc/interrupts to help see who is generating NMIs.

The patch below just implements the new infrastructure but doesn't wire it up
yet (that is the next patch).  Its design is based on doIRQ structs and the
atomic notifier routines.  So the rcu stuff in the patch isn't entirely untested
(as the notifier routines have soaked it) but it should be double checked in
case I copied the code wrong.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1317409584-23662-3-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agox86, nmi: Split out nmi from traps.c
Don Zickus [Fri, 30 Sep 2011 19:06:19 +0000 (15:06 -0400)]
x86, nmi: Split out nmi from traps.c

The nmi stuff is changing a lot and adding more functionality.  Split it
out from the traps.c file so it doesn't continue to pollute that file.

This makes it easier to find and expand all the future nmi related work.

No real functional changes here.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1317409584-23662-2-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agoperf, intel: Use GO/HO bits in perf-ctr
Gleb Natapov [Wed, 5 Oct 2011 12:01:21 +0000 (14:01 +0200)]
perf, intel: Use GO/HO bits in perf-ctr

Intel does not have guest/host-only bit in perf counters like AMD
does.  To support GO/HO bits KVM needs to switch EVENTSELn values
(or PERF_GLOBAL_CTRL if available) at a guest entry. If a counter is
configured to count only in a guest mode it stays disabled in a host,
but VMX is configured to switch it to enabled value during guest entry.

This patch adds GO/HO tracking to Intel perf code and provides interface
for KVM to get a list of MSRs that need to be switched on a guest entry.

Only cpus with architectural PMU (v1 or later) are supported with this
patch.  To my knowledge there is not p6 models with VMX but without
architectural PMU and p4 with VMX are rare and the interface is general
enough to support them if need arise.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1317816084-18026-7-git-send-email-gleb@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agoperf, amd: Use GO/HO bits in perf-ctr
Joerg Roedel [Wed, 5 Oct 2011 12:01:17 +0000 (14:01 +0200)]
perf, amd: Use GO/HO bits in perf-ctr

The AMD perf-counters support counting in guest or host-mode
only. Make use of that feature when user-space specified
guest/host-mode only counting.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1317816084-18026-3-git-send-email-gleb@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agoperf, core: Introduce attrs to count in either host or guest mode
Joerg Roedel [Wed, 5 Oct 2011 12:01:16 +0000 (14:01 +0200)]
perf, core: Introduce attrs to count in either host or guest mode

The two new attributes exclude_guest and exclude_host can
bes used by user-space to tell the kernel to setup
performance counter to either only count while the CPU is in
guest or in host mode.

An additional check is also introduced to make sure
user-space does not try to exclude guest and host mode from
counting.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1317816084-18026-2-git-send-email-gleb@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agoMerge branch 'ras' of git://amd64.org/linux/bp into perf/core
Ingo Molnar [Thu, 6 Oct 2011 10:54:36 +0000 (12:54 +0200)]
Merge branch 'ras' of git://amd64.org/linux/bp into perf/core

13 years agoMerge commit 'v3.1-rc9' into perf/core
Ingo Molnar [Thu, 6 Oct 2011 10:48:57 +0000 (12:48 +0200)]
Merge commit 'v3.1-rc9' into perf/core

Merge reason: pick up latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agoLinux 3.1-rc9
Linus Torvalds [Wed, 5 Oct 2011 01:11:50 +0000 (18:11 -0700)]
Linux 3.1-rc9

13 years agoMerge git://github.com/davem330/net
Linus Torvalds [Tue, 4 Oct 2011 17:37:06 +0000 (10:37 -0700)]
Merge git://github.com/davem330/net

* git://github.com/davem330/net:
  pch_gbe: Fixed the issue on which a network freezes
  pch_gbe: Fixed the issue on which PC was frozen when link was downed.
  make PACKET_STATISTICS getsockopt report consistently between ring and non-ring
  net: xen-netback: correctly restart Tx after a VM restore/migrate
  bonding: properly stop queuing work when requested
  can bcm: fix incomplete tx_setup fix
  RDSRDMA: Fix cleanup of rds_iw_mr_pool
  net: Documentation: Fix type of variables
  ibmveth: Fix oops on request_irq failure
  ipv6: nullify ipv6_ac_list and ipv6_fl_list when creating new socket
  cxgb4: Fix EEH on IBM P7IOC
  can bcm: fix tx_setup off-by-one errors
  MAINTAINERS: tehuti: Alexander Indenbaum's address bounces
  dp83640: reduce driver noise
  ptp: fix L2 event message recognition

13 years agoMerge branch 'fix/asoc' of git://github.com/tiwai/sound
Linus Torvalds [Tue, 4 Oct 2011 16:59:22 +0000 (09:59 -0700)]
Merge branch 'fix/asoc' of git://github.com/tiwai/sound

* 'fix/asoc' of git://github.com/tiwai/sound:
  ASoC: omap_mcpdm_remove cannot be __devexit
  ASoC: Fix setting update bits for WM8753_LADC and WM8753_RADC
  ASoC: use a valid device for dev_err() in Zylonite

13 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Tue, 4 Oct 2011 16:54:18 +0000 (09:54 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon/kms: fix channel_remap setup (v2)
  drm/radeon: Set cursor x/y to 0 when x/yorigin > 0.
  drm/radeon: Update AVIVO cursor coordinate origin before x/yorigin calculation.
  drm/radeon: Simplify cursor x/yorigin calculation.
  drm/radeon/kms: fix cursor image off-by-one error
  drm/radeon/kms: Fix logic error in DP HPD handler
  drm/radeon/kms: add retry limits for native DP aux defer
  drm/radeon/kms: fix regression in DP aux defer handling

13 years agoMerge branch 'spi/merge' of git://git.secretlab.ca/git/linux-2.6
Linus Torvalds [Tue, 4 Oct 2011 16:52:56 +0000 (09:52 -0700)]
Merge branch 'spi/merge' of git://git.secretlab.ca/git/linux-2.6

* 'spi/merge' of git://git.secretlab.ca/git/linux-2.6:
  spi-topcliff-pch: Fix overrun issue
  spi-topcliff-pch: Add recovery processing in case FIFO overrun error occurs
  spi-topcliff-pch: Fix CPU read complete condition issue
  spi-topcliff-pch: Fix SSN Control issue
  spi-topcliff-pch: add tx-memory clear after complete transmitting

13 years agoPCI: Disable MPS configuration by default
Jon Mason [Mon, 3 Oct 2011 14:50:20 +0000 (09:50 -0500)]
PCI: Disable MPS configuration by default

Add the ability to disable PCI-E MPS turning and using the BIOS
configured MPS defaults.  Due to the number of issues recently
discovered on some x86 chipsets, make this the default behavior.

Also, add the option for peer to peer DMA MPS configuration.  Peer to
peer DMA is outside the scope of this patch, but MPS configuration could
prevent it from working by having the MPS on one root port different
than the MPS on another.  To work around this, simply make the system
wide MPS the smallest possible value (128B).

Signed-off-by: Jon Mason <mason@myri.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agodrm/radeon/kms: fix channel_remap setup (v2)
Alex Deucher [Tue, 4 Oct 2011 14:46:34 +0000 (10:46 -0400)]
drm/radeon/kms: fix channel_remap setup (v2)

Most asics just use the hw default value which requires
no explicit programming.  For those that need a different
value, the vbios will program it properly.  As such,
there's no need to program these registers explicitly
in the driver.  Changing MC_SHARED_CHREMAP requires a reload
of all data in vram otherwise its contents will be scambled.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=40103

v2: drop now unused channel_remap functions.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agospi-topcliff-pch: Fix overrun issue
Tomoya MORINAGA [Tue, 6 Sep 2011 08:16:38 +0000 (17:16 +0900)]
spi-topcliff-pch: Fix overrun issue

We found that adding load, Rx data sometimes drops.(with DMA transfer mode)
The cause is that before starting Rx-DMA processing, Tx-DMA processing starts.
This causes FIFO overrun occurs.

This patch fixes the issue by modifying FIFO tx-threshold and DMA descriptor
size like below.

                      Current                   this patch
Rx-descriptor   4Byte+12Byte*341    -->    12Byte*340-4Byte-12Byte
Rx-threshold                   (Not modified)
Tx-descriptor   4Byte+12Byte*341    -->    16Byte-12Byte*340
Rx-threshold    12Byte              -->    2Byte

Signed-off-by: Tomoya MORINAGA <tomoya-linux@dsn.okisemi.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
13 years agospi-topcliff-pch: Add recovery processing in case FIFO overrun error occurs
Tomoya MORINAGA [Tue, 6 Sep 2011 08:16:37 +0000 (17:16 +0900)]
spi-topcliff-pch: Add recovery processing in case FIFO overrun error occurs

Add recovery processing in case FIFO overrun error occurs with DMA transfer mode.

Signed-off-by: Tomoya MORINAGA <tomoya-linux@dsn.okisemi.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
13 years agospi-topcliff-pch: Fix CPU read complete condition issue
Tomoya MORINAGA [Tue, 6 Sep 2011 08:16:36 +0000 (17:16 +0900)]
spi-topcliff-pch: Fix CPU read complete condition issue

We found Rx data sometimes drops.(with non-DMA transfer mode)
The cause is read complete condition is not true.

This patch fixes the issue.

Signed-off-by: Tomoya MORINAGA <tomoya-linux@dsn.okisemi.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
13 years agospi-topcliff-pch: Fix SSN Control issue
Tomoya MORINAGA [Tue, 6 Sep 2011 08:16:35 +0000 (17:16 +0900)]
spi-topcliff-pch: Fix SSN Control issue

During processing 1 command/data series,
SSN should keep LOW.
However, currently, SSN becomes HIGH.
This patch fixes the issue.

Signed-off-by: Tomoya MORINAGA <tomoya-linux@dsn.okisemi.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
13 years agospi-topcliff-pch: add tx-memory clear after complete transmitting
Tomoya MORINAGA [Tue, 6 Sep 2011 08:16:34 +0000 (17:16 +0900)]
spi-topcliff-pch: add tx-memory clear after complete transmitting

Currently, in case of reading date from SPI flash,
command is sent twice.
The cause is that tx-memory clear processing is missing .
This patch adds the tx-momory clear processing.

Signed-off-by: Tomoya MORINAGA <tomoya-linux@dsn.okisemi.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
13 years agoperf: Fix counter of ftrace events
Andrew Vagin [Mon, 26 Sep 2011 15:55:32 +0000 (19:55 +0400)]
perf: Fix counter of ftrace events

Each event adds some points to its counters. By default it adds 1,
and a number of points may be transmited in event's parameters.

E.g. sched:sched_stat_runtime adds how long process has been running.

But this functionality was broken by v2.6.31-rc5-392-gf413cdb
and now the event's parameters doesn't affect on a number of points.

TP_perf_assign isn't defined, so __perf_count(c) isn't executed and
__count is always equal to 1.

Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1317052535-1765247-2-git-send-email-avagin@openvz.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agolis3: fix regression of HP DriveGuard with 8bit chip
Takashi Iwai [Tue, 4 Oct 2011 01:09:14 +0000 (18:09 -0700)]
lis3: fix regression of HP DriveGuard with 8bit chip

Commit 2a7fade7e03 ("hwmon: lis3: Power on corrections") caused a
regression on HP laptops with 8bit chip.  Writing CTRL2_BOOT_8B bit seems
clearing the BIOS setup, and no proper interrupt for DriveGuard will be
triggered any more.

Since the init code there is basically only for embedded devices, put a
pdata check so that the problematic initialization will be skipped for
hp_accel stuff.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Cc: Eric Piel <eric.piel@tremplin-utc.net>
Cc: Samu Onkalo <samu.p.onkalo@nokia.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoMerge branch 'hwmon-for-linus' of git://github.com/groeck/linux
Linus Torvalds [Mon, 3 Oct 2011 19:54:56 +0000 (12:54 -0700)]
Merge branch 'hwmon-for-linus' of git://github.com/groeck/linux

* 'hwmon-for-linus' of git://github.com/groeck/linux:
  hwmon: (coretemp) Avoid leaving around dangling pointer
  hwmon: (coretemp) Fixup platform device ID change

13 years agoMerge git://github.com/davem330/ide
Linus Torvalds [Mon, 3 Oct 2011 19:53:43 +0000 (12:53 -0700)]
Merge git://github.com/davem330/ide

* git://github.com/davem330/ide:
  ide-disk: Fix request requeuing

13 years agoMerge branch 'btrfs-3.0' of git://github.com/chrismason/linux
Linus Torvalds [Mon, 3 Oct 2011 19:17:44 +0000 (12:17 -0700)]
Merge branch 'btrfs-3.0' of git://github.com/chrismason/linux

* 'btrfs-3.0' of git://github.com/chrismason/linux:
  Btrfs: force a page fault if we have a shorty copy on a page boundary

13 years agoide-disk: Fix request requeuing
Borislav Petkov [Mon, 3 Oct 2011 18:28:18 +0000 (14:28 -0400)]
ide-disk: Fix request requeuing

Simon Kirby reported that on his RAID setup with idedisk underneath
the box OOMs after a couple of days of runtime. Running with
CONFIG_DEBUG_KMEMLEAK pointed to idedisk_prep_fn() which unconditionally
allocates an ide_cmd struct. However, ide_requeue_and_plug() can be
called more than once per request, either from the request issue or the
IRQ handler path and do blk_peek_request() ends up in idedisk_prep_fn()
repeatedly, allocating a struct ide_cmd everytime and "forgetting" the
previous pointer.

Make sure the code reuses the old allocated chunk.

Reported-and-tested-by: Simon Kirby <sim@hostway.ca>
Cc: <stable@kernel.org> [ 39.x, 3.0.x ]
Link: http://marc.info/?l=linux-kernel&m=131667641517919
Link: http://lkml.kernel.org/r/20110922072643.GA27232@hostway.ca
Signed-off-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agopch_gbe: Fixed the issue on which a network freezes
Toshiharu Okada [Sun, 25 Sep 2011 21:27:43 +0000 (21:27 +0000)]
pch_gbe: Fixed the issue on which a network freezes

The pch_gbe driver has an issue which a network stops,
when receiving traffic is high.
In the case, The link down and up are necessary to return a network.

This patch fixed this issue.

Signed-off-by: Toshiharu Okada <toshiharu-linux@dsn.okisemi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agopch_gbe: Fixed the issue on which PC was frozen when link was downed.
Toshiharu Okada [Sun, 25 Sep 2011 21:27:42 +0000 (21:27 +0000)]
pch_gbe: Fixed the issue on which PC was frozen when link was downed.

When a link was downed during network use,
there is an issue on which PC freezes.

This patch fixed this issue.

Signed-off-by: Toshiharu Okada <toshiharu-linux@dsn.okisemi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agomake PACKET_STATISTICS getsockopt report consistently between ring and non-ring
Willem de Bruijn [Fri, 30 Sep 2011 10:38:28 +0000 (10:38 +0000)]
make PACKET_STATISTICS getsockopt report consistently between ring and non-ring

This is a minor change.

Up until kernel 2.6.32, getsockopt(fd, SOL_PACKET, PACKET_STATISTICS,
...) would return total and dropped packets since its last invocation. The
introduction of socket queue overflow reporting [1] changed drop
rate calculation in the normal packet socket path, but not when using a
packet ring. As a result, the getsockopt now returns different statistics
depending on the reception method used. With a ring, it still returns the
count since the last call, as counts are incremented in tpacket_rcv and
reset in getsockopt. Without a ring, it returns 0 if no drops occurred
since the last getsockopt and the total drops over the lifespan of
the socket otherwise. The culprit is this line in packet_rcv, executed
on a drop:

drop_n_acct:
        po->stats.tp_drops = atomic_inc_return(&sk->sk_drops);

As it shows, the new drop number it taken from the socket drop counter,
which is not reset at getsockopt. I put together a small example
that demonstrates the issue [2]. It runs for 10 seconds and overflows
the queue/ring on every odd second. The reported drop rates are:
ring: 16, 0, 16, 0, 16, ...
non-ring: 0, 15, 0, 30, 0, 46, 0, 60, 0 , 74.

Note how the even ring counts monotonically increase. Because the
getsockopt adds tp_drops to tp_packets, total counts are similarly
reported cumulatively. Long story short, reinstating the original code, as
the below patch does, fixes the issue at the cost of additional per-packet
cycles. Another solution that does not introduce per-packet overhead
is be to keep the current data path, record the value of sk_drops at
getsockopt() at call N in a new field in struct packetsock and subtract
that when reporting at call N+1. I'll be happy to code that, instead,
it's just more messy.

[1] http://patchwork.ozlabs.org/patch/35665/
[2] http://kernel.googlecode.com/files/test-packetsock-getstatistics.c

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: xen-netback: correctly restart Tx after a VM restore/migrate
David Vrabel [Fri, 30 Sep 2011 06:37:51 +0000 (06:37 +0000)]
net: xen-netback: correctly restart Tx after a VM restore/migrate

If a VM is saved and restored (or migrated) the netback driver will no
longer process any Tx packets from the frontend.  xenvif_up() does not
schedule the processing of any pending Tx requests from the front end
because the carrier is off.  Without this initial kick the frontend
just adds Tx requests to the ring without raising an event (until the
ring is full).

This was caused by 47103041e91794acdbc6165da0ae288d844c820b (net:
xen-netback: convert to hw_features) which reordered the calls to
xenvif_up() and netif_carrier_on() in xenvif_connect().

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobonding: properly stop queuing work when requested
Andy Gospodarek [Fri, 23 Sep 2011 10:53:34 +0000 (10:53 +0000)]
bonding: properly stop queuing work when requested

During a test where a pair of bonding interfaces using ARP monitoring
were both brought up and torn down (with an rmmod) repeatedly, a panic
in the timer code was noticed.  I tracked this down and determined that
any of the bonding functions that ran as workqueue handlers and requeued
more work might not properly exit when the module was removed.

There was a flag protected by the bond lock called kill_timers that is
set when the interface goes down or the module is removed, but many of
the functions that monitor link status now unlock the bond lock to take
rtnl first.  There is a chance that another CPU running the rmmod could
get the lock and set kill_timers after the first check has passed.

This patch does not allow any function to queue work that will make
itself run unless kill_timers is not set.  I also noticed while doing
this work that bond_resend_igmp_join_requests did not have a check for
kill_timers, so I added the needed call there as well.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Reported-by: Liang Zheng <lzheng@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agodrm/radeon: Set cursor x/y to 0 when x/yorigin > 0.
Michel Dänzer [Fri, 30 Sep 2011 15:16:53 +0000 (17:16 +0200)]
drm/radeon: Set cursor x/y to 0 when x/yorigin > 0.

Apart from the obvious cleanup, this should make the line

cursor_end = x - xorigin + w;

correct now.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agodrm/radeon: Update AVIVO cursor coordinate origin before x/yorigin calculation.
Michel Dänzer [Fri, 30 Sep 2011 15:16:52 +0000 (17:16 +0200)]
drm/radeon: Update AVIVO cursor coordinate origin before x/yorigin calculation.

Fixes cursor disappearing prematurely when moving off a top/left edge which
is not located at the desktop top/left edge.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: stable@kernel.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agodrm/radeon: Simplify cursor x/yorigin calculation.
Michel Dänzer [Fri, 30 Sep 2011 15:16:51 +0000 (17:16 +0200)]
drm/radeon: Simplify cursor x/yorigin calculation.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agodrm/radeon/kms: fix cursor image off-by-one error
Nicholas Miell [Fri, 30 Sep 2011 02:07:14 +0000 (19:07 -0700)]
drm/radeon/kms: fix cursor image off-by-one error

The mouse cursor hotspot calculation when the cursor is partially off the
top or left side of the screen was off by one.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=41158

Signed-off-by: Nicholas Miell <nmiell@gmail.com>
Reviewed-by: Michel Dänzer <michel@daenzer.net>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agodrm/radeon/kms: Fix logic error in DP HPD handler
Alex Deucher [Mon, 3 Oct 2011 12:37:33 +0000 (08:37 -0400)]
drm/radeon/kms: Fix logic error in DP HPD handler

Only disable the pipe if the monitor is physically
disconnected.  The previous logic also disabled the
pipe if the link was trained.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=41248

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agodrm/radeon/kms: add retry limits for native DP aux defer
Alex Deucher [Mon, 3 Oct 2011 13:13:46 +0000 (09:13 -0400)]
drm/radeon/kms: add retry limits for native DP aux defer

The previous code could potentially loop forever.  Limit
the number of DP aux defer retries to 4 for native aux
transactions, same as i2c over aux transactions.

Noticed by: Brad Campbell <lists2009@fnarfbargle.com>

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Brad Campbell <lists2009@fnarfbargle.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agodrm/radeon/kms: fix regression in DP aux defer handling
Alex Deucher [Mon, 3 Oct 2011 13:13:45 +0000 (09:13 -0400)]
drm/radeon/kms: fix regression in DP aux defer handling

An incorrect ordering in the error checking code lead
to DP aux defer being skipped in the aux native write
path.  Move the bytes transferred check (ret == 0)
below the defer check.

Tracked down by: Brad Campbell <brad@fnarfbargle.com>

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=41121

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Brad Campbell <brad@fnarfbargle.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agoMerge branch 'for-linus' of git://git.infradead.org/users/sameo/mfd-2.6
Linus Torvalds [Mon, 3 Oct 2011 02:23:44 +0000 (19:23 -0700)]
Merge branch 'for-linus' of git://git.infradead.org/users/sameo/mfd-2.6

* 'for-linus' of git://git.infradead.org/users/sameo/mfd-2.6:
  mfd: Fix generic irq chip ack function name for jz4740-adc

13 years agoMerge branch 'for-linus' of git://github.com/tiwai/sound
Linus Torvalds [Mon, 3 Oct 2011 02:22:44 +0000 (19:22 -0700)]
Merge branch 'for-linus' of git://github.com/tiwai/sound

* 'for-linus' of git://github.com/tiwai/sound:
  ALSA: hda - Fix a regression of the position-buffer check

13 years agoASoC: omap_mcpdm_remove cannot be __devexit
Arnd Bergmann [Sun, 2 Oct 2011 14:45:31 +0000 (16:45 +0200)]
ASoC: omap_mcpdm_remove cannot be __devexit

omap_mcpdm_remove is used from asoc_mcpdm_probe, which is an
initcall, and must not be discarded when HOTPLUG is disabled.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
13 years agoASoC: Fix setting update bits for WM8753_LADC and WM8753_RADC
Axel Lin [Sun, 2 Oct 2011 12:41:04 +0000 (20:41 +0800)]
ASoC: Fix setting update bits for WM8753_LADC and WM8753_RADC

Current code set update bits for WM8753_LDAC and WM8753_RDAC twice,
but missed setting update bits for WM8753_LADC and WM8753_RADC.

I think it is a copy-paste bug in commit 776065
"ASoC: codecs: wm8753: Fix register cache incoherency".

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: stable@kernel.org
13 years agoASoC: use a valid device for dev_err() in Zylonite
Arnd Bergmann [Sat, 1 Oct 2011 20:03:34 +0000 (22:03 +0200)]
ASoC: use a valid device for dev_err() in Zylonite

A recent conversion has introduced references to &pdev->dev, which does
not actually exist in all the contexts it's used in.

Replace this with card->dev where necessary, in order to let
the driver build again.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: stable@kernel.org
13 years agoMerge branch 'perf-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip
Linus Torvalds [Sun, 2 Oct 2011 00:46:13 +0000 (17:46 -0700)]
Merge branch 'perf-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip

* 'perf-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
  perf tools: Fix raw sample reading

13 years agoMerge branches 'irq-urgent-for-linus', 'x86-urgent-for-linus' and 'sched-urgent-for...
Linus Torvalds [Sat, 1 Oct 2011 15:37:25 +0000 (08:37 -0700)]
Merge branches 'irq-urgent-for-linus', 'x86-urgent-for-linus' and 'sched-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip

* 'irq-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
  irq: Fix check for already initialized irq_domain in irq_domain_add
  irq: Add declaration of irq_domain_simple_ops to irqdomain.h

* 'x86-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
  x86/rtc: Don't recursively acquire rtc_lock

* 'sched-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
  posix-cpu-timers: Cure SMP wobbles
  sched: Fix up wchan borkage
  sched/rt: Migrate equal priority tasks to available CPUs

13 years agoBtrfs: force a page fault if we have a shorty copy on a page boundary
Josef Bacik [Fri, 30 Sep 2011 19:23:54 +0000 (15:23 -0400)]
Btrfs: force a page fault if we have a shorty copy on a page boundary

A user reported a problem where ceph was getting into 100% cpu usage while doing
some writing.  It turns out it's because we were doing a short write on a not
uptodate page, which means we'd fall back at one page at a time and fault the
page in.  The problem is our position is on the page boundary, so our fault in
logic wasn't actually reading the page, so we'd just spin forever or until the
page got read in by somebody else.  This will force a readpage if we end up
doing a short copy.  Alexandre could reproduce this easily with ceph and reports
it fixes his problem.  I also wrote a reproducer that no longer hangs my box
with this patch.  Thanks,

Reported-and-tested-by: Alexandre Oliva <aoliva@redhat.com>
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
13 years agoMerge branch 'perf/urgent' of git://github.com/acmel/linux into perf/urgent
Ingo Molnar [Fri, 30 Sep 2011 18:08:56 +0000 (20:08 +0200)]
Merge branch 'perf/urgent' of git://github.com/acmel/linux into perf/urgent

13 years agoposix-cpu-timers: Cure SMP wobbles
Peter Zijlstra [Thu, 1 Sep 2011 10:42:04 +0000 (12:42 +0200)]
posix-cpu-timers: Cure SMP wobbles

David reported:

  Attached below is a watered-down version of rt/tst-cpuclock2.c from
  GLIBC.  Just build it with "gcc -o test test.c -lpthread -lrt" or
  similar.

  Run it several times, and you will see cases where the main thread
  will measure a process clock difference before and after the nanosleep
  which is smaller than the cpu-burner thread's individual thread clock
  difference.  This doesn't make any sense since the cpu-burner thread
  is part of the top-level process's thread group.

  I've reproduced this on both x86-64 and sparc64 (using both 32-bit and
  64-bit binaries).

  For example:

  [davem@boricha build-x86_64-linux]$ ./test
  process: before(0.001221967) after(0.498624371) diff(497402404)
  thread:  before(0.000081692) after(0.498316431) diff(498234739)
  self:    before(0.001223521) after(0.001240219) diff(16698)
  [davem@boricha build-x86_64-linux]$

  The diff of 'process' should always be >= the diff of 'thread'.

  I make sure to wrap the 'thread' clock measurements the most tightly
  around the nanosleep() call, and that the 'process' clock measurements
  are the outer-most ones.

  ---
  #include <unistd.h>
  #include <stdio.h>
  #include <stdlib.h>
  #include <time.h>
  #include <fcntl.h>
  #include <string.h>
  #include <errno.h>
  #include <pthread.h>

  static pthread_barrier_t barrier;

  static void *chew_cpu(void *arg)
  {
  pthread_barrier_wait(&barrier);
  while (1)
  __asm__ __volatile__("" : : : "memory");
  return NULL;
  }

  int main(void)
  {
  clockid_t process_clock, my_thread_clock, th_clock;
  struct timespec process_before, process_after;
  struct timespec me_before, me_after;
  struct timespec th_before, th_after;
  struct timespec sleeptime;
  unsigned long diff;
  pthread_t th;
  int err;

  err = clock_getcpuclockid(0, &process_clock);
  if (err)
  return 1;

  err = pthread_getcpuclockid(pthread_self(), &my_thread_clock);
  if (err)
  return 1;

  pthread_barrier_init(&barrier, NULL, 2);
  err = pthread_create(&th, NULL, chew_cpu, NULL);
  if (err)
  return 1;

  err = pthread_getcpuclockid(th, &th_clock);
  if (err)
  return 1;

  pthread_barrier_wait(&barrier);

  err = clock_gettime(process_clock, &process_before);
  if (err)
  return 1;

  err = clock_gettime(my_thread_clock, &me_before);
  if (err)
  return 1;

  err = clock_gettime(th_clock, &th_before);
  if (err)
  return 1;

  sleeptime.tv_sec = 0;
  sleeptime.tv_nsec = 500000000;
  nanosleep(&sleeptime, NULL);

  err = clock_gettime(th_clock, &th_after);
  if (err)
  return 1;

  err = clock_gettime(my_thread_clock, &me_after);
  if (err)
  return 1;

  err = clock_gettime(process_clock, &process_after);
  if (err)
  return 1;

  diff = process_after.tv_nsec - process_before.tv_nsec;
  printf("process: before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
 process_before.tv_sec, process_before.tv_nsec,
 process_after.tv_sec, process_after.tv_nsec, diff);
  diff = th_after.tv_nsec - th_before.tv_nsec;
  printf("thread:  before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
 th_before.tv_sec, th_before.tv_nsec,
 th_after.tv_sec, th_after.tv_nsec, diff);
  diff = me_after.tv_nsec - me_before.tv_nsec;
  printf("self:    before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
 me_before.tv_sec, me_before.tv_nsec,
 me_after.tv_sec, me_after.tv_nsec, diff);

  return 0;
  }

This is due to us using p->se.sum_exec_runtime in
thread_group_cputime() where we iterate the thread group and sum all
data. This does not take time since the last schedule operation (tick
or otherwise) into account. We can cure this by using
task_sched_runtime() at the cost of having to take locks.

This also means we can (and must) do away with
thread_group_sched_runtime() since the modified thread_group_cputime()
is now more accurate and would deadlock when called from
thread_group_sched_runtime().

Aside of that it makes the function safe on 32 bit systems. The old
code added t->se.sum_exec_runtime unprotected. sum_exec_runtime is a
64bit value and could be changed on another cpu at the same time.

Reported-by: David Miller <davem@davemloft.net>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/1314874459.7945.22.camel@twins
Tested-by: David Miller <davem@davemloft.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agoALSA: hda - Fix a regression of the position-buffer check
Takashi Iwai [Fri, 30 Sep 2011 06:52:26 +0000 (08:52 +0200)]
ALSA: hda - Fix a regression of the position-buffer check

The commit a810364a0424c297242c6c66071a42f7675a5568
    ALSA: hda - Handle -1 as invalid position, too
caused a regression on some machines that require the position-buffer
instead of LPIB, e.g. resulting in noises with mic recording with
PulseAudio.

This patch fixes the detection by delaying the test at the timing as
same as 3.0, i.e. doing the position check only when requested in
azx_position_ok().

Reported-and-tested-by: Rocko Requin <rockorequin@hotmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
13 years agoResource: fix wrong resource window calculation
Ram Pai [Thu, 22 Sep 2011 07:48:58 +0000 (15:48 +0800)]
Resource: fix wrong resource window calculation

__find_resource() incorrectly returns a resource window which overlaps
an existing allocated window.  This happens when the parent's
resource-window spans 0x00000000 to 0xffffffff and is entirely allocated
to all its children resource-windows.

__find_resource() looks for gaps in resource allocation among the
children resource windows.  When it encounters the last child window it
blindly tries the range next to one allocated to the last child.  Since
the last child's window ends at 0xffffffff the calculation overflows,
leading the algorithm to believe that any window in the range 0x0000000
to 0xfffffff is available for allocation.  This leads to a conflicting
window allocation.

Michal Ludvig reported this issue seen on his platform.  The following
patch fixes the problem and has been verified by Michal.  I believe this
bug has been there for ages.  It got exposed by git commit 2bbc6942273b
("PCI : ability to relocate assigned pci-resources")

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Tested-by: Michal Ludvig <mludvig@logix.net.nz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoMerge branch 'for-linus' of git://github.com/NewDreamNetwork/ceph-client
Linus Torvalds [Fri, 30 Sep 2011 02:58:58 +0000 (19:58 -0700)]
Merge branch 'for-linus' of git://github.com/NewDreamNetwork/ceph-client

* 'for-linus' of git://github.com/NewDreamNetwork/ceph-client:
  libceph: fix pg_temp mapping update
  libceph: fix pg_temp mapping calculation
  libceph: fix linger request requeuing
  libceph: fix parse options memory leak
  libceph: initialize ack_stamp to avoid unnecessary connection reset

13 years agoMerge branch 'v4l_for_linus' of git://linuxtv.org/mchehab/for_linus
Linus Torvalds [Fri, 30 Sep 2011 02:29:45 +0000 (19:29 -0700)]
Merge branch 'v4l_for_linus' of git://linuxtv.org/mchehab/for_linus

* 'v4l_for_linus' of git://linuxtv.org/mchehab/for_linus:
  [media] omap3isp: Fix build error in ispccdc.c
  [media] uvcvideo: Fix crash when linking entities
  [media] v4l: Make sure we hold a reference to the v4l2_device before using it
  [media] v4l: Fix use-after-free case in v4l2_device_release
  [media] uvcvideo: Set alternate setting 0 on resume if the bus has been reset
  [media] OMAP_VOUT: Fix build break caused by update_mode removal in DSS2

13 years agoMerge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
Linus Torvalds [Fri, 30 Sep 2011 02:28:26 +0000 (19:28 -0700)]
Merge branch 'for-linus' of git://git390.marist.edu/linux-2.6

* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
  [S390] cio: fix cio_tpi ignoring adapter interrupts
  [S390] gmap: always up mmap_sem properly
  [S390] Do not clobber personality flags on exec

13 years agoMerge git://github.com/davem330/sparc
Linus Torvalds [Fri, 30 Sep 2011 02:24:33 +0000 (19:24 -0700)]
Merge git://github.com/davem330/sparc

* git://github.com/davem330/sparc:
  sparc64: Force the execute bit in OpenFirmware's translation entries.
  sparc: Make '-p' boot option meaningful again.
  sparc, exec: remove redundant addr_limit assignment
  sparc64: Future proof Niagara cpu detection.

13 years agoMerge branch 'drm-intel-fixes' of git://people.freedesktop.org/~keithp/linux
Linus Torvalds [Fri, 30 Sep 2011 02:23:30 +0000 (19:23 -0700)]
Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~keithp/linux

* 'drm-intel-fixes' of git://people.freedesktop.org/~keithp/linux:
  drm/i915: FBC off for ironlake and older, otherwise on by default
  drm/i915: Enable SDVO hotplug interrupts for HDMI and DVI
  drm/i915: Enable dither whenever display bpc < frame buffer bpc

13 years agopowerpc: Fix device-tree matching for Apple U4 bridge
Benjamin Herrenschmidt [Thu, 29 Sep 2011 05:57:01 +0000 (15:57 +1000)]
powerpc: Fix device-tree matching for Apple U4 bridge

Apple Quad G5 has some oddity in it's device-tree which causes the new
generic matching code to fail to relate nodes for PCI-E devices below U4
with their respective struct pci_dev.  This breaks graphics on those
machines among others.

This fixes it using a quirk which copies the node pointer from the host
bridge for the root complex, which makes the generic code work for the
children afterward.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agobootup: move 'usermodehelper_enable()' a little earlier
wangyanqing [Thu, 29 Sep 2011 07:09:40 +0000 (15:09 +0800)]
bootup: move 'usermodehelper_enable()' a little earlier

Commit d5767c53535a ("bootup: move 'usermodehelper_enable()' to the end
of do_basic_setup()") moved 'usermodehelper_enable()' to end of
do_basic_setup() to after the initcalls.  But then I get failed to let
uvesafb work on my computer, and lose the splash boot.

So maybe we could start usermodehelper_enable a little early to make
some task work that need eary init with the help of user mode.

[ I would *really* prefer that initcalls not call into user space - even
  the real 'init' hasn't been execve'd yet, after all! But for uvesafb
  it really does look like we don't have much choice.

  I considered doing this when we mount the root filesystem, but
  depending on config options that is in multiple places.  We could do
  the usermode helper enable as a rootfs_initcall()..

  So I'm just using wang yanqing's trivial patch.  It's not wonderful,
  but it's simple and should work.  We should revisit this some day,
  though.      - Linus ]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoperf symbols: Treat all memory maps without dso file as loaded
Jiri Olsa [Wed, 24 Aug 2011 13:18:34 +0000 (15:18 +0200)]
perf symbols: Treat all memory maps without dso file as loaded

The stack/vdso/heap memory maps dont have any dso file.  Setting the
perf dso objects as 'loaded' for these maps, we avoid unnecessary
warnings like:

  "Failed to open [stack], continuing without symbols"

All map__find_* functions still return NULL when searching for symbols
in these maps.

Link: http://lkml.kernel.org/r/20110824131834.GA2007@jolsa.brq.redhat.com
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
13 years agoperf sched: Fix script command documentation
Jiri Olsa [Tue, 27 Sep 2011 09:16:35 +0000 (11:16 +0200)]
perf sched: Fix script command documentation

Fixed leftover from trace -> script rename.

Link: http://lkml.kernel.org/r/1317114995-4534-1-git-send-email-jolsa@redhat.com
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
13 years agoperf report: Fix stdio event name header printing
Arnaldo Carvalho de Melo [Wed, 21 Sep 2011 18:43:04 +0000 (15:43 -0300)]
perf report: Fix stdio event name header printing

In the past we tried to avoid printing the name of the event when just
one event was found in the perf.data file, after some refactorings it
ended up not printing the event name if just one hist_entry was found in
one of the events.

Fix it by always printing the name of the event, even if just one is
found.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-kikr0c7ou55bd9caok8569rf@git.kernel.org
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
13 years agoperf: Support setting the disassembler style
Andi Kleen [Thu, 15 Sep 2011 21:31:41 +0000 (14:31 -0700)]
perf: Support setting the disassembler style

Add -M option to report/annotate to pass directly to objdump.  This
allows to use -M intel for intel style disassembler syntax, which is
useful for people who are very used to the Intel syntax.

Link: http://lkml.kernel.org/r/1316122302-24306-2-git-send-email-andi@firstfloor.org
[committer note: Add missing Documentation bits, fixup conflicts with 3e6a2a7]
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
13 years agoperf tools: Make stat/record print fatal signals of the target program
Andi Kleen [Thu, 15 Sep 2011 21:31:40 +0000 (14:31 -0700)]
perf tools: Make stat/record print fatal signals of the target program

When a program crashes under perf there is no message about it, unlike
when running it from bash. This can be confusing and lead to wrong
actions during debugging.

Print fatal signals in perf stat/record.

Thanks to Furat Afram for finding the problem originally

Link: http://lkml.kernel.org/r/1316122302-24306-1-git-send-email-andi@firstfloor.org
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
13 years agoperf stat: Fix spelling in comment
Jim Cromie [Wed, 7 Sep 2011 23:14:04 +0000 (17:14 -0600)]
perf stat: Fix spelling in comment

Link: http://lkml.kernel.org/r/1315437244-3788-6-git-send-email-jim.cromie@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
13 years agoperf stat: Allow tab as cvs delimiter
Jim Cromie [Wed, 7 Sep 2011 23:14:03 +0000 (17:14 -0600)]
perf stat: Allow tab as cvs delimiter

If option -x '\t' is given, convert '\t' to "\t".  This makes cvs
printing more flexible.

Link: http://lkml.kernel.org/r/1315437244-3788-5-git-send-email-jim.cromie@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
13 years agoperf stat: Suppress printing std-dev when its 0
Jim Cromie [Wed, 7 Sep 2011 23:14:02 +0000 (17:14 -0600)]
perf stat: Suppress printing std-dev when its 0

For pretty output only (preserve column for cvs output), dont print
std-deviation when its 0.00.  Do this based upon value, instead of
checking for --no-aggr, since the stats could conceivably be computed
over the runs on each CPU, and theres no reason to preclude that.

Link: http://lkml.kernel.org/r/1315437244-3788-4-git-send-email-jim.cromie@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
13 years agoperf stat: Fix +- nan% in --no-aggr runs
Jim Cromie [Wed, 7 Sep 2011 23:14:01 +0000 (17:14 -0600)]
perf stat: Fix +- nan% in --no-aggr runs

Without this patch, running:

$ sudo ./perf stat -r20 --no-aggr -a perl -e '$i++ for 1..100000'

I get computations like this:

CPU0             12.488247 task-clock                #    1.224 CPUs utilized            ( +-  -nan% )
CPU1             12.488909 task-clock                #    1.225 CPUs utilized            ( +-  -nan% )
CPU2             12.500221 task-clock                #    1.226 CPUs utilized            ( +-  -nan% )
CPU3             12.481713 task-clock                #    1.224 CPUs utilized            ( +-  -nan% )

but with patch, I get:

CPU0              8.233682 task-clock                #    0.754 CPUs utilized            ( +-  0.00% )
CPU1              8.226318 task-clock                #    0.754 CPUs utilized            ( +-  0.00% )
CPU2              8.210737 task-clock                #    0.752 CPUs utilized            ( +-  0.00% )
CPU3              8.201691 task-clock                #    0.751 CPUs utilized            ( +-  0.00% )

Note that without --no-aggr, I get non-0 statistics both before and after patch:

        231.986022 task-clock                #    4.030 CPUs utilized            ( +-  0.97% )
               212 context-switches          #    0.001 M/sec                    ( +- 12.07% )
                 9 CPU-migrations            #    0.000 M/sec                    ( +- 25.80% )
               466 page-faults               #    0.002 M/sec                    ( +-  3.23% )
       174,318,593 cycles                    #    0.751 GHz                      ( +-  1.06% )

I couldnt see anything wrong in the caller, so fixed it in
stddev_stats().  ISTM that 0.00 is better than nan, since perf stat was
passed -A (--no-aggr) so no standard deviation should be expected, and
nan is suggestive of a deeper error.

When running with --no-aggr, perhaps we should suppress the statistics
printing entirely, or do so when they are 0.00.

Link: http://lkml.kernel.org/r/1315437244-3788-3-git-send-email-jim.cromie@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
13 years agoperf stat: Add --log-fd <N> option to redirect stderr elsewhere
Jim Cromie [Wed, 7 Sep 2011 23:14:00 +0000 (17:14 -0600)]
perf stat: Add --log-fd <N> option to redirect stderr elsewhere

This perf stat option emulates valgrind's --log-fd option, allowing the
user to send perf results elsewhere, and leaving stderr for use by the
program under test.  This complements --output file option, and is
mutually exclusive with it.

   3>results  perf stat --log-fd 3          -- $cmd
   3>>results perf stat --log-fd 3 --append -- $cmd

The perl distro's make test.valgrind target uses valgrind's --log-fd
option, I've adapted it to invoke perf also, and tested this patch
there.

Link: http://lkml.kernel.org/r/1315437244-3788-2-git-send-email-jim.cromie@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
13 years agoperf top: Improve lost events warning
Arnaldo Carvalho de Melo [Thu, 1 Sep 2011 17:27:58 +0000 (14:27 -0300)]
perf top: Improve lost events warning

Now it warns everytime that new events are lost.

And the TUI also warns now.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-w1n168yrvrppnq6887s4u0wx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
13 years agoperf top browser: Fix up line width calculation
Arnaldo Carvalho de Melo [Thu, 1 Sep 2011 15:57:19 +0000 (12:57 -0300)]
perf top browser: Fix up line width calculation

Fixing an artifact where the last 3 chars of a long DSO name would
remain on the screen sometimes.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-dkiakcl3z69dh1bt9uegaktv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
13 years agoperf buildid-list: Support showing the build id in an ELF file
Arnaldo Carvalho de Melo [Mon, 29 Aug 2011 11:33:17 +0000 (08:33 -0300)]
perf buildid-list: Support showing the build id in an ELF file

Try first reading the build id, validating that it is an ELF file, etc.
Cheap as libelf will bail out as soon as the magic number check fails.

Useful when investigating debuginfo packaging problems like this one:

[root@emilia ~]# perf buildid-list -i /usr/lib/debug/lib/modules/`uname -r`/vmlinux
77bb4ea591a602d455ace759a377c9adfff1aba3
[root@emilia ~]# perf buildid-list -k
07b0c016a2b30004e86132d0239945b1e88f5d75
[root@emilia ~]#

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-4elot9oxwa0rr0d90dshca3a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
13 years agoperf buildid-list: Add option to show the running kernel build id
Arnaldo Carvalho de Melo [Mon, 29 Aug 2011 11:07:22 +0000 (08:07 -0300)]
perf buildid-list: Add option to show the running kernel build id

[root@emilia ~]# perf buildid-list -k
07b0c016a2b30004e86132d0239945b1e88f5d75

Useful when diagnosing build id problems in debuginfo packages, etc.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-po1bl7acn6e1hhne90opmvtl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
13 years agoperf script: Add drop monitor script
Neil Horman [Mon, 4 Jul 2011 17:40:17 +0000 (13:40 -0400)]
perf script: Add drop monitor script

A while back I created the dropmonitor protocol, which allowed users to get
reports of dropped frames communicated to them via a netlink socket.

While useful, several people have now asked that I integrate the ability
to do drop monitoring with perf, so they don't have to run additional
tools.

This patch adds a drop monitor script to the perf suite, and provides
the same output that the netlink socket does.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1309801217-22450-1-git-send-email-nhorman@tuxdriver.com
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
13 years agoperf symbols: Stop using 'self' in map_groups__ methods
Arnaldo Carvalho de Melo [Tue, 23 Aug 2011 17:31:30 +0000 (14:31 -0300)]
perf symbols: Stop using 'self' in map_groups__ methods

Stop using this python/OOP convention, doesn't really helps. Will do
more from time to time till we get it cleaned up in all of /perf.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-rl9e690y60vnuyng05yp1zd3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
13 years agocan bcm: fix incomplete tx_setup fix
Oliver Hartkopp [Thu, 29 Sep 2011 19:33:47 +0000 (15:33 -0400)]
can bcm: fix incomplete tx_setup fix

The commit aabdcb0b553b9c9547b1a506b34d55a764745870 ("can bcm: fix tx_setup
off-by-one errors") fixed only a part of the original problem reported by
Andre Naujoks. It turned out that the original code needed to be re-ordered
to reduce complexity and to finally fix the reported frame counting issues.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoperf tools: Fix raw sample reading
Jiri Olsa [Thu, 29 Sep 2011 15:05:08 +0000 (17:05 +0200)]
perf tools: Fix raw sample reading

Wrong pointer is being passed for raw data sanity checking, when parsing
sample event.

This ends up with invalid event and perf record being stuck in
__perf_session__process_events function during processing build IDs
(process_buildids function).

Following command hangs up in my setup:
./perf record -e raw_syscalls:sys_enter ls

The fix is to use proper pointer to the raw data instead of the 'u'
union.

Reviewed-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1317308709-9474-2-git-send-email-jolsa@redhat.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
13 years agosparc64: Force the execute bit in OpenFirmware's translation entries.
David S. Miller [Thu, 29 Sep 2011 19:18:59 +0000 (12:18 -0700)]
sparc64: Force the execute bit in OpenFirmware's translation entries.

In the OF 'translations' property, the template TTEs in the mappings
never specify the executable bit.  This is the case even though some
of these mappings are for OF's code segment.

Therefore, we need to force the execute bit on in every mapping.

This problem can only really trigger on Niagara/sun4v machines and the
history behind this is a little complicated.

Previous to sun4v, the sun4u TTE entries lacked a hardware execute
permission bit.  So OF didn't have to ever worry about setting
anything to handle executable pages.  Any valid TTE loaded into the
I-TLB would be respected by the chip.

But sun4v Niagara chips have a real hardware enforced executable bit
in their TTEs.  So it has to be set or else the I-TLB throws an
instruction access exception with type code 6 (protection violation).

We've been extremely fortunate to not get bitten by this in the past.

The best I can tell is that the OF's mappings for it's executable code
were mapped using permanent locked mappings on sun4v in the past.
Therefore, the fact that we didn't have the exec bit set in the OF
translations we would use did not matter in practice.

Thanks to Greg Onufer for helping me track this down.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoRDSRDMA: Fix cleanup of rds_iw_mr_pool
Jonathan Lallinger [Thu, 29 Sep 2011 07:58:41 +0000 (07:58 +0000)]
RDSRDMA: Fix cleanup of rds_iw_mr_pool

In the rds_iw_mr_pool struct the free_pinned field keeps track of
memory pinned by free MRs. While this field is incremented properly
upon allocation, it is never decremented upon unmapping. This would
cause the rds_rdma module to crash the kernel upon unloading, by
triggering the BUG_ON in the rds_iw_destroy_mr_pool function.

This change keeps track of the MRs that become unpinned, so that
free_pinned can be decremented appropriately.

Signed-off-by: Jonathan Lallinger <jonathan@ogc.us>
Signed-off-by: Steve Wise <swise@ogc.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: Documentation: Fix type of variables
Roy.Li [Wed, 28 Sep 2011 19:51:54 +0000 (19:51 +0000)]
net: Documentation: Fix type of variables

Signed-off-by: Roy.Li <rongqing.li@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'core' of git://amd64.org/linux/rric into perf/core
Ingo Molnar [Thu, 29 Sep 2011 15:35:29 +0000 (17:35 +0200)]
Merge branch 'core' of git://amd64.org/linux/rric into perf/core

13 years agoibmveth: Fix oops on request_irq failure
Brian King [Wed, 28 Sep 2011 05:33:43 +0000 (05:33 +0000)]
ibmveth: Fix oops on request_irq failure

If request_irq fails, the ibmveth driver will overwrite
the rc and end up returning a successful rc on its open
function, resulting in an oops later when a packet gets
sent and buffers are not allocated due to the failed open.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv6: nullify ipv6_ac_list and ipv6_fl_list when creating new socket
Yan, Zheng [Sun, 25 Sep 2011 02:21:30 +0000 (02:21 +0000)]
ipv6: nullify ipv6_ac_list and ipv6_fl_list when creating new socket

ipv6_ac_list and ipv6_fl_list from listening socket are inadvertently
shared with new socket created for connection.

Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agocxgb4: Fix EEH on IBM P7IOC
Divy Le Ray [Sat, 24 Sep 2011 06:11:31 +0000 (06:11 +0000)]
cxgb4: Fix EEH on IBM P7IOC

Fix EEH recovery on new P Series platform by
requesting fundamental reset.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agocan bcm: fix tx_setup off-by-one errors
Oliver Hartkopp [Fri, 23 Sep 2011 08:23:47 +0000 (08:23 +0000)]
can bcm: fix tx_setup off-by-one errors

This patch fixes two off-by-one errors that canceled each other out.
Checking for the same condition two times in bcm_tx_timeout_tsklet() reduced
the count of frames to be sent by one. This did not show up the first time
tx_setup is invoked as an additional frame is sent due to TX_ANNONCE.
Invoking a second tx_setup on the same item led to a reduced (by 1) number of
sent frames.

Reported-by: Andre Naujoks <nautsch@gmail.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMAINTAINERS: tehuti: Alexander Indenbaum's address bounces
Ian Campbell [Wed, 21 Sep 2011 22:08:26 +0000 (22:08 +0000)]
MAINTAINERS: tehuti: Alexander Indenbaum's address bounces

I got:
Generating server: Tehuti.onmicrosoft.com

baum@tehutinetworks.net
#< #5.1.1 smtp;550 5.1.1 RESOLVER.ADR.RecipNotFound; not found> #SMTP#

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Alexander Indenbaum <baum@tehutinetworks.net>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agodp83640: reduce driver noise
Richard Cochran [Tue, 20 Sep 2011 01:25:42 +0000 (01:25 +0000)]
dp83640: reduce driver noise

The driver has two warning messages that might be triggered
by normal use cases. When they appear, the messages give the
impression of a never ending series of errors.

This commit changes them to debug messages instead.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoptp: fix L2 event message recognition
Richard Cochran [Tue, 20 Sep 2011 01:25:41 +0000 (01:25 +0000)]
ptp: fix L2 event message recognition

The IEEE 1588 standard defines two kinds of messages, event and general
messages. Event messages require time stamping, and general do not. When
using UDP transport, two separate ports are used for the two message
types.

The BPF designed to recognize event messages incorrectly classifies L2
general messages as event messages. This commit fixes the issue by
extending the filter to check the message type field for L2 PTP packets.
Event messages are be distinguished from general messages by testing
the "general" bit.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Cc: <stable@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobootup: move 'usermodehelper_enable()' to the end of do_basic_setup()
Linus Torvalds [Wed, 28 Sep 2011 17:23:44 +0000 (10:23 -0700)]
bootup: move 'usermodehelper_enable()' to the end of do_basic_setup()

Doing it just before starting to call into cpu_idle() made a sick kind
of sense only because the original bug we fixed (see commit
288d5abec831: "Boot up with usermodehelper disabled") was about problems
with some scheduler data structures not being initialized, and they had
better be initialized at that point.

But it really didn't make any other conceptual sense, and doing it after
the initial "schedule()" call for the idle thread actually opened up a
race: what if the main initialization thread did everything without
needing to sleep, and got all the way into user land too? Without
actually having scheduled back to the idle thread?

Now, in normal circumstances that doesn't ever happen, but it looks like
Richard Cochran triggered exactly that on his ARM IXP4xx machines:

  "I have some ARM IXP4xx based machines that use the two on chip MAC
   ports (aka NPEs).  The NPE needs a firmware in order to function.
   Ever since the following commit [that 288d5abec831 one], it is no
   longer possible to bring up the interfaces during the init scripts."

with a call trace showing an ioctl coming from user space. Richard says:

  "The init is busybox, and the startup script does mount, syslogd, and
   then ifup, so that all can go by quickly."

The fix is to move the usermodehelper_enable() into the main 'init'
thread, and just put it after we've done all our initcalls.  By then,
everything really should be up, but we've obviously not actually started
the user-mode portion of init yet.

Reported-and-tested-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agolibceph: fix pg_temp mapping update
Sage Weil [Wed, 28 Sep 2011 17:11:04 +0000 (10:11 -0700)]
libceph: fix pg_temp mapping update

The incremental map updates have a record for each pg_temp mapping that is
to be add/updated (len > 0) or removed (len == 0).  The old code was
written as if the updates were a complete enumeration; that was just wrong.
Update the code to remove 0-length entries and drop the rbtree traversal.

This avoids misdirected (and hung) requests that manifest as server
errors like

[WRN] client4104 10.0.1.219:0/275025290 misdirected client4104.1:129 0.1 to osd0 not [1,0] in e11/11

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agolibceph: fix pg_temp mapping calculation
Sage Weil [Wed, 28 Sep 2011 17:08:27 +0000 (10:08 -0700)]
libceph: fix pg_temp mapping calculation

We need to apply the modulo pg_num calculation before looking up a pgid in
the pg_temp mapping rbtree.  This fixes pg_temp mappings, and fixes
(some) misdirected requests that result in messages like

[WRN] client4104 10.0.1.219:0/275025290 misdirected client4104.1:129 0.1 to osd0 not [1,0] in e11/11

on the server and stall make the client block without getting a reply (at
least until the pg_temp mapping goes way, but that can take a long long
time).

Reorder calc_pg_raw() a bit to make more sense.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge git://github.com/davem330/net
Linus Torvalds [Wed, 28 Sep 2011 15:39:05 +0000 (08:39 -0700)]
Merge git://github.com/davem330/net

* git://github.com/davem330/net:
  ipv6-multicast: Fix memory leak in IPv6 multicast.
  ipv6: check return value for dst_alloc
  net: check return value for dst_alloc
  ipv6-multicast: Fix memory leak in input path.
  bnx2x: add missing break in bnx2x_dcbnl_get_cap
  bnx2x: fix WOL by enablement PME in config space
  bnx2x: fix hw attention handling
  net: fix a typo in Documentation/networking/scaling.txt
  ath9k: Fix a dma warning/memory leak
  rtlwifi: rtl8192cu: Fix unitialized struct
  iwlagn: fix dangling scan request
  batman-adv: do_bcast has to be true for broadcast packets only
  cfg80211: Fix validation of AKM suites
  iwlegacy: do not use interruptible waits
  iwlegacy: fix command queue timeout
  ath9k_hw: Fix Rx DMA stuck for AR9003 chips

13 years agoMerge git://bedivere.hansenpartnership.com/git/scsi-rc-fixes-2.6
Linus Torvalds [Wed, 28 Sep 2011 15:23:39 +0000 (08:23 -0700)]
Merge git://bedivere.hansenpartnership.com/git/scsi-rc-fixes-2.6

* git://bedivere.hansenpartnership.com/git/scsi-rc-fixes-2.6:
  [SCSI] 3w-9xxx: fix iommu_iova leak
  [SCSI] cxgb3i: convert cdev->l2opt to use rcu to prevent NULL dereference
  [SCSI] scsi: qla4xxx needs libiscsi.o
  [SCSI] libsas: fix failure to revalidate domain for anything but the first expander child.
  [SCSI] aacraid: reset should disable MSI interrupt

13 years agohwmon: (coretemp) Avoid leaving around dangling pointer
Guenter Roeck [Sat, 24 Sep 2011 22:27:04 +0000 (15:27 -0700)]
hwmon: (coretemp) Avoid leaving around dangling pointer

Storing the struct temp_data pointer allocated from create_core_data()
when returning an error has the potential of leaving around a pointer
to freed memory. Reset it to NULL for error returns.

Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
13 years agohwmon: (coretemp) Fixup platform device ID change
Jean Delvare [Wed, 28 Sep 2011 15:11:00 +0000 (08:11 -0700)]
hwmon: (coretemp) Fixup platform device ID change

With recent change "hwmon: (coretemp) don't use kernel assigned CPU
number as platform device ID", the microcode check is now running on
random CPU. Fix that by checking the microcode before creating the
platform device rather than at probe time.

Also avoid calling TO_PHYS_ID(cpu) twice in the same function, it's
expensive.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
13 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-block
Linus Torvalds [Wed, 28 Sep 2011 15:03:00 +0000 (08:03 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: Free queue resources at blk_release_queue()

13 years agoMerge branch 'writeback-for-linus' of git://github.com/fengguang/linux
Linus Torvalds [Wed, 28 Sep 2011 15:01:05 +0000 (08:01 -0700)]
Merge branch 'writeback-for-linus' of git://github.com/fengguang/linux

* 'writeback-for-linus' of git://github.com/fengguang/linux:
  writeback: show raw dirtied_when in trace writeback_single_inode

13 years agoblock: Free queue resources at blk_release_queue()
Hannes Reinecke [Wed, 28 Sep 2011 14:07:01 +0000 (08:07 -0600)]
block: Free queue resources at blk_release_queue()

A kernel crash is observed when a mounted ext3/ext4 filesystem is
physically removed. The problem is that blk_cleanup_queue() frees up
some resources eg by calling elevator_exit(), which are not checked for
in normal operation. So we should rather move these calls to the
destructor function blk_release_queue() as at that point all remaining
references are gone. However, in doing so we have to ensure that any
externally supplied queue_lock is disconnected as the driver might free
up the lock after the call of blk_cleanup_queue(),

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
13 years agoMerge branch 'for-davem' of git://git.infradead.org/users/linville/wireless
David S. Miller [Wed, 28 Sep 2011 02:42:30 +0000 (22:42 -0400)]
Merge branch 'for-davem' of git://git.infradead.org/users/linville/wireless

13 years agoLinux 3.1-rc8
Linus Torvalds [Tue, 27 Sep 2011 22:48:34 +0000 (15:48 -0700)]
Linux 3.1-rc8