GitHub/mt8127/android_kernel_alcatel_ttab.git
10 years agotg3: Allow for recieve of full-size 8021AD frames
Vlad Yasevich [Tue, 30 Sep 2014 23:39:36 +0000 (19:39 -0400)]
tg3: Allow for recieve of full-size 8021AD frames

[ Upstream commit 7d3083ee36b51e425b6abd76778a2046906b0fd3 ]

When receiving a vlan-tagged frame that still contains
a vlan header, the length of the packet will be greater
then MTU+ETH_HLEN since it will account of the extra
vlan header.  TG3 checks this for the case for 802.1Q,
but not for 802.1ad.  As a result, full sized 802.1ad
frames get dropped by the card.

Add a check for 802.1ad protocol when receving full
sized frames.

Suggested-by: Prashant Sreedharan <prashant@broadcom.com>
CC: Prashant Sreedharan <prashant@broadcom.com>
CC: Michael Chan <mchan@broadcom.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agotg3: Work around HW/FW limitations with vlan encapsulated frames
Vlad Yasevich [Thu, 18 Sep 2014 14:31:17 +0000 (10:31 -0400)]
tg3: Work around HW/FW limitations with vlan encapsulated frames

[ Upstream commit 476c18850c6cbaa3f2bb661ae9710645081563b9 ]

TG3 appears to have an issue performing TSO and checksum offloading
correclty when the frame has been vlan encapsulated (non-accelrated).
In these cases, tcp checksum is not correctly updated.

This patch attempts to work around this issue.  After the patch,
802.1ad vlans start working correctly over tg3 devices.

CC: Prashant Sreedharan <prashant@broadcom.com>
CC: Michael Chan <mchan@broadcom.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agol2tp: fix race while getting PMTU on PPP pseudo-wire
Guillaume Nault [Wed, 3 Sep 2014 12:12:55 +0000 (14:12 +0200)]
l2tp: fix race while getting PMTU on PPP pseudo-wire

[ Upstream commit eed4d839b0cdf9d84b0a9bc63de90fd5e1e886fb ]

Use dst_entry held by sk_dst_get() to retrieve tunnel's PMTU.

The dst_mtu(__sk_dst_get(tunnel->sock)) call was racy. __sk_dst_get()
could return NULL if tunnel->sock->sk_dst_cache was reset just before the
call, thus making dst_mtu() dereference a NULL pointer:

[ 1937.661598] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[ 1937.664005] IP: [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp]
[ 1937.664005] PGD daf0c067 PUD d9f93067 PMD 0
[ 1937.664005] Oops: 0000 [#1] SMP
[ 1937.664005] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables udp_tunnel pppoe pppox ppp_generic slhc deflate ctr twofish_generic twofish_x86_64_3way xts lrw gf128mul glue_helper twofish_x86_64 twofish_common blowfish_generic blowfish_x86_64 blowfish_common des_generic cbc xcbc rmd160 sha512_generic hmac crypto_null af_key xfrm_algo 8021q garp bridge stp llc tun atmtcp clip atm ext3 mbcache jbd iTCO_wdt coretemp kvm_intel iTCO_vendor_support kvm pcspkr evdev ehci_pci lpc_ich mfd_core i5400_edac edac_core i5k_amb shpchp button processor thermal_sys xfs crc32c_generic libcrc32c dm_mod usbhid sg hid sr_mod sd_mod cdrom crc_t10dif crct10dif_common ata_generic ahci ata_piix tg3 libahci libata uhci_hcd ptp ehci_hcd pps_core usbcore scsi_mod libphy usb_common [last unloaded: l2tp_core]
[ 1937.664005] CPU: 0 PID: 10022 Comm: l2tpstress Tainted: G           O   3.17.0-rc1 #1
[ 1937.664005] Hardware name: HP ProLiant DL160 G5, BIOS O12 08/22/2008
[ 1937.664005] task: ffff8800d8fda790 ti: ffff8800c43c4000 task.ti: ffff8800c43c4000
[ 1937.664005] RIP: 0010:[<ffffffffa049db88>]  [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp]
[ 1937.664005] RSP: 0018:ffff8800c43c7de8  EFLAGS: 00010282
[ 1937.664005] RAX: ffff8800da8a7240 RBX: ffff8800d8c64600 RCX: 000001c325a137b5
[ 1937.664005] RDX: 8c6318c6318c6320 RSI: 000000000000010c RDI: 0000000000000000
[ 1937.664005] RBP: ffff8800c43c7ea8 R08: 0000000000000000 R09: 0000000000000000
[ 1937.664005] R10: ffffffffa048e2c0 R11: ffff8800d8c64600 R12: ffff8800ca7a5000
[ 1937.664005] R13: ffff8800c439bf40 R14: 000000000000000c R15: 0000000000000009
[ 1937.664005] FS:  00007fd7f610f700(0000) GS:ffff88011a600000(0000) knlGS:0000000000000000
[ 1937.664005] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1937.664005] CR2: 0000000000000020 CR3: 00000000d9d75000 CR4: 00000000000027e0
[ 1937.664005] Stack:
[ 1937.664005]  ffffffffa049da80 ffff8800d8fda790 000000000000005b ffff880000000009
[ 1937.664005]  ffff8800daf3f200 0000000000000003 ffff8800c43c7e48 ffffffff81109b57
[ 1937.664005]  ffffffff81109b0e ffffffff8114c566 0000000000000000 0000000000000000
[ 1937.664005] Call Trace:
[ 1937.664005]  [<ffffffffa049da80>] ? pppol2tp_connect+0x235/0x41e [l2tp_ppp]
[ 1937.664005]  [<ffffffff81109b57>] ? might_fault+0x9e/0xa5
[ 1937.664005]  [<ffffffff81109b0e>] ? might_fault+0x55/0xa5
[ 1937.664005]  [<ffffffff8114c566>] ? rcu_read_unlock+0x1c/0x26
[ 1937.664005]  [<ffffffff81309196>] SYSC_connect+0x87/0xb1
[ 1937.664005]  [<ffffffff813e56f7>] ? sysret_check+0x1b/0x56
[ 1937.664005]  [<ffffffff8107590d>] ? trace_hardirqs_on_caller+0x145/0x1a1
[ 1937.664005]  [<ffffffff81213dee>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 1937.664005]  [<ffffffff8114c262>] ? spin_lock+0x9/0xb
[ 1937.664005]  [<ffffffff813092b4>] SyS_connect+0x9/0xb
[ 1937.664005]  [<ffffffff813e56d2>] system_call_fastpath+0x16/0x1b
[ 1937.664005] Code: 10 2a 84 81 e8 65 76 bd e0 65 ff 0c 25 10 bb 00 00 4d 85 ed 74 37 48 8b 85 60 ff ff ff 48 8b 80 88 01 00 00 48 8b b8 10 02 00 00 <48> 8b 47 20 ff 50 20 85 c0 74 0f 83 e8 28 89 83 10 01 00 00 89
[ 1937.664005] RIP  [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp]
[ 1937.664005]  RSP <ffff8800c43c7de8>
[ 1937.664005] CR2: 0000000000000020
[ 1939.559375] ---[ end trace 82d44500f28f8708 ]---

Fixes: f34c4a35d879 ("l2tp: take PMTU from tunnel UDP socket")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoopenvswitch: fix panic with multiple vlan headers
Jiri Benc [Thu, 21 Aug 2014 19:33:44 +0000 (21:33 +0200)]
openvswitch: fix panic with multiple vlan headers

[ Upstream commit 2ba5af42a7b59ef01f9081234d8855140738defd ]

When there are multiple vlan headers present in a received frame, the first
one is put into vlan_tci and protocol is set to ETH_P_8021Q. Anything in the
skb beyond the VLAN TPID may be still non-linear, including the inner TCI
and ethertype. While ovs_flow_extract takes care of IP and IPv6 headers, it
does nothing with ETH_P_8021Q. Later, if OVS_ACTION_ATTR_POP_VLAN is
executed, __pop_vlan_tci pulls the next vlan header into vlan_tci.

This leads to two things:

1. Part of the resulting ethernet header is in the non-linear part of the
   skb. When eth_type_trans is called later as the result of
   OVS_ACTION_ATTR_OUTPUT, kernel BUGs in __skb_pull. Also, __pop_vlan_tci
   is in fact accessing random data when it reads past the TPID.

2. network_header points into the ethernet header instead of behind it.
   mac_len is set to a wrong value (10), too.

Reported-by: Yulong Pei <ypei@redhat.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agopacket: handle too big packets for PACKET_V3
Eric Dumazet [Fri, 15 Aug 2014 16:16:04 +0000 (09:16 -0700)]
packet: handle too big packets for PACKET_V3

[ Upstream commit dc808110bb62b64a448696ecac3938902c92e1ab ]

af_packet can currently overwrite kernel memory by out of bound
accesses, because it assumed a [new] block can always hold one frame.

This is not generally the case, even if most existing tools do it right.

This patch clamps too long frames as API permits, and issue a one time
error on syslog.

[  394.357639] tpacket_rcv: packet too big, clamped from 5042 to 3966. macoff=82

In this example, packet header tp_snaplen was set to 3966,
and tp_len was set to 5042 (skb->len)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: f6fb8f100b80 ("af-packet: TPACKET_V3 flexible buffer implementation.")
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agotcp: fix tcp_release_cb() to dispatch via address family for mtu_reduced()
Neal Cardwell [Thu, 14 Aug 2014 16:40:05 +0000 (12:40 -0400)]
tcp: fix tcp_release_cb() to dispatch via address family for mtu_reduced()

[ Upstream commit 4fab9071950c2021d846e18351e0f46a1cffd67b ]

Make sure we use the correct address-family-specific function for
handling MTU reductions from within tcp_release_cb().

Previously AF_INET6 sockets were incorrectly always using the IPv6
code path when sometimes they were handling IPv4 traffic and thus had
an IPv4 dst.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Diagnosed-by: Willem de Bruijn <willemb@google.com>
Fixes: 563d34d057862 ("tcp: dont drop MTU reduction indications")
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agosit: Fix ipip6_tunnel_lookup device matching criteria
Shmulik Ladkani [Thu, 14 Aug 2014 12:27:20 +0000 (15:27 +0300)]
sit: Fix ipip6_tunnel_lookup device matching criteria

[ Upstream commit bc8fc7b8f825ef17a0fb9e68c18ce94fa66ab337 ]

As of 4fddbf5d78 ("sit: strictly restrict incoming traffic to tunnel link device"),
when looking up a tunnel, tunnel's underlying interface (t->parms.link)
is verified to match incoming traffic's ingress device.

However the comparison was incorrectly based on skb->dev->iflink.

Instead, dev->ifindex should be used, which correctly represents the
interface from which the IP stack hands the ipip6 packets.

This allows setting up sit tunnels bound to vlan interfaces (otherwise
incoming ipip6 traffic on the vlan interface was dropped due to
ipip6_tunnel_lookup match failure).

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agomyri10ge: check for DMA mapping errors
Stanislaw Gruszka [Tue, 12 Aug 2014 08:35:19 +0000 (10:35 +0200)]
myri10ge: check for DMA mapping errors

[ Upstream commit 10545937e866ccdbb7ab583031dbdcc6b14e4eb4 ]

On IOMMU systems DMA mapping can fail, we need to check for
that possibility.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoLinux 3.10.57
Greg Kroah-Hartman [Thu, 9 Oct 2014 19:18:54 +0000 (12:18 -0700)]
Linux 3.10.57

10 years agocpufreq: ondemand: Change the calculation of target frequency
Stratos Karafotis [Wed, 5 Jun 2013 16:01:25 +0000 (19:01 +0300)]
cpufreq: ondemand: Change the calculation of target frequency

commit dfa5bb622555d9da0df21b50f46ebdeef390041b upstream.

The ondemand governor calculates load in terms of frequency and
increases it only if load_freq is greater than up_threshold
multiplied by the current or average frequency.  This appears to
produce oscillations of frequency between min and max because,
for example, a relatively small load can easily saturate minimum
frequency and lead the CPU to the max.  Then, it will decrease
back to the min due to small load_freq.

Change the calculation method of load and target frequency on the
basis of the following two observations:

 - Load computation should not depend on the current or average
   measured frequency.  For example, absolute load of 80% at 100MHz
   is not necessarily equivalent to 8% at 1000MHz in the next
   sampling interval.

 - It should be possible to increase the target frequency to any
   value present in the frequency table proportional to the absolute
   load, rather than to the max only, so that:

   Target frequency = C * load

   where we take C = policy->cpuinfo.max_freq / 100.

Tested on Intel i7-3770 CPU @ 3.40GHz and on Quad core 1500MHz Krait.
Phoronix benchmark of Linux Kernel Compilation 3.1 test shows an
increase ~1.5% in performance. cpufreq_stats (time_in_state) shows
that middle frequencies are used more, with this patch.  Highest
and lowest frequencies were used less by ~9%.

[rjw: We have run multiple other tests on kernels with this
 change applied and in the vast majority of cases it turns out
 that the resulting performance improvement also leads to reduced
 consumption of energy.  The change is additionally justified by
 the overall simplification of the code in question.]

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agocpufreq: Fix wrong time unit conversion
Andreas Schwab [Sat, 7 Sep 2013 16:35:08 +0000 (18:35 +0200)]
cpufreq: Fix wrong time unit conversion

commit a857c0b9e24e39fe5be82451b65377795f9538d8 upstream.

The time spent by a CPU under a given frequency is stored in jiffies unit
in the cpu var cpufreq_stats_table->time_in_state[i], i being the index of
the frequency.

This is what is displayed in the following file on the right column:

     cat /sys/devices/system/cpu/cpuX/cpufreq/stats/time_in_state
     2301000 19835820
     2300000 3172
     [...]

Now cpufreq converts this jiffies unit delta to clock_t before returning it
to the user as in the above file. And that conversion is achieved using the API
cputime64_to_clock_t().

Although it accidentally works on traditional tick based cputime accounting, where
cputime_t maps directly to jiffies, it doesn't work with other types of cputime
accounting such as CONFIG_VIRT_CPU_ACCOUNTING_* where cputime_t can map to nsecs
or any granularity preffered by the architecture.

For example we get a buggy zero delta on full dyntick configurations:

     cat /sys/devices/system/cpu/cpuX/cpufreq/stats/time_in_state
     2301000 0
     2300000 0
     [...]

Fix this with using the proper jiffies_64_t to clock_t conversion.

Reported-and-tested-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agonl80211: clear skb cb before passing to netlink
Johannes Berg [Wed, 30 Jul 2014 12:55:26 +0000 (14:55 +0200)]
nl80211: clear skb cb before passing to netlink

commit bd8c78e78d5011d8111bc2533ee73b13a3bd6c42 upstream.

In testmode and vendor command reply/event SKBs we use the
skb cb data to store nl80211 parameters between allocation
and sending. This causes the code for CONFIG_NETLINK_MMAP
to get confused, because it takes ownership of the skb cb
data when the SKB is handed off to netlink, and it doesn't
explicitly clear it.

Clear the skb cb explicitly when we're done and before it
gets passed to netlink to avoid this issue.

Reported-by: Assaf Azulay <assaf.azulay@intel.com>
Reported-by: David Spinadel <david.spinadel@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrbd: fix regression 'out of mem, failed to invoke fence-peer helper'
Lars Ellenberg [Wed, 9 Jul 2014 19:18:32 +0000 (21:18 +0200)]
drbd: fix regression 'out of mem, failed to invoke fence-peer helper'

commit bbc1c5e8ad6dfebf9d13b8a4ccdf66c92913eac9 upstream.

Since linux kernel 3.13, kthread_run() internally uses
wait_for_completion_killable().  We sometimes may use kthread_run()
while we still have a signal pending, which we used to kick our threads
out of potentially blocking network functions, causing kthread_run() to
mistake that as a new fatal signal and fail.

Fix: flush_signals() before kthread_run().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agojiffies: Fix timeval conversion to jiffies
Andrew Hunter [Thu, 4 Sep 2014 21:17:16 +0000 (14:17 -0700)]
jiffies: Fix timeval conversion to jiffies

commit d78c9300c51d6ceed9f6d078d4e9366f259de28c upstream.

timeval_to_jiffies tried to round a timeval up to an integral number
of jiffies, but the logic for doing so was incorrect: intervals
corresponding to exactly N jiffies would become N+1. This manifested
itself particularly repeatedly stopping/starting an itimer:

setitimer(ITIMER_PROF, &val, NULL);
setitimer(ITIMER_PROF, NULL, &val);

would add a full tick to val, _even if it was exactly representable in
terms of jiffies_ (say, the result of a previous rounding.)  Doing
this repeatedly would cause unbounded growth in val.  So fix the math.

Here's what was wrong with the conversion: we essentially computed
(eliding seconds)

jiffies = usec  * (NSEC_PER_USEC/TICK_NSEC)

by using scaling arithmetic, which took the best approximation of
NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC =
x/(2^USEC_JIFFIE_SC), and computed:

jiffies = (usec * x) >> USEC_JIFFIE_SC

and rounded this calculation up in the intermediate form (since we
can't necessarily exactly represent TICK_NSEC in usec.) But the
scaling arithmetic is a (very slight) *over*approximation of the true
value; that is, instead of dividing by (1 usec/ 1 jiffie), we
effectively divided by (1 usec/1 jiffie)-epsilon (rounding
down). This would normally be fine, but we want to round timeouts up,
and we did so by adding 2^USEC_JIFFIE_SC - 1 before the shift; this
would be fine if our division was exact, but dividing this by the
slightly smaller factor was equivalent to adding just _over_ 1 to the
final result (instead of just _under_ 1, as desired.)

In particular, with HZ=1000, we consistently computed that 10000 usec
was 11 jiffies; the same was true for any exact multiple of
TICK_NSEC.

We could possibly still round in the intermediate form, adding
something less than 2^USEC_JIFFIE_SC - 1, but easier still is to
convert usec->nsec, round in nanoseconds, and then convert using
time*spec*_to_jiffies.  This adds one constant multiplication, and is
not observably slower in microbenchmarks on recent x86 hardware.

Tested: the following program:

int main() {
  struct itimerval zero = {{0, 0}, {0, 0}};
  /* Initially set to 10 ms. */
  struct itimerval initial = zero;
  initial.it_interval.tv_usec = 10000;
  setitimer(ITIMER_PROF, &initial, NULL);
  /* Save and restore several times. */
  for (size_t i = 0; i < 10; ++i) {
    struct itimerval prev;
    setitimer(ITIMER_PROF, &zero, &prev);
    /* on old kernels, this goes up by TICK_USEC every iteration */
    printf("previous value: %ld %ld %ld %ld\n",
           prev.it_interval.tv_sec, prev.it_interval.tv_usec,
           prev.it_value.tv_sec, prev.it_value.tv_usec);
    setitimer(ITIMER_PROF, &prev, NULL);
  }
    return 0;
}

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Turner <pjt@google.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: Paul Turner <pjt@google.com>
Reported-by: Aaron Jacobs <jacobsa@google.com>
Signed-off-by: Andrew Hunter <ahh@google.com>
[jstultz: Tweaked to apply to 3.17-rc]
Signed-off-by: John Stultz <john.stultz@linaro.org>
[bwh: Backported to 3.16: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agomd/raid5: disable 'DISCARD' by default due to safety concerns.
NeilBrown [Thu, 2 Oct 2014 03:45:00 +0000 (13:45 +1000)]
md/raid5: disable 'DISCARD' by default due to safety concerns.

commit 8e0e99ba64c7ba46133a7c8a3e3f7de01f23bd93 upstream.

It has come to my attention (thanks Martin) that 'discard_zeroes_data'
is only a hint.  Some devices in some cases don't do what it
says on the label.

The use of DISCARD in RAID5 depends on reads from discarded regions
being predictably zero.  If a write to a previously discarded region
performs a read-modify-write cycle it assumes that the parity block
was consistent with the data blocks.  If all were zero, this would
be the case.  If some are and some aren't this would not be the case.
This could lead to data corruption after a device failure when
data needs to be reconstructed from the parity.

As we cannot trust 'discard_zeroes_data', ignore it by default
and so disallow DISCARD on all raid4/5/6 arrays.

As many devices are trustworthy, and as there are benefits to using
DISCARD, add a module parameter to over-ride this caution and cause
DISCARD to work if discard_zeroes_data is set.

If a site want to enable DISCARD on some arrays but not on others they
should select DISCARD support at the filesystem level, and set the
raid456 module parameter.
    raid456.devices_handle_discard_safely=Y

As this is a data-safety issue, I believe this patch is suitable for
-stable.
DISCARD support for RAID456 was added in 3.7

Cc: Shaohua Li <shli@kernel.org>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Heinz Mauelshagen <heinzm@redhat.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Fixes: 620125f2bf8ff0c4969b79653b54d7bcc9d40637
Signed-off-by: NeilBrown <neilb@suse.de>
[bwh: Backported to 3.10: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agomedia: vb2: fix VBI/poll regression
Hans Verkuil [Sat, 20 Sep 2014 19:16:35 +0000 (16:16 -0300)]
media: vb2: fix VBI/poll regression

commit 58d75f4b1ce26324b4d809b18f94819843a98731 upstream.

The recent conversion of saa7134 to vb2 unconvered a poll() bug that
broke the teletext applications alevt and mtt. These applications
expect that calling poll() without having called VIDIOC_STREAMON will
cause poll() to return POLLERR. That did not happen in vb2.

This patch fixes that behavior. It also fixes what should happen when
poll() is called when STREAMON is called but no buffers have been
queued. In that case poll() will also return POLLERR, but only for
capture queues since output queues will always return POLLOUT
anyway in that situation.

This brings the vb2 behavior in line with the old videobuf behavior.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agomm: numa: Do not mark PTEs pte_numa when splitting huge pages
Mel Gorman [Thu, 2 Oct 2014 18:47:42 +0000 (19:47 +0100)]
mm: numa: Do not mark PTEs pte_numa when splitting huge pages

commit abc40bd2eeb77eb7c2effcaf63154aad929a1d5f upstream.

This patch reverts 1ba6e0b50b ("mm: numa: split_huge_page: transfer the
NUMA type from the pmd to the pte"). If a huge page is being split due
a protection change and the tail will be in a PROT_NONE vma then NUMA
hinting PTEs are temporarily created in the protected VMA.

 VM_RW|VM_PROTNONE
|-----------------|
      ^
      split here

In the specific case above, it should get fixed up by change_pte_range()
but there is a window of opportunity for weirdness to happen. Similarly,
if a huge page is shrunk and split during a protection update but before
pmd_numa is cleared then a pte_numa can be left behind.

Instead of adding complexity trying to deal with the case, this patch
will not mark PTEs NUMA when splitting a huge page. NUMA hinting faults
will not be triggered which is marginal in comparison to the complexity
in dealing with the corner cases during THP split.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agomm, thp: move invariant bug check out of loop in __split_huge_page_map
Waiman Long [Wed, 6 Aug 2014 23:05:36 +0000 (16:05 -0700)]
mm, thp: move invariant bug check out of loop in __split_huge_page_map

commit f8303c2582b889351e261ff18c4d8eb197a77db2 upstream.

In __split_huge_page_map(), the check for page_mapcount(page) is
invariant within the for loop.  Because of the fact that the macro is
implemented using atomic_read(), the redundant check cannot be optimized
away by the compiler leading to unnecessary read to the page structure.

This patch moves the invariant bug check out of the loop so that it will
be done only once.  On a 3.16-rc1 based kernel, the execution time of a
microbenchmark that broke up 1000 transparent huge pages using munmap()
had an execution time of 38,245us and 38,548us with and without the
patch respectively.  The performance gain is about 1%.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Scott J Norton <scott.norton@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoring-buffer: Fix infinite spin in reading buffer
Steven Rostedt (Red Hat) [Thu, 2 Oct 2014 20:51:18 +0000 (16:51 -0400)]
ring-buffer: Fix infinite spin in reading buffer

commit 24607f114fd14f2f37e3e0cb3d47bce96e81e848 upstream.

Commit 651e22f2701b "ring-buffer: Always reset iterator to reader page"
fixed one bug but in the process caused another one. The reset is to
update the header page, but that fix also changed the way the cached
reads were updated. The cache reads are used to test if an iterator
needs to be updated or not.

A ring buffer iterator, when created, disables writes to the ring buffer
but does not stop other readers or consuming reads from happening.
Although all readers are synchronized via a lock, they are only
synchronized when in the ring buffer functions. Those functions may
be called by any number of readers. The iterator continues down when
its not interrupted by a consuming reader. If a consuming read
occurs, the iterator starts from the beginning of the buffer.

The way the iterator sees that a consuming read has happened since
its last read is by checking the reader "cache". The cache holds the
last counts of the read and the reader page itself.

Commit 651e22f2701b changed what was saved by the cache_read when
the rb_iter_reset() occurred, making the iterator never match the cache.
Then if the iterator calls rb_iter_reset(), it will go into an
infinite loop by checking if the cache doesn't match, doing the reset
and retrying, just to see that the cache still doesn't match! Which
should never happen as the reset is suppose to set the cache to the
current value and there's locks that keep a consuming reader from
having access to the data.

Fixes: 651e22f2701b "ring-buffer: Always reset iterator to reader page"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoinit/Kconfig: Fix HAVE_FUTEX_CMPXCHG to not break up the EXPERT menu
Josh Triplett [Fri, 3 Oct 2014 23:19:24 +0000 (16:19 -0700)]
init/Kconfig: Fix HAVE_FUTEX_CMPXCHG to not break up the EXPERT menu

commit 62b4d2041117f35ab2409c9f5c4b8d3dc8e59d0f upstream.

commit 03b8c7b623c80af264c4c8d6111e5c6289933666 ("futex: Allow
architectures to skip futex_atomic_cmpxchg_inatomic() test") added the
HAVE_FUTEX_CMPXCHG symbol right below FUTEX.  This placed it right in
the middle of the options for the EXPERT menu.  However,
HAVE_FUTEX_CMPXCHG does not depend on EXPERT or FUTEX, so Kconfig stops
placing items in the EXPERT menu, and displays the remaining several
EXPERT items (starting with EPOLL) directly in the General Setup menu.

Since both users of HAVE_FUTEX_CMPXCHG only select it "if FUTEX", make
HAVE_FUTEX_CMPXCHG itself depend on FUTEX.  With this change, the
subsequent items display as part of the EXPERT menu again; the EMBEDDED
menu now appears as the next top-level item in the General Setup menu,
which makes General Setup much shorter and more usable.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoperf: fix perf bug in fork()
Peter Zijlstra [Thu, 2 Oct 2014 23:17:02 +0000 (16:17 -0700)]
perf: fix perf bug in fork()

commit 6c72e3501d0d62fc064d3680e5234f3463ec5a86 upstream.

Oleg noticed that a cleanup by Sylvain actually uncovered a bug; by
calling perf_event_free_task() when failing sched_fork() we will not yet
have done the memset() on ->perf_event_ctxp[] and will therefore try and
'free' the inherited contexts, which are still in use by the parent
process.  This is bad..

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Sylvain 'ythier' Hitier <sylvain.hitier@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoudf: Avoid infinite loop when processing indirect ICBs
Jan Kara [Thu, 4 Sep 2014 12:06:55 +0000 (14:06 +0200)]
udf: Avoid infinite loop when processing indirect ICBs

commit c03aa9f6e1f938618e6db2e23afef0574efeeb65 upstream.

We did not implement any bound on number of indirect ICBs we follow when
loading inode. Thus corrupted medium could cause kernel to go into an
infinite loop, possibly causing a stack overflow.

Fix the possible stack overflow by removing recursion from
__udf_read_inode() and limit number of indirect ICBs we follow to avoid
infinite loops.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Chuck Ebbert <cebbert.lkml@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoLinux 3.10.56
Greg Kroah-Hartman [Sun, 5 Oct 2014 21:54:30 +0000 (14:54 -0700)]
Linux 3.10.56

10 years agovm_is_stack: use for_each_thread() rather then buggy while_each_thread()
Oleg Nesterov [Fri, 8 Aug 2014 21:19:17 +0000 (14:19 -0700)]
vm_is_stack: use for_each_thread() rather then buggy while_each_thread()

commit 4449a51a7c281602d3a385044ab928322a122a02 upstream.

Aleksei hit the soft lockup during reading /proc/PID/smaps.  David
investigated the problem and suggested the right fix.

while_each_thread() is racy and should die, this patch updates
vm_is_stack().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Aleksei Besogonov <alex.besogonov@gmail.com>
Tested-by: Aleksei Besogonov <alex.besogonov@gmail.com>
Suggested-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Li Zefan <lizefan@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agooom_kill: add rcu_read_lock() into find_lock_task_mm()
Oleg Nesterov [Tue, 21 Jan 2014 23:50:01 +0000 (15:50 -0800)]
oom_kill: add rcu_read_lock() into find_lock_task_mm()

commit 4d4048be8a93769350efa31d2482a038b7de73d0 upstream.

find_lock_task_mm() expects it is called under rcu or tasklist lock, but
it seems that at least oom_unkillable_task()->task_in_mem_cgroup() and
mem_cgroup_out_of_memory()->oom_badness() can call it lockless.

Perhaps we could fix the callers, but this patch simply adds rcu lock
into find_lock_task_mm().  This also allows to simplify a bit one of its
callers, oom_kill_process().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Sergey Dyasly <dserrg@gmail.com>
Cc: Sameer Nanda <snanda@chromium.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: "Ma, Xindong" <xindong.ma@intel.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: "Tu, Xiaobing" <xiaobing.tu@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Li Zefan <lizefan@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agooom_kill: has_intersects_mems_allowed() needs rcu_read_lock()
Oleg Nesterov [Tue, 21 Jan 2014 23:50:00 +0000 (15:50 -0800)]
oom_kill: has_intersects_mems_allowed() needs rcu_read_lock()

commit ad96244179fbd55b40c00f10f399bc04739b8e1f upstream.

At least out_of_memory() calls has_intersects_mems_allowed() without
even rcu_read_lock(), this is obviously buggy.

Add the necessary rcu_read_lock().  This means that we can not simply
return from the loop, we need "bool ret" and "break".

While at it, swap the names of task_struct's (the argument and the
local).  This cleans up the code a little bit and avoids the unnecessary
initialization.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Sergey Dyasly <dserrg@gmail.com>
Tested-by: Sergey Dyasly <dserrg@gmail.com>
Reviewed-by: Sameer Nanda <snanda@chromium.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: "Ma, Xindong" <xindong.ma@intel.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: "Tu, Xiaobing" <xiaobing.tu@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Li Zefan <lizefan@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agooom_kill: change oom_kill.c to use for_each_thread()
Oleg Nesterov [Tue, 21 Jan 2014 23:49:58 +0000 (15:49 -0800)]
oom_kill: change oom_kill.c to use for_each_thread()

commit 1da4db0cd5c8a31d4468ec906b413e75e604b465 upstream.

Change oom_kill.c to use for_each_thread() rather than the racy
while_each_thread() which can loop forever if we race with exit.

Note also that most users were buggy even if while_each_thread() was
fine, the task can exit even _before_ rcu_read_lock().

Fortunately the new for_each_thread() only requires the stable
task_struct, so this change fixes both problems.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Sergey Dyasly <dserrg@gmail.com>
Tested-by: Sergey Dyasly <dserrg@gmail.com>
Reviewed-by: Sameer Nanda <snanda@chromium.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: "Ma, Xindong" <xindong.ma@intel.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: "Tu, Xiaobing" <xiaobing.tu@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Li Zefan <lizefan@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agointroduce for_each_thread() to replace the buggy while_each_thread()
Oleg Nesterov [Tue, 21 Jan 2014 23:49:56 +0000 (15:49 -0800)]
introduce for_each_thread() to replace the buggy while_each_thread()

commit 0c740d0afc3bff0a097ad03a1c8df92757516f5c upstream.

while_each_thread() and next_thread() should die, almost every lockless
usage is wrong.

1. Unless g == current, the lockless while_each_thread() is not safe.

   while_each_thread(g, t) can loop forever if g exits, next_thread()
   can't reach the unhashed thread in this case. Note that this can
   happen even if g is the group leader, it can exec.

2. Even if while_each_thread() itself was correct, people often use
   it wrongly.

   It was never safe to just take rcu_read_lock() and loop unless
   you verify that pid_alive(g) == T, even the first next_thread()
   can point to the already freed/reused memory.

This patch adds signal_struct->thread_head and task->thread_node to
create the normal rcu-safe list with the stable head.  The new
for_each_thread(g, t) helper is always safe under rcu_read_lock() as
long as this task_struct can't go away.

Note: of course it is ugly to have both task_struct->thread_node and the
old task_struct->thread_group, we will kill it later, after we change
the users of while_each_thread() to use for_each_thread().

Perhaps we can kill it even before we convert all users, we can
reimplement next_thread(t) using the new thread_head/thread_node.  But
we can't do this right now because this will lead to subtle behavioural
changes.  For example, do/while_each_thread() always sees at least one
task, while for_each_thread() can do nothing if the whole thread group
has died.  Or thread_group_empty(), currently its semantics is not clear
unless thread_group_leader(p) and we need to audit the callers before we
can change it.

So this patch adds the new interface which has to coexist with the old
one for some time, hopefully the next changes will be more or less
straightforward and the old one will go away soon.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Sergey Dyasly <dserrg@gmail.com>
Tested-by: Sergey Dyasly <dserrg@gmail.com>
Reviewed-by: Sameer Nanda <snanda@chromium.org>
Acked-by: David Rientjes <rientjes@google.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: "Ma, Xindong" <xindong.ma@intel.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: "Tu, Xiaobing" <xiaobing.tu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Li Zefan <lizefan@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agokernel/fork.c:copy_process(): unify CLONE_THREAD-or-thread_group_leader code
Oleg Nesterov [Wed, 3 Jul 2013 22:08:30 +0000 (15:08 -0700)]
kernel/fork.c:copy_process(): unify CLONE_THREAD-or-thread_group_leader code

commit 80628ca06c5d42929de6bc22c0a41589a834d151 upstream.

Cleanup and preparation for the next changes.

Move the "if (clone_flags & CLONE_THREAD)" code down under "if
(likely(p->pid))" and turn it into into the "else" branch.  This makes the
process/thread initialization more symmetrical and removes one check.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Sergey Dyasly <dserrg@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Li Zefan <lizefan@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoarm: multi_v7_defconfig: Enable Zynq UART driver
Soren Brinkmann [Wed, 19 Jun 2013 17:53:03 +0000 (10:53 -0700)]
arm: multi_v7_defconfig: Enable Zynq UART driver

commit 90de827b9c238f8d8209bc7adc70190575514315 upstream.

Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoext2: Fix fs corruption in ext2_get_xip_mem()
Jan Kara [Tue, 5 Nov 2013 00:15:38 +0000 (01:15 +0100)]
ext2: Fix fs corruption in ext2_get_xip_mem()

commit 7ba3ec5749ddb61f79f7be17b5fd7720eebc52de upstream.

Commit 8e3dffc651cb "Ext2: mark inode dirty after the function
dquot_free_block_nodirty is called" unveiled a bug in __ext2_get_block()
called from ext2_get_xip_mem(). That function called ext2_get_block()
mistakenly asking it to map 0 blocks while 1 was intended. Before the
above mentioned commit things worked out fine by luck but after that commit
we started returning that we allocated 0 blocks while we in fact
allocated 1 block and thus allocation was looping until all blocks in
the filesystem were exhausted.

Fix the problem by properly asking for one block and also add assertion
in ext2_get_blocks() to catch similar problems.

Reported-and-tested-by: Andiry Xu <andiry.xu@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoserial: 8250_dma: check the result of TX buffer mapping
Heikki Krogerus [Mon, 28 Apr 2014 12:59:56 +0000 (15:59 +0300)]
serial: 8250_dma: check the result of TX buffer mapping

commit d4089a332883ad969700aac5dd4dd5f1c4fee825 upstream.

Using dma_mapping_error() to make sure the mapping did not
fail.

Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: "Petallo, MauriceX R" <mauricex.r.petallo@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoARM: 7748/1: oabi: handle faults when loading swi instruction from userspace
Will Deacon [Wed, 5 Jun 2013 10:25:13 +0000 (11:25 +0100)]
ARM: 7748/1: oabi: handle faults when loading swi instruction from userspace

commit 1aa2b3b7a6c4f3dbd3671171113a20e6a6190e3b upstream.

Running an OABI_COMPAT kernel on an SMP platform can lead to fun and
games with page aging.

If one CPU issues a swi instruction immediately before another CPU
decides to mkold the page containing the swi instruction, then we will
fault attempting to load the instruction during the vector_swi handler
in order to retrieve its immediate field. Since this fault is not
currently dealt with by our exception tables, this results in a panic:

  Unable to handle kernel paging request at virtual address 4020841c
  pgd = c490c000
  [4020841c] *pgd=84451831, *pte=bf05859d, *ppte=00000000
  Internal error: Oops: 17 [#1] PREEMPT SMP ARM
  Modules linked in: hid_sony(O)
  CPU: 1    Tainted: G        W  O  (3.4.0-perf-gf496dca-01162-gcbcc62b #1)
  PC is at vector_swi+0x28/0x88
  LR is at 0x40208420

This patch wraps all of the swi instruction loads with the USER macro
and provides a shared exception table entry which simply rewinds the
saved user PC and returns from the system call (without setting tbl, so
there's no worries with tracing or syscall restarting). Returning to
userspace will re-enter the page fault handler, from where we will
probably send SIGSEGV to the current task.

Reported-by: Wang, Yalin <yalin.wang@sonymobile.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agonetfilter: nf_conntrack: avoid large timeout for mid-stream pickup
Florian Westphal [Thu, 13 Jun 2013 15:31:28 +0000 (17:31 +0200)]
netfilter: nf_conntrack: avoid large timeout for mid-stream pickup

commit 6547a221871f139cc56328a38105d47c14874cbe upstream.

When loose tracking is enabled (default), non-syn packets cause
creation of new conntracks in established state with default timeout for
established state (5 days).  This causes the table to fill up with UNREPLIED
when the 'new ack' packet happened to be the last-ack of a previous,
already timed-out connection.

Consider:

A 192.168.x.52792 > 10.184.y.80: F, 426:426(0) ack 9237 win 255
B 10.184.y.80 > 192.168.x.52792: ., ack 427 win 123
<61 second pause>
C 10.184.y.80 > 192.168.x.52792: F, 9237:9237(0) ack 427 win 123
D 192.168.x.52792 > 10.184.y.80: ., ack 9238 win 255

B moves conntrack to CLOSE_WAIT and will kill it after 60 second timeout,
C is ignored (FIN set), but last packet (D) causes new ct with 5-days timeout.

Use UNACK timeout (5 minutes) instead to get rid of these entries sooner
when in ESTABLISHED state without having seen traffic in both directions.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Florian Koch <florian.koch1981@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoPM / sleep: Use valid_state() for platform-dependent sleep states only
Rafael J. Wysocki [Mon, 26 May 2014 11:40:53 +0000 (13:40 +0200)]
PM / sleep: Use valid_state() for platform-dependent sleep states only

commit 43e8317b0bba1d6eb85f38a4a233d82d7c20d732 upstream.

Use the observation that, for platform-dependent sleep states
(PM_SUSPEND_STANDBY, PM_SUSPEND_MEM), a given state is either
always supported or always unsupported and store that information
in pm_states[] instead of calling valid_state() every time we
need to check it.

Also do not use valid_state() for PM_SUSPEND_FREEZE, which is always
valid, and move the pm_test_level validity check for PM_SUSPEND_FREEZE
directly into enter_state().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoPM / sleep: Add state field to pm_states[] entries
Rafael J. Wysocki [Mon, 26 May 2014 11:40:47 +0000 (13:40 +0200)]
PM / sleep: Add state field to pm_states[] entries

commit 27ddcc6596e50cb8f03d2e83248897667811d8f6 upstream.

To allow sleep states corresponding to the "mem", "standby" and
"freeze" lables to be different from the pm_states[] indexes of
those strings, introduce struct pm_sleep_state, consisting of
a string label and a state number, and turn pm_states[] into an
array of objects of that type.

This modification should not lead to any functional changes.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoipvs: fix ipv6 hook registration for local replies
Julian Anastasov [Fri, 22 Aug 2014 14:53:41 +0000 (17:53 +0300)]
ipvs: fix ipv6 hook registration for local replies

commit eb90b0c734ad793d5f5bf230a9e9a4dcc48df8aa upstream.

commit fc604767613b6d2036cdc35b660bc39451040a47
("ipvs: changes for local real server") from 2.6.37
introduced DNAT support to local real server but the
IPv6 LOCAL_OUT handler ip_vs_local_reply6() is
registered incorrectly as IPv4 hook causing any outgoing
IPv4 traffic to be dropped depending on the IP header values.

Chris tracked down the problem to CONFIG_IP_VS_IPV6=y
Bug report: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349768

Reported-by: Chris J Arges <chris.j.arges@canonical.com>
Tested-by: Chris J Arges <chris.j.arges@canonical.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoipvs: Maintain all DSCP and ECN bits for ipv6 tun forwarding
Alex Gartrell [Wed, 16 Jul 2014 22:57:34 +0000 (15:57 -0700)]
ipvs: Maintain all DSCP and ECN bits for ipv6 tun forwarding

commit 76f084bc10004b3050b2cff9cfac29148f1f6088 upstream.

Previously, only the four high bits of the tclass were maintained in the
ipv6 case.  This matches the behavior of ipv4, though whether or not we
should reflect ECN bits may be up for debate.

Signed-off-by: Alex Gartrell <agartrell@fb.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoipvs: avoid netns exit crash on ip_vs_conn_drop_conntrack
Julian Anastasov [Thu, 10 Jul 2014 06:24:01 +0000 (09:24 +0300)]
ipvs: avoid netns exit crash on ip_vs_conn_drop_conntrack

commit 2627b7e15c5064ddd5e578e4efd948d48d531a3f upstream.

commit 8f4e0a18682d91 ("IPVS netns exit causes crash in conntrack")
added second ip_vs_conn_drop_conntrack call instead of just adding
the needed check. As result, the first call still can cause
crash on netns exit. Remove it.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agomd/raid1: fix_read_error should act on all non-faulty devices.
NeilBrown [Thu, 18 Sep 2014 01:09:04 +0000 (11:09 +1000)]
md/raid1: fix_read_error should act on all non-faulty devices.

commit b8cb6b4c121e1bf1963c16ed69e7adcb1bc301cd upstream.

If a devices is being recovered it is not InSync and is not Faulty.

If a read error is experienced on that device, fix_read_error()
will be called, but it ignores non-InSync devices.  So it will
neither fix the error nor fail the device.

It is incorrect that fix_read_error() ignores non-InSync devices.
It should only ignore Faulty devices.  So fix it.

This became a bug when we allowed reading from a device that was being
recovered.  It is suitable for any subsequent -stable kernel.

Fixes: da8840a747c0dbf49506ec906757a6b87b9741e9
Reported-by: Alexander Lyakas <alex.bolshoy@gmail.com>
Tested-by: Alexander Lyakas <alex.bolshoy@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agomedia: cx18: fix kernel oops with tda8290 tuner
Hans Verkuil [Tue, 26 Aug 2014 05:59:53 +0000 (02:59 -0300)]
media: cx18: fix kernel oops with tda8290 tuner

commit 6a03dc92cc2edfa2257502557b9f714893987383 upstream.

This was caused by an uninitialized setup.config field.

Based on a suggestion from Devin Heitmueller.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Thanks-to: Devin Heitmueller <dheitmueller@kernellabs.com>
Reported-by: Scott Robinson <scott.robinson55@gmail.com>
Tested-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoFix nasty 32-bit overflow bug in buffer i/o code.
Anton Altaparmakov [Mon, 22 Sep 2014 00:53:03 +0000 (01:53 +0100)]
Fix nasty 32-bit overflow bug in buffer i/o code.

commit f2d5a94436cc7cc0221b9a81bba2276a25187dd3 upstream.

On 32-bit architectures, the legacy buffer_head functions are not always
handling the sector number with the proper 64-bit types, and will thus
fail on 4TB+ disks.

Any code that uses __getblk() (and thus bread(), breadahead(),
sb_bread(), sb_breadahead(), sb_getblk()), and calls it using a 64-bit
block on a 32-bit arch (where "long" is 32-bit) causes an inifinite loop
in __getblk_slow() with an infinite stream of errors logged to dmesg
like this:

  __find_get_block_slow() failed. block=6740375944, b_blocknr=2445408648
  b_state=0x00000020, b_size=512
  device sda1 blocksize: 512

Note how in hex block is 0x191C1F988 and b_blocknr is 0x91C1F988 i.e. the
top 32-bits are missing (in this case the 0x1 at the top).

This is because grow_dev_page() is broken and has a 32-bit overflow due
to shifting the page index value (a pgoff_t - which is just 32 bits on
32-bit architectures) left-shifted as the block number.  But the top
bits to get lost as the pgoff_t is not type cast to sector_t / 64-bit
before the shift.

This patch fixes this issue by type casting "index" to sector_t before
doing the left shift.

Note this is not a theoretical bug but has been seen in the field on a
4TiB hard drive with logical sector size 512 bytes.

This patch has been verified to fix the infinite loop problem on 3.17-rc5
kernel using a 4TB disk image mounted using "-o loop".  Without this patch
doing a "find /nt" where /nt is an NTFS volume causes the inifinite loop
100% reproducibly whilst with the patch it works fine as expected.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoperf kmem: Make it work again on non NUMA machines
Jiri Olsa [Thu, 12 Sep 2013 16:39:36 +0000 (18:39 +0200)]
perf kmem: Make it work again on non NUMA machines

commit 4921e320244e099bdf237fd10428594ce5f5b87d upstream.

The commit '2814eb0 perf kmem: Remove die() calls' disabled 'perf kmem'
command for machines without numa support. It made the command fail if
'/sys/devices/system/node' dir wasn't found.

Skipping the numa based initialization in case the directory is not
found and continue execution.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1379003976-5839-5-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: zhangzhiqiang <zhangzhiqiang.zhang@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoperf: Fix a race condition in perf_remove_from_context()
Cong Wang [Tue, 2 Sep 2014 22:27:20 +0000 (15:27 -0700)]
perf: Fix a race condition in perf_remove_from_context()

commit 3577af70a2ce4853d58e57d832e687d739281479 upstream.

We saw a kernel soft lockup in perf_remove_from_context(),
it looks like the `perf` process, when exiting, could not go
out of the retry loop. Meanwhile, the target process was forking
a child. So either the target process should execute the smp
function call to deactive the event (if it was running) or it should
do a context switch which deactives the event.

It seems we optimize out a context switch in perf_event_context_sched_out(),
and what's more important, we still test an obsolete task pointer when
retrying, so no one actually would deactive that event in this situation.
Fix it directly by reloading the task pointer in perf_remove_from_context().

This should cure the above soft lockup.

Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1409696840-843-1-git-send-email-xiyou.wangcong@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoalarmtimer: Lock k_itimer during timer callback
Richard Larocque [Wed, 10 Sep 2014 01:31:05 +0000 (18:31 -0700)]
alarmtimer: Lock k_itimer during timer callback

commit 474e941bed9262f5fa2394f9a4a67e24499e5926 upstream.

Locks the k_itimer's it_lock member when handling the alarm timer's
expiry callback.

The regular posix timers defined in posix-timers.c have this lock held
during timout processing because their callbacks are routed through
posix_timer_fn().  The alarm timers follow a different path, so they
ought to grab the lock somewhere else.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Sharvil Nanavati <sharvil@google.com>
Signed-off-by: Richard Larocque <rlarocque@google.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoalarmtimer: Do not signal SIGEV_NONE timers
Richard Larocque [Wed, 10 Sep 2014 01:31:04 +0000 (18:31 -0700)]
alarmtimer: Do not signal SIGEV_NONE timers

commit 265b81d23a46c39df0a735a3af4238954b41a4c2 upstream.

Avoids sending a signal to alarm timers created with sigev_notify set to
SIGEV_NONE by checking for that special case in the timeout callback.

The regular posix timers avoid sending signals to SIGEV_NONE timers by
not scheduling any callbacks for them in the first place.  Although it
would be possible to do something similar for alarm timers, it's simpler
to handle this as a special case in the timeout.

Prior to this patch, the alarm timer would ignore the sigev_notify value
and try to deliver signals to the process anyway.  Even worse, the
sanity check for the value of sigev_signo is skipped when SIGEV_NONE was
specified, so the signal number could be bogus.  If sigev_signo was an
unitialized value (as it often would be if SIGEV_NONE is used), then
it's hard to predict which signal will be sent.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Sharvil Nanavati <sharvil@google.com>
Signed-off-by: Richard Larocque <rlarocque@google.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoparisc: Only use -mfast-indirect-calls option for 32-bit kernel builds
John David Anglin [Tue, 23 Sep 2014 00:54:50 +0000 (20:54 -0400)]
parisc: Only use -mfast-indirect-calls option for 32-bit kernel builds

commit d26a7730b5874a5fa6779c62f4ad7c5065a94723 upstream.

In spite of what the GCC manual says, the -mfast-indirect-calls has
never been supported in the 64-bit parisc compiler. Indirect calls have
always been done using function descriptors irrespective of the
-mfast-indirect-calls option.

Recently, it was noticed that a function descriptor was always requested
when the -mfast-indirect-calls option was specified. This caused
problems when the option was used in  application code and doesn't make
any sense because the whole point of the option is to avoid using a
function descriptor for indirect calls.

Fixing this broke 64-bit kernel builds.

I will fix GCC but for now we need the attached change. This results in
the same kernel code as before.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agopowerpc/perf: Fix ABIv2 kernel backtraces
Anton Blanchard [Tue, 26 Aug 2014 02:44:15 +0000 (12:44 +1000)]
powerpc/perf: Fix ABIv2 kernel backtraces

commit 85101af13bb854a6572fa540df7c7201958624b9 upstream.

ABIv2 kernels are failing to backtrace through the kernel. An example:

39.30%  readseek2_proce  [kernel.kallsyms]    [k] find_get_entry
            |
            --- find_get_entry
               __GI___libc_read

The problem is in valid_next_sp() where we check that the new stack
pointer is at least STACK_FRAME_OVERHEAD below the previous one.

ABIv1 has a minimum stack frame size of 112 bytes consisting of 48 bytes
and 64 bytes of parameter save area. ABIv2 changes that to 32 bytes
with no paramter save area.

STACK_FRAME_OVERHEAD is in theory the minimum stack frame size,
but we over 240 uses of it, some of which assume that it includes
space for the parameter area.

We need to work through all our stack defines and rationalise them
but let's fix perf now by creating STACK_FRAME_MIN_SIZE and using
in valid_next_sp(). This fixes the issue:

30.64%  readseek2_proce  [kernel.kallsyms]    [k] find_get_entry
            |
            --- find_get_entry
               pagecache_get_page
               generic_file_read_iter
               new_sync_read
               vfs_read
               sys_read
               syscall_exit
               __GI___libc_read

Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agosched: Fix unreleased llc_shared_mask bit during CPU hotplug
Wanpeng Li [Wed, 24 Sep 2014 08:38:05 +0000 (16:38 +0800)]
sched: Fix unreleased llc_shared_mask bit during CPU hotplug

commit 03bd4e1f7265548832a76e7919a81f3137c44fd1 upstream.

The following bug can be triggered by hot adding and removing a large number of
xen domain0's vcpus repeatedly:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: [..] find_busiest_group
PGD 5a9d5067 PUD 13067 PMD 0
Oops: 0000 [#3] SMP
[...]
Call Trace:
load_balance
? _raw_spin_unlock_irqrestore
idle_balance
__schedule
schedule
schedule_timeout
? lock_timer_base
schedule_timeout_uninterruptible
msleep
lock_device_hotplug_sysfs
online_store
dev_attr_store
sysfs_write_file
vfs_write
SyS_write
system_call_fastpath

Last level cache shared mask is built during CPU up and the
build_sched_domain() routine takes advantage of it to setup
the sched domain CPU topology.

However, llc_shared_mask is not released during CPU disable,
which leads to an invalid sched domainCPU topology.

This patch fix it by releasing the llc_shared_mask correctly
during CPU disable.

Yasuaki also reported that this can happen on real hardware:

  https://lkml.org/lkml/2014/7/22/1018

His case is here:

==
Here is an example on my system.
My system has 4 sockets and each socket has 15 cores and HT is
enabled. In this case, each core of sockes is numbered as
follows:

 | CPU#
Socket#0 | 0-14 , 60-74
Socket#1 | 15-29, 75-89
Socket#2 | 30-44, 90-104
Socket#3 | 45-59, 105-119

Then llc_shared_mask of CPU#30 has 0x3fff80000001fffc0000000.

It means that last level cache of Socket#2 is shared with
CPU#30-44 and 90-104.

When hot-removing socket#2 and #3, each core of sockets is
numbered as follows:

 | CPU#
Socket#0 | 0-14 , 60-74
Socket#1 | 15-29, 75-89

But llc_shared_mask is not cleared. So llc_shared_mask of CPU#30
remains having 0x3fff80000001fffc0000000.

After that, when hot-adding socket#2 and #3, each core of
sockets is numbered as follows:

 | CPU#
Socket#0 | 0-14 , 60-74
Socket#1 | 15-29, 75-89
Socket#2 | 30-59
Socket#3 | 90-119

Then llc_shared_mask of CPU#30 becomes
0x3fff8000fffffffc0000000. It means that last level cache of
Socket#2 is shared with CPU#30-59 and 90-104. So the mask has
the wrong value.

Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Tested-by: Linn Crosetto <linn@hp.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1411547885-48165-1-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoocfs2/dlm: do not get resource spinlock if lockres is new
Joseph Qi [Thu, 25 Sep 2014 23:05:16 +0000 (16:05 -0700)]
ocfs2/dlm: do not get resource spinlock if lockres is new

commit 5760a97c7143c208fa3a8f8cad0ed7dd672ebd28 upstream.

There is a deadlock case which reported by Guozhonghua:
  https://oss.oracle.com/pipermail/ocfs2-devel/2014-September/010079.html

This case is caused by &res->spinlock and &dlm->master_lock
misordering in different threads.

It was introduced by commit 8d400b81cc83 ("ocfs2/dlm: Clean up refmap
helpers").  Since lockres is new, it doesn't not require the
&res->spinlock.  So remove it.

Fixes: 8d400b81cc83 ("ocfs2/dlm: Clean up refmap helpers")
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: joyce.xue <xuejiufei@huawei.com>
Reported-by: Guozhonghua <guozhonghua@h3c.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agonilfs2: fix data loss with mmap()
Andreas Rohner [Thu, 25 Sep 2014 23:05:14 +0000 (16:05 -0700)]
nilfs2: fix data loss with mmap()

commit 56d7acc792c0d98f38f22058671ee715ff197023 upstream.

This bug leads to reproducible silent data loss, despite the use of
msync(), sync() and a clean unmount of the file system.  It is easily
reproducible with the following script:

  ----------------[BEGIN SCRIPT]--------------------
  mkfs.nilfs2 -f /dev/sdb
  mount /dev/sdb /mnt

  dd if=/dev/zero bs=1M count=30 of=/mnt/testfile

  umount /mnt
  mount /dev/sdb /mnt
  CHECKSUM_BEFORE="$(md5sum /mnt/testfile)"

  /root/mmaptest/mmaptest /mnt/testfile 30 10 5

  sync
  CHECKSUM_AFTER="$(md5sum /mnt/testfile)"
  umount /mnt
  mount /dev/sdb /mnt
  CHECKSUM_AFTER_REMOUNT="$(md5sum /mnt/testfile)"
  umount /mnt

  echo "BEFORE MMAP:\t$CHECKSUM_BEFORE"
  echo "AFTER MMAP:\t$CHECKSUM_AFTER"
  echo "AFTER REMOUNT:\t$CHECKSUM_AFTER_REMOUNT"
  ----------------[END SCRIPT]--------------------

The mmaptest tool looks something like this (very simplified, with
error checking removed):

  ----------------[BEGIN mmaptest]--------------------
  data = mmap(NULL, file_size - file_offset, PROT_READ | PROT_WRITE,
              MAP_SHARED, fd, file_offset);

  for (i = 0; i < write_count; ++i) {
        memcpy(data + i * 4096, buf, sizeof(buf));
        msync(data, file_size - file_offset, MS_SYNC))
  }
  ----------------[END mmaptest]--------------------

The output of the script looks something like this:

  BEFORE MMAP:    281ed1d5ae50e8419f9b978aab16de83  /mnt/testfile
  AFTER MMAP:     6604a1c31f10780331a6850371b3a313  /mnt/testfile
  AFTER REMOUNT:  281ed1d5ae50e8419f9b978aab16de83  /mnt/testfile

So it is clear, that the changes done using mmap() do not survive a
remount.  This can be reproduced a 100% of the time.  The problem was
introduced in commit 136e8770cd5d ("nilfs2: fix issue of
nilfs_set_page_dirty() for page at EOF boundary").

If the page was read with mpage_readpage() or mpage_readpages() for
example, then it has no buffers attached to it.  In that case
page_has_buffers(page) in nilfs_set_page_dirty() will be false.
Therefore nilfs_set_file_dirty() is never called and the pages are never
collected and never written to disk.

This patch fixes the problem by also calling nilfs_set_file_dirty() if the
page has no buffers attached to it.

[akpm@linux-foundation.org: s/PAGE_SHIFT/PAGE_CACHE_SHIFT/]
Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net>
Tested-by: Andreas Rohner <andreas.rohner@gmx.net>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agofs/notify: don't show f_handle if exportfs_encode_inode_fh failed
Andrey Vagin [Tue, 9 Sep 2014 21:51:06 +0000 (14:51 -0700)]
fs/notify: don't show f_handle if exportfs_encode_inode_fh failed

commit 7e8824816bda16bb11ff5ff1e1212d642e57b0b3 upstream.

Currently we handle only ENOSPC.  In case of other errors the file_handle
variable isn't filled properly and we will show a part of stack.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agofsnotify/fdinfo: use named constants instead of hardcoded values
Andrey Vagin [Tue, 9 Sep 2014 21:51:04 +0000 (14:51 -0700)]
fsnotify/fdinfo: use named constants instead of hardcoded values

commit 1fc98d11cac6dd66342e5580cb2687e5b1e9a613 upstream.

MAX_HANDLE_SZ is equal to 128, but currently the size of pad is only 64
bytes, so exportfs_encode_inode_fh can return an error.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agokcmp: fix standard comparison bug
Rasmus Villemoes [Tue, 9 Sep 2014 21:51:01 +0000 (14:51 -0700)]
kcmp: fix standard comparison bug

commit acbbe6fbb240a927ee1f5994f04d31267d422215 upstream.

The C operator <= defines a perfectly fine total ordering on the set of
values representable in a long.  However, unlike its namesake in the
integers, it is not translation invariant, meaning that we do not have
"b <= c" iff "a+b <= a+c" for all a,b,c.

This means that it is always wrong to try to boil down the relationship
between two longs to a question about the sign of their difference,
because the resulting relation [a LEQ b iff a-b <= 0] is neither
anti-symmetric or transitive.  The former is due to -LONG_MIN==LONG_MIN
(take any two a,b with a-b = LONG_MIN; then a LEQ b and b LEQ a, but a !=
b).  The latter can either be seen observing that x LEQ x+1 for all x,
implying x LEQ x+1 LEQ x+2 ...  LEQ x-1 LEQ x; or more directly with the
simple example a=LONG_MIN, b=0, c=1, for which a-b < 0, b-c < 0, but a-c >
0.

Note that it makes absolutely no difference that a transmogrying bijection
has been applied before the comparison is done.  In fact, had the
obfuscation not been done, one could probably not observe the bug
(assuming all values being compared always lie in one half of the address
space, the mathematical value of a-b is always representable in a long).
As it stands, one can easily obtain three file descriptors exhibiting the
non-transitivity of kcmp().

Side note 1: I can't see that ensuring the MSB of the multiplier is
set serves any purpose other than obfuscating the obfuscating code.

Side note 2:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <assert.h>
#include <sys/syscall.h>

enum kcmp_type {
        KCMP_FILE,
        KCMP_VM,
        KCMP_FILES,
        KCMP_FS,
        KCMP_SIGHAND,
        KCMP_IO,
        KCMP_SYSVSEM,
        KCMP_TYPES,
};
pid_t pid;

int kcmp(pid_t pid1, pid_t pid2, int type,
 unsigned long idx1, unsigned long idx2)
{
return syscall(SYS_kcmp, pid1, pid2, type, idx1, idx2);
}
int cmp_fd(int fd1, int fd2)
{
int c = kcmp(pid, pid, KCMP_FILE, fd1, fd2);
if (c < 0) {
perror("kcmp");
exit(1);
}
assert(0 <= c && c < 3);
return c;
}
int cmp_fdp(const void *a, const void *b)
{
static const int normalize[] = {0, -1, 1};
return normalize[cmp_fd(*(int*)a, *(int*)b)];
}
#define MAX 100 /* This is plenty; I've seen it trigger for MAX==3 */
int main(int argc, char *argv[])
{
int r, s, count = 0;
int REL[3] = {0,0,0};
int fd[MAX];
pid = getpid();
while (count < MAX) {
r = open("/dev/null", O_RDONLY);
if (r < 0)
break;
fd[count++] = r;
}
printf("opened %d file descriptors\n", count);
for (r = 0; r < count; ++r) {
for (s = r+1; s < count; ++s) {
REL[cmp_fd(fd[r], fd[s])]++;
}
}
printf("== %d\t< %d\t> %d\n", REL[0], REL[1], REL[2]);
qsort(fd, count, sizeof(fd[0]), cmp_fdp);
memset(REL, 0, sizeof(REL));

for (r = 0; r < count; ++r) {
for (s = r+1; s < count; ++s) {
REL[cmp_fd(fd[r], fd[s])]++;
}
}
printf("== %d\t< %d\t> %d\n", REL[0], REL[1], REL[2]);
return (REL[0] + REL[2] != 0);
}

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoRevert "mac80211: disable uAPSD if all ACs are under ACM"
Johannes Berg [Mon, 25 Aug 2014 10:08:09 +0000 (12:08 +0200)]
Revert "mac80211: disable uAPSD if all ACs are under ACM"

commit bb512ad0732232f1d2693bb68f31a76bed8f22ae upstream.

This reverts commit 24aa11ab8ae03292d38ec0dbd9bc2ac49fe8a6dd.

That commit was wrong since it uses data that hasn't even been set
up yet, but might be a hold-over from a previous connection.

Additionally, it seems like a driver-specific workaround that
shouldn't have been in mac80211 to start with.

Fixes: 24aa11ab8ae0 ("mac80211: disable uAPSD if all ACs are under ACM")
Reviewed-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agousb: dwc3: core: fix ordering for PHY suspend
Felipe Balbi [Wed, 3 Sep 2014 21:13:37 +0000 (16:13 -0500)]
usb: dwc3: core: fix ordering for PHY suspend

commit dc99f16f076559235c92d3eb66d03d1310faea08 upstream.

We can't suspend the PHYs before dwc3_core_exit_mode()
has been called, that's because the host and/or device
sides might still need to communicate with the far end
link partner.

Fixes: 8ba007a (usb: dwc3: core: enable the USB2 and USB3 phy in probe)
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agousb: dwc3: core: fix order of PM runtime calls
Felipe Balbi [Tue, 2 Sep 2014 19:57:20 +0000 (14:57 -0500)]
usb: dwc3: core: fix order of PM runtime calls

commit fed33afce0eda44a46ae24d93aec1b5198c0bac4 upstream.

Currently, we disable pm_runtime before all register
accesses are done, this is dangerous and might lead
to abort exceptions due to the driver trying to access
a register which is clocked by a clock which was long
gated.

Fix that by moving pm_runtime_put_sync() and pm_runtime_disable()
as the last thing we do before returning from our ->remove()
method.

Fixes: 72246da (usb: Introduce DesignWare USB3 DRD Driver)
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agousb: host: xhci: fix compliance mode workaround
Felipe Balbi [Wed, 27 Aug 2014 21:38:04 +0000 (16:38 -0500)]
usb: host: xhci: fix compliance mode workaround

commit 96908589a8b2584b1185f834d365f5cc360e8226 upstream.

Commit 71c731a (usb: host: xhci: Fix Compliance Mode
on SN65LVP3502CP Hardware) implemented a workaround
for a known issue with Texas Instruments' USB 3.0
redriver IC but it left a condition where any xHCI
host would be taken out of reset if port was placed
in compliance mode and there was no device connected
to the port.

That condition would trigger a fake connection to a
non-existent device so that usbcore would trigger a
warm reset of the port, thus taking the link out of
reset.

This has the side-effect of preventing any xHCI host
connected to a Linux machine from starting and running
the USB 3.0 Electrical Compliance Suite because the
port will mysteriously taken out of compliance mode
and, thus, xHCI won't step through the necessary
compliance patterns for link validation.

This patch fixes the issue by just adding a missing
check for XHCI_COMP_MODE_QUIRK inside
xhci_hub_report_usb3_link_state() when PORT_CAS isn't
set.

This patch should be backported to all kernels containing
commit 71c731a.

Fixes: 71c731a (usb: host: xhci: Fix Compliance Mode on SN65LVP3502CP Hardware)
Cc: Alexis R. Cortes <alexis.cortes@ti.com>
Cc: <stable@vger.kernel.org> # v3.2+
Signed-off-by: Felipe Balbi <balbi@ti.com>
Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agogenhd: fix leftover might_sleep() in blk_free_devt()
Jens Axboe [Tue, 16 Sep 2014 19:38:51 +0000 (13:38 -0600)]
genhd: fix leftover might_sleep() in blk_free_devt()

commit 46f341ffcfb5d8530f7d1e60f3be06cce6661b62 upstream.

Commit 2da78092 changed the locking from a mutex to a spinlock,
so we now longer sleep in this context. But there was a leftover
might_sleep() in there, which now triggers since we do the final
free from an RCU callback. Get rid of it.

Reported-by: Pontus Fuchs <pontus.fuchs@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agolockd: fix rpcbind crash on lockd startup failure
J. Bruce Fields [Fri, 29 Aug 2014 20:25:50 +0000 (16:25 -0400)]
lockd: fix rpcbind crash on lockd startup failure

commit 7c17705e77b12b20fb8afb7c1b15dcdb126c0c12 upstream.

Nikita Yuschenko reported that booting a kernel with init=/bin/sh and
then nfs mounting without portmap or rpcbind running using a busybox
mount resulted in:

  # mount -t nfs 10.30.130.21:/opt /mnt
  svc: failed to register lockdv1 RPC service (errno 111).
  lockd_up: makesock failed, error=-111
  Unable to handle kernel paging request for data at address 0x00000030
  Faulting instruction address: 0xc055e65c
  Oops: Kernel access of bad area, sig: 11 [#1]
  MPC85xx CDS
  Modules linked in:
  CPU: 0 PID: 1338 Comm: mount Not tainted 3.10.44.cge #117
  task: cf29cea0 ti: cf35c000 task.ti: cf35c000
  NIP: c055e65c LR: c0566490 CTR: c055e648
  REGS: cf35dad0 TRAP: 0300   Not tainted  (3.10.44.cge)
  MSR: 00029000 <CE,EE,ME>  CR: 22442488  XER: 20000000
  DEAR: 00000030, ESR: 00000000

  GPR00: c05606f4 cf35db80 cf29cea0 cf0ded80 cf0dedb8 00000001 1dec3086
  00000000
  GPR08: 00000000 c07b1640 00000007 1dec3086 22442482 100b9758 00000000
  10090ae8
  GPR16: 00000000 000186a5 00000000 00000000 100c3018 bfa46edc 100b0000
  bfa46ef0
  GPR24: cf386ae0 c07834f0 00000000 c0565f88 00000001 cf0dedb8 00000000
  cf0ded80
  NIP [c055e65c] call_start+0x14/0x34
  LR [c0566490] __rpc_execute+0x70/0x250
  Call Trace:
  [cf35db80] [00000080] 0x80 (unreliable)
  [cf35dbb0] [c05606f4] rpc_run_task+0x9c/0xc4
  [cf35dbc0] [c0560840] rpc_call_sync+0x50/0xb8
  [cf35dbf0] [c056ee90] rpcb_register_call+0x54/0x84
  [cf35dc10] [c056f24c] rpcb_register+0xf8/0x10c
  [cf35dc70] [c0569e18] svc_unregister.isra.23+0x100/0x108
  [cf35dc90] [c0569e38] svc_rpcb_cleanup+0x18/0x30
  [cf35dca0] [c0198c5c] lockd_up+0x1dc/0x2e0
  [cf35dcd0] [c0195348] nlmclnt_init+0x2c/0xc8
  [cf35dcf0] [c015bb5c] nfs_start_lockd+0x98/0xec
  [cf35dd20] [c015ce6c] nfs_create_server+0x1e8/0x3f4
  [cf35dd90] [c0171590] nfs3_create_server+0x10/0x44
  [cf35dda0] [c016528c] nfs_try_mount+0x158/0x1e4
  [cf35de20] [c01670d0] nfs_fs_mount+0x434/0x8c8
  [cf35de70] [c00cd3bc] mount_fs+0x20/0xbc
  [cf35de90] [c00e4f88] vfs_kern_mount+0x50/0x104
  [cf35dec0] [c00e6e0c] do_mount+0x1d0/0x8e0
  [cf35df10] [c00e75ac] SyS_mount+0x90/0xd0
  [cf35df40] [c000ccf4] ret_from_syscall+0x0/0x3c

The addition of svc_shutdown_net() resulted in two calls to
svc_rpcb_cleanup(); the second is no longer necessary and crashes when
it calls rpcb_register_call with clnt=NULL.

Reported-by: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru>
Fixes: 679b033df484 "lockd: ensure we tear down any live sockets when socket creation fails during lockd_up"
Acked-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agortlwifi: rtl8192cu: Add new ID
Larry Finger [Sun, 24 Aug 2014 22:49:43 +0000 (17:49 -0500)]
rtlwifi: rtl8192cu: Add new ID

commit c66517165610b911e4c6d268f28d8c640832dbd1 upstream.

The Sitecom WLA-2102 adapter uses this driver.

Reported-by: Nico Baggus <nico-linux@noci.xs4all.nl>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Nico Baggus <nico-linux@noci.xs4all.nl>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agopercpu: perform tlb flush after pcpu_map_pages() failure
Tejun Heo [Fri, 15 Aug 2014 20:06:10 +0000 (16:06 -0400)]
percpu: perform tlb flush after pcpu_map_pages() failure

commit 849f5169097e1ba35b90ac9df76b5bb6f9c0aabd upstream.

If pcpu_map_pages() fails midway, it unmaps the already mapped pages.
Currently, it doesn't flush tlb after the partial unmapping.  This may
be okay in most cases as the established mapping hasn't been used at
that point but it can go wrong and when it goes wrong it'd be
extremely difficult to track down.

Flush tlb after the partial unmapping.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agopercpu: fix pcpu_alloc_pages() failure path
Tejun Heo [Fri, 15 Aug 2014 20:06:06 +0000 (16:06 -0400)]
percpu: fix pcpu_alloc_pages() failure path

commit f0d279654dea22b7a6ad34b9334aee80cda62cde upstream.

When pcpu_alloc_pages() fails midway, pcpu_free_pages() is invoked to
free what has already been allocated.  The invocation is across the
whole requested range and pcpu_free_pages() will try to free all
non-NULL pages; unfortunately, this is incorrect as
pcpu_get_pages_and_bitmap(), unlike what its comment suggests, doesn't
clear the pages array and thus the array may have entries from the
previous invocations making the partial failure path free incorrect
pages.

Fix it by open-coding the partial freeing of the already allocated
pages.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agopercpu: free percpu allocation info for uniprocessor system
Honggang Li [Tue, 12 Aug 2014 13:36:15 +0000 (21:36 +0800)]
percpu: free percpu allocation info for uniprocessor system

commit 3189eddbcafcc4d827f7f19facbeddec4424eba8 upstream.

Currently, only SMP system free the percpu allocation info.
Uniprocessor system should free it too. For example, one x86 UML
virtual machine with 256MB memory, UML kernel wastes one page memory.

Signed-off-by: Honggang Li <enjoymindful@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoata_piix: Add Device IDs for Intel 9 Series PCH
James Ralston [Wed, 27 Aug 2014 21:31:58 +0000 (14:31 -0700)]
ata_piix: Add Device IDs for Intel 9 Series PCH

commit 6cad1376954e591c3c41500c4e586e183e7ffe6d upstream.

This patch adds the IDE mode SATA Device IDs for the Intel 9 Series PCH.

Signed-off-by: James Ralston <james.d.ralston@intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoInput: i8042 - add nomux quirk for Avatar AVIU-145A6
Hans de Goede [Thu, 11 Sep 2014 17:10:26 +0000 (10:10 -0700)]
Input: i8042 - add nomux quirk for Avatar AVIU-145A6

commit d2682118f4bb3ceb835f91c1a694407a31bb7378 upstream.

The sys_vendor / product_name are somewhat generic unfortunately, so this
may lead to some false positives. But nomux usually does no harm, where as
not having it clearly is causing problems on the Avatar AVIU-145A6.

https://bugzilla.kernel.org/show_bug.cgi?id=77391

Reported-by: Hugo P <saurosii@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoInput: i8042 - add Fujitsu U574 to no_timeout dmi table
Hans de Goede [Wed, 10 Sep 2014 20:53:37 +0000 (13:53 -0700)]
Input: i8042 - add Fujitsu U574 to no_timeout dmi table

commit cc18a69c92d0972bc2fc5a047ee3be1e8398171b upstream.

https://bugzilla.kernel.org/show_bug.cgi?id=69731

Reported-by: Jason Robinson <mail@jasonrobinson.me>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoInput: atkbd - do not try 'deactivate' keyboard on any LG laptops
Dmitry Torokhov [Wed, 10 Sep 2014 20:50:37 +0000 (13:50 -0700)]
Input: atkbd - do not try 'deactivate' keyboard on any LG laptops

commit c01206796139e2b1feb7539bc72174fef1c6dc6e upstream.

We are getting more and more reports about LG laptops not having
functioning keyboard if we try to deactivate keyboard during probe.
Given that having keyboard deactivated is merely "nice to have"
instead of a hard requirement for probing, let's disable it on all
LG boxes instead of trying to hunt down particular models.

This change is prompted by patches trying to add "LG Electronics"/"ROCKY"
and "LG Electronics"/"LW60-F27B" to the DMI list.

https://bugzilla.kernel.org/show_bug.cgi?id=77051

Reported-by: Jaime Velasco Juan <jsagarribay@gmail.com>
Reported-by: Georgios Tsalikis <georgios@tsalikis.net>
Tested-by: Jaime Velasco Juan <jsagarribay@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoInput: elantech - fix detection of touchpad on ASUS s301l
Hans de Goede [Mon, 8 Sep 2014 21:39:52 +0000 (14:39 -0700)]
Input: elantech - fix detection of touchpad on ASUS s301l

commit 271329b3c798b2102120f5df829071c211ef00ed upstream.

Adjust Elantech signature validation to account fo rnewer models of
touchpads.

Reported-and-tested-by: Màrius Monton <marius.monton@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoInput: synaptics - add support for ForcePads
Dmitry Torokhov [Sat, 30 Aug 2014 20:51:06 +0000 (13:51 -0700)]
Input: synaptics - add support for ForcePads

commit 5715fc764f7753d464dbe094b5ef9cffa6e479a4 upstream.

ForcePads are found on HP EliteBook 1040 laptops. They lack any kind of
physical buttons, instead they generate primary button click when user
presses somewhat hard on the surface of the touchpad. Unfortunately they
also report primary button click whenever there are 2 or more contacts
on the pad, messing up all multi-finger gestures (2-finger scrolling,
multi-finger tapping, etc). To cope with this behavior we introduce a
delay (currently 50 msecs) in reporting primary press in case more
contacts appear.

Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoInput: serport - add compat handling for SPIOCSTYPE ioctl
John Sung [Tue, 9 Sep 2014 17:06:51 +0000 (10:06 -0700)]
Input: serport - add compat handling for SPIOCSTYPE ioctl

commit a80d8b02751060a178bb1f7a6b7a93645a7a308b upstream.

When running a 32-bit inputattach utility in a 64-bit system, there will be
error code "inputattach: can't set device type". This is caused by the
serport device driver not supporting compat_ioctl, so that SPIOCSTYPE ioctl
fails.

Signed-off-by: John Sung <penmount.touch@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodm crypt: fix access beyond the end of allocated space
Mikulas Patocka [Thu, 28 Aug 2014 15:09:31 +0000 (11:09 -0400)]
dm crypt: fix access beyond the end of allocated space

commit d49ec52ff6ddcda178fc2476a109cf1bd1fa19ed upstream.

The DM crypt target accesses memory beyond allocated space resulting in
a crash on 32 bit x86 systems.

This bug is very old (it dates back to 2.6.25 commit 3a7f6c990ad04 "dm
crypt: use async crypto").  However, this bug was masked by the fact
that kmalloc rounds the size up to the next power of two.  This bug
wasn't exposed until 3.17-rc1 commit 298a9fa08a ("dm crypt: use per-bio
data").  By switching to using per-bio data there was no longer any
padding beyond the end of a dm-crypt allocated memory block.

To minimize allocation overhead dm-crypt puts several structures into one
block allocated with kmalloc.  The block holds struct ablkcipher_request,
cipher-specific scratch pad (crypto_ablkcipher_reqsize(any_tfm(cc))),
struct dm_crypt_request and an initialization vector.

The variable dmreq_start is set to offset of struct dm_crypt_request
within this memory block.  dm-crypt allocates the block with this size:
cc->dmreq_start + sizeof(struct dm_crypt_request) + cc->iv_size.

When accessing the initialization vector, dm-crypt uses the function
iv_of_dmreq, which performs this calculation: ALIGN((unsigned long)(dmreq
+ 1), crypto_ablkcipher_alignmask(any_tfm(cc)) + 1).

dm-crypt allocated "cc->iv_size" bytes beyond the end of dm_crypt_request
structure.  However, when dm-crypt accesses the initialization vector, it
takes a pointer to the end of dm_crypt_request, aligns it, and then uses
it as the initialization vector.  If the end of dm_crypt_request is not
aligned on a crypto_ablkcipher_alignmask(any_tfm(cc)) boundary the
alignment causes the initialization vector to point beyond the allocated
space.

Fix this bug by calculating the variable iv_size_padding and adding it
to the allocated size.

Also correct the alignment of dm_crypt_request.  struct dm_crypt_request
is specific to dm-crypt (it isn't used by the crypto subsystem at all),
so it is aligned on __alignof__(struct dm_crypt_request).

Also align per_bio_data_size on ARCH_KMALLOC_MINALIGN, so that it is
aligned as if the block was allocated with kmalloc.

Reported-by: Krzysztof Kolasa <kkolasa@winsoft.pl>
Tested-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoblock: Fix dev_t minor allocation lifetime
Keith Busch [Tue, 26 Aug 2014 15:05:36 +0000 (09:05 -0600)]
block: Fix dev_t minor allocation lifetime

commit 2da78092dda13f1efd26edbbf99a567776913750 upstream.

Releases the dev_t minor when all references are closed to prevent
another device from acquiring the same major/minor.

Since the partition's release may be invoked from call_rcu's soft-irq
context, the ext_dev_idr's mutex had to be replaced with a spinlock so
as not so sleep.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoworkqueue: apply __WQ_ORDERED to create_singlethread_workqueue()
Tejun Heo [Fri, 12 Sep 2014 19:14:30 +0000 (04:14 +0900)]
workqueue: apply __WQ_ORDERED to create_singlethread_workqueue()

commit e09c2c295468476a239d13324ce9042ec4de05eb upstream.

create_singlethread_workqueue() is a compat interface for single
threaded workqueue which maps to ordered workqueue w/ rescuer in the
current implementation.  create_singlethread_workqueue() currently
implemented by invoking alloc_workqueue() w/ appropriate parameters.

8719dceae2f9 ("workqueue: reject adjusting max_active or applying
attrs to ordered workqueues") introduced __WQ_ORDERED to protect
ordered workqueues against dynamic attribute changes which can break
ordering guarantees but forgot to apply it to
create_singlethread_workqueue().  This in itself is okay as nobody
currently uses dynamic attribute change on workqueues created with
create_singlethread_workqueue().

However, 4c16bd327c ("workqueue: implement NUMA affinity for unbound
workqueues") broke singlethreaded guarantee for ordered workqueues
through allocating a separate pool_workqueue on each NUMA node by
default.  A later change 8a2b75384444 ("workqueue: fix ordered
workqueues in NUMA setups") fixed it by allocating only one global
pool_workqueue if __WQ_ORDERED is set.

Combined, the __WQ_ORDERED omission in create_singlethread_workqueue()
became critical breaking its single threadedness and ordering
guarantee.

Let's make create_singlethread_workqueue() wrap
alloc_ordered_workqueue() instead so that it inherits __WQ_ORDERED and
can implicitly track future ordered_workqueue changes.

v2: I missed that __WQ_ORDERED now protects against pwq splitting
    across NUMA nodes and incorrectly described the patch as a
    nice-to-have fix to protect against future dynamic attribute
    usages.  Oleg pointed out that this is actually a critical
    breakage due to 8a2b75384444 ("workqueue: fix ordered workqueues
    in NUMA setups").

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Mike Anderson <mike.anderson@us.ibm.com>
Cc: Oleg Nesterov <onestero@redhat.com>
Cc: Gustavo Luiz Duarte <gduarte@redhat.com>
Cc: Tomas Henzl <thenzl@redhat.com>
Fixes: 4c16bd327c ("workqueue: implement NUMA affinity for unbound workqueues")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoRevert "iwlwifi: dvm: don't enable CTS to self"
Emmanuel Grumbach [Sun, 31 Aug 2014 19:11:11 +0000 (22:11 +0300)]
Revert "iwlwifi: dvm: don't enable CTS to self"

commit f47f46d7b09cf1d09e4b44b6cc4dd7d68a08028c upstream.

This reverts commit 43d826ca5979927131685cc2092c7ce862cb91cd.

This commit caused packet loss.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoSCSI: libiscsi: fix potential buffer overrun in __iscsi_conn_send_pdu
Mike Christie [Wed, 3 Sep 2014 05:00:39 +0000 (00:00 -0500)]
SCSI: libiscsi: fix potential buffer overrun in __iscsi_conn_send_pdu

commit db9bfd64b14a3a8f1868d2164518fdeab1b26ad1 upstream.

This patches fixes a potential buffer overrun in __iscsi_conn_send_pdu.
This function is used by iscsi drivers and userspace to send iscsi PDUs/
commands. For login commands, we have a set buffer size. For all other
commands we do not support data buffers.

This was reported by Dan Carpenter here:
http://www.spinics.net/lists/linux-scsi/msg66838.html

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoNFC: microread: Potential overflows in microread_target_discovered()
Dan Carpenter [Mon, 1 Sep 2014 17:27:29 +0000 (20:27 +0300)]
NFC: microread: Potential overflows in microread_target_discovered()

commit d07f1e8600ccb885c8f4143402b8912f7d827bcb upstream.

Smatch says that skb->data is untrusted so we need to check to make sure
that the memcpy() doesn't overflow.

Fixes: cfad1ba87150 ('NFC: Initial support for Inside Secure microread')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoiscsi-target: Fix memory corruption in iscsit_logout_post_handler_diffcid
Nicholas Bellinger [Wed, 17 Sep 2014 18:45:17 +0000 (11:45 -0700)]
iscsi-target: Fix memory corruption in iscsit_logout_post_handler_diffcid

commit b53b0d99d6fbf7d44330395349a895521cfdbc96 upstream.

This patch fixes a bug in iscsit_logout_post_handler_diffcid() where
a pointer used as storage for list_for_each_entry() was incorrectly
being used to determine if no matching entry had been found.

This patch changes iscsit_logout_post_handler_diffcid() to key off
bool conn_found to determine if the function needs to exit early.

Reported-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoiscsi-target: avoid NULL pointer in iscsi_copy_param_list failure
Joern Engel [Tue, 2 Sep 2014 21:49:54 +0000 (17:49 -0400)]
iscsi-target: avoid NULL pointer in iscsi_copy_param_list failure

commit 8ae757d09c45102b347a1bc2867f54ffc1ab8fda upstream.

In iscsi_copy_param_list() a failed iscsi_param_list memory allocation
currently invokes iscsi_release_param_list() to cleanup, and will promptly
trigger a NULL pointer dereference.

Instead, go ahead and return for the first iscsi_copy_param_list()
failure case.

Found by coverity.

Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoTarget/iser: Don't put isert_conn inside disconnected handler
Sagi Grimberg [Wed, 2 Jul 2014 13:19:25 +0000 (16:19 +0300)]
Target/iser: Don't put isert_conn inside disconnected handler

commit 0fc4ea701fcf5bc51ace4e288af5be741465f776 upstream.

disconnected_handler is invoked on several CM events (such
as DISCONNECTED, DEVICE_REMOVAL, TIMEWAIT_EXIT...). Since
multiple  events can occur while before isert_free_conn is
invoked, we might put all isert_conn references and free
the connection too early.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoTarget/iser: Get isert_conn reference once got to connected_handler
Sagi Grimberg [Wed, 2 Jul 2014 13:19:24 +0000 (16:19 +0300)]
Target/iser: Get isert_conn reference once got to connected_handler

commit c2f88b17a1d97ca4ecd96cc22333a7a4f1407d39 upstream.

In case the connection didn't reach connected state, disconnected
handler will never be invoked thus the second kref_put on
isert_conn will be missing.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoiio:inkern: fix overwritten -EPROBE_DEFER in of_iio_channel_get_by_name
Johannes Pointner [Mon, 25 Aug 2014 08:04:00 +0000 (09:04 +0100)]
iio:inkern: fix overwritten -EPROBE_DEFER in of_iio_channel_get_by_name

commit 872687f626e033b4ddfaec1e410057cfc6636d77 upstream.

Fixes: a2c12493ed7e ('iio: of_iio_channel_get_by_name() returns non-null pointers for error legs')

which improperly assumes that of_iio_channel_get_by_name must always
return NULL and thus now hides -EPROBE_DEFER.

Signed-off-by: Johannes Pointner <johannes.pointner@br-automation.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoiio:magnetometer: bugfix magnetometers gain values
Denis CIOCCA [Thu, 9 Oct 2014 12:55:00 +0000 (13:55 +0100)]
iio:magnetometer: bugfix magnetometers gain values

commit a31d0928999fbf33b3a6042e8bcb7b7f7e07d094 upstream.

This patch fix gains values. The first driver was designed using
engineering samples, in mass production the values are changed.

Signed-off-by: Denis Ciocca <denis.ciocca@st.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoiio: adc: ad_sigma_delta: Fix indio_dev->trig assignment
Srinivas Pandruvada [Fri, 22 Aug 2014 20:48:00 +0000 (21:48 +0100)]
iio: adc: ad_sigma_delta: Fix indio_dev->trig assignment

commit 9e5846be33277802c0c76e5c12825d0e4d27f639 upstream.

This can result in wrong reference count for trigger device, call
iio_trigger_get to increment reference.
Refer to http://www.spinics.net/lists/linux-iio/msg13669.html for discussion
with Jonathan.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoiio: st_sensors: Fix indio_dev->trig assignment
Srinivas Pandruvada [Fri, 22 Aug 2014 20:48:00 +0000 (21:48 +0100)]
iio: st_sensors: Fix indio_dev->trig assignment

commit f0e84acd7056e6d7ade551c6439531606ae30a46 upstream.

This can result in wrong reference count for trigger device, call
iio_trigger_get to increment reference.
Refer to http://www.spinics.net/lists/linux-iio/msg13669.html for discussion
with Jonathan.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoiio: meter: ade7758: Fix indio_dev->trig assignment
Srinivas Pandruvada [Fri, 22 Aug 2014 20:48:00 +0000 (21:48 +0100)]
iio: meter: ade7758: Fix indio_dev->trig assignment

commit 0495081179212b758775df752e657ea71dcae020 upstream.

This can result in wrong reference count for trigger device, call
iio_trigger_get to increment reference.
Refer to http://www.spinics.net/lists/linux-iio/msg13669.html for discussion
with Jonathan.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoiio: inv_mpu6050: Fix indio_dev->trig assignment
Srinivas Pandruvada [Fri, 22 Aug 2014 20:48:00 +0000 (21:48 +0100)]
iio: inv_mpu6050: Fix indio_dev->trig assignment

commit b07e3b3850b2e1f09c19f54d3ed7210d9f529e2c upstream.

This can result in wrong reference count for trigger device, call
iio_trigger_get to increment reference.
Refer to http://www.spinics.net/lists/linux-iio/msg13669.html for discussion
with Jonathan.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoiio: gyro: itg3200: Fix indio_dev->trig assignment
Srinivas Pandruvada [Fri, 22 Aug 2014 20:48:00 +0000 (21:48 +0100)]
iio: gyro: itg3200: Fix indio_dev->trig assignment

commit 0b4dce2ee694a991ef38203ec5ff91a738518cb3 upstream.

This can result in wrong reference count for trigger device, call
iio_trigger_get to increment reference.
Refer to http://www.spinics.net/lists/linux-iio/msg13669.html for discussion
with Jonathan.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoiio:trigger: modify return value for iio_trigger_get
Srinivas Pandruvada [Fri, 22 Aug 2014 20:48:00 +0000 (21:48 +0100)]
iio:trigger: modify return value for iio_trigger_get

commit f153566570fb9e32c2f59182883f4f66048788fb upstream.

Instead of a void function, return the trigger pointer.

Whilst not in of itself a fix, this makes the following set of
7 fixes cleaner than they would otherwise be.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoCIFS: Fix SMB2 readdir error handling
Pavel Shilovsky [Mon, 18 Aug 2014 16:49:57 +0000 (20:49 +0400)]
CIFS: Fix SMB2 readdir error handling

commit 52755808d4525f4d5b86d112d36ffc7a46f3fb48 upstream.

SMB2 servers indicates the end of a directory search with
STATUS_NO_MORE_FILE error code that is not processed now.
This causes generic/257 xfstest to fail. Fix this by triggering
the end of search by this error code in SMB2_query_directory.

Also when negotiating CIFS protocol we tell the server to close
the search automatically at the end and there is no need to do
it itself. In the case of SMB2 protocol, we need to close it
explicitly - separate close directory checks for different
protocols.

Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoCIFS: Fix directory rename error
Pavel Shilovsky [Fri, 22 Aug 2014 09:32:09 +0000 (13:32 +0400)]
CIFS: Fix directory rename error

commit a07d322059db66b84c9eb4f98959df468e88b34b upstream.

CIFS servers process nlink counts differently for files and directories.
In cifs_rename() if we the request fails on the existing target, we
try to remove it through cifs_unlink() but this is not what we want
to do for directories. As the result the following sequence of commands

mkdir {1,2}; mv -T 1 2; rmdir {1,2}; mkdir {1,2}; echo foo > 2/bar

and XFS test generic/023 fail with -ENOENT error. That's why the second
mkdir reuses the existing inode (target inode of the mv -T command) with
S_DEAD flag.

Fix this by checking whether the target is directory or not and
calling cifs_rmdir() rather than cifs_unlink() for directories.

Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoASoC: davinci-mcasp: Correct rx format unit configuration
Peter Ujfalusi [Thu, 4 Sep 2014 07:52:53 +0000 (10:52 +0300)]
ASoC: davinci-mcasp: Correct rx format unit configuration

commit fe0a29e163a5d045c73faab682a8dac71c2f8012 upstream.

In case of capture we should not use rotation. The reverse and mask is
enough to get the data align correctly from the bus to MCU:
Format   data from bus    after reverse (XRBUF)
S16_LE:  |LSB|MSB|xxx|xxx|  |xxx|xxx|MSB|LSB|
S24_3LE: |LSB|DAT|MSB|xxx|  |xxx|MSB|DAT|LSB|
S24_LE:  |LSB|DAT|MSB|xxx|  |xxx|MSB|DAT|LSB|
S32_LE:  |LSB|DAT|DAT|MSB|  |MSB|DAT|DAT|LSB|

With this patch all supported formats will work for playback and capture.

Reported-by: Jyri Sarha <jsarha@ti.com> (broken S24_3LE capture)
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoshmem: fix nlink for rename overwrite directory
Miklos Szeredi [Wed, 24 Sep 2014 15:56:17 +0000 (17:56 +0200)]
shmem: fix nlink for rename overwrite directory

commit b928095b0a7cff7fb9fcf4c706348ceb8ab2c295 upstream.

If overwriting an empty directory with rename, then need to drop the extra
nlink.

Test prog:

#include <stdio.h>
#include <fcntl.h>
#include <err.h>
#include <sys/stat.h>

int main(void)
{
const char *test_dir1 = "test-dir1";
const char *test_dir2 = "test-dir2";
int res;
int fd;
struct stat statbuf;

res = mkdir(test_dir1, 0777);
if (res == -1)
err(1, "mkdir(\"%s\")", test_dir1);

res = mkdir(test_dir2, 0777);
if (res == -1)
err(1, "mkdir(\"%s\")", test_dir2);

fd = open(test_dir2, O_RDONLY);
if (fd == -1)
err(1, "open(\"%s\")", test_dir2);

res = rename(test_dir1, test_dir2);
if (res == -1)
err(1, "rename(\"%s\", \"%s\")", test_dir1, test_dir2);

res = fstat(fd, &statbuf);
if (res == -1)
err(1, "fstat(%i)", fd);

if (statbuf.st_nlink != 0) {
fprintf(stderr, "nlink is %lu, should be 0\n", statbuf.st_nlink);
return 1;
}

return 0;
}

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agox86 early_ioremap: Increase FIX_BTMAPS_SLOTS to 8
Dave Young [Tue, 26 Aug 2014 09:06:41 +0000 (17:06 +0800)]
x86 early_ioremap: Increase FIX_BTMAPS_SLOTS to 8

commit 3eddc69ffeba092d288c386646bfa5ec0fce25fd upstream.

3.16 kernel boot fail with earlyprintk=efi, it keeps scrolling at the
bottom line of screen.

Bisected, the first bad commit is below:
commit 86dfc6f339886559d80ee0d4bd20fe5ee90450f0
Author: Lv Zheng <lv.zheng@intel.com>
Date:   Fri Apr 4 12:38:57 2014 +0800

    ACPICA: Tables: Fix table checksums verification before installation.

I did some debugging by enabling both serial and efi earlyprintk, below is
some debug dmesg, seems early_ioremap fails in scroll up function due to
no free slot, see below dmesg output:

  WARNING: CPU: 0 PID: 0 at mm/early_ioremap.c:116 __early_ioremap+0x90/0x1c4()
  __early_ioremap(ed00c80000000c80) not found slot
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper Not tainted 3.17.0-rc1+ #204
  Hardware name: Hewlett-Packard HP Z420 Workstation/1589, BIOS J61 v03.15 05/09/2013
  Call Trace:
    dump_stack+0x4e/0x7a
    warn_slowpath_common+0x75/0x8e
    ? __early_ioremap+0x90/0x1c4
    warn_slowpath_fmt+0x47/0x49
    __early_ioremap+0x90/0x1c4
    ? sprintf+0x46/0x48
    early_ioremap+0x13/0x15
    early_efi_map+0x24/0x26
    early_efi_scroll_up+0x6d/0xc0
    early_efi_write+0x1b0/0x214
    call_console_drivers.constprop.21+0x73/0x7e
    console_unlock+0x151/0x3b2
    ? vprintk_emit+0x49f/0x532
    vprintk_emit+0x521/0x532
    ? console_unlock+0x383/0x3b2
    printk+0x4f/0x51
    acpi_os_vprintf+0x2b/0x2d
    acpi_os_printf+0x43/0x45
    acpi_info+0x5c/0x63
    ? __acpi_map_table+0x13/0x18
    ? acpi_os_map_iomem+0x21/0x147
    acpi_tb_print_table_header+0x177/0x186
    acpi_tb_install_table_with_override+0x4b/0x62
    acpi_tb_install_standard_table+0xd9/0x215
    ? early_ioremap+0x13/0x15
    ? __acpi_map_table+0x13/0x18
    acpi_tb_parse_root_table+0x16e/0x1b4
    acpi_initialize_tables+0x57/0x59
    acpi_table_init+0x50/0xce
    acpi_boot_table_init+0x1e/0x85
    setup_arch+0x9b7/0xcc4
    start_kernel+0x94/0x42d
    ? early_idt_handlers+0x120/0x120
    x86_64_start_reservations+0x2a/0x2c
    x86_64_start_kernel+0xf3/0x100

Quote reply from Lv.zheng about the early ioremap slot usage in this case:

"""
In early_efi_scroll_up(), 2 mapping entries will be used for the src/dst screen buffer.
In drivers/acpi/acpica/tbutils.c, we've improved the early table loading code in acpi_tb_parse_root_table().
We now need 2 mapping entries:
1. One mapping entry is used for RSDT table mapping. Each RSDT entry contains an address for another ACPI table.
2. For each entry in RSDP, we need another mapping entry to map the table to perform necessary check/override before installing it.

When acpi_tb_parse_root_table() prints something through EFI earlyprintk console, we'll have 4 mapping entries used.
The current 4 slots setting of early_ioremap() seems to be too small for such a use case.
"""

Thus increase the slot to 8 in this patch to fix this issue.
boot-time mappings become 512 page with this patch.

Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoKVM: x86: handle idiv overflow at kvm_write_tsc
Marcelo Tosatti [Wed, 12 Jun 2013 02:31:12 +0000 (23:31 -0300)]
KVM: x86: handle idiv overflow at kvm_write_tsc

commit 8915aa27d5efbb9185357175b0acf884325565f9 upstream.

Its possible that idivl overflows (due to large delta stored in usdiff,
valid scenario).

Create an exception handler to catch the overflow exception (division by zero
is protected by vcpu->arch.virtual_tsc_khz check), and interpret it accordingly
(delta is larger than USEC_PER_SEC).

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=969644

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Philipp Hahn <hahn@univention.de>
Tested-by: Philipp Hahn <hahn@univention.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoregmap: Fix handling of volatile registers for format_write() chips
Mark Brown [Tue, 26 Aug 2014 11:12:17 +0000 (12:12 +0100)]
regmap: Fix handling of volatile registers for format_write() chips

commit 5844a8b9d98ec11ce1d77610daacf3f0a0e14715 upstream.

A previous over-zealous factorisation of code means that we only treat
registers as volatile if they are readable. For most devices this is fine
since normally most registers can be read and volatility implies
readability but for format_write() devices where there is no readback from
the hardware and we use volatility to mean simply uncacheability this means
that we end up treating all registers as cacheble.

A bigger refactoring of the code to clarify this is in order but as a fix
make a minimal change and only check readability when checking volatility
if there is no format_write() operation defined for the device.

Signed-off-by: Mark Brown <broonie@linaro.org>
Tested-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoACPICA: Update to GPIO region handler interface.
Bob Moore [Tue, 23 Sep 2014 02:35:47 +0000 (10:35 +0800)]
ACPICA: Update to GPIO region handler interface.

commit 75ec6e55f1384548311a13ce4fcb39c516053314 upstream.

Changes to correct several GPIO issues:

1) The update_rule in a GPIO field definition is now ignored;
a read-modify-write operation is never performed for GPIO fields.
(Internally, this means that the field assembly/disassembly
code is completely bypassed for GPIO.)

2) The Address parameter passed to a GPIO region handler is
now the bit offset of the field from a previous Connection()
operator. Thus, it becomes a "Pin Number Index" into the
Connection() resource descriptor.

3) The bit_width parameter passed to a GPIO region handler is
now the exact bit width of the GPIO field. Thus, it can be
interpreted as "number of pins".

Overall, we can now say that the region handler interface
to GPIO handlers is a raw "bit/pin" addressed interface, not
a byte-addressed interface like the system_memory handler interface.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoMIPS: mcount: Adjust stack pointer for static trace in MIPS32
Markos Chandras [Tue, 16 Sep 2014 14:55:12 +0000 (15:55 +0100)]
MIPS: mcount: Adjust stack pointer for static trace in MIPS32

commit 8a574cfa2652545eb95595d38ac2a0bb501af0ae upstream.

Every mcount() call in the MIPS 32-bit kernel is done as follows:

[...]
move at, ra
jal _mcount
addiu sp, sp, -8
[...]

but upon returning from the mcount() function, the stack pointer
is not adjusted properly. This is explained in details in 58b69401c797
(MIPS: Function tracer: Fix broken function tracing).

Commit ad8c396936e3 ("MIPS: Unbreak function tracer for 64-bit kernel.)
fixed the stack manipulation for 64-bit but it didn't fix it completely
for MIPS32.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7792/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoMIPS: ZBOOT: add missing <linux/string.h> include
Aurelien Jarno [Sun, 20 Jul 2014 17:58:23 +0000 (19:58 +0200)]
MIPS: ZBOOT: add missing <linux/string.h> include

commit 29593fd5a8149462ed6fad0d522234facdaee6c8 upstream.

Commit dc4d7b37 (MIPS: ZBOOT: gather string functions into string.c)
moved the string related functions into a separate file, which might
cause the following build error, depending on the configuration:

| CC      arch/mips/boot/compressed/decompress.o
| In file included from linux/arch/mips/boot/compressed/../../../../lib/decompress_unxz.c:234:0,
|                  from linux/arch/mips/boot/compressed/decompress.c:67:
| linux/arch/mips/boot/compressed/../../../../lib/xz/xz_dec_stream.c: In function 'fill_temp':
| linux/arch/mips/boot/compressed/../../../../lib/xz/xz_dec_stream.c:162:2: error: implicit declaration of function 'memcpy' [-Werror=implicit-function-declaration]
| cc1: some warnings being treated as errors
| linux/scripts/Makefile.build:308: recipe for target 'arch/mips/boot/compressed/decompress.o' failed
| make[6]: *** [arch/mips/boot/compressed/decompress.o] Error 1
| linux/arch/mips/Makefile:308: recipe for target 'vmlinuz' failed

It does not fail with the standard configuration, as when
CONFIG_DYNAMIC_DEBUG is not enabled <linux/string.h> gets included in
include/linux/dynamic_debug.h. There might be other ways for it to
get indirectly included.

We can't add the include directly in xz_dec_stream.c as some
architectures might want to use a different version for the boot/
directory (see for example arch/x86/boot/string.h).

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7420/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoARM: 8165/1: alignment: don't break misaligned NEON load/store
Robin Murphy [Thu, 25 Sep 2014 10:56:19 +0000 (11:56 +0100)]
ARM: 8165/1: alignment: don't break misaligned NEON load/store

commit 5ca918e5e3f9df4634077c06585c42bc6a8d699a upstream.

The alignment fixup incorrectly decodes faulting ARM VLDn/VSTn
instructions (where the optional alignment hint is given but incorrect)
as LDR/STR, leading to register corruption. Detect these and correctly
treat them as unhandled, so that userspace gets the fault it expects.

Reported-by: Simon Hosie <simon.hosie@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>