GitHub/LineageOS/G12/android_kernel_amlogic_linux-4.9.git
13 years agonetfilter: fix race in conntrack between dump_table and destroy
Stephen Hemminger [Tue, 11 Jan 2011 22:54:42 +0000 (23:54 +0100)]
netfilter: fix race in conntrack between dump_table and destroy

The netlink interface to dump the connection tracking table has a race
when entries are deleted at the same time. A customer reported a crash
and the backtrace showed thatctnetlink_dump_table was running while a
conntrack entry was being destroyed.
(see https://bugzilla.vyatta.com/show_bug.cgi?id=6402).

According to RCU documentation, when using hlist_nulls the reader
must handle the case of seeing a deleted entry and not proceed
further down the linked list.  The old code would continue
which caused the scan to walk into the free list.

This patch uses locking (rather than RCU) for this operation which
is guaranteed safe, and no longer requires getting reference while
doing dump operation.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
13 years agonetfilter: x_tables: dont block BH while reading counters
Eric Dumazet [Mon, 10 Jan 2011 19:11:38 +0000 (20:11 +0100)]
netfilter: x_tables: dont block BH while reading counters

Using "iptables -L" with a lot of rules have a too big BH latency.
Jesper mentioned ~6 ms and worried of frame drops.

Switch to a per_cpu seqlock scheme, so that taking a snapshot of
counters doesnt need to block BH (for this cpu, but also other cpus).

This adds two increments on seqlock sequence per ipt_do_table() call,
its a reasonable cost for allowing "iptables -L" not block BH
processing.

Reported-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
13 years agoixgbe: update ntuple filter configuration
Alexander Duyck [Thu, 6 Jan 2011 14:29:59 +0000 (14:29 +0000)]
ixgbe: update ntuple filter configuration

This change fixes several issues found in ntuple filtering while I was
doing the ATR refactor.

Specifically I updated the masks to work correctly with the latest version
of ethtool, I cleaned up the exception handling and added detailed error
output when a filter is rejected, and corrected several bits that were set
incorrectly in ixgbe_type.h.

The previous version of this patch included a printk that was left over from
me fixing the filter setup.  This patch does not include that printk.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoixgbe: further flow director performance optimizations
Alexander Duyck [Thu, 6 Jan 2011 14:29:58 +0000 (14:29 +0000)]
ixgbe: further flow director performance optimizations

This change adds a compressed input type for atr signature hash
computation.  It also drops the use of the set functions when setting up
the ATR input since we can then directly setup the hash input as two dwords
that can be stored and passed as registers.

With these changes the cost of computing the has is low enough that we can
perform a hash computation on each TCP SYN flagged packet allowing us to
drop the number of flow director misses considerably in tests such as
netperf TCP_CRR.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoixgbe: cleanup flow director hash computation to improve performance
Alexander Duyck [Thu, 6 Jan 2011 14:29:57 +0000 (14:29 +0000)]
ixgbe: cleanup flow director hash computation to improve performance

This change cleans up the layout of the flow director data, and the
algorithm used to calculate the hash resulting in a 35x / 3500% performance
increase versus the old flow director hash computation.  The overall effect
is only a 1% increase in transactions per second though due to the fact
that only 1 packet in 20 are actually hashed upon.

TCP_RR before:
Socket Size   Request  Resp.   Elapsed  Trans.
Send   Recv   Size     Size    Time     Rate
bytes  Bytes  bytes    bytes   secs.    per sec

16384  87380  1        1       60.00    23059.27
16384  87380

TCP_RR after:
Socket Size   Request  Resp.   Elapsed  Trans.
Send   Recv   Size     Size    Time     Rate
bytes  Bytes  bytes    bytes   secs.    per sec

16384  87380  1        1       60.00    23239.98
16384  87380

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoixgbe: make sure per Rx queue is disabled before unmapping the receive buffer
Yi Zou [Thu, 6 Jan 2011 14:29:56 +0000 (14:29 +0000)]
ixgbe: make sure per Rx queue is disabled before unmapping the receive buffer

When disable the Rx logic globally, we would also want to disable the per Rx
queue receive logic by per queue Rx control register RXDCTL so no more DMA is
happening from the packet buffer to the receive buffer associated with the Rx
ring, before we start unmapping Rx ring receive buffer. The hardware may take
max of 100us before the corresponding Rx queue is really disabled. Added
ixgbe_disable_rx_queue() for this purpose.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoe1000: Add support for the CE4100 reference platform
Dirk Brandewie [Thu, 6 Jan 2011 14:29:54 +0000 (14:29 +0000)]
e1000: Add support for the CE4100 reference platform

This patch adds support for the gigabit phys present on the CE4100 reference
platforms.

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoe1000e: add custom set_d[0|3]_lplu_state function pointer for 82574
Bruce Allan [Thu, 6 Jan 2011 14:29:53 +0000 (14:29 +0000)]
e1000e: add custom set_d[0|3]_lplu_state function pointer for 82574

82574 needs to configure Low Power Link Up (or LPLU) differently than
the other parts in the 8257x family supported by the driver.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoe1000e: power off PHY after reset when interface is down
Bruce Allan [Thu, 6 Jan 2011 14:29:52 +0000 (14:29 +0000)]
e1000e: power off PHY after reset when interface is down

Some Phys supported by the driver do not remain powered off across a reset
of the device when the interface is down, e.g. on 82571, but not on 82574.
This patch powers down (only when WoL is disabled) the PHY after a reset if
the interface is down and the ethtool diagnostics are not currently running.

The ethtool diagnostic function required a minor re-factor as a result, and
the e1000_[get|put]_hw_control() functions are renamed since they are no
longer static to netdev.c as they are needed by the ethtool diagnostics.
A couple minor whitespace issues were cleaned up, too.

Reported-by: Arthur Jones <ajones@riverbed.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoe1000e: use either_crc_le() rather than re-write it
Bruce Allan [Thu, 6 Jan 2011 14:29:51 +0000 (14:29 +0000)]
e1000e: use either_crc_le() rather than re-write it

For the 82579 jumbo frame workaround, there is no need to re-write the CRC
calculation functionality already found in the kernel's ether_crc_le().

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoe1000e: properly bounds-check string functions
Bruce Allan [Thu, 6 Jan 2011 14:29:50 +0000 (14:29 +0000)]
e1000e: properly bounds-check string functions

Use string functions with bounds checking rather than their non-bounds
checking counterparts, and do not hard code these boundaries.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoe1000e: convert calls of ops.[read|write]_reg to e1e_[r|w]phy
Bruce Allan [Thu, 6 Jan 2011 14:29:49 +0000 (14:29 +0000)]
e1000e: convert calls of ops.[read|write]_reg to e1e_[r|w]phy

Cleans up the code a bit by using the driver-specific e1e_rphy and
e1e_wphy macros instead of the full function pointer variants.  Fix
a couple whitespace issue with two already existing calls to e1e_wphy.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoe1000e: cleanup variables set but not used
Bruce Allan [Thu, 6 Jan 2011 14:29:48 +0000 (14:29 +0000)]
e1000e: cleanup variables set but not used

The ICR register is clear on read and we don't care what the returned value
is when resetting the hardware so the icr variable(s) can be removed.  We
should not ignore the return from e1000_lv_jumbo_workaround_ich8lan() and
from e1000_get_phy_id_82571() (dump a debug message when it fails and when
an unknown Phy id is returned).

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet offloading: Convert checksums to use centrally computed features.
Jesse Gross [Sun, 9 Jan 2011 06:23:35 +0000 (06:23 +0000)]
net offloading: Convert checksums to use centrally computed features.

In order to compute the features for other offloads (primarily
scatter/gather), we need to first check the ability of the NIC to
offload the checksum for the packet.  Since we have already computed
this, we can directly use the result instead of figuring it out
again.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet offloading: Convert skb_need_linearize() to use precomputed features.
Jesse Gross [Sun, 9 Jan 2011 06:23:34 +0000 (06:23 +0000)]
net offloading: Convert skb_need_linearize() to use precomputed features.

This switches skb_need_linearize() to use the features that have
been centrally computed.  In doing so, this fixes a problem where
scatter/gather should not be used because the card does not support
checksum offloading on that type of packet.  On device registration
we only check that some form of checksum offloading is available if
scatter/gatther is enabled but we must also check at transmission
time.  Examples of this include IPv6 or vlan packets on a NIC that
only supports IPv4 offloading.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet offloading: Convert dev_gso_segment() to use precomputed features.
Jesse Gross [Sun, 9 Jan 2011 06:23:33 +0000 (06:23 +0000)]
net offloading: Convert dev_gso_segment() to use precomputed features.

This switches dev_gso_segment() to use the device features computed
by the centralized routine.  In doing so, it fixes a problem where
it would always use dev->features, instead of those appropriate
to the number of vlan tags if any are present.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet offloading: Pass features into netif_needs_gso().
Jesse Gross [Sun, 9 Jan 2011 06:23:32 +0000 (06:23 +0000)]
net offloading: Pass features into netif_needs_gso().

Now that there is a single function that can compute the device
features relevant to a packet, we don't want to run it for each
offload.  This converts netif_needs_gso() to take the features
of the device, rather than computing them itself.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet offloading: Generalize netif_get_vlan_features().
Jesse Gross [Sun, 9 Jan 2011 06:23:31 +0000 (06:23 +0000)]
net offloading: Generalize netif_get_vlan_features().

netif_get_vlan_features() is currently only used by netif_needs_gso(),
so it only concerns itself with GSO features.  However, several other
places also should take into account the contents of the packet when
deciding whether to offload to hardware.  This generalizes the function
to return features about all of the various forms of offloading.  Since
offloads tend to be linked together, this avoids duplicating the logic
in each location (i.e. the scatter/gather code also needs the checksum
logic).

Suggested-by: Michał Mirosław <mirqus@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet offloading: Accept NETIF_F_HW_CSUM for all protocols.
Jesse Gross [Sun, 9 Jan 2011 06:23:30 +0000 (06:23 +0000)]
net offloading: Accept NETIF_F_HW_CSUM for all protocols.

We currently only have software fallback for one type of checksum: the
TCP/UDP one's complement.  This means that a protocol that uses hardware
offloading for a different type of checksum (FCoE, SCTP) must directly
check the device's features and do the right thing ahead of time.  By
the time we get to dev_can_checksum(), we're only deciding whether to
apply the one algorithm in software or hardware.  NETIF_F_HW_CSUM has the
same capabilities as the software version, so we should always use it if
present.  The primary advantage of this is multiply tagged vlans can use
hardware checksumming.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agor8169: delay phy init until device opens.
françois romieu [Sat, 8 Jan 2011 02:17:26 +0000 (02:17 +0000)]
r8169: delay phy init until device opens.

It workarounds the 60s firmware load failure timeout for the
non-modular case.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: fix kernel-doc warning in core/filter.c
Randy Dunlap [Sat, 8 Jan 2011 17:41:42 +0000 (17:41 +0000)]
net: fix kernel-doc warning in core/filter.c

Fix new kernel-doc notation warning in net/core/filter.c:

Warning(net/core/filter.c:172): No description found for parameter 'fentry'
Warning(net/core/filter.c:172): Excess function parameter 'filter' description in 'sk_run_filter'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet/sock.h: make some fields private to fix kernel-doc warning(s)
Randy Dunlap [Sat, 8 Jan 2011 17:39:21 +0000 (17:39 +0000)]
net/sock.h: make some fields private to fix kernel-doc warning(s)

Fix new kernel-doc notation warning in sock.h by annotating skc_dontcopy_*
as private fields.

Warning(include/net/sock.h:163): No description found for parameter 'skc_dontcopy_end[0]'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonetlink: test for all flags of the NLM_F_DUMP composite
Jan Engelhardt [Fri, 7 Jan 2011 03:15:05 +0000 (03:15 +0000)]
netlink: test for all flags of the NLM_F_DUMP composite

Due to NLM_F_DUMP is composed of two bits, NLM_F_ROOT | NLM_F_MATCH,
when doing "if (x & NLM_F_DUMP)", it tests for _either_ of the bits
being set. Because NLM_F_MATCH's value overlaps with NLM_F_EXCL,
non-dump requests with NLM_F_EXCL set are mistaken as dump requests.

Substitute the condition to test for _all_ bits being set.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoforcedeth: Do not use legacy PCI power management
Rafael J. Wysocki [Fri, 7 Jan 2011 11:12:05 +0000 (11:12 +0000)]
forcedeth: Do not use legacy PCI power management

The forcedeth driver uses the legacy PCI power management, so it has
to do PCI-specific things in its ->suspend() and ->resume() callbacks
and some of them are not done correctly.

Convert forcedeth to the new PCI power management framework and make
it let the PCI subsystem take care of all the PCI-specific aspects of
device handling during system power transitions.

Tested with nVidia Corporation MCP55 Ethernet (rev a2).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/net-next-2.6
David S. Miller [Mon, 10 Jan 2011 00:16:57 +0000 (16:16 -0800)]
Merge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/net-next-2.6

13 years agosky2: convert to new VLAN model (v0.2)
Stephen Hemminger [Sun, 9 Jan 2011 23:54:15 +0000 (15:54 -0800)]
sky2: convert to new VLAN model (v0.2)

This converts sky2 to new VLAN offload flags control via ethtool.
It also allows for transmit offload of vlan tagged frames which
was not possible before.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reviewed-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosky2: fix limited auto negotiation
Stephen Hemminger [Thu, 6 Jan 2011 18:40:36 +0000 (18:40 +0000)]
sky2: fix limited auto negotiation

The sky2 driver would always try all possible supported speeds even
if the user only asked for a limited set of speed/duplex combinations.

Reported-by: Mohsen Hariri <m.hariri@gmail.com>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobnx2x: Fix the race on bp->stats_pending.
Vladislav Zolotarov [Sun, 9 Jan 2011 02:20:34 +0000 (02:20 +0000)]
bnx2x: Fix the race on bp->stats_pending.

Fix the race on bp->stats_pending between the timer and a LINK_UP event
handler.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobnx2x: Move to D0 before clearing MSI/MSI-X configuration.
Vladislav Zolotarov [Sun, 9 Jan 2011 02:20:19 +0000 (02:20 +0000)]
bnx2x: Move to D0 before clearing MSI/MSI-X configuration.

Move to D0 before clearing MSI/MSI-X configuration. Otherwise MSI/MSI-X
won't be cleared.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobnx2x: registers dump fixes
Vladislav Zolotarov [Sun, 9 Jan 2011 02:20:04 +0000 (02:20 +0000)]
bnx2x: registers dump fixes

Fixes in registers dump:
        - Properly calculate dump length for 57712.
        - Prevent HW blocks parity attentions when dumping registers in order to
prevent false parity errors handling.
        - Update the bnx2x_dump.h file: old one had a few bugs that could cause
fatal HW error as a result of a registers dump.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobnx2x: Don't prevent RSS configuration in INT#x and MSI interrupt modes.
Vladislav Zolotarov [Sun, 9 Jan 2011 02:19:40 +0000 (02:19 +0000)]
bnx2x: Don't prevent RSS configuration in INT#x and MSI interrupt modes.

Don't prevent RSS configuration in INT#x and MSI interrupt modes. Otherwise
Rx hash key won't be available.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMadge Ambassador ATM Adapter driver: Always release_firmware() in ucode_init() and...
Jesper Juhl [Sun, 9 Jan 2011 11:32:38 +0000 (11:32 +0000)]
Madge Ambassador ATM Adapter driver: Always release_firmware() in ucode_init() and don't leak memory.

Failure to call release_firmware() will result in memory leak in
drivers/atm/ambassador.c::ucode_init().
This patch makes sure we always call release_firmware() when needed,
thus removing the leak(s).

Yes, I know checkpatch complains about this patch, but it was either that
or completely mess up the existing style, so I opted to use the existing
style and live with the checkpatch related flak.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agohamradio: Resolve memory leak due to missing firmware release in add_mcs()
Jesper Juhl [Thu, 6 Jan 2011 10:50:29 +0000 (10:50 +0000)]
hamradio: Resolve memory leak due to missing firmware release in add_mcs()

Failure to release_firmware() in drivers/net/hamradio/yam.c::add_mcs()
causes memory leak.
This patch should fix it.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet/fec: add dual fec support for mx28
Shawn Guo [Wed, 5 Jan 2011 21:13:13 +0000 (21:13 +0000)]
net/fec: add dual fec support for mx28

This patch is to add mx28 dual fec support. Here are some key notes
for mx28 fec controller.

 - The mx28 fec controller naming ENET-MAC is a different IP from FEC
   used on other i.mx variants.  But they are basically compatible
   on software interface, so it's possible to share the same driver.
 - ENET-MAC design on mx28 made an improper assumption that it runs
   on a big-endian system. As the result, driver has to swap every
   frame going to and coming from the controller.
 - The external phys can only be configured by fec0, which means fec1
   can not work independently and both phys need to be configured by
   mii_bus attached on fec0.
 - ENET-MAC reset will get mac address registers reset too.
 - ENET-MAC MII/RMII mode and 10M/100M speed are configured
   differently FEC.
 - ETHER_EN bit must be set to get ENET-MAC interrupt work.

Signed-off-by: Shawn Guo <shawn.guo@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet/fec: improve pm for better suspend/resume
Shawn Guo [Wed, 5 Jan 2011 21:13:12 +0000 (21:13 +0000)]
net/fec: improve pm for better suspend/resume

The following commit made a fix to use fec_enet_open/fec_enet_close
over fec_enet_init/fec_stop for suspend/resume, because fec_enet_init
does not allow to have a working network interface at resume.

  e3fe8558c7fc182972c3d947d88744482111f304
  net/fec: fix pm to survive to suspend/resume

This fix works for i.mx/mxc fec controller, but fails on mx28 fec
which gets a different interrupt logic design. On i.mx fec, interrupt
can be triggered even bit ETHER_EN of ECR register is not set. But
on mx28 fec, ETHER_EN must be set to get interrupt work. Meanwhile,
MII interrupt is mandatory to resume the driver, because MDIO
read/write changed to interrupt mode by commit below.

  97b72e4320a9aaa4a7f1592ee7d2da7e2c9bd349
  fec: use interrupt for MDIO completion indication

fec_restart/fec_stop comes out as the solution working for both
cases.

Signed-off-by: Shawn Guo <shawn.guo@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet/fec: add mac field into platform data and consolidate fec_get_mac
Shawn Guo [Wed, 5 Jan 2011 21:13:11 +0000 (21:13 +0000)]
net/fec: add mac field into platform data and consolidate fec_get_mac

Add mac field into fec_platform_data and consolidate function
fec_get_mac to get mac address in following order.

 1) module parameter via kernel command line fec.macaddr=0x00,0x04,...
 2) from flash in case of CONFIG_M5272 or fec_platform_data mac
    field for others, which typically have mac stored in fuse
 3) fec mac address registers set by bootloader

Signed-off-by: Shawn Guo <shawn.guo@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet/fec: remove the use of "index" which is legacy
Shawn Guo [Wed, 5 Jan 2011 21:13:10 +0000 (21:13 +0000)]
net/fec: remove the use of "index" which is legacy

The "index" becomes legacy since fep->pdev->id starts working
to identify the instance.

Moreover, the call of fec_enet_init(ndev, 0) always passes 0
to fep->index. This makes the following code in fec_get_mac buggy.

/* Adjust MAC if using default MAC address */
if (iap == fec_mac_default)
dev->dev_addr[ETH_ALEN-1] = fec_mac_default[ETH_ALEN-1] + fep->index;

It may be the time to remove "index" and use fep->pdev->id instead.

Signed-off-by: Shawn Guo <shawn.guo@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet/fec: fix MMFR_OP type in fec_enet_mdio_write
Shawn Guo [Wed, 5 Jan 2011 21:13:09 +0000 (21:13 +0000)]
net/fec: fix MMFR_OP type in fec_enet_mdio_write

FEC_MMFR_OP_WRITE should be used than FEC_MMFR_OP_READ in
a mdio write operation.

It's probably a typo introduced by commit:

e6b043d512fa8d9a3801bf5d72bfa3b8fc3b3cc8
netdev/fec.c: add phylib supporting to enable carrier detection (v2)

Signed-off-by: Shawn Guo <shawn.guo@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/hfsplus
Linus Torvalds [Sat, 8 Jan 2011 01:16:27 +0000 (17:16 -0800)]
Merge branch 'for-next' of git://git./linux/kernel/git/hch/hfsplus

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/hfsplus:
  hfsplus: %L-to-%ll, macro correction, and remove unneeded braces
  hfsplus: spaces/indentation clean-up
  hfsplus: C99 comments clean-up
  hfsplus: over 80 character lines clean-up
  hfsplus: fix an artifact in ioctl flag checking
  hfsplus: flush disk caches in sync and fsync
  hfsplus: optimize fsync
  hfsplus: split up inode flags
  hfsplus: write up fsync for directories
  hfsplus: simplify fsync
  hfsplus: avoid useless work in hfsplus_sync_fs
  hfsplus: make sure sync writes out all metadata
  hfsplus: use raw bio access for partition tables
  hfsplus: use raw bio access for the volume headers
  hfsplus: always use hfsplus_sync_fs to write the volume header
  hfsplus: silence a few debug printks
  hfsplus: fix option parsing during remount

Fix up conflicts due to VFS changes in fs/hfsplus/{hfsplus_fs.h,unicode.c}

13 years agoMerge branch 'next-spi' of git://git.secretlab.ca/git/linux-2.6
Linus Torvalds [Sat, 8 Jan 2011 01:08:46 +0000 (17:08 -0800)]
Merge branch 'next-spi' of git://git.secretlab.ca/git/linux-2.6

* 'next-spi' of git://git.secretlab.ca/git/linux-2.6: (77 commits)
  spi/omap: Fix DMA API usage in OMAP MCSPI driver
  spi/imx: correct the test on platform_get_irq() return value
  spi/topcliff: Typo fix threhold to threshold
  spi/dw_spi Typo change diable to disable.
  spi/fsl_espi: change the read behaviour of the SPIRF
  spi/mpc52xx-psc-spi: move probe/remove to proper sections
  spi/dw_spi: add DMA support
  spi/dw_spi: change to EXPORT_SYMBOL_GPL for exported APIs
  spi/dw_spi: Fix too short timeout in spi polling loop
  spi/pl022: convert running variable
  spi/pl022: convert busy flag to a bool
  spi/pl022: pass the returned sglen to the DMA engine
  spi/pl022: map the buffers on the DMA engine
  spi/topcliff_pch: Fix data transfer issue
  spi/imx: remove autodetection
  spi/pxa2xx: pass of_node to spi device and set a parent device
  spi/pxa2xx: Modify RX-Tresh instead of busy-loop for the remaining RX bytes.
  spi/pxa2xx: Add chipselect support for Sodaville
  spi/pxa2xx: Consider CE4100's FIFO depth
  spi/pxa2xx: Add CE4100 support
  ...

13 years agoMerge branch 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Linus Torvalds [Sat, 8 Jan 2011 01:02:58 +0000 (17:02 -0800)]
Merge branch 'for-2.6.38' of git://git./linux/kernel/git/tj/percpu

* 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (30 commits)
  gameport: use this_cpu_read instead of lookup
  x86: udelay: Use this_cpu_read to avoid address calculation
  x86: Use this_cpu_inc_return for nmi counter
  x86: Replace uses of current_cpu_data with this_cpu ops
  x86: Use this_cpu_ops to optimize code
  vmstat: User per cpu atomics to avoid interrupt disable / enable
  irq_work: Use per cpu atomics instead of regular atomics
  cpuops: Use cmpxchg for xchg to avoid lock semantics
  x86: this_cpu_cmpxchg and this_cpu_xchg operations
  percpu: Generic this_cpu_cmpxchg() and this_cpu_xchg support
  percpu,x86: relocate this_cpu_add_return() and friends
  connector: Use this_cpu operations
  xen: Use this_cpu_inc_return
  taskstats: Use this_cpu_ops
  random: Use this_cpu_inc_return
  fs: Use this_cpu_inc_return in buffer.c
  highmem: Use this_cpu_xx_return() operations
  vmstat: Use this_cpu_inc_return for vm statistics
  x86: Support for this_cpu_add, sub, dec, inc_return
  percpu: Generic support for this_cpu_add, sub, dec, inc_return
  ...

Fixed up conflicts: in arch/x86/kernel/{apic/nmi.c, apic/x2apic_uv_x.c, process.c}
as per Tejun.

13 years agoMerge branch 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Linus Torvalds [Sat, 8 Jan 2011 00:58:04 +0000 (16:58 -0800)]
Merge branch 'for-2.6.38' of git://git./linux/kernel/git/tj/wq

* 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (33 commits)
  usb: don't use flush_scheduled_work()
  speedtch: don't abuse struct delayed_work
  media/video: don't use flush_scheduled_work()
  media/video: explicitly flush request_module work
  ioc4: use static work_struct for ioc4_load_modules()
  init: don't call flush_scheduled_work() from do_initcalls()
  s390: don't use flush_scheduled_work()
  rtc: don't use flush_scheduled_work()
  mmc: update workqueue usages
  mfd: update workqueue usages
  dvb: don't use flush_scheduled_work()
  leds-wm8350: don't use flush_scheduled_work()
  mISDN: don't use flush_scheduled_work()
  macintosh/ams: don't use flush_scheduled_work()
  vmwgfx: don't use flush_scheduled_work()
  tpm: don't use flush_scheduled_work()
  sonypi: don't use flush_scheduled_work()
  hvsi: don't use flush_scheduled_work()
  xen: don't use flush_scheduled_work()
  gdrom: don't use flush_scheduled_work()
  ...

Fixed up trivial conflict in drivers/media/video/bt8xx/bttv-input.c
as per Tejun.

13 years agoMerge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 7 Jan 2011 22:55:48 +0000 (14:55 -0800)]
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Constify function scope static struct sched_param usage
  sched: Fix strncmp operation
  sched: Move sched_autogroup_exit() to free_signal_struct()
  sched: Fix struct autogroup memory leak
  sched: Mark autogroup_init() __init
  sched: Consolidate the name of root_task_group and init_task_group

13 years agoMerge branch 'x86-apic-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Fri, 7 Jan 2011 22:55:31 +0000 (14:55 -0800)]
Merge branch 'x86-apic-cleanups-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'x86-apic-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: apic: Cleanup and simplify setup_local_APIC()
  x86: Further simplify mp_irq info handling
  x86: Unify 3 similar ways of saving mp_irqs info
  x86, ioapic: Avoid writing io_apic id if already correct
  x86, x2apic: Don't map lapic addr for preenabled x2apic systems
  x86, sfi: Use register_lapic_address()
  x86, apic: Use register_lapic_address() in init_apic_mapping()
  x86, apic: Remove early_init_lapic_mapping()
  x86, apic: Unify identical register_lapic_address() functions

13 years agoMerge branch 'mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp
Linus Torvalds [Fri, 7 Jan 2011 22:54:03 +0000 (14:54 -0800)]
Merge branch 'mce-for-linus' of git://git./linux/kernel/git/bp/bp

* 'mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  EDAC, MCE: Fix NB error formatting
  EDAC, MCE: Use BIT_64() to eliminate warnings on 32-bit
  EDAC, MCE: Enable MCE decoding on F15h
  EDAC, MCE: Allow F15h bank 6 MCE injection
  EDAC, MCE: Shorten error report formatting
  EDAC, MCE: Overhaul error fields extraction macros
  EDAC, MCE: Add F15h FP MCE decoder
  EDAC, MCE: Add F15 EX MCE decoder
  EDAC, MCE: Add an F15h NB MCE decoder
  EDAC, MCE: No F15h LS MCE decoder
  EDAC, MCE: Add F15h CU MCE decoder
  EDAC, MCE: Add F15h IC MCE decoder
  EDAC, MCE: Add F15h DC MCE decoder
  EDAC, MCE: Select extended error code mask

13 years agoMerge branch 'edac-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp
Linus Torvalds [Fri, 7 Jan 2011 22:53:42 +0000 (14:53 -0800)]
Merge branch 'edac-for-linus' of git://git./linux/kernel/git/bp/bp

* 'edac-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  amd64_edac: Disable DRAM ECC injection on K8
  EDAC: Fixup scrubrate manipulation
  amd64_edac: Remove two-stage initialization
  amd64_edac: Check ECC capabilities initially
  amd64_edac: Carve out ECC-related hw settings
  amd64_edac: Remove PCI ECS enabling functions
  amd64_edac: Remove explicit Kconfig PCI dependency
  amd64_edac: Allocate driver instances dynamically
  amd64_edac: Rework printk macros
  amd64_edac: Rename CPU PCI devices
  amd64_edac: Concentrate per-family init even more
  amd64_edac: Cleanup the CPU PCI device reservation
  amd64_edac: Simplify CPU family detection
  amd64_edac: Add per-family init function
  amd64_edac: Use cached extended CPU model
  amd64_edac: Remove F11h support

13 years agoMerge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
Linus Torvalds [Fri, 7 Jan 2011 22:50:50 +0000 (14:50 -0800)]
Merge branch 'for-linus' of git://git390.marist.edu/linux-2.6

* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: (65 commits)
  [S390] prevent unneccesary loops_per_jiffy recalculation
  [S390] cpuinfo: use get_online_cpus() instead of preempt_disable()
  [S390] smp: remove cpu hotplug messages
  [S390] mutex: enable spinning mutex on s390
  [S390] mutex: Introduce arch_mutex_cpu_relax()
  [S390] cio: fix ccwgroup unregistration race condition
  [S390] perf: add DWARF register lookup for s390
  [S390] cleanup ftrace backend functions
  [S390] ptrace cleanup
  [S390] smp/idle: call init_idle() before starting a new cpu
  [S390] smp: delay idle task creation
  [S390] dasd: Correct retry counter for terminated I/O.
  [S390] dasd: Add support for raw ECKD access.
  [S390] dasd: Prevent deadlock during suspend/resume.
  [S390] dasd: Improve handling of stolen DASD reservation
  [S390] dasd: do path verification for paths added at runtime
  [S390] dasd: add High Performance FICON multitrack support
  [S390] cio: reduce memory consumption of itcw structures
  [S390] nmi: enable machine checks early
  [S390] qeth: buffer count imbalance
  ...

13 years agoMerge branch 'rmobile-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal...
Linus Torvalds [Fri, 7 Jan 2011 22:50:14 +0000 (14:50 -0800)]
Merge branch 'rmobile-latest' of git://git./linux/kernel/git/lethal/sh-2.6

* 'rmobile-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (67 commits)
  ARM: mach-shmobile: update for SMP changes.
  ARM: mach-shmobile: update for GIC changes.
  ARM: mach-shmobile: Fix up clkdev fallout for SH73A0.
  dma: shdma: don't register the global die notifier multiple times
  ARM: mach-shmobile: Rely on run-time IRQ handlers
  ARM: mach-shmobile: Run-time IRQ handler for GIC
  ARM: mach-shmobile: Run-time IRQ handler for INTCA
  ARM: mach-shmobile: Enable CONFIG_MULTI_IRQ_HANDLER
  ARM: mach-shmobile: Use shared GIC entry macros
  ARM: mach-shmobile: mackerel: Add zboot support
  ARM: mach-shmobile: mackerel: Add HDMI sound support
  ARM: mach-shmobile: mackerel: add HDMI video support
  ARM: mach-shmobile: ap4evb: fixup clk_put timing of fsib_clk
  ARM: mach-shmobile: sh73a0: fix div4 table
  ARM: mach-shmobile: ap4/mackerel: modify wrong comment out of USB
  ARM: mach-shmobile: Mackerel VGA camera support
  mmc: sh_mmcif: make DMA support by the driver unconditional
  ARM: mach-shmobile: Add eMMC support through MMCIF on AG5EVM
  ARM: mach-shmobile: Use pullups for AG5EVM KEYSC pins
  ARM: mach-shmobile: sh73a0 GPIO pullup improvement
  ...

13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Fri, 7 Jan 2011 22:45:47 +0000 (14:45 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (58 commits)
  Input: wacom_w8001 - support pen or touch only devices
  Input: wacom_w8001 - use __set_bit to set keybits
  Input: bu21013_ts - fix misuse of logical operation in place of bitop
  Input: i8042 - add Acer Aspire 5100 to the Dritek list
  Input: wacom - add support for digitizer in Lenovo W700
  Input: psmouse - disable the synaptics extension on OLPC machines
  Input: psmouse - fix up Synaptics comment
  Input: synaptics - ignore bogus mt packet
  Input: synaptics - add multi-finger and semi-mt support
  Input: synaptics - report clickpad property
  input: mt: Document interface updates
  Input: fix double equality sign in uevent
  Input: introduce device properties
  hid: egalax: Add support for Wetab (726b)
  Input: include MT library as source for kerneldoc
  MAINTAINERS: Update input-mt entry
  hid: egalax: Add support for Samsung NB30 netbook
  hid: egalax: Document the new devices in Kconfig
  hid: egalax: Add support for Wetab
  hid: egalax: Convert to MT slots
  ...

Fixed up trivial conflict in drivers/input/keyboard/Kconfig

13 years agoMerge branch 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6
Linus Torvalds [Fri, 7 Jan 2011 22:39:20 +0000 (14:39 -0800)]
Merge branch 'tty-next' of git://git./linux/kernel/git/gregkh/tty-2.6

* 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: (36 commits)
  serial: apbuart: Fixup apbuart_console_init()
  TTY: Add tty ioctl to figure device node of the system console.
  tty: add 'active' sysfs attribute to tty0 and console device
  drivers: serial: apbuart: Handle OF failures gracefully
  Serial: Avoid unbalanced IRQ wake disable during resume
  tty: fix typos/errors in tty_driver.h comments
  pch_uart : fix warnings for 64bit compile
  8250: fix uninitialized FIFOs
  ip2: fix compiler warning on ip2main_pci_tbl
  specialix: fix compiler warning on specialix_pci_tbl
  rocket: fix compiler warning on rocket_pci_ids
  8250: add a UPIO_DWAPB32 for 32 bit accesses
  8250: use container_of() instead of casting
  serial: omap-serial: Add support for kernel debugger
  serial: fix pch_uart kconfig & build
  drivers: char: hvc: add arm JTAG DCC console support
  RS485 documentation: add 16C950 UART description
  serial: ifx6x60: fix memory leak
  serial: ifx6x60: free IRQ on error
  Serial: EG20T: add PCH_UART driver
  ...

Fixed up conflicts in drivers/serial/apbuart.c with evil merge that
makes the code look fairly sane (unlike either side).

13 years agoMerge branch 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
Linus Torvalds [Fri, 7 Jan 2011 21:16:28 +0000 (13:16 -0800)]
Merge branch 'usb-next' of git://git./linux/kernel/git/gregkh/usb-2.6

* 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (144 commits)
  USB: add support for Dream Cheeky DL100B Webmail Notifier (1d34:0004)
  USB: serial: ftdi_sio: add support for TIOCSERGETLSR
  USB: ehci-mxc: Setup portsc register prior to accessing OTG viewport
  USB: atmel_usba_udc: fix freeing irq in usba_udc_remove()
  usb: ehci-omap: fix tll channel enable mask
  usb: ohci-omap3: fix trivial typo
  USB: gadget: ci13xxx: don't assume that PAGE_SIZE is 4096
  USB: gadget: ci13xxx: fix complete() callback for no_interrupt rq's
  USB: gadget: update ci13xxx to work with g_ether
  USB: gadgets: ci13xxx: fix probing of compiled-in gadget drivers
  Revert "USB: musb: pm: don't rely fully on clock support"
  Revert "USB: musb: blackfin: pm: make it work"
  USB: uas: Use GFP_NOIO instead of GFP_KERNEL in I/O submission path
  USB: uas: Ensure we only bind to a UAS interface
  USB: uas: Rename sense pipe and sense urb to status pipe and status urb
  USB: uas: Use kzalloc instead of kmalloc
  USB: uas: Fix up the Sense IU
  usb: musb: core: kill unneeded #include's
  DA8xx: assign name to MUSB IRQ resource
  usb: gadget: g_ncm added
  ...

Manually fix up trivial conflicts in USB Kconfig changes in:
arch/arm/mach-omap2/Kconfig
arch/sh/Kconfig
drivers/usb/Kconfig
drivers/usb/host/ehci-hcd.c
and annoying chip clock data conflicts in:
arch/arm/mach-omap2/clock3xxx_data.c
arch/arm/mach-omap2/clock44xx_data.c

13 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
Linus Torvalds [Fri, 7 Jan 2011 20:47:02 +0000 (12:47 -0800)]
Merge git://git./linux/kernel/git/jejb/scsi-misc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (147 commits)
  [SCSI] arcmsr: fix write to device check
  [SCSI] lpfc: lower stack use in lpfc_fc_frame_check
  [SCSI] eliminate an unnecessary local variable from scsi_remove_target()
  [SCSI] libiscsi: use bh locking instead of irq with session lock
  [SCSI] libiscsi: do not take host lock in queuecommand
  [SCSI] be2iscsi: fix null ptr when accessing task hdr
  [SCSI] be2iscsi: fix gfp use in alloc_pdu
  [SCSI] libiscsi: add more informative failure message during iscsi scsi eh
  [SCSI] gdth: Add missing call to gdth_ioctl_free
  [SCSI] bfa: remove unused defintions and misc cleanups
  [SCSI] bfa: remove inactive functions
  [SCSI] bfa: replace bfa_assert with WARN_ON
  [SCSI] qla2xxx: Use sg_next to fetch next sg element while walking sg list.
  [SCSI] qla2xxx: Fix to avoid recursive lock failure during BSG timeout.
  [SCSI] qla2xxx: Remove code to not reset ISP82xx on failure.
  [SCSI] qla2xxx: Display mailbox register 4 during 8012 AEN for ISP82XX parts.
  [SCSI] qla2xxx: Don't perform a BIG_HAMMER if Get-ID (0x20) mailbox command fails on CNAs.
  [SCSI] qla2xxx: Remove redundant module parameter permission bits
  [SCSI] qla2xxx: Add sysfs node for displaying board temperature.
  [SCSI] qla2xxx: Code cleanup to remove unwanted comments and code.
  ...

13 years agoinput/tc3589x: fix compile error
Dan Carpenter [Fri, 7 Jan 2011 19:47:37 +0000 (20:47 +0100)]
input/tc3589x: fix compile error

There was a semi-colon missing and it broke the compile.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agospi/omap: Fix DMA API usage in OMAP MCSPI driver
Russell King - ARM Linux [Fri, 7 Jan 2011 15:49:20 +0000 (15:49 +0000)]
spi/omap: Fix DMA API usage in OMAP MCSPI driver

Running the latest kernel on the 4430SDP board with DMA API debugging
enabled results in this:

WARNING: at lib/dma-debug.c:803 check_unmap+0x19c/0x6f0()
NULL NULL: DMA-API: device driver tries to free DMA memory it has not allocated
[device address=0x000000008129901a] [size=260 bytes]
Modules linked in:
Backtrace:
[<c003cbe0>] (dump_backtrace+0x0/0x10c) from [<c0278da8>] (dump_stack+0x18/0x1c)
 r7:c1839dc0 r6:c0198578 r5:c0304b17 r4:00000323
[<c0278d90>] (dump_stack+0x0/0x1c) from [<c005b158>] (warn_slowpath_common+0x58/0x70)
[<c005b100>] (warn_slowpath_common+0x0/0x70) from [<c005b214>] (warn_slowpath_fmt+0x38/0x40)
 r8:c1839e40 r7:00000000 r6:00000104 r5:00000000 r4:8129901a
[<c005b1dc>] (warn_slowpath_fmt+0x0/0x40) from [<c0198578>] (check_unmap+0x19c/0x6f0)
 r3:c03110de r2:c0304e6b
[<c01983dc>] (check_unmap+0x0/0x6f0) from [<c0198cd8>] (debug_dma_unmap_page+0x74/0x80)
[<c0198c64>] (debug_dma_unmap_page+0x0/0x80) from [<c01d5ad8>] (omap2_mcspi_work+0x514/0xbf0)
[<c01d55c4>] (omap2_mcspi_work+0x0/0xbf0) from [<c006dfb0>] (process_one_work+0x294/0x400)
[<c006dd1c>] (process_one_work+0x0/0x400) from [<c006e50c>] (worker_thread+0x220/0x3f8)
[<c006e2ec>] (worker_thread+0x0/0x3f8) from [<c00738d0>] (kthread+0x88/0x90)
[<c0073848>] (kthread+0x0/0x90) from [<c005e924>] (do_exit+0x0/0x5fc)
 r7:00000013 r6:c005e924 r5:c0073848 r4:c1829ee0
---[ end trace 1b75b31a2719ed20 ]---

I've no idea why this driver uses NULL for dma_unmap_single instead of
the &spi->dev that is laying around just waiting to be used in that
function - but it's an easy fix.

Also replace this comment with a FIXME comment:
                /* Do DMA mapping "early" for better error reporting and
                 * dcache use.  Note that if dma_unmap_single() ever starts
                 * to do real work on ARM, we'd need to clean up mappings
                 * for previous transfers on *ALL* exits of this loop...
                 */
as the comment is not true - we do work in dma_unmap() functions,
particularly on ARMv6 and above.  I've corrected the existing unmap
functions but if any others are required they must be added ASAP.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
13 years agospi/imx: correct the test on platform_get_irq() return value
Richard Genoud [Fri, 7 Jan 2011 14:26:01 +0000 (15:26 +0100)]
spi/imx: correct the test on platform_get_irq() return value

The test "if (spi_imx->irq <= 0)" is not testing the IRQ value, but
the return value of platform_get_irq().  As platform_get_irq() can
return an error (-ENXIO) or the IRQ value it found, the test should be
"if (spi_imx->irq < 0)"

[grant.likely: Note: In general, Linux irq number 0 should also mean
no irq, but arm still allows devices to be assigned 0, and the imx
platform uses 0 for one of the spi devices, so this patch is needed
for the device to work]

Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
13 years agoMerge branch 'vfs-scale-working' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 7 Jan 2011 16:56:33 +0000 (08:56 -0800)]
Merge branch 'vfs-scale-working' of git://git./linux/kernel/git/npiggin/linux-npiggin

* 'vfs-scale-working' of git://git.kernel.org/pub/scm/linux/kernel/git/npiggin/linux-npiggin: (57 commits)
  fs: scale mntget/mntput
  fs: rename vfsmount counter helpers
  fs: implement faster dentry memcmp
  fs: prefetch inode data in dcache lookup
  fs: improve scalability of pseudo filesystems
  fs: dcache per-inode inode alias locking
  fs: dcache per-bucket dcache hash locking
  bit_spinlock: add required includes
  kernel: add bl_list
  xfs: provide simple rcu-walk ACL implementation
  btrfs: provide simple rcu-walk ACL implementation
  ext2,3,4: provide simple rcu-walk ACL implementation
  fs: provide simple rcu-walk generic_check_acl implementation
  fs: provide rcu-walk aware permission i_ops
  fs: rcu-walk aware d_revalidate method
  fs: cache optimise dentry and inode for rcu-walk
  fs: dcache reduce branches in lookup path
  fs: dcache remove d_mounted
  fs: fs_struct use seqlock
  fs: rcu-walk for path lookup
  ...

13 years agosched: Constify function scope static struct sched_param usage
Peter Zijlstra [Fri, 7 Jan 2011 12:41:40 +0000 (13:41 +0100)]
sched: Constify function scope static struct sched_param usage

Function-scope statics are discouraged because they are
easily overlooked and can cause subtle bugs/races due to
their global (non-SMP safe) nature.

Linus noticed that we did this for sched_param - at minimum
make the const.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: Message-ID: <AANLkTinotRxScOHEb0HgFgSpGPkq_6jKTv5CfvnQM=ee@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agosched: Fix strncmp operation
Hillf Danton [Thu, 6 Jan 2011 12:58:12 +0000 (20:58 +0800)]
sched: Fix strncmp operation

One of the operands, buf, is incorrect, since it is stripped and the
correct address for subsequent string comparing could change if
leading white spaces, if any, are removed from buf.

It is fixed by replacing buf with cmp.

Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <AANLkTinOPuYsVovrZpbuCCmG5deEyc8WgA_A1RJx_YK7@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agosched: Move sched_autogroup_exit() to free_signal_struct()
Mike Galbraith [Wed, 5 Jan 2011 10:16:04 +0000 (11:16 +0100)]
sched: Move sched_autogroup_exit() to free_signal_struct()

Per Oleg's suggestion, undo fork failure free/put_signal_struct change,
and move sched_autogroup_exit() to free_signal_struct() instead.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1294222564.8369.6.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agosched: Fix struct autogroup memory leak
Mike Galbraith [Wed, 5 Jan 2011 10:11:25 +0000 (11:11 +0100)]
sched: Fix struct autogroup memory leak

Seems I lost a change somewhere, leaking memory.

sched: fix struct autogroup memory leak

Add missing change to actually use autogroup_free().

Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1294222285.8369.2.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agosched: Mark autogroup_init() __init
Yong Zhang [Fri, 7 Jan 2011 04:43:45 +0000 (12:43 +0800)]
sched: Mark autogroup_init() __init

autogroup_init() is only called at boot time.

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1294375425-31065-1-git-send-email-yong.zhang0@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agosched: Consolidate the name of root_task_group and init_task_group
Yong Zhang [Fri, 7 Jan 2011 07:17:36 +0000 (15:17 +0800)]
sched: Consolidate the name of root_task_group and init_task_group

root_task_group is the leftover of USER_SCHED, now it's always
same to init_task_group.
But as Mike suggested, root_task_group is maybe the suitable name
to keep for a tree.
So in this patch:
  init_task_group      --> root_task_group
  init_task_group_load --> root_task_group_load
  INIT_TASK_GROUP_LOAD --> ROOT_TASK_GROUP_LOAD

Suggested-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20110107071736.GA32635@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agoMerge branch 'linus' into x86/apic-cleanups
Ingo Molnar [Fri, 7 Jan 2011 13:14:15 +0000 (14:14 +0100)]
Merge branch 'linus' into x86/apic-cleanups

Conflicts:
arch/x86/include/asm/io_apic.h

Merge reason: Resolve the conflict, update to a more recent -rc base

Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agodccp: make upper bound for seq_window consistent on 32/64 bit
Gerrit Renker [Sun, 2 Jan 2011 17:15:58 +0000 (18:15 +0100)]
dccp: make upper bound for seq_window consistent on 32/64 bit

The 'seq_window' sysctl sets the initial value for the DCCP Sequence Window,
which may range from 32..2^46-1 (RFC 4340, 7.5.2). The patch sets the upper
bound consistently to 2^32-1 on both 32 and 64 bit systems, which should be
sufficient - with a RTT of 1sec and 1-byte packets, a seq_window of 2^32-1
corresponds to a link speed of 34 Gbps.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
13 years agodccp: fix bug in updating the GSR
Samuel Jero [Thu, 30 Dec 2010 11:15:41 +0000 (12:15 +0100)]
dccp: fix bug in updating the GSR

Currently dccp_check_seqno allows any valid packet to update the Greatest
Sequence Number Received, even if that packet's sequence number is less than
the current GSR. This patch adds a check to make sure that the new packet's
sequence number is greater than GSR.

Signed-off-by: Samuel Jero <sj323707@ohio.edu>
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
13 years agodccp: fix return value for sequence-invalid packets
Samuel Jero [Thu, 30 Dec 2010 11:15:16 +0000 (12:15 +0100)]
dccp: fix return value for sequence-invalid packets

Currently dccp_check_seqno returns 0 (indicating a valid packet) if the
acknowledgment number is out of bounds and the sync that RFC 4340 mandates at
this point is currently being rate-limited. This function should return -1,
indicating an invalid packet.

Signed-off-by: Samuel Jero <sj323707@ohio.edu>
Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
13 years agoEDAC, MCE: Fix NB error formatting
Borislav Petkov [Thu, 25 Nov 2010 14:40:27 +0000 (15:40 +0100)]
EDAC, MCE: Fix NB error formatting

Minor formatting fixup since the information which core was associated
with the MCE is not always valid.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoEDAC, MCE: Use BIT_64() to eliminate warnings on 32-bit
Randy Dunlap [Sat, 13 Nov 2010 16:44:26 +0000 (11:44 -0500)]
EDAC, MCE: Use BIT_64() to eliminate warnings on 32-bit

Building for X86_32 produces shift count warnings, so use BIT_64() to
eliminate the warnings.

drivers/edac/mce_amd.c:778: warning: left shift count >= width of type
drivers/edac/mce_amd.c:778: warning: left shift count >= width of type

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Doug Thompson <dougthompson@xmission.com>
Cc: bluesmoke-devel@lists.sourceforge.net
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoEDAC, MCE: Enable MCE decoding on F15h
Borislav Petkov [Wed, 22 Sep 2010 15:44:51 +0000 (17:44 +0200)]
EDAC, MCE: Enable MCE decoding on F15h

Now that everything is inplace, enable MCE decoding on F15h. Make
initcall routine a bit more readable.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoEDAC, MCE: Allow F15h bank 6 MCE injection
Borislav Petkov [Tue, 9 Nov 2010 18:41:49 +0000 (19:41 +0100)]
EDAC, MCE: Allow F15h bank 6 MCE injection

F15h adds a sixth MCE bank: adjust bank number check in the injection
code.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoEDAC, MCE: Shorten error report formatting
Borislav Petkov [Wed, 22 Sep 2010 15:42:27 +0000 (17:42 +0200)]
EDAC, MCE: Shorten error report formatting

Shorten up MCi_STATUS flags and add BD's new deferred and poison types.
Also, simplify formatting.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoEDAC, MCE: Overhaul error fields extraction macros
Borislav Petkov [Wed, 22 Sep 2010 14:08:37 +0000 (16:08 +0200)]
EDAC, MCE: Overhaul error fields extraction macros

Make macro names shorter thus making code shorter and more clear.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoEDAC, MCE: Add F15h FP MCE decoder
Borislav Petkov [Wed, 22 Sep 2010 13:37:58 +0000 (15:37 +0200)]
EDAC, MCE: Add F15h FP MCE decoder

Add decoder for FP MCEs.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoEDAC, MCE: Add F15 EX MCE decoder
Borislav Petkov [Wed, 22 Sep 2010 13:28:59 +0000 (15:28 +0200)]
EDAC, MCE: Add F15 EX MCE decoder

Integrate the single FIROB signature into an expanded table along with
the new BD MCE types.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoEDAC, MCE: Add an F15h NB MCE decoder
Borislav Petkov [Wed, 22 Sep 2010 13:06:24 +0000 (15:06 +0200)]
EDAC, MCE: Add an F15h NB MCE decoder

by (almost) reusing the F10h one since the signatures are the same.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoEDAC, MCE: No F15h LS MCE decoder
Borislav Petkov [Wed, 22 Sep 2010 09:53:32 +0000 (11:53 +0200)]
EDAC, MCE: No F15h LS MCE decoder

F15h BD doesn't generate LS MCEs so warn about it.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoEDAC, MCE: Add F15h CU MCE decoder
Borislav Petkov [Tue, 21 Sep 2010 18:45:10 +0000 (20:45 +0200)]
EDAC, MCE: Add F15h CU MCE decoder

MCE bank 2 is redefined from a BU to a CU (Combined Unit) bank on F15h.
Add a decoder function for CU MCEs.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoEDAC, MCE: Add F15h IC MCE decoder
Borislav Petkov [Mon, 8 Nov 2010 14:03:35 +0000 (15:03 +0100)]
EDAC, MCE: Add F15h IC MCE decoder

Add support for decoding F15h IC MCEs.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoEDAC, MCE: Add F15h DC MCE decoder
Borislav Petkov [Fri, 17 Sep 2010 17:22:34 +0000 (19:22 +0200)]
EDAC, MCE: Add F15h DC MCE decoder

Add a decoder for F15h DC MCEs to support the new types of DC MCEs
introduced by the BD microarchitecture.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoEDAC, MCE: Select extended error code mask
Borislav Petkov [Fri, 17 Sep 2010 17:11:47 +0000 (19:11 +0200)]
EDAC, MCE: Select extended error code mask

F15h enlarges the extended error code of an MCE to a 5-bit field
(MCi_STATUS[20:16]). Add a mask variable which default 0xf is overridden
on F15h.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoamd64_edac: Disable DRAM ECC injection on K8
Borislav Petkov [Fri, 26 Nov 2010 18:24:44 +0000 (19:24 +0100)]
amd64_edac: Disable DRAM ECC injection on K8

K8 does not allow for an atomic RMW to a cacheline as F10h does so
disable the error injection interface for it.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoEDAC: Fixup scrubrate manipulation
Borislav Petkov [Wed, 24 Nov 2010 18:52:09 +0000 (19:52 +0100)]
EDAC: Fixup scrubrate manipulation

Make the ->{get|set}_sdram_scrub_rate return the actual scrub rate
bandwidth it succeeded setting and remove superfluous arg pointer used
for that. A negative value returned still means that an error occurred
while setting the scrubrate. Document this for future reference.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoamd64_edac: Remove two-stage initialization
Borislav Petkov [Fri, 15 Oct 2010 17:25:38 +0000 (19:25 +0200)]
amd64_edac: Remove two-stage initialization

Now that all prerequisites are in place, drop the two-stage driver
instances initialization in favor of the following simple init sequence:

1. Probe PCI device: we only test ECC capabilities here and if none exit
early.

2. If the hw supports ECC and it is/can be enabled, we init the per-node
instance.

Remove "amd64_" prefix from static functions touched, while at it.

There actually should be no visible functional change resulting from
this patch.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoamd64_edac: Check ECC capabilities initially
Borislav Petkov [Fri, 15 Oct 2010 15:44:04 +0000 (17:44 +0200)]
amd64_edac: Check ECC capabilities initially

Rework the code to check the hardware ECC capabilities at PCI probing
time. We do all further initialization only if we actually can/have ECC
enabled.

While at it:
0. Fix function naming.
1. Simplify/clarify debug output.
2. Remove amd64_ prefix from the static functions
3. Reorganize code.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoamd64_edac: Carve out ECC-related hw settings
Borislav Petkov [Thu, 14 Oct 2010 14:01:30 +0000 (16:01 +0200)]
amd64_edac: Carve out ECC-related hw settings

This is in preparation for the init path reorganization where we want
only to

1) test whether a particular node supports ECC
2) can it be enabled

and only then do the necessary allocation/initialization. For that,
we need to decouple the ECC settings of the node from the instance's
descriptor.

The should be no functional change introduced by this patch.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoamd64_edac: Remove PCI ECS enabling functions
Borislav Petkov [Thu, 14 Oct 2010 12:37:13 +0000 (14:37 +0200)]
amd64_edac: Remove PCI ECS enabling functions

PCI ECS is being enabled by default since 2.6.26 on AMD so this code is
just superfluous now, remove it.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoamd64_edac: Remove explicit Kconfig PCI dependency
Borislav Petkov [Wed, 13 Oct 2010 20:12:15 +0000 (22:12 +0200)]
amd64_edac: Remove explicit Kconfig PCI dependency

AMD_NB pulls in the dependency on PCI. Clarify/fix help text while at it.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoamd64_edac: Allocate driver instances dynamically
Borislav Petkov [Wed, 13 Oct 2010 14:11:59 +0000 (16:11 +0200)]
amd64_edac: Allocate driver instances dynamically

Remove static allocation in favor of dynamically allocating space for as
many driver instances as northbridges present on the system.

There should be no functional change resulting from this patch.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoamd64_edac: Rework printk macros
Borislav Petkov [Thu, 7 Oct 2010 16:29:15 +0000 (18:29 +0200)]
amd64_edac: Rework printk macros

Add a macro per printk level, shorten up error messages. Add relevant
information to KERN_INFO level. No functional change.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoamd64_edac: Rename CPU PCI devices
Borislav Petkov [Fri, 1 Oct 2010 18:11:07 +0000 (20:11 +0200)]
amd64_edac: Rename CPU PCI devices

Rename variables representing PCI devices to their BKDG names for faster
search and shorter, clearer code.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoamd64_edac: Concentrate per-family init even more
Borislav Petkov [Fri, 1 Oct 2010 17:35:38 +0000 (19:35 +0200)]
amd64_edac: Concentrate per-family init even more

Move the remaining per-family init code into the proper place and
simplify the rest of the initialization. Reorganize error handling in
amd64_init_one_instance().

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoamd64_edac: Cleanup the CPU PCI device reservation
Borislav Petkov [Fri, 1 Oct 2010 17:27:58 +0000 (19:27 +0200)]
amd64_edac: Cleanup the CPU PCI device reservation

Shorten code and clarify comments, return proper -E* values on error.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoamd64_edac: Simplify CPU family detection
Borislav Petkov [Fri, 1 Oct 2010 17:20:05 +0000 (19:20 +0200)]
amd64_edac: Simplify CPU family detection

Concentrate CPU family detection in the per-family init function.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoamd64_edac: Add per-family init function
Borislav Petkov [Fri, 1 Oct 2010 16:38:19 +0000 (18:38 +0200)]
amd64_edac: Add per-family init function

Run a per-family init function which does all the settings based on
the family this driver instance is running on. Move the scrubrate
calculation in it and simplify code.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoamd64_edac: Use cached extended CPU model
Borislav Petkov [Fri, 1 Oct 2010 17:44:53 +0000 (19:44 +0200)]
amd64_edac: Use cached extended CPU model

... instead of computing it needlessly again.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agoamd64_edac: Remove F11h support
Borislav Petkov [Fri, 1 Oct 2010 16:19:06 +0000 (18:19 +0200)]
amd64_edac: Remove F11h support

F11h doesn't support DRAM ECC so whack it away.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
13 years agofs: scale mntget/mntput
Nick Piggin [Fri, 7 Jan 2011 06:50:11 +0000 (17:50 +1100)]
fs: scale mntget/mntput

The problem that this patch aims to fix is vfsmount refcounting scalability.
We need to take a reference on the vfsmount for every successful path lookup,
which often go to the same mount point.

The fundamental difficulty is that a "simple" reference count can never be made
scalable, because any time a reference is dropped, we must check whether that
was the last reference. To do that requires communication with all other CPUs
that may have taken a reference count.

We can make refcounts more scalable in a couple of ways, involving keeping
distributed counters, and checking for the global-zero condition less
frequently.

- check the global sum once every interval (this will delay zero detection
  for some interval, so it's probably a showstopper for vfsmounts).

- keep a local count and only taking the global sum when local reaches 0 (this
  is difficult for vfsmounts, because we can't hold preempt off for the life of
  a reference, so a counter would need to be per-thread or tied strongly to a
  particular CPU which requires more locking).

- keep a local difference of increments and decrements, which allows us to sum
  the total difference and hence find the refcount when summing all CPUs. Then,
  keep a single integer "long" refcount for slow and long lasting references,
  and only take the global sum of local counters when the long refcount is 0.

This last scheme is what I implemented here. Attached mounts and process root
and working directory references are "long" references, and everything else is
a short reference.

This allows scalable vfsmount references during path walking over mounted
subtrees and unattached (lazy umounted) mounts with processes still running
in them.

This results in one fewer atomic op in the fastpath: mntget is now just a
per-CPU inc, rather than an atomic inc; and mntput just requires a spinlock
and non-atomic decrement in the common case. However code is otherwise bigger
and heavier, so single threaded performance is basically a wash.

Signed-off-by: Nick Piggin <npiggin@kernel.dk>
13 years agofs: rename vfsmount counter helpers
Nick Piggin [Fri, 7 Jan 2011 06:50:10 +0000 (17:50 +1100)]
fs: rename vfsmount counter helpers

Suggested by Andreas, mnt_ prefix is clearer namespace, follows kernel
conventions better, and is easier for tab complete. I introduced these
names so I'll admit they were not good choices.

Signed-off-by: Nick Piggin <npiggin@kernel.dk>
13 years agofs: implement faster dentry memcmp
Nick Piggin [Fri, 7 Jan 2011 06:50:09 +0000 (17:50 +1100)]
fs: implement faster dentry memcmp

The standard memcmp function on a Westmere system shows up hot in
profiles in the `git diff` workload (both parallel and single threaded),
and it is likely due to the costs associated with trapping into
microcode, and little opportunity to improve memory access (dentry
name is not likely to take up more than a cacheline).

So replace it with an open-coded byte comparison. This increases code
size by 8 bytes in the critical __d_lookup_rcu function, but the
speedup is huge, averaging 10 runs of each:

git diff st   user   sys   elapsed  CPU
before        1.15   2.57  3.82      97.1
after         1.14   2.35  3.61      96.8

git diff mt   user   sys   elapsed  CPU
before        1.27   3.85  1.46     349
after         1.26   3.54  1.43     333

Elapsed time for single threaded git diff at 95.0% confidence:
        -0.21  +/- 0.01
        -5.45% +/- 0.24%

It's -0.66% +/- 0.06% elapsed time on my Opteron, so rep cmp costs on the
fam10h seem to be relatively smaller, but there is still a win.

Signed-off-by: Nick Piggin <npiggin@kernel.dk>
13 years agofs: prefetch inode data in dcache lookup
Nick Piggin [Fri, 7 Jan 2011 06:50:08 +0000 (17:50 +1100)]
fs: prefetch inode data in dcache lookup

This makes single threaded git diff -1.25% +/- 0.05% elapsed time on my
2s12c24t Westmere system, and -0.86% +/- 0.05% on my 2s8c Barcelona, by
prefetching the important first cacheline of the inode in while we do the
actual name compare and other operations on the dentry.

There was no measurable slowdown in the single file stat case, or the creat
case (where negative dentries would be common).

Signed-off-by: Nick Piggin <npiggin@kernel.dk>