Mauro Carvalho Chehab [Wed, 14 Oct 2009 11:02:40 +0000 (08:02 -0300)]
i7core_edac: Use a more generic approach for probing PCI devices
Currently, only one PCI set of tables is allowed. This prevents using
the driver for other devices like Lynnfield, with have a different
set of PCI ID's.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 14 Oct 2009 09:07:07 +0000 (06:07 -0300)]
i7core_edac: PCI device is called NONCORE, instead of NOCORE
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 8 Oct 2009 16:11:08 +0000 (13:11 -0300)]
i7core_edac: Fix ringbuffer maxsize
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Mon, 5 Oct 2009 12:40:09 +0000 (09:40 -0300)]
i7core_edac: First store, then increment
Fix ringbuffer store logic.
While here, add a few comments to the code and remove the undesired
printk that could otherwise be called during NMI time.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sun, 4 Oct 2009 14:54:56 +0000 (11:54 -0300)]
i7core_edac: Better parse "any" addrmask
Instead of accepting just "any", accept also "any\n"
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sun, 4 Oct 2009 13:15:40 +0000 (10:15 -0300)]
i7core_edac: Use a lockless ringbuffer
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Fri, 25 Sep 2009 16:42:25 +0000 (13:42 -0300)]
edac: Create an unique instance for each kobj
Current code only works when there's just one memory
controller, since we need one kobj for each instance.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 24 Sep 2009 20:28:50 +0000 (17:28 -0300)]
Documentation/edac.txt: Reflect the sysfs changes at the document
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 24 Sep 2009 20:25:43 +0000 (17:25 -0300)]
i7core_edac: Convert UDIMM error counters into a proper sysfs group
Instead of displaying 3 values at the same var, break it into 3
different sysfs nodes:
/sys/devices/system/edac/mc/mc0/all_channel_counts/udimm0
/sys/devices/system/edac/mc/mc0/all_channel_counts/udimm1
/sys/devices/system/edac/mc/mc0/all_channel_counts/udimm2
For registered dimms, however, the error counters are already being
displayed at:
/sys/devices/system/edac/mc/mc0/csrow*/ce_count
So, there's no need to add any extra sysfs nodes.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 24 Sep 2009 19:36:32 +0000 (16:36 -0300)]
edac: Don't create csrow entries on instance groups
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 24 Sep 2009 19:23:42 +0000 (16:23 -0300)]
edac: store/show methods for device groups weren't working
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 23 Sep 2009 21:56:47 +0000 (18:56 -0300)]
i7core_edac: Add support for sysfs addrmatch group
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 23 Sep 2009 19:26:09 +0000 (16:26 -0300)]
edac_core: Allow the creation of sysfs groups
Currently, all sysfs nodes are stored at /sys/.*/mc. (regex)
However, sometimes it is needed to create attribute groups.
This patch extends edac_core to allow groups creation.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 24 Sep 2009 12:58:26 +0000 (09:58 -0300)]
i7core_edac: Avoid printing a warning when debug is disabled
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 24 Sep 2009 12:59:13 +0000 (09:59 -0300)]
i7core_edac: We need to use list_for_each_entry_safe to avoid errors
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sun, 6 Sep 2009 02:06:50 +0000 (23:06 -0300)]
i7core_edac: change remove module strategy
The old remove module stragegy didn't work on devices with multiple
cores, since only one PCI device is used to open all mc's, due to
Nehalem nature.
Also, it were based at pdev value. However, this doesn't point to the
pci device used at mci->dev.
So, instead, it unregisters all devices at once, deleting them from the
device list.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 5 Sep 2009 15:16:19 +0000 (12:16 -0300)]
i7core_edac: remove static counter for max sockets
The number of sockets is now fully dynamic. Get rid of this obsolete
var.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 5 Sep 2009 15:15:20 +0000 (12:15 -0300)]
i7core_edac: at remove, don't remove all pci devices at once
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 5 Sep 2009 08:10:31 +0000 (05:10 -0300)]
i7core_edac: Fix a bug when printing error counts with RDIMMs
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 5 Sep 2009 08:10:15 +0000 (05:10 -0300)]
Documentation/edac.txt: Improve it to reflect the latest changes at the driver
Signed-off-by: Mauro Carvalho Chehab <mcheahb@redhat.com>
Mauro Carvalho Chehab [Sat, 5 Sep 2009 07:12:02 +0000 (04:12 -0300)]
i7core_edac: a few fixes for multiple mc's
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 5 Sep 2009 06:27:04 +0000 (03:27 -0300)]
i7core_edac: sanity check: print a warning if a mcelog is ignored
In thesis, the other mc controller should handle it.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 5 Sep 2009 05:35:08 +0000 (02:35 -0300)]
i7core_edac: create one mc per socket/QPI
Instead of creating just one memory controller, create one per socket
(e. g. per Quick Link Path Interconnect).
This better reflects the Nehalem architecture.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 5 Sep 2009 03:52:11 +0000 (00:52 -0300)]
Dynamically allocate memory for PCI devices
Instead of using a static table assuming always 2 CPU sockets, allocate
space dynamically for Nehalem PCI devs.
This patch is part of a series of patches that changes i7core_edac to
allow more than 2 sockets and to properly report one memory controller
per socket.
Mauro Carvalho Chehab [Sat, 5 Sep 2009 03:47:21 +0000 (00:47 -0300)]
i7core: temporary workaround to allow it to compile against 2.6.30
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 3 Sep 2009 23:17:26 +0000 (20:17 -0300)]
i7core_edac: Improve corrected_error_counts output for RDIMM
Just cosmetics. instead of showing something like:
socket 0, channel 2dimm0: 1
dimm1: 0
dimm2: 0
socket 1, channel 2dimm0: 0
dimm1: 0
dimm2: 0
Show:
socket 0, channel 2 RDIMM0: 1 RDIMM1: 0 RDIMM2: 0
socket 0, channel 2 RDIMM0: 0 RDIMM1: 0 RDIMM2: 0
This is more synthetic and easier to parse.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Keith Mannthey [Thu, 3 Sep 2009 03:05:05 +0000 (00:05 -0300)]
i7core_edac: Probe on Xeons eariler
On the Xeon 55XX series cpus the pci deives are not exposed via acpi so
we much explicitly probe them to make the usable as a Linux PCI device.
This moves the detection of this state to before pci_register_driver is
called. Its present position was not working on my systems, the driver
would complain about not finding a specific device.
This patch allows the driver to load on my systems.
Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 3 Sep 2009 02:52:36 +0000 (23:52 -0300)]
i7core: Use registered memories per processor
Instead of assuming that the entire machine has either registered or
unregistered memories, do it at CPU socket based.
While here, fix a bug at i7core_mce_output_error(), where the we're
using m->cpu directly as if it would represent a socket. Instead, the
proper socket_id is given by cpu_data[m->cpu].phys_proc_id.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
Mauro Carvalho Chehab [Thu, 3 Sep 2009 02:49:59 +0000 (23:49 -0300)]
i7core_edac: Use Device 3 function 2 to report errors with RDIMM's
Nehalem and upper chipsets provide an special device that has corrected memory
error counters detected with registered dimms. This device is only seen if
there are registered memories plugged.
After this patch, on a machine fully equiped with RDIMM's, it will use the
Device 3 function 2 to count corrected errors instead on relying at mcelog.
For unregistered DIMMs, it will keep the old behavior, counting errors
via mcelog.
This patch were developed together with Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Keith Mannthey [Thu, 3 Sep 2009 02:46:59 +0000 (23:46 -0300)]
i7core_edac: Fix ecc enable shift
From: Keith Mannthey <kmannth@us.ibm.com>
Simple correction to a shift value.
ECC_ENABLED is bit 4 of MC_STATUS, Dev 3 Fun 0 Offset 0x4c
This correctly identifies the state of the ECC at the machine.
Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 3 Sep 2009 02:43:33 +0000 (23:43 -0300)]
i7core_edac: Print an error message if pci register fails
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 6 Aug 2009 00:36:35 +0000 (21:36 -0300)]
i7core_edac: CodingSyle fixes/cleanups
No functional changes.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 6 Aug 2009 00:16:56 +0000 (21:16 -0300)]
Documentation/edac.txt: Add Nehalem specific EDAC characteristics
As Nehalem has a different binding to EDAC API, and its own different
error injection code, documents it.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 5 Aug 2009 23:27:15 +0000 (20:27 -0300)]
i7core_edac: fix error injection
There were two stupid error injection bugs introduced by wrong
cut-and-paste: one at socket store, and another at the error inject
register. The last one were causing the code to not work at all.
While here, adds debug messages to allow seeing what registers are being
set while sending error injection.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 5 Aug 2009 22:28:27 +0000 (19:28 -0300)]
i7core_edac: fix error codes for sysfs error injection interface
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 23 Jul 2009 00:45:50 +0000 (21:45 -0300)]
i7core_edac: some fixes at error injection code
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Mon, 20 Jul 2009 21:48:18 +0000 (18:48 -0300)]
i7core_edac: Some cleanups at displayed info
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 18 Jul 2009 15:22:28 +0000 (12:22 -0300)]
i7core: remove some uneeded noisy debug messages
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 18 Jul 2009 15:20:04 +0000 (12:20 -0300)]
i7core: add socket info at the debug msg
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 18 Jul 2009 13:44:30 +0000 (10:44 -0300)]
i7core: better document i7core_get_active_channels()
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 18 Jul 2009 13:43:08 +0000 (10:43 -0300)]
i7core: fix get_devices routine for Xeon55xx
i7core_get_devices() were preparet to get just the first found device of each type.
Due to that, on Xeon 55xx, only socket 1 were retrived.
Rework i7core_get_devices() to clean it and to properly support Xeon 55xx.
While here, fix a small typo.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Fri, 17 Jul 2009 13:54:23 +0000 (10:54 -0300)]
i7core: enrich error information based on memory transaction type
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Fri, 17 Jul 2009 13:28:15 +0000 (10:28 -0300)]
i7core: check if the memory error is fatal or non-fatal
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Fri, 17 Jul 2009 03:09:10 +0000 (00:09 -0300)]
i7core: fix probing on Xeon55xx
Xeon55xx fails to probe with this error message:
EDAC DEBUG: in drivers/edac/i7core_edac.c, line at 1660: MC: drivers/edac/i7core_edac.c: i7core_init()
EDAC i7core: Device not found: dev 00:00.0 PCI ID 8086:2c41
i7core_edac: probe of 0000:00:14.0 failed with error -22
This is due to the fact that, on Xeon35xx (and i7core), device 00.0 has
PCI ID 8086:2c40.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 15 Jul 2009 22:53:24 +0000 (19:53 -0300)]
i7core_edac: some fixes at memory error parser
m->bank is not related to the memory bank but, instead, to the MCA Error
register bank. Fix it accordingly. While here, improves the comments for
Nehalem bank.
A later fix is needed, in order to get bank/rank information from MCA
error log.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 15 Jul 2009 22:01:08 +0000 (19:01 -0300)]
i7core_edac: decode mcelog error and send it via edac interface
Enriches mcelog error by using the encoded information at MCE status and
misc registers (IA32_MCx_STATUS, IA32_MCx_MISC).
Some fixes are still needed here, in order to properly fill the EDAC
fields.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 15 Jul 2009 12:02:32 +0000 (09:02 -0300)]
i7core_edac: maps all sockets as if ther are one MC controller
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 15 Jul 2009 09:56:23 +0000 (06:56 -0300)]
i7core_edac: add support for more than one MC socket
Some Nehalem architectures have more than one MC socket. Socket 0 is
located at bus 255.
Currently, it is using up to 2 sockets, but increasing it to a larger
number is just a matter of increasing MAX_SOCKETS definition.
This seems to be required for properly support of Xeon 55xx.
Still needs testing with Xeon 55xx.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Fri, 10 Jul 2009 21:39:53 +0000 (18:39 -0300)]
i7core_edac: Add a code to probe Xeon 55xx bus
This code changes the detection procedure of i7core_edac. Instead of
directly probing for MC registers, it probes for another register found
on Nehalem. If found, it tries to pick the first MC PCI BUS. This should
work fine with Xeon 35xx, but, on Xeon 55xx, this is at bus 254 and 255
that are not properly detected by the non-legacy PCI methods.
The new detection code scans specifically at buses 254 and 255 for the
Xeon 55xx devices.
This code has not tested yet. After working, a change at the code will
be needed, since the i7core is not yet ready for working with 2 sets of
MC.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Aristeu Rozanski [Fri, 10 Jul 2009 01:21:13 +0000 (22:21 -0300)]
pci: Add a probing code that seeks for an specific bus
This patch adds a probing code that seeks for an specific pci bus. It
still needs testing, but it is hoped that this will help to identify the
memory controller with Xeon 55xx series.
Signed-off-by: Aristeu Sergio <arozansk@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Fri, 10 Jul 2009 01:14:35 +0000 (22:14 -0300)]
i7core_edac: Adds write unlock to MC registers
The public Intel Xeon 5500 volume 2 datasheet describes, on page 53,
session 2.6.7 a register that can lock/unlock Memory Controller the
configuration register, called MC_CFG_CONTROL.
Adds support for it in the hope that software error injection would
work. With my tests with Xeon 35xx, there's still something missing.
With a program that does sequencial bit writes at dev 0.0, sometimes, it
produces error injection, after unblocking the MC_CFG_CONTROL (and,
sometimes, it just locks my testing machine).
I'll try later to discover by trial and error what's the register that
solves this issue on Xeon 35xx.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Fri, 10 Jul 2009 01:06:41 +0000 (22:06 -0300)]
i7core_edac: Add edac_mce glue
Adds a glue code to allow i7core to work with mcelog. With the glue,
i7core registers itself on edac_mce. At mce, when an error is detected,
it calls all registered drivers (in this case, i7core), for EDAC error
handling.
TODO: It currently just prints the MCE error log using about the same
format as mce panic messages. The error message should be enhanced
with mcelog userspace info and converted into the proper EDAC format,
to feed the EDAC error counts.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Fri, 10 Jul 2009 01:04:30 +0000 (22:04 -0300)]
edac/Kconfig: edac_mce can't be module
Since mcelog is bool, edac_mce glue should also be bool, or otherwise
will not work.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 23 Jul 2009 09:57:45 +0000 (06:57 -0300)]
edac_mce: Add an interface driver to report mce errors via edac
edac_mce module is an interface module that gets mcelog data and
forwards to any registered edac module that expects to receive data via
mce.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 23 Jun 2009 01:48:31 +0000 (22:48 -0300)]
i7core_edac: CodingStyle fixes
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 23 Jun 2009 01:48:31 +0000 (22:48 -0300)]
i7core_edac: fill csrows edac sysfs info
csrows is still fake, since we can't identify its representation with
Nehalem registers.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 23 Jun 2009 01:48:31 +0000 (22:48 -0300)]
i7core_edac: Memory info fixes and preparation for properly filling cswrow data
Now, memory size is properly displayed:
EDAC i7core: DOD Max limits: DIMMS: 2, 1-ranked, 8-banked
EDAC i7core: DOD Max rows x colums = 0x4000 x 0x400
EDAC i7core: Memory channel configuration:
EDAC i7core: Ch0 phy rd0, wr0 (0x063f7c31): 2 ranks, UDIMMs
EDAC i7core: dimm 0 (0x00000288) 1024 Mb offset: 0, numbank: 8,
numrank: 1, numrow: 0x4000, numcol: 0x400
EDAC i7core: dimm 1 (0x00001288) 1024 Mb offset: 4, numbank: 8,
numrank: 1, numrow: 0x4000, numcol: 0x400
EDAC i7core: Ch1 phy rd1, wr1 (0x063f7c31): 2 ranks, UDIMMs
EDAC i7core: dimm 0 (0x00000288) 1024 Mb offset: 0, numbank: 8,
numrank: 1, numrow: 0x4000, numcol: 0x400
EDAC i7core: Ch2 phy rd3, wr3 (0x063f7c31): 2 ranks, UDIMMs
EDAC i7core: dimm 0 (0x00000288) 1024 Mb offset: 0, numbank: 8,
numrank: 1, numrow: 0x4000, numcol: 0x400
Still, as the way to retrieve csrows info is not known, it does a
mapping of what's available to csrows basic unit at edac core.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 23 Jun 2009 01:48:30 +0000 (22:48 -0300)]
i7core_edac: Get more info about the memory DIMMs
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 23 Jun 2009 01:48:30 +0000 (22:48 -0300)]
i7core_edac: Add more information about each active dimm
Thanks-to: Aristeu Rozanski <aris@redhat.com> for part of the code
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 23 Jun 2009 01:48:30 +0000 (22:48 -0300)]
i7core_edac: Improve error handling
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 23 Jun 2009 01:48:30 +0000 (22:48 -0300)]
i7core_edac: Properly fill struct csrow_info
Thanks-to: Aristeu Rozanski <aris@redhat.com> for part of the code
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 23 Jun 2009 01:48:30 +0000 (22:48 -0300)]
i7core_edac: Add additional tests for error detection
Properly check the number of channels and improve probing error detection
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 23 Jun 2009 01:48:29 +0000 (22:48 -0300)]
i7core_edac: Add a memory check routine, based on device 3 function 4
This function appears only on Xeon 5500 datasheet. Yet, testing with a
Xeon 3503 showed that this is also implemented on other Nehalem
processors.
At the first read, MC_TEST_ERR_RCV1 and MC_TEST_ERR_RCV0 can contain any
value. Modify CE error logic to update the error count only after the
second read.
An alternative approach would be to do a write at rcv0 and rcv1
registers, but it seemed better to keep they untouched, since BIOS might
eventually assume that they are exclusive for their usage.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 23 Jun 2009 01:48:29 +0000 (22:48 -0300)]
i7core_edac: need mci->edac_check, otherwise module removal doesn't work
There are some locking troubles with edac_core: if you don't declare an
edac_check, module may suffer from soft lock.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 23 Jun 2009 01:48:29 +0000 (22:48 -0300)]
i7core_edac: A few fixes at error injection code
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 23 Jun 2009 01:48:29 +0000 (22:48 -0300)]
i7core_edac: Show read/write virtual/physical channel association
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 23 Jun 2009 01:48:29 +0000 (22:48 -0300)]
i7core_edac: Registers all supported MC functions
Now, it will try to register on all supported Memory Controller
functions.
It should be noticed that dev3, function 2 is present only on chips with
Registered DIMM's, according to the datasheet. So, the driver doesn't
return -ENODEV is all functions but this one were successfully
registered and enabled:
EDAC i7core: Registered device 8086:2c18 fn=3 0
EDAC i7core: Registered device 8086:2c19 fn=3 1
EDAC i7core: Device not found: PCI ID 8086:2c1a (dev 3, func 2)
EDAC i7core: Registered device 8086:2c1c fn=3 4
EDAC i7core: Registered device 8086:2c20 fn=4 0
EDAC i7core: Registered device 8086:2c21 fn=4 1
EDAC i7core: Registered device 8086:2c22 fn=4 2
EDAC i7core: Registered device 8086:2c23 fn=4 3
EDAC i7core: Registered device 8086:2c28 fn=5 0
EDAC i7core: Registered device 8086:2c29 fn=5 1
EDAC i7core: Registered device 8086:2c2a fn=5 2
EDAC i7core: Registered device 8086:2c2b fn=5 3
EDAC i7core: Registered device 8086:2c30 fn=6 0
EDAC i7core: Registered device 8086:2c31 fn=6 1
EDAC i7core: Registered device 8086:2c32 fn=6 2
EDAC i7core: Registered device 8086:2c33 fn=6 3
EDAC i7core: Driver loaded.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 23 Jun 2009 01:48:29 +0000 (22:48 -0300)]
i7core_edac: Add more status functions to EDAC driver
This patch were co-authored with Aristeu Rozanski.
Signed-off-by: Aristeu Sergio <arozansk@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 23 Jun 2009 01:48:28 +0000 (22:48 -0300)]
i7core_edac: Add error insertion code for Nehalem
Implements set_inject_error() with the low-level code needed to inject
memory errors at Nehalem, and adds some sysfs nodes to allow error injection
The next patch will add an API for error injection.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 23 Jun 2009 01:41:15 +0000 (22:41 -0300)]
i7core_edac: Add an EDAC memory controller driver for Nehalem chipsets
This driver is meant to support i7 core/i7core extreme desktop
processors and Xeon 35xx/55xx series with integrated memory controller.
It is likely that it can be expanded in the future to work with other
processor series based at the same Memory Controller design.
For now, it has just a few MCH status reads.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Linus Torvalds [Fri, 30 Apr 2010 03:02:05 +0000 (20:02 -0700)]
Linux 2.6.34-rc6
Linus Torvalds [Fri, 30 Apr 2010 03:01:42 +0000 (20:01 -0700)]
Merge branch 'for_linus' of git://git./linux/kernel/git/jwessel/linux-2.6-kgdb
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
kgdb: don't needlessly skip PAGE_USER test for Fsl booke
Linus Torvalds [Fri, 30 Apr 2010 02:49:34 +0000 (19:49 -0700)]
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: add a shrinker to background inode reclaim
Wufei [Wed, 28 Apr 2010 21:42:32 +0000 (17:42 -0400)]
kgdb: don't needlessly skip PAGE_USER test for Fsl booke
The bypassing of this test is a leftover from 2.4 vintage
kernels, and is no longer appropriate, or even used by KGDB.
Currently KGDB uses probe_kernel_write() for all access to
memory via the KGDB core, so it can simply be deleted.
This fixes CVE-2010-1446.
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
CC: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Wufei <fei.wu@windriver.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Linus Torvalds [Fri, 30 Apr 2010 00:18:07 +0000 (17:18 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
exofs: Fix "add bdi backing to mount session" fall out
fs: fs/super.c needs to include backing-dev.h for !CONFIG_BLOCK
Linus Torvalds [Fri, 30 Apr 2010 00:17:35 +0000 (17:17 -0700)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
ARM: 6061/1: PL061 GPIO: Bug fix - setting gpio for HIGH_LEVEL interrupt is not working.
ARM: 5957/1: ARM: RealView SD/MMC Card detection and write-protect using GPIOLIB
ARM: 6030/1: KS8695: enable console
ARM: 6060/1: PL061 GPIO: Setting gpio val after changing direction to OUT.
ARM: 6059/1: PL061 GPIO: Changing *_irq_chip_data with *_irq_data for real irqs.
ARM: 6023/1: update bcmring_defconfig to latest version and fix build error
ARM: fix build error in arch/arm/kernel/process.c
Linus Torvalds [Fri, 30 Apr 2010 00:16:36 +0000 (17:16 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/ps3: Update ps3_defconfig
powerpc/ps3: Update platform maintainer
powerpc/pseries: Flush lazy kernel mappings after unplug operations
powerpc/numa: Add form 1 NUMA affinity
powerpc/fsl-booke: Fix CONFIG_RELOCATABLE support on FSL Book-E ppc32
powerpc: 2.6.34 update of defconfigs for embedded 6xx/7xxx, 8xx, 8xxx
powerpc/mpc8xxx defconfigs - turn off SYSFS_DEPRECATED
powerpc/83xx: configure SIL SATA driver in 83xx-wide defconfig
powerpc/83xx: enable EPOLL syscall in defconfig
powerpc/83xx: add RTC drivers in 83xx defconfig
powerpc/fsl-cpm: Configure clock correctly for SCC
powerpc/fsl_booke: Correct test for MMU_FTR_BIG_PHYS
powerpc/85xx/86xx: Fix build w/ CONFIG_PCI=n
viresh kumar [Thu, 29 Apr 2010 11:22:52 +0000 (12:22 +0100)]
ARM: 6061/1: PL061 GPIO: Bug fix - setting gpio for HIGH_LEVEL interrupt is not working.
In current implementation of PL061, setting type of irq to HIGH_LEVEL is not
working. This patch fixes this bug.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Dave Chinner [Wed, 28 Apr 2010 23:55:50 +0000 (09:55 +1000)]
xfs: add a shrinker to background inode reclaim
On low memory boxes or those with highmem, kernel can OOM before the
background reclaims inodes via xfssyncd. Add a shrinker to run inode
reclaim so that it inode reclaim is expedited when memory is low.
This is more complex than it needs to be because the VM folk don't
want a context added to the shrinker infrastructure. Hence we need
to add a global list of XFS mount structures so the shrinker can
traverse them.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Boaz Harrosh [Thu, 29 Apr 2010 18:35:29 +0000 (20:35 +0200)]
exofs: Fix "add bdi backing to mount session" fall out
The patch: add bdi backing to mount session
(
b3d0ab7e60d1865bb6f6a79a77aaba22f2543236)
Has a bug in the placement of the bdi member at
struct exofs_sb_info. The layout member must be kept
last.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Thu, 29 Apr 2010 18:33:35 +0000 (20:33 +0200)]
fs: fs/super.c needs to include backing-dev.h for !CONFIG_BLOCK
When CONFIG_BLOCK is set, it ends up getting backing-dev.h included.
But for !CONFIG_BLOCK, it isn't so lucky. The proper thing to do is
include <linux/backing-dev.h> directly from the file it's used from,
so do that.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Linus Torvalds [Thu, 29 Apr 2010 17:23:44 +0000 (10:23 -0700)]
Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
nfs: fix memory leak in nfs_get_sb with CONFIG_NFS_V4
nfs: fix some issues in nfs41_proc_reclaim_complete()
NFS: Ensure that nfs_wb_page() waits for Pg_writeback to clear
NFS: Fix an unstable write data integrity race
nfs: testing for null instead of ERR_PTR()
NFS: rsize and wsize settings ignored on v4 mounts
NFSv4: Don't attempt an atomic open if the file is a mountpoint
SUNRPC: Fix a bug in rpcauth_prune_expired
Arnd Bergmann [Wed, 28 Apr 2010 12:36:41 +0000 (14:36 +0200)]
pktcdvd: improve BKL and compat_ioctl.c usage
The pktcdvd driver uses proper locking and does not need the BKL in the
ioctl and llseek functions of the character device, so kill both.
Moving the compat_ioctl handling from common code into the driver itself
fixes build problems when CONFIG_BLOCK is disabled.
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Boaz Harrosh [Thu, 29 Apr 2010 10:38:00 +0000 (13:38 +0300)]
exofs: Fix "add bdi backing to mount session" fall out
Commit
b3d0ab7e60d1865bb6f6a79a77aaba22f2543236 ("exofs: add bdi backing
to mount session") has a bug in the placement of the bdi member at
struct exofs_sb_info. The layout member must be kept last.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 29 Apr 2010 03:41:55 +0000 (20:41 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/x86/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip:
x86: Disable large pages on CPUs with Atom erratum AAE44
x86-64: Clear a 64-bit FS/GS base on fork if selector is nonzero
x86, mrst: Conditionally register cpu hotplug notifier for apbt
Linus Torvalds [Thu, 29 Apr 2010 03:40:17 +0000 (20:40 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
x86/PCI: compute Address Space length rather than using _LEN
x86/PCI: never allocate PCI MMIO resources below BIOS_END
Al Viro [Thu, 29 Apr 2010 02:10:43 +0000 (03:10 +0100)]
nfs d_revalidate() is too trigger-happy with d_drop()
If dentry found stale happens to be a root of disconnected tree, we
can't d_drop() it; its d_hash is actually part of s_anon and d_drop()
would simply hide it from shrink_dcache_for_umount(), leading to
all sorts of fun, including busy inodes on umount and oopsen after
that.
Bug had been there since at least 2006 (commit c636eb already has it),
so it's definitely -stable fodder.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Colin Tuckley [Wed, 24 Feb 2010 14:23:10 +0000 (15:23 +0100)]
ARM: 5957/1: ARM: RealView SD/MMC Card detection and write-protect using GPIOLIB
The switch to using GPIOLIB broke the sd/mmc card detection on the
RealView development boards if GPIO_PL061 was not selected.
This patch selects GPIO_PL061 if GPIOLIB is selected.
The sense of the return value from mmc_status has also changed
and is corrected.
Signed-off-by: Colin Tuckley <colin.tuckley@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linus Torvalds [Wed, 28 Apr 2010 20:37:31 +0000 (13:37 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/lrg/voltage-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6:
regulator: fix enabling regulator issue on max8925
Linus Torvalds [Wed, 28 Apr 2010 20:37:06 +0000 (13:37 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits)
sfc: Change falcon_probe_board() to fail for unsupported boards
sfc: Always close net device at the end of a disabling reset
sfc: Wait at most 10ms for the MC to finish reading out MAC statistics
sctp: Fix oops when sending queued ASCONF chunks
sctp: fix to calc the INIT/INIT-ACK chunk length correctly is set
sctp: per_cpu variables should be in bh_disabled section
sctp: fix potential reference of a freed pointer
sctp: avoid irq lock inversion while call sk->sk_data_ready()
Revert "tcp: bind() fix when many ports are bound"
net/usb: add sierra_net.c driver
cdc_ether: fix autosuspend for mbm devices
bluetooth: handle l2cap_create_connless_pdu() errors
gianfar: Wait for both RX and TX to stop
ipheth: potential null dereferences on error path
smc91c92_cs: spin_unlock_irqrestore before calling smc_interrupt()
drivers/usb/net/kaweth.c: add device "Allied Telesyn AT-USB10 USB Ethernet Adapter"
bnx2: Update version to 2.0.9.
bnx2: Prevent "scheduling while atomic" warning with cnic, bonding and vlan.
bnx2: Fix lost MSI-X problem on 5709 NICs.
cxgb3: Wait longer for control packets on initialization
...
Ben Hutchings [Wed, 28 Apr 2010 09:01:50 +0000 (09:01 +0000)]
sfc: Change falcon_probe_board() to fail for unsupported boards
The driver needs specific PHY and board support code for each SFC4000
board; there is no point trying to continue if it is missing.
Currently unsupported boards can trigger an 'oops'.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Wed, 28 Apr 2010 09:01:33 +0000 (09:01 +0000)]
sfc: Always close net device at the end of a disabling reset
This fixes a regression introduced by commit
eb9f6744cbfa97674c13263802259b5aa0034594 "sfc: Implement ethtool
reset operation".
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Wed, 28 Apr 2010 09:00:35 +0000 (09:00 +0000)]
sfc: Wait at most 10ms for the MC to finish reading out MAC statistics
The original code would wait indefinitely if MAC stats DMA failed.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Wed, 28 Apr 2010 08:47:22 +0000 (08:47 +0000)]
sctp: Fix oops when sending queued ASCONF chunks
When we finish processing ASCONF_ACK chunk, we try to send
the next queued ASCONF. This action runs the sctp state
machine recursively and it's not prepared to do so.
kernel BUG at kernel/timer.c:790!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/module/ipv6/initstate
Modules linked in: sha256_generic sctp libcrc32c ipv6 dm_multipath
uinput 8139too i2c_piix4 8139cp mii i2c_core pcspkr virtio_net joydev
floppy virtio_blk virtio_pci [last unloaded: scsi_wait_scan]
Pid: 0, comm: swapper Not tainted 2.6.34-rc4 #15 /Bochs
EIP: 0060:[<
c044a2ef>] EFLAGS:
00010286 CPU: 0
EIP is at add_timer+0xd/0x1b
EAX:
cecbab14 EBX:
000000f0 ECX:
c0957b1c EDX:
03595cf4
ESI:
cecba800 EDI:
cf276f00 EBP:
c0957aa0 ESP:
c0957aa0
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process swapper (pid: 0, ti=
c0956000 task=
c0988ba0 task.ti=
c0956000)
Stack:
c0957ae0 d1851214 c0ab62e4 c0ab5f26 0500ffff 00000004 00000005 00000004
<0>
00000000 d18694fd 00000004 1666b892 cecba800 cecba800 c0957b14
00000004
<0>
c0957b94 d1851b11 ceda8b00 cecba800 cf276f00 00000001 c0957b14
000000d0
Call Trace:
[<
d1851214>] ? sctp_side_effects+0x607/0xdfc [sctp]
[<
d1851b11>] ? sctp_do_sm+0x108/0x159 [sctp]
[<
d1863386>] ? sctp_pname+0x0/0x1d [sctp]
[<
d1861a56>] ? sctp_primitive_ASCONF+0x36/0x3b [sctp]
[<
d185657c>] ? sctp_process_asconf_ack+0x2a4/0x2d3 [sctp]
[<
d184e35c>] ? sctp_sf_do_asconf_ack+0x1dd/0x2b4 [sctp]
[<
d1851ac1>] ? sctp_do_sm+0xb8/0x159 [sctp]
[<
d1863334>] ? sctp_cname+0x0/0x52 [sctp]
[<
d1854377>] ? sctp_assoc_bh_rcv+0xac/0xe1 [sctp]
[<
d1858f0f>] ? sctp_inq_push+0x2d/0x30 [sctp]
[<
d186329d>] ? sctp_rcv+0x797/0x82e [sctp]
Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Yuansong Qiao <ysqiao@research.ait.ie>
Signed-off-by: Shuaijun Zhang <szhang@research.ait.ie>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Wed, 28 Apr 2010 08:47:21 +0000 (08:47 +0000)]
sctp: fix to calc the INIT/INIT-ACK chunk length correctly is set
When calculating the INIT/INIT-ACK chunk length, we should not
only account the length of parameters, but also the parameters
zero padding length, such as AUTH HMACS parameter and CHUNKS
parameter. Without the parameters zero padding length we may get
following oops.
skb_over_panic: text:
ce2068d2 len:130 put:6 head:
cac3fe00 data:
cac3fe00 tail:0xcac3fe82 end:0xcac3fe80 dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:127!
invalid opcode: 0000 [#2] SMP
last sysfs file: /sys/module/aes_generic/initstate
Modules linked in: authenc ......
Pid: 4102, comm: sctp_darn Tainted: G D 2.6.34-rc2 #6
EIP: 0060:[<
c0607630>] EFLAGS:
00010282 CPU: 0
EIP is at skb_over_panic+0x37/0x3e
EAX:
00000078 EBX:
c07c024b ECX:
c07c02b9 EDX:
cb607b78
ESI:
00000000 EDI:
cac3fe7a EBP:
00000002 ESP:
cb607b74
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process sctp_darn (pid: 4102, ti=
cb607000 task=
cabdc990 task.ti=
cb607000)
Stack:
c07c02b9 ce2068d2 00000082 00000006 cac3fe00 cac3fe00 cac3fe82 cac3fe80
<0>
c07c024b cac3fe7c cac3fe7a c0608dec ca986e80 ce2068d2 00000006 0000007a
<0>
cb8120ca ca986e80 cb812000 00000003 cb8120c4 ce208a25 cb8120ca cadd9400
Call Trace:
[<
ce2068d2>] ? sctp_addto_chunk+0x45/0x85 [sctp]
[<
c0608dec>] ? skb_put+0x2e/0x32
[<
ce2068d2>] ? sctp_addto_chunk+0x45/0x85 [sctp]
[<
ce208a25>] ? sctp_make_init+0x279/0x28c [sctp]
[<
c0686a92>] ? apic_timer_interrupt+0x2a/0x30
[<
ce1fdc0b>] ? sctp_sf_do_prm_asoc+0x2b/0x7b [sctp]
[<
ce202823>] ? sctp_do_sm+0xa0/0x14a [sctp]
[<
ce2133b9>] ? sctp_pname+0x0/0x14 [sctp]
[<
ce211d72>] ? sctp_primitive_ASSOCIATE+0x2b/0x31 [sctp]
[<
ce20f3cf>] ? sctp_sendmsg+0x7a0/0x9eb [sctp]
[<
c064eb1e>] ? inet_sendmsg+0x3b/0x43
[<
c04244b7>] ? task_tick_fair+0x2d/0xd9
[<
c06031e1>] ? sock_sendmsg+0xa7/0xc1
[<
c0416afe>] ? smp_apic_timer_interrupt+0x6b/0x75
[<
c0425123>] ? dequeue_task_fair+0x34/0x19b
[<
c0446abb>] ? sched_clock_local+0x17/0x11e
[<
c052ea87>] ? _copy_from_user+0x2b/0x10c
[<
c060ab3a>] ? verify_iovec+0x3c/0x6a
[<
c06035ca>] ? sys_sendmsg+0x186/0x1e2
[<
c042176b>] ? __wake_up_common+0x34/0x5b
[<
c04240c2>] ? __wake_up+0x2c/0x3b
[<
c057e35c>] ? tty_wakeup+0x43/0x47
[<
c04430f2>] ? remove_wait_queue+0x16/0x24
[<
c0580c94>] ? n_tty_read+0x5b8/0x65e
[<
c042be02>] ? default_wake_function+0x0/0x8
[<
c0604e0e>] ? sys_socketcall+0x17f/0x1cd
[<
c040264c>] ? sysenter_do_call+0x12/0x22
Code: 0f 45 de 53 ff b0 98 00 00 00 ff b0 94 ......
EIP: [<
c0607630>] skb_over_panic+0x37/0x3e SS:ESP 0068:
cb607b74
To reproduce:
# modprobe sctp
# echo 1 > /proc/sys/net/sctp/addip_enable
# echo 1 > /proc/sys/net/sctp/auth_enable
# sctp_test -H 3ffe:501:ffff:100:20c:29ff:fe4d:f37e -P 800 -l
# sctp_darn -H 3ffe:501:ffff:100:20c:29ff:fe4d:f37e -P 900 -h 192.168.0.21 -p 800 -I -s -t
sctp_darn ready to send...
3ffe:501:ffff:100:20c:29ff:fe4d:f37e:900-192.168.0.21:800 Interactive mode> bindx-add=192.168.0.21
3ffe:501:ffff:100:20c:29ff:fe4d:f37e:900-192.168.0.21:800 Interactive mode> bindx-add=192.168.1.21
3ffe:501:ffff:100:20c:29ff:fe4d:f37e:900-192.168.0.21:800 Interactive mode> snd=10
------------------------------------------------------------------
eth0 has addresses: 3ffe:501:ffff:100:20c:29ff:fe4d:f37e and 192.168.0.21
eth1 has addresses: 192.168.1.21
------------------------------------------------------------------
Reported-by: George Cheimonidis <gchimon@gmail.com>
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Wed, 28 Apr 2010 08:47:20 +0000 (08:47 +0000)]
sctp: per_cpu variables should be in bh_disabled section
Since the change of the atomics to percpu variables, we now
have to disable BH in process context when touching percpu variables.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Wed, 28 Apr 2010 08:47:19 +0000 (08:47 +0000)]
sctp: fix potential reference of a freed pointer
When sctp attempts to update an assocition, it removes any
addresses that were not in the updated INITs. However, the loop
may attempt to refrence a transport with address after removing it.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Wed, 28 Apr 2010 08:47:18 +0000 (08:47 +0000)]
sctp: avoid irq lock inversion while call sk->sk_data_ready()
sk->sk_data_ready() of sctp socket can be called from both BH and non-BH
contexts, but the default sk->sk_data_ready(), sock_def_readable(), can
not be used in this case. Therefore, we have to make a new function
sctp_data_ready() to grab sk->sk_data_ready() with BH disabling.
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.33-rc6 #129
---------------------------------------------------------
sctp_darn/1517 just changed the state of lock:
(clock-AF_INET){++.?..}, at: [<
c06aab60>] sock_def_readable+0x20/0x80
but this lock took another, SOFTIRQ-unsafe lock in the past:
(slock-AF_INET){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
1 lock held by sctp_darn/1517:
#0: (sk_lock-AF_INET){+.+.+.}, at: [<
cdfe363d>] sctp_sendmsg+0x23d/0xc00 [sctp]
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 28 Apr 2010 18:25:59 +0000 (11:25 -0700)]
Revert "tcp: bind() fix when many ports are bound"
This reverts two commits:
fda48a0d7a8412cedacda46a9c0bf8ef9cd13559
tcp: bind() fix when many ports are bound
and a follow-on fix for it:
6443bb1fc2050ca2b6585a3fa77f7833b55329ed
ipv6: Fix inet6_csk_bind_conflict()
It causes problems with binding listening sockets when time-wait
sockets from a previous instance still are alive.
It's too late to keep fiddling with this so late in the -rc
series, and we'll deal with it in net-next-2.6 instead.
Signed-off-by: David S. Miller <davem@davemloft.net>
Xiaotian Feng [Thu, 22 Apr 2010 10:56:17 +0000 (18:56 +0800)]
nfs: fix memory leak in nfs_get_sb with CONFIG_NFS_V4
With CONFIG_NFS_V4 and data version 4, nfs_get_sb will allocate memory for
export_path in nfs4_validate_text_mount_data, so we need to free it then.
This is addressed in following kmemleak report:
unreferenced object 0xffff88016bf48a50 (size 16):
comm "mount.nfs", pid 22567, jiffies
4651574704 (age 175471.200s)
hex dump (first 16 bytes):
2f 6f 70 74 2f 77 6f 72 6b 00 6b 6b 6b 6b 6b a5 /opt/work.kkkkk.
backtrace:
[<
ffffffff814b34f9>] kmemleak_alloc+0x60/0xa7
[<
ffffffff81102c76>] kmemleak_alloc_recursive.clone.5+0x1b/0x1d
[<
ffffffff811046b3>] __kmalloc_track_caller+0x18f/0x1b7
[<
ffffffff810e1b08>] kstrndup+0x37/0x54
[<
ffffffffa0336971>] nfs_parse_devname+0x152/0x204 [nfs]
[<
ffffffffa0336af3>] nfs4_validate_text_mount_data+0xd0/0xdc [nfs]
[<
ffffffffa0338deb>] nfs_get_sb+0x325/0x736 [nfs]
[<
ffffffff81113671>] vfs_kern_mount+0xbd/0x17c
[<
ffffffff81113798>] do_kern_mount+0x4d/0xed
[<
ffffffff81129a87>] do_mount+0x787/0x7fe
[<
ffffffff81129b86>] sys_mount+0x88/0xc2
[<
ffffffff81009b42>] system_call_fastpath+0x16/0x1b
Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Benny Halevy <bhalevy@panasas.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>