Xiaolong Ye reported the following failure on Broadwell D server:
EDAC sbridge: Some needed devices are missing
EDAC MC: Removed device 0 for sbridge_edac.c Broadwell SrcID#0_Ha#0: DEV 0000:ff:12.0
EDAC sbridge: Couldn't find mci handler
EDAC sbridge: Failed to register device with error -19.
Broadwell D (only IMC0 per socket) and Broadwell X (IMC0 and IMC1 per
socket) use the same PCI device IDs for IMC0 per socket, then they
share pci_dev_descr_broadwell_table (n_imcs_per_sock=2). In this case,
Broadwell D wrongly creates the nonexistent SOCK EDAC memory controller
and reports above error messages, since it has no IMC1 per socket.
Avoid creating the nonexistent SOCK memory controller.
Reported-and-tested-by: Xiaolong Ye <xiaolong.ye@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170608113351.25323-1-qiuxu.zhuo@intel.com
[ Massage. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
next_imc:
sbridge_dev = get_sbridge_dev(bus, dev_descr->dom, multi_bus, sbridge_dev);
if (!sbridge_dev) {
+
+ if (dev_descr->dom == SOCK)
+ goto out_imc;
+
sbridge_dev = alloc_sbridge_dev(bus, dev_descr->dom, table);
if (!sbridge_dev) {
pci_dev_put(pdev);
if (dev_descr->dom == SOCK && i < table->n_imcs_per_sock)
goto next_imc;
+out_imc:
/* Be sure that the device is enabled */
if (unlikely(pci_enable_device(pdev) < 0)) {
sbridge_printk(KERN_ERR,