While working on a different issue, I noticed an annoying use
after free bug on my machine when unloading the ixgbe driver:
[ 8642.318797] ixgbe 0000:02:00.1: removed PHC on p2p2
[ 8642.742716] ixgbe 0000:02:00.1: complete
[ 8642.743784] BUG: unable to handle kernel paging request at
ffff8807d3740a90
[ 8642.744828] IP: [<
ffffffffa01c77dc>] ixgbe_remove+0xfc/0x1b0 [ixgbe]
[ 8642.745886] PGD
20c6067 PUD
81c1f6067 PMD
81c15a067 PTE
80000007d3740060
[ 8642.746956] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
[ 8642.748039] Modules linked in: [...]
[ 8642.752929] CPU: 1 PID: 1225 Comm: rmmod Not tainted 3.18.0-rc2+ #49
[ 8642.754203] Hardware name: Supermicro X10SLM-F/X10SLM-F, BIOS 1.1b 11/01/2013
[ 8642.755505] task:
ffff8807e34d3fe0 ti:
ffff8807b7204000 task.ti:
ffff8807b7204000
[ 8642.756831] RIP: 0010:[<
ffffffffa01c77dc>] [<
ffffffffa01c77dc>] ixgbe_remove+0xfc/0x1b0 [ixgbe]
[...]
[ 8642.774335] Stack:
[ 8642.775805]
ffff8807ee824098 ffff8807ee824098 ffffffffa01f3000 ffff8807ee824000
[ 8642.777326]
ffff8807b7207e18 ffffffff8137720f ffff8807ee824098 ffff8807ee824098
[ 8642.778848]
ffffffffa01f3068 ffff8807ee8240f8 ffff8807b7207e38 ffffffff8144180f
[ 8642.780365] Call Trace:
[ 8642.781869] [<
ffffffff8137720f>] pci_device_remove+0x3f/0xc0
[ 8642.783395] [<
ffffffff8144180f>] __device_release_driver+0x7f/0xf0
[ 8642.784876] [<
ffffffff814421f8>] driver_detach+0xb8/0xc0
[ 8642.786352] [<
ffffffff814414a9>] bus_remove_driver+0x59/0xe0
[ 8642.787783] [<
ffffffff814429d0>] driver_unregister+0x30/0x70
[ 8642.789202] [<
ffffffff81375c65>] pci_unregister_driver+0x25/0xa0
[ 8642.790657] [<
ffffffffa01eb38e>] ixgbe_exit_module+0x1c/0xc8e [ixgbe]
[ 8642.792064] [<
ffffffff810f93a2>] SyS_delete_module+0x132/0x1c0
[ 8642.793450] [<
ffffffff81012c61>] ? do_notify_resume+0x61/0xa0
[ 8642.794837] [<
ffffffff816d2029>] system_call_fastpath+0x12/0x17
The issue is that test_and_set_bit() done on adapter->state is being
performed *after* the netdevice has been freed via free_netdev().
When netdev is being allocated on initialization time, it allocates
a private area, here struct ixgbe_adapter, that resides after the
net_device structure. In ixgbe_probe(), the device init routine,
we set up the adapter after alloc_etherdev_mq() on the private area
and add a reference for the pci_dev as well via pci_set_drvdata().
Both in the error path of ixgbe_probe(), but also on module unload
when ixgbe_remove() is being called, commit
41c62843eb6a ("ixgbe:
Fix rcu warnings induced by LER") accesses adapter after free_netdev().
The patch stores the result in a bool and thus fixes above oops on my
side.
Fixes:
41c62843eb6a ("ixgbe: Fix rcu warnings induced by LER")
Cc: stable <stable@vger.kernel.org>
Cc: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
int i, err, pci_using_dac, expected_gts;
unsigned int indices = MAX_TX_QUEUES;
u8 part_str[IXGBE_PBANUM_LENGTH];
+ bool disable_dev = false;
#ifdef IXGBE_FCOE
u16 device_caps;
#endif
iounmap(adapter->io_addr);
kfree(adapter->mac_table);
err_ioremap:
+ disable_dev = !test_and_set_bit(__IXGBE_DISABLED, &adapter->state);
free_netdev(netdev);
err_alloc_etherdev:
pci_release_selected_regions(pdev,
pci_select_bars(pdev, IORESOURCE_MEM));
err_pci_reg:
err_dma:
- if (!adapter || !test_and_set_bit(__IXGBE_DISABLED, &adapter->state))
+ if (!adapter || disable_dev)
pci_disable_device(pdev);
return err;
}
{
struct ixgbe_adapter *adapter = pci_get_drvdata(pdev);
struct net_device *netdev = adapter->netdev;
+ bool disable_dev;
ixgbe_dbg_adapter_exit(adapter);
e_dev_info("complete\n");
kfree(adapter->mac_table);
+ disable_dev = !test_and_set_bit(__IXGBE_DISABLED, &adapter->state);
free_netdev(netdev);
pci_disable_pcie_error_reporting(pdev);
- if (!test_and_set_bit(__IXGBE_DISABLED, &adapter->state))
+ if (disable_dev)
pci_disable_device(pdev);
}