In the current code, dev_map_free() can still race with dev_map_notification().
In dev_map_free(), we remove dtab from the list of dtabs after we purged
all entries from it. However, we don't do xchg() with NULL or the like,
so the entry at that point is still pointing to the device. If a unregister
notification comes in at the same time, we therefore risk a double-free,
since the pointer is still present in the map, and then pushed again to
__dev_map_entry_free().
All this is completely unnecessary. Just remove the dtab from the list
right before the synchronize_rcu(), so all outstanding readers from the
notifier list have finished by then, thus we don't need to deal with this
corner case anymore and also wouldn't need to nullify dev entires. This is
fine because we iterate over the map releasing all entries and therefore
dev references anyway.
Fixes:
4cc7b9544b9a ("bpf: devmap fix mutex in rcu critical section")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
* no further reads against netdev_map. It does __not__ ensure pending
* flush operations (if any) are complete.
*/
+
+ spin_lock(&dev_map_lock);
+ list_del_rcu(&dtab->list);
+ spin_unlock(&dev_map_lock);
+
synchronize_rcu();
/* To ensure all pending flush operations have completed wait for flush
cpu_relax();
}
- /* Although we should no longer have datapath or bpf syscall operations
- * at this point we we can still race with netdev notifier, hence the
- * lock.
- */
for (i = 0; i < dtab->map.max_entries; i++) {
struct bpf_dtab_netdev *dev;
/* At this point bpf program is detached and all pending operations
* _must_ be complete
*/
- spin_lock(&dev_map_lock);
- list_del_rcu(&dtab->list);
- spin_unlock(&dev_map_lock);
free_percpu(dtab->flush_needed);
bpf_map_area_free(dtab->netdev_map);
kfree(dtab);