netfilter: br_netfilter: Drop dst references before setting.
authorHarsh Modi <harshmodi@google.com>
Wed, 31 Aug 2022 05:36:03 +0000 (22:36 -0700)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Thu, 15 Sep 2022 10:39:46 +0000 (12:39 +0200)
[ Upstream commit d047283a7034140ea5da759a494fd2274affdd46 ]

The IPv6 path already drops dst in the daddr changed case, but the IPv4
path does not. This change makes the two code paths consistent.

Further, it is possible that there is already a metadata_dst allocated from
ingress that might already be attached to skbuff->dst while following
the bridge path. If it is not released before setting a new
metadata_dst, it will be leaked. This is similar to what is done in
bpf_set_tunnel_key() or ip6_route_input().

It is important to note that the memory being leaked is not the dst
being set in the bridge code, but rather memory allocated from some
other code path that is not being freed correctly before the skb dst is
overwritten.

An example of the leakage fixed by this commit found using kmemleak:

unreferenced object 0xffff888010112b00 (size 256):
  comm "softirq", pid 0, jiffies 4294762496 (age 32.012s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 80 16 f1 83 ff ff ff ff  ................
    e1 4e f6 82 ff ff ff ff 00 00 00 00 00 00 00 00  .N..............
  backtrace:
    [<00000000d79567ea>] metadata_dst_alloc+0x1b/0xe0
    [<00000000be113e13>] udp_tun_rx_dst+0x174/0x1f0
    [<00000000a36848f4>] geneve_udp_encap_recv+0x350/0x7b0
    [<00000000d4afb476>] udp_queue_rcv_one_skb+0x380/0x560
    [<00000000ac064aea>] udp_unicast_rcv_skb+0x75/0x90
    [<000000009a8ee8c5>] ip_protocol_deliver_rcu+0xd8/0x230
    [<00000000ef4980bb>] ip_local_deliver_finish+0x7a/0xa0
    [<00000000d7533c8c>] __netif_receive_skb_one_core+0x89/0xa0
    [<00000000a879497d>] process_backlog+0x93/0x190
    [<00000000e41ade9f>] __napi_poll+0x28/0x170
    [<00000000b4c0906b>] net_rx_action+0x14f/0x2a0
    [<00000000b20dd5d4>] __do_softirq+0xf4/0x305
    [<000000003a7d7e15>] __irq_exit_rcu+0xc3/0x140
    [<00000000968d39a2>] sysvec_apic_timer_interrupt+0x9e/0xc0
    [<000000009e920794>] asm_sysvec_apic_timer_interrupt+0x16/0x20
    [<000000008942add0>] native_safe_halt+0x13/0x20

Florian Westphal says: "Original code was likely fine because nothing
ever did set a skb->dst entry earlier than bridge in those days."

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Harsh Modi <harshmodi@google.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
net/bridge/br_netfilter_hooks.c
net/bridge/br_netfilter_ipv6.c

index 11d4d18012fed9a40e2dbf87594a2817edade794..560a36c68be203201cfb7e46e23686df96523e5d 100644 (file)
@@ -382,6 +382,7 @@ static int br_nf_pre_routing_finish(struct net *net, struct sock *sk, struct sk_
                                /* - Bridged-and-DNAT'ed traffic doesn't
                                 *   require ip_forwarding. */
                                if (rt->dst.dev == dev) {
+                                       skb_dst_drop(skb);
                                        skb_dst_set(skb, &rt->dst);
                                        goto bridged_dnat;
                                }
@@ -411,6 +412,7 @@ bridged_dnat:
                        kfree_skb(skb);
                        return 0;
                }
+               skb_dst_drop(skb);
                skb_dst_set_noref(skb, &rt->dst);
        }
 
index 8c08dd07419f6d2a60669d6e26b5cbf5303eff2f..54dcead37dfd06335d5111a39ad242e485c808c7 100644 (file)
@@ -201,6 +201,7 @@ static int br_nf_pre_routing_finish_ipv6(struct net *net, struct sock *sk, struc
                        kfree_skb(skb);
                        return 0;
                }
+               skb_dst_drop(skb);
                skb_dst_set_noref(skb, &rt->dst);
        }