Patrick McHardy [Mon, 14 Apr 2008 09:15:52 +0000 (11:15 +0200)]
[NETFILTER]: nf_conntrack_tcp: catch invalid state updates over ctnetlink
Invalid states can cause out-of-bound memory accesses of the state table.
Also don't insist on having a new state contained in the netlink message.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Mon, 14 Apr 2008 09:15:52 +0000 (11:15 +0200)]
[NETFILTER]: nf_nat: kill helper and seq_adjust hooks
Connection tracking helpers (specifically FTP) need to be called
before NAT sequence numbers adjustments are performed to be able
to compare them against previously seen ones. We've introduced
two new hooks around 2.6.11 to maintain this ordering when NAT
modules were changed to get called from conntrack helpers directly.
The cost of netfilter hooks is quite high and sequence number
adjustments are only rarely needed however. Add a RCU-protected
sequence number adjustment function pointer and call it from
IPv4 conntrack after calling the helper.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Mon, 14 Apr 2008 09:15:51 +0000 (11:15 +0200)]
[NETFILTER]: nf_conntrack_extend: warn on confirmed conntracks
New extensions may only be added to unconfirmed conntracks to avoid races
when reallocating the storage.
Also change NF_CT_ASSERT to use WARN_ON to get backtraces.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Mon, 14 Apr 2008 09:15:51 +0000 (11:15 +0200)]
[NETFILTER]: nf_nat: don't add NAT extension for confirmed conntracks
Adding extensions to confirmed conntracks is not allowed to avoid races
on reallocation. Don't setup NAT for confirmed conntracks in case NAT
module is loaded late.
The has one side-effect, the connections existing before the NAT module
was loaded won't enter the bysource hash. The only case where this actually
makes a difference is in case of SNAT to a multirange where the IP before
NAT is also part of the range. Since old connections don't enter the
bysource hash the first new connection from the IP will have a new address
selected. This shouldn't matter at all.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Mon, 14 Apr 2008 09:15:50 +0000 (11:15 +0200)]
[NETFILTER]: nf_nat: remove obsolete check for ICMP redirects
Locally generated ICMP packets have a reference to the conntrack entry
of the original packet manually attached by icmp_send(). Therefore the
check for locally originated untracked ICMP redirects can never be
true.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Mon, 14 Apr 2008 09:15:50 +0000 (11:15 +0200)]
[NETFILTER]: nf_nat: add SCTP protocol support
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Thu, 20 Mar 2008 14:15:57 +0000 (15:15 +0100)]
[NETFILTER]: nf_nat: add DCCP protocol support
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Thu, 20 Mar 2008 14:15:55 +0000 (15:15 +0100)]
[NETFILTER]: nf_conntrack: add DCCP protocol support
Add DCCP conntrack helper. Thanks to Gerrit Renker <gerrit@erg.abdn.ac.uk>
for review and testing.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Thu, 20 Mar 2008 14:15:53 +0000 (15:15 +0100)]
[NETFILTER]: Add partial checksum validation helper
Move the UDP-Lite conntrack checksum validation to a generic helper
similar to nf_checksum() and make it fall back to nf_checksum()
in case the full packet is to be checksummed and hardware checksums
are available. This is to be used by DCCP conntrack, which also
needs to verify partial checksums.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Thu, 20 Mar 2008 14:15:51 +0000 (15:15 +0100)]
[NETFILTER]: nf_nat: add UDP-Lite support
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Thu, 20 Mar 2008 14:15:49 +0000 (15:15 +0100)]
[NETFILTER]: nf_nat: remove unused name from struct nf_nat_protocol
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Mon, 14 Apr 2008 09:15:47 +0000 (11:15 +0200)]
[NETFILTER]: nf_conntrack_netlink: clean up NAT protocol parsing
Move responsibility for setting the IP_NAT_RANGE_PROTO_SPECIFIED flag
to the NAT protocol, properly propagate errors and get rid of ugly
return value convention.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Mon, 14 Apr 2008 09:15:47 +0000 (11:15 +0200)]
[NETFILTER]: nf_nat: move NAT ctnetlink helpers to nf_nat_proto_common
Move to nf_nat_proto_common and rename to nf_nat_proto_... since they're
also used by protocols that don't have port numbers.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Mon, 14 Apr 2008 09:15:46 +0000 (11:15 +0200)]
[NETFILTER]: nf_nat: fix random mode not to overwrite port rover
The port rover should not get overwritten when using random mode,
otherwise other rules will also use more or less random ports.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Thu, 20 Mar 2008 14:15:47 +0000 (15:15 +0100)]
[NETFILTER]: nf_nat: add helpers for common NAT protocol operations
Add generic ->in_range and ->unique_tuple ops to avoid duplicating them
again and again for future NAT modules and save a few bytes of text:
net/ipv4/netfilter/nf_nat_proto_tcp.c:
tcp_in_range | -62 (removed)
tcp_unique_tuple | -259 # 271 -> 12, # inlines: 1 -> 0, size inlines: 7 -> 0
2 functions changed, 321 bytes removed
net/ipv4/netfilter/nf_nat_proto_udp.c:
udp_in_range | -62 (removed)
udp_unique_tuple | -259 # 271 -> 12, # inlines: 1 -> 0, size inlines: 7 -> 0
2 functions changed, 321 bytes removed
net/ipv4/netfilter/nf_nat_proto_gre.c:
gre_in_range | -62 (removed)
1 function changed, 62 bytes removed
vmlinux:
5 functions changed, 704 bytes removed
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Mon, 14 Apr 2008 09:15:45 +0000 (11:15 +0200)]
[NETFILTER]: {ip,ip6,arp}_tables: return EAGAIN for invalid SO_GET_ENTRIES size
Rule dumping is performed in two steps: first userspace gets the
ruleset size using getsockopt(SO_GET_INFO) and allocates memory,
then it calls getsockopt(SO_GET_ENTRIES) to actually dump the
ruleset. When another process changes the ruleset in between the
sizes from the first getsockopt call doesn't match anymore and
the kernel aborts. Unfortunately it returns EAGAIN, as for multiple
other possible errors, so userspace can't distinguish this case
from real errors.
Return EAGAIN so userspace can retry the operation.
Fixes (with current iptables SVN version) netfilter bugzilla #104.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Mon, 14 Apr 2008 09:15:45 +0000 (11:15 +0200)]
[NETFILTER]: nf_conntrack_sip: clear address in parse_addr()
Some callers pass uninitialized structures, clear the address to make
sure later comparisions work properly.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Mon, 14 Apr 2008 09:15:44 +0000 (11:15 +0200)]
[NETFILTER]: Explicitly initialize .priority in arptable_filter
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Mon, 14 Apr 2008 09:15:44 +0000 (11:15 +0200)]
[NETFILTER]: remove arpt_(un)register_target indirection macros
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Mon, 14 Apr 2008 09:15:43 +0000 (11:15 +0200)]
[NETFILTER]: remove arpt_target indirection macro
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Mon, 14 Apr 2008 09:15:43 +0000 (11:15 +0200)]
[NETFILTER]: remove arpt_table indirection macro
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Mon, 14 Apr 2008 09:15:42 +0000 (11:15 +0200)]
[NETFILTER]: annotate rest of nf_nat_* with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Mon, 14 Apr 2008 09:15:42 +0000 (11:15 +0200)]
[NETFILTER]: annotate rest of nf_conntrack_* with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Mon, 14 Apr 2008 09:15:35 +0000 (11:15 +0200)]
[NETFILTER]: annotate {arp,ip,ip6,x}tables with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Mon, 14 Apr 2008 07:56:05 +0000 (09:56 +0200)]
[NETFILTER]: annotate xtables targets with const and remove casts
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Mon, 14 Apr 2008 07:56:04 +0000 (09:56 +0200)]
[NETFILTER]: xt_sctp: simplify xt_sctp.h
The use of xt_sctp.h flagged up -Wshadow warnings in userspace, which
prompted me to look at it and clean it up. Basic operations have been
directly replaced by library calls (memcpy, memset is both available
in the kernel and userspace, and usually faster than a self-made
loop). The is_set and is_clear functions now use a processing time
shortcut, too.
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Robert P. J. Day [Mon, 14 Apr 2008 07:56:03 +0000 (09:56 +0200)]
[NETFILTER]: Use non-deprecated __RW_LOCK_UNLOCKED macro
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Robert P. J. Day [Mon, 14 Apr 2008 07:56:03 +0000 (09:56 +0200)]
[NETFILTER]: bridge netfilter: use non-deprecated __RW_LOCK_UNLOCKED macro.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Mon, 14 Apr 2008 07:56:02 +0000 (09:56 +0200)]
[NETFILTER]: ip_tables: per-netns FILTER/MANGLE/RAW tables for real
Commit
9335f047fe61587ec82ff12fbb1220bcfdd32006 aka
"[NETFILTER]: ip_tables: per-netns FILTER, MANGLE, RAW"
added per-netns _view_ of iptables rules. They were shown to user, but
ignored by filtering code. Now that it's possible to at least ping loopback,
per-netns tables can affect filtering decisions.
netns is taken in case of
PRE_ROUTING, LOCAL_IN -- from in device,
POST_ROUTING, LOCAL_OUT -- from out device,
FORWARD -- from in device which should be equal to out device's netns.
This code is relatively new, so BUG_ON was plugged.
Wrappers were added to a) keep code the same from CONFIG_NET_NS=n users
(overwhelming majority), b) consolidate code in one place -- similar
changes will be done in ipv6 and arp netfilter code.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Thu, 20 Mar 2008 14:15:45 +0000 (15:15 +0100)]
[NETFILTER]: {ip,ip6}t_LOG: print MARK value in log output
Dump the mark value in log messages similar to nfnetlink_log. This
is useful for debugging complex setups where marks are used for
routing or traffic classification.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Thu, 20 Mar 2008 14:15:43 +0000 (15:15 +0100)]
[NETFILTER]: nf_conntrack: less hairy ifdefs around proc and sysctl
Patch splits creation of /proc/net/nf_conntrack, /proc/net/stat/nf_conntrack
and net.netfilter hierarchy into their own functions with dummy ones
if PROC_FS or SYSCTL is not set. Also, remove dead "ret = 0" write
while I'm at it.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Gerrit Renker [Mon, 14 Apr 2008 07:05:28 +0000 (00:05 -0700)]
[SKB]: __skb_queue_tail = __skb_insert before
This expresses __skb_queue_tail() in terms of __skb_insert(),
using __skb_insert_before() as auxiliary function.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gerrit Renker [Mon, 14 Apr 2008 07:05:09 +0000 (00:05 -0700)]
[SKB]: __skb_append = __skb_queue_after
This expresses __skb_append in terms of __skb_queue_after, exploiting that
__skb_append(old, new, list) = __skb_queue_after(list, old, new).
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gerrit Renker [Mon, 14 Apr 2008 07:04:51 +0000 (00:04 -0700)]
[SKB]: __skb_queue_after(prev) = __skb_insert(prev, prev->next)
By reordering, __skb_queue_after() is expressed in terms of __skb_insert().
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gerrit Renker [Mon, 14 Apr 2008 07:04:12 +0000 (00:04 -0700)]
[SKB]: __skb_dequeue = skb_peek + __skb_unlink
By rearranging the order of declarations, __skb_dequeue() is expressed in terms of
* skb_peek() and
* __skb_unlink(),
thus in effect mirroring the analogue implementation of __skb_dequeue_tail().
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rami Rosen [Mon, 14 Apr 2008 06:59:13 +0000 (23:59 -0700)]
[IPV6] MROUTE: Add stats in multicast routing module method ip6_mr_forward().
This patches adds a call to increment IPSTATS_MIB_OUTFORWDATAGRAMS
when forwarding the packet in ip6_mr_forward() in the IPv6 multicast
routing module (net/ipv6/ip6mr.c).
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Mon, 14 Apr 2008 06:30:47 +0000 (23:30 -0700)]
[NET]: Sink IPv6 menuoptions into its own submenu
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki [Mon, 14 Apr 2008 06:21:52 +0000 (23:21 -0700)]
[IPV6]: Share common code-paths for sticky socket options.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki [Mon, 14 Apr 2008 06:21:16 +0000 (23:21 -0700)]
[IPV6] MROUTE: Do not call ipv6_find_idev() directly.
Since NETDEV_REGISTER notifier chain is responsible for creating
inet6_dev{}, we do not need to call ipv6_find_idev() directly here.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Mon, 14 Apr 2008 05:33:06 +0000 (22:33 -0700)]
[NETNS][DCCPV6]: Make per-net socket lookup.
The inet6_lookup family of functions requires a net to lookup
a socket in, so give a proper one to them.
No more things to do for dccpv6, since routing is OK and the
ipv4-like transport layer filtering is not done for ipv6.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Mon, 14 Apr 2008 05:32:45 +0000 (22:32 -0700)]
[NETNS][DCCPV6]: Actually create ctl socket on each net and use it.
Move the call to inet_ctl_sock_create to init callback (and
inet_ctl_sock_destroy to exit one) and use proper ctl sock
in dccp_v6_ctl_send_reset.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Mon, 14 Apr 2008 05:32:25 +0000 (22:32 -0700)]
[NETNS][DCCPV6]: Move the dccp_v6_ctl_sk on the struct net.
And replace all its usage with init_net's socket.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Mon, 14 Apr 2008 05:32:02 +0000 (22:32 -0700)]
[NETNS][DCCPV6]: Add dummy per-net operations.
They will be responsible for ctl socket initialization, but
currently they are void.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Mon, 14 Apr 2008 05:31:32 +0000 (22:31 -0700)]
[NETNS][DCCPV6]: Don't pass NULL to ip6_dst_lookup.
This call uses the sock to get the net to lookup the routing
in. With CONFIG_NET_NS this code will OOPS, since the sk ptr
is NULL.
After looking inside the ip6_dst_lookup and drawing the analogy
with respective ipv6 code, it seems, that the dccp ctl socket
is a good candidate for the first argument.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Mon, 14 Apr 2008 05:31:05 +0000 (22:31 -0700)]
[NETNS][DCCPV4]: Enable DCCPv4 in net namespaces.
This enables sockets creation with IPPROTO_DCCP and enables
the ip level to pass DCCP packets to the DCCP level.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Mon, 14 Apr 2008 05:30:43 +0000 (22:30 -0700)]
[NETNS][DCCPV4]: Make per-net socket lookup.
The inet_lookup family of functions requires a net to lookup
a socket in, so give a proper one to them.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Mon, 14 Apr 2008 05:30:19 +0000 (22:30 -0700)]
[NETNS][DCCPV4]: Use proper net to route the reset packet.
The dccp_v4_route_skb used in dccp_v4_ctl_send_reset, currently
works with init_net's routing tables - fix it.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Mon, 14 Apr 2008 05:29:59 +0000 (22:29 -0700)]
[NETNS][DCCPV4]: Actually create ctl socket on each net and use it.
Move the call to inet_ctl_sock_create to init callback (and
inet_ctl_sock_destroy to exit one) and use proper ctl sock
in dccp_v4_ctl_send_reset.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Mon, 14 Apr 2008 05:29:37 +0000 (22:29 -0700)]
[NETNS][DCCPV4]: Move the dccp_v4_ctl_sk on the struct net.
And replace all its usage with init_net's socket.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Mon, 14 Apr 2008 05:29:13 +0000 (22:29 -0700)]
[NETNS][DCCPV4]: Add dummy per-net operations.
They will be responsible for ctl socket initialization, but
currently they are void.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Mon, 14 Apr 2008 05:28:42 +0000 (22:28 -0700)]
[NETNS]: Add an empty netns_dccp structure on struct net.
According to the overall struct net design, it will be
filled with DCCP-related members.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Mon, 14 Apr 2008 05:13:53 +0000 (22:13 -0700)]
[TCP]: Remove owner from tcp_seq_afinfo.
Move it to tcp_seq_afinfo->seq_fops as should be.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Mon, 14 Apr 2008 05:13:30 +0000 (22:13 -0700)]
[TCP]: Place file operations directly into tcp_seq_afinfo.
No need to have separate never-used variable.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Mon, 14 Apr 2008 05:12:41 +0000 (22:12 -0700)]
[TCP]: Cleanup /proc/tcp[6] creation/removal.
Replace seq_open with seq_open_net and remove tcp_seq_release
completely. seq_release_net will do this job just fine.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Mon, 14 Apr 2008 05:12:13 +0000 (22:12 -0700)]
[TCP]: Move seq_ops from tcp_iter_state to tcp_seq_afinfo.
No need to create seq_operations for each instance of 'netstat'.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Mon, 14 Apr 2008 05:11:46 +0000 (22:11 -0700)]
[TCP]: No need to check afinfo != NULL in tcp_proc_(un)register.
tcp_proc_register/tcp_proc_unregister are called with a static pointer only.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Mon, 14 Apr 2008 05:11:14 +0000 (22:11 -0700)]
[TCP]: Replace struct net on tcp_iter_state with seq_net_private.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denys Vlasenko [Mon, 14 Apr 2008 04:54:34 +0000 (21:54 -0700)]
[ATM] drivers/atm/horizon.c: stop inlining largish static functions
drivers/atm/horizon.c has unusually large number
of static inline functions - 36.
I looked through them. Most of them seems to be small enough,
but a few are big, others are using udelay or busy loop,
and as such are better not be inlined.
This patch removes "inline" from these static functions
(regardless of number of callsites - gcc nowadays auto-inlines
statics with one callsite).
Size difference for 32bit x86:
text data bss dec hex filename
8201 180 6 8387 20c3 linux-2.6-ALLYES/drivers/atm/horizon.o
7840 180 6 8026 1f5a linux-2.6.inline-ALLYES/drivers/atm/horizon.o
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gerrit Renker [Mon, 14 Apr 2008 04:50:08 +0000 (21:50 -0700)]
[INET]: sk_reuse is valbool
sk_reuse is declared as "unsigned char", but is set as type valbool in net/core/sock.c.
There is no other place in net/ where sk->sk_reuse is set to a value > 1, so the test
"sk_reuse > 1" can not be true.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allan Stephens [Mon, 14 Apr 2008 04:35:11 +0000 (21:35 -0700)]
[TIPC]: Improve socket time conversions
This patch modifies TIPC's socket code to use standard kernel
routines to handle time conversions between jiffies and ms.
This ensures proper operation even when HZ isn't 1000.
Acknowledgements to Eric Sesterhenn <snakebyte@gmx.de> for
identifying this issue and proposing a solution.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allan Stephens [Mon, 14 Apr 2008 04:33:17 +0000 (21:33 -0700)]
[TIPC]: Remove redundant socket wait queue initialization
This patch eliminates re-initialization of the standard socket
wait queue used for sleeping in TIPC's socket creation code.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 13 Apr 2008 02:19:46 +0000 (19:19 -0700)]
Merge branch 'net-2.6.26-misc-
20080412b' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-dev
Paul Moore [Sun, 13 Apr 2008 02:07:52 +0000 (19:07 -0700)]
LSM: Make the Labeled IPsec hooks more stack friendly
The xfrm_get_policy() and xfrm_add_pol_expire() put some rather large structs
on the stack to work around the LSM API. This patch attempts to fix that
problem by changing the LSM API to require only the relevant "security"
pointers instead of the entire SPD entry; we do this for all of the
security_xfrm_policy*() functions to keep things consistent.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Moore [Sun, 13 Apr 2008 02:06:42 +0000 (19:06 -0700)]
NetLabel: Allow passing the LSM domain as a shared pointer
Smack doesn't have the need to create a private copy of the LSM "domain" when
setting NetLabel security attributes like SELinux, however, the current
NetLabel code requires a private copy of the LSM "domain". This patches fixes
that by letting the LSM determine how it wants to pass the domain value.
* NETLBL_SECATTR_DOMAIN_CPY
The current behavior, NetLabel assumes that the domain value is a copy and
frees it when done
* NETLBL_SECATTR_DOMAIN
New, Smack-friendly behavior, NetLabel assumes that the domain value is a
reference to a string managed by the LSM and does not free it when done
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Sun, 13 Apr 2008 02:04:38 +0000 (19:04 -0700)]
[AF_UNIX]: Use SEQ_START_TOKEN
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Sun, 13 Apr 2008 01:55:42 +0000 (18:55 -0700)]
MAINTAINERS: New sctp mailing list
Add a new sctp mailing list linux-sctp@vger.kernel.org.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gui Jianfeng [Sun, 13 Apr 2008 01:55:12 +0000 (18:55 -0700)]
[SCTP]: Remove an unused parameter from sctp_cmd_hb_timer_update
The 'asoc' parameter to sctp_cmd_hb_timer_update() is unused, and
we can remove it.
Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Robert P. J. Day [Sun, 13 Apr 2008 01:54:24 +0000 (18:54 -0700)]
[SCTP]: "list_for_each()" -> "list_for_each_entry()" where appropriate.
Replacing (almost) all invocations of list_for_each() with
list_for_each_entry() tightens up the code and allows for the deletion
of numerous list iterator variables that are no longer necessary.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Neil Horman [Sun, 13 Apr 2008 01:53:48 +0000 (18:53 -0700)]
[SCTP]: Correct /proc/net/assocs formatting error
Recently I posted a patch to add some informational items to
/proc/net/sctp/assocs. All the information is correct, but because
of how the seqfile show operation is laid out, some of the formatting
is backwards. This patch corrects that formatting, so that the new
information appears at the end of each line, rather than in the middle.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki [Fri, 11 Apr 2008 14:51:26 +0000 (23:51 +0900)]
[IPV6]: Fix IPV6_RECVERR for connected raw sockets.
Based on patch from Dmitry Butskoy <buc@odusz.so-cdu.ru>.
Closes: 10437
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Brian Haley [Fri, 11 Apr 2008 04:38:24 +0000 (00:38 -0400)]
[IPv6]: Change IPv6 unspecified destination address to ::1 for raw and un-connected sockets
This patch fixes a difference between IPv4 and IPv6 when sending packets
to the unspecified address (either 0.0.0.0 or ::) when using raw or
un-connected UDP sockets. There are two cases where IPv6 either fails
to send anything, or sends with the destination address set to ::. For
example:
--> ping -c1 0.0.0.0
PING 0.0.0.0 (127.0.0.1) 56(84) bytes of data.
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.032 ms
--> ping6 -c1 ::
PING ::(::) 56 data bytes
ping: sendmsg: Invalid argument
Doing a sendto("0.0.0.0") reveals:
10:55:01.495090 IP localhost.32780 > localhost.7639: UDP, length 100
Doing a sendto("::") reveals:
10:56:13.262478 IP6 fe80::217:8ff:fe7d:4718.32779 > ::.7639: UDP, length 100
If you issue a connect() first in the UDP case, it will be sent to ::1,
similar to what happens with TCP.
This restores the BSD-ism.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Rami Rosen [Thu, 10 Apr 2008 09:40:10 +0000 (12:40 +0300)]
[IPV6] MROUTE: Adjust IPV6 multicast routing module to use mroute6 header declarations.
- This patch adjusts IPv6 multicast routing module, net/ipv6/ip6mr.c,
to use mroute6 header definitions instead of mroute.
(MFC6_LINES instead of MFC_LINES, MAXMIFS instead of MAXVIFS, mifi_t
instead of vifi_t.)
- In addition, inclusion of some headers was removed as it is not needed.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
YOSHIFUJI Hideaki [Sat, 12 Apr 2008 03:59:42 +0000 (12:59 +0900)]
[IPV6]: Check length of int/boolean optval provided by user in setsockopt().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Wang Chen [Mon, 7 Apr 2008 01:42:07 +0000 (09:42 +0800)]
[IPV6]: Check length of optval provided by user in setsockopt().
Check length of setsockopt's optval, which provided by user, before copy it
from user space.
For POSIX compliant, return -EINVAL for setsockopt of short lengths.
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
YOSHIFUJI Hideaki [Thu, 10 Apr 2008 06:42:12 +0000 (15:42 +0900)]
[IPV6] MIP6: Use our standard definitions for paddings.
MIP6_OPT_PAD_X are actually for paddings in destination
option header. Replace them with our standard IPV6_TLV_PADX.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
YOSHIFUJI Hideaki [Thu, 10 Apr 2008 06:42:11 +0000 (15:42 +0900)]
[IPV6]: Use in6addr_any where appropriate.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
YOSHIFUJI Hideaki [Thu, 10 Apr 2008 06:42:11 +0000 (15:42 +0900)]
[IPV6]: Define constants for link-local multicast addresses.
- Define link-local all-node / all-router multicast addresses.
- Remove ipv6_addr_all_nodes() and ipv6_addr_all_routers().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
YOSHIFUJI Hideaki [Thu, 10 Apr 2008 06:42:10 +0000 (15:42 +0900)]
[IPV6]: Make address arguments const.
- net/ipv6/addrconf.c:
ipv6_get_ifaddr(), ipv6_dev_get_saddr()
- net/ipv6/mcast.c:
ipv6_sock_mc_join(), ipv6_sock_mc_drop(),
inet6_mc_check(),
ipv6_dev_mc_inc(), __ipv6_dev_mc_dec(), ipv6_dev_mc_dec(),
ipv6_chk_mcast_addr()
- net/ipv6/route.c:
rt6_lookup(), icmp6_dst_alloc()
- net/ipv6/ip6_output.c:
ip6_nd_hdr()
- net/ipv6/ndisc.c:
ndisc_send_ns(), ndisc_send_rs(), ndisc_send_redirect(),
ndisc_get_neigh(), __ndisc_send()
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
YOSHIFUJI Hideaki [Thu, 10 Apr 2008 06:42:09 +0000 (15:42 +0900)]
[IPV6] ADDRCONF: Uninline ipv6_isatap_eui64().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
YOSHIFUJI Hideaki [Thu, 10 Apr 2008 06:42:08 +0000 (15:42 +0900)]
[IPV6] ADDRCONF: Uninline ipv6_addr_hash().
The function is only used in net/ipv6/addrconf.c.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
YOSHIFUJI Hideaki [Fri, 11 Apr 2008 11:17:55 +0000 (20:17 +0900)]
[IPV6]: Use XOR and OR rather than mutiple ands for ipv6 address comparisons.
ipv6_addr_equal(), ipv6_addr_v4mapped(),
ipv6_addr_is_ll_all_{nodes,routers}(),
ipv6_masked_addr_cmp()
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
YOSHIFUJI Hideaki [Thu, 10 Apr 2008 06:42:07 +0000 (15:42 +0900)]
[IPV6]: Use ipv6_addr_equal() instead of !ipv6_addr_cmp().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
YOSHIFUJI Hideaki [Thu, 10 Apr 2008 06:41:28 +0000 (15:41 +0900)]
[IPV6] FIB_RULE: Sparse: fib6_rules_cleanup() is of void.
| net/ipv6/fib6_rules.c:319:2: warning: returning void-valued expression
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
YOSHIFUJI Hideaki [Thu, 10 Apr 2008 06:41:28 +0000 (15:41 +0900)]
[IPV6]: Sparse: Reuse previous delaration where appropriate.
| net/ipv6/ipv6_sockglue.c:162:16: warning: symbol 'net' shadows an earlier one
| net/ipv6/ipv6_sockglue.c:111:13: originally declared here
| net/ipv6/ipv6_sockglue.c:175:16: warning: symbol 'net' shadows an earlier one
| net/ipv6/ipv6_sockglue.c:111:13: originally declared here
| net/ipv6/ip6mr.c:1241:10: warning: symbol 'ret' shadows an earlier one
| net/ipv6/ip6mr.c:1163:6: originally declared here
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
YOSHIFUJI Hideaki [Thu, 10 Apr 2008 06:41:27 +0000 (15:41 +0900)]
[IPV6] SIT: Sparse: Use NULL pointer instead of 0.
| net/ipv6/sit.c:382:42: warning: Using plain integer as NULL pointer
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
YOSHIFUJI Hideaki [Thu, 10 Apr 2008 06:41:26 +0000 (15:41 +0900)]
[IPV6]: Kill several warnings without CONFIG_IPV6_MROUTE.
Pointed out by Andrew Morton <akpm@linux-foundation.org>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Stephen Hemminger [Thu, 10 Apr 2008 11:00:28 +0000 (04:00 -0700)]
IPV4: use xor rather than multiple ands for route compare
The comparison in ip_route_input is a hot path, by recoding the C
"and" as bit operations, fewer conditional branches get generated
so the code should be faster. Maybe someday Gcc will be smart
enough to do this?
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki [Thu, 10 Apr 2008 10:50:13 +0000 (03:50 -0700)]
[SCTP]: Use snmp_mib_{init,free}().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki [Thu, 10 Apr 2008 10:48:43 +0000 (03:48 -0700)]
[DCCP]: Use snmp_mib_{init,free}().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Thu, 10 Apr 2008 10:47:34 +0000 (03:47 -0700)]
ipv4: fib_trie leaf free optimization
Avoid unneeded test in the case where object to be freed
has to be a leaf. Don't need to use the generic tnode_free()
function, instead just setup leaf to be freed.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Thu, 10 Apr 2008 10:46:12 +0000 (03:46 -0700)]
ipv4: fib_trie remove unused argument
The trie pointer is passed down to flush_list and flush_leaf
but never used.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 10 Apr 2008 10:33:03 +0000 (03:33 -0700)]
[ATM]: Use SEQ_START_TOKEN
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Thu, 10 Apr 2008 10:12:40 +0000 (03:12 -0700)]
[Syncookies]: Add support for TCP options via timestamps.
Allow the use of SACK and window scaling when syncookies are used
and the client supports tcp timestamps. Options are encoded into
the timestamp sent in the syn-ack and restored from the timestamp
echo when the ack is received.
Based on earlier work by Glenn Griffin.
This patch avoids increasing the size of structs by encoding TCP
options into the least significant bits of the timestamp and
by not using any 'timestamp offset'.
The downside is that the timestamp sent in the packet after the synack
will increase by several seconds.
changes since v1:
don't duplicate timestamp echo decoding function, put it into ipv4/syncookie.c
and have ipv6/syncookies.c use it.
Feedback from Glenn Griffin: fix line indented with spaces, kill redundant if ()
Reviewed-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Thu, 10 Apr 2008 09:56:38 +0000 (02:56 -0700)]
IPV4: fib_trie use vmalloc for large tnodes
Use vmalloc rather than alloc_pages to avoid wasting memory.
The problem is that tnode structure has a power of 2 sized array,
plus a header. So the current code wastes almost half the memory
allocated because it always needs the next bigger size to hold
that small header.
This is similar to an earlier patch by Eric, but instead of a list
and lock, I used a workqueue to handle the fact that vfree can't
be done in interrupt context.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rami Rosen [Thu, 10 Apr 2008 09:31:20 +0000 (02:31 -0700)]
[IPV6]: Remove unused declarations in include/net/ip6_route.h.
1) Standlaone ip6_null_entry is no longer needed as it is replaced by
the ip6_null_entry member of ipv6 (instance of struct netns_ipv6) in
struct net (as a result of Network Namespaces patches).
2) These 3 methods from this same header are not defined anywhere:
ip6_rt_addr_add(), ip6_rt_addr_del(), rt6_sndmsg()
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cornelia Huck [Thu, 10 Apr 2008 09:12:45 +0000 (02:12 -0700)]
iucv: Delay bus registration until core is ready.
If we register the iucv bus after the infrastructure is ready,
userspace can start relying on it when it receives the uevent
for the bus.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiko Carstens [Thu, 10 Apr 2008 09:12:03 +0000 (02:12 -0700)]
iucv: get rid of in_atomic() use.
This BUG_ON is not needed, since all (debug) checks are also done
in smp_call_function() which gets called by this function.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Robert P. J. Day [Thu, 10 Apr 2008 09:11:24 +0000 (02:11 -0700)]
af_iucv: Use non-deprecated __RW_LOCK_UNLOCKED macro.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 10 Apr 2008 09:02:28 +0000 (02:02 -0700)]
[SKFILTER]: Add SKF_ADF_NLATTR instruction
SKF_ADF_NLATTR searches for a netlink attribute, which avoids manually
parsing and walking attributes. It takes the offset at which to start
searching in the 'A' register and the attribute type in the 'X' register
and returns the offset in the 'A' register. When the attribute is not
found it returns zero.
A top-level attribute can be located using a filter like this
(example for nfnetlink, using struct nfgenmsg):
...
{
/* A = offset of first attribute */
.code = BPF_LD | BPF_IMM,
.k = sizeof(struct nlmsghdr) + sizeof(struct nfgenmsg)
},
{
/* X = CTA_PROTOINFO */
.code = BPF_LDX | BPF_IMM,
.k = CTA_PROTOINFO,
},
{
/* A = netlink attribute offset */
.code = BPF_LD | BPF_B | BPF_ABS,
.k = SKF_AD_OFF + SKF_AD_NLATTR
},
{
/* Exit if not found */
.code = BPF_JMP | BPF_JEQ | BPF_K,
.k = 0,
.jt = <error>
},
...
A nested attribute below the CTA_PROTOINFO attribute would then
be parsed like this:
...
{
/* A += sizeof(struct nlattr) */
.code = BPF_ALU | BPF_ADD | BPF_K,
.k = sizeof(struct nlattr),
},
{
/* X = CTA_PROTOINFO_TCP */
.code = BPF_LDX | BPF_IMM,
.k = CTA_PROTOINFO_TCP,
},
{
/* A = netlink attribute offset */
.code = BPF_LD | BPF_B | BPF_ABS,
.k = SKF_AD_OFF + SKF_AD_NLATTR
},
...
The data of an attribute can be loaded into 'A' like this:
...
{
/* X = A (attribute offset) */
.code = BPF_MISC | BPF_TAX,
},
{
/* A = skb->data[X + k] */
.code = BPF_LD | BPF_B | BPF_IND,
.k = sizeof(struct nlattr),
},
...
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rami Rosen [Thu, 10 Apr 2008 09:01:21 +0000 (02:01 -0700)]
[IPV6] Remove three method declarations in include/net/ndisc.h.
This patch removes two unused method declarations in
include/net/ndisc.h: ndisc_forwarding_on(void) and
ndisc_forwarding_off(void);
Also igmp6_cleanup(void) appears twice in this header, so one
igmp6_cleanup(void) declaration is removed.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>