Patrick McHardy [Thu, 12 Jul 2007 02:42:31 +0000 (19:42 -0700)]
[RTNETLINK]: rtnl_link: allow specifying initial device address
Drivers need to validate the initial addresses in their netlink attribute
validation function or manually reject them if they can't support this.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 12 Jul 2007 02:42:13 +0000 (19:42 -0700)]
[RTNETLINK]: rtnl_link API simplification
All drivers need to unregister their devices in the module unload function.
While doing so they must hold the rtnl and atomically unregister the
rtnl_link ops as well. This makes the rtnl_link_unregister function that
takes the rtnl itself completely useless.
Provide default newlink/dellink functions, make __rtnl_link_unregister and
rtnl_link_unregister unregister all devices with matching rtnl_link_ops and
change the existing users to take advantage of that.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 12 Jul 2007 02:45:24 +0000 (19:45 -0700)]
[VLAN]: Fix MAC address handling
The VLAN MAC address handling is broken in multiple ways. When the address
differs when setting it, the real device is put in promiscous mode twice,
but never taken out again. Additionally it doesn't resync when the real
device's address is changed and needlessly puts it in promiscous mode when
the vlan device is still down.
Fix by moving address handling to vlan_dev_open/vlan_dev_stop and properly
deal with address changes in the device notifier. Also switch to
dev_unicast_add (which needs the exact same handling).
Since the set_mac_address handler is identical to the generic ethernet one
with these changes, kill it and use ether_setup().
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 12 Jul 2007 02:41:18 +0000 (19:41 -0700)]
[ETH]: Validate address in eth_mac_addr
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2007 02:37:40 +0000 (19:37 -0700)]
Merge /linux/kernel/git/holtmann/bluetooth-2.6
Olaf Kirch [Thu, 12 Jul 2007 02:32:02 +0000 (19:32 -0700)]
[NET]: Fix races in net_rx_action vs netpoll.
Keep netpoll/poll_napi from messing with the poll_list.
Only net_rx_action is allowed to manipulate the list.
Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Miklos Szeredi [Wed, 11 Jul 2007 21:22:39 +0000 (14:22 -0700)]
[AF_UNIX]: Rewrite garbage collector, fixes race.
Throw out the old mark & sweep garbage collector and put in a
refcounting cycle detecting one.
The old one had a race with recvmsg, that resulted in false positives
and hence data loss. The old algorithm operated on all unix sockets
in the system, so any additional locking would have meant performance
problems for all users of these.
The new algorithm instead only operates on "in flight" sockets, which
are very rare, and the additional locking for these doesn't negatively
impact the vast majority of users.
In fact it's probable, that there weren't *any* heavy senders of
sockets over sockets, otherwise the above race would have been
discovered long ago.
The patch works OK with the app that exposed the race with the old
code. The garbage collection has also been verified to work in a few
simple cases.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Wed, 11 Jul 2007 06:24:52 +0000 (23:24 -0700)]
[NETFILTER]: {ip, nf}_conntrack_sctp: fix remotely triggerable NULL ptr dereference (CVE-2007-2876)
When creating a new connection by sending an unknown chunk type, we
don't transition to a valid state, causing a NULL pointer dereference
in sctp_packet when accessing sctp_timeouts[SCTP_CONNTRACK_NONE].
Fix by don't creating new conntrack entry if initial state is invalid.
Noticed by Vilmos Nebehaj <vilmos.nebehaj@ramsys.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Philippe De Muyter [Wed, 11 Jul 2007 06:07:31 +0000 (23:07 -0700)]
[NET]: Make all initialized struct seq_operations const.
Make all initialized struct seq_operations in net/ const
Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Wed, 11 Jul 2007 06:06:43 +0000 (23:06 -0700)]
[UDP]: Fix length check.
Rémi Denis-Courmont wrote:
> Right. By the way, shouldn't "len" rather be signed in there?
>
> unsigned int len;
>
> /* if we're overly short, let UDP handle it */
> len = skb->len - sizeof(struct udphdr);
> if (len <= 0)
> goto udp;
It should, but the < 0 case can't happen since __udp4_lib_rcv
already makes sure that we have at least a complete UDP header.
Anyways, this patch fixes it.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Micah Gruber [Wed, 11 Jul 2007 06:04:19 +0000 (23:04 -0700)]
[IPV6]: Remove unneeded pointer idev from addrconf_cleanup().
This trivial patch removes the unneeded pointer idev returned from
__in6_dev_get(), which is never used. The check for NULL can be simply
done by if (__in6_dev_get(dev) == NULL).
Signed-off-by: Micah Gruber <micah.gruber@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ilpo Järvinen [Wed, 11 Jul 2007 06:02:12 +0000 (23:02 -0700)]
[DECNET]: Another unnecessary net/tcp.h inclusion in net/dn.h
No longer needed.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki [Wed, 23 May 2007 04:28:48 +0000 (13:28 +0900)]
[IPV6]: Make IPV6_{RECV,2292}RTHDR boolean options.
Because reversing RH0 is no longer supported by deprecation
of RH0, let's make IPV6_{RECV,2292}RTHDR boolean options.
Boolean are more appropriate from standard POV.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki [Wed, 11 Jul 2007 05:55:49 +0000 (22:55 -0700)]
[IPV6]: Do not send RH0 anymore.
Based on <draft-ietf-ipv6-deprecate-rh0-00.txt>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki [Wed, 11 Jul 2007 05:47:58 +0000 (22:47 -0700)]
[IPV6]: Restore semantics of Routing Header processing.
The "fix" for emerging security threat was overkill and it broke
basic semantic of IPv6 routing header processing. We should assume
RT0 (or even RT2, depends on configuration) as "unknown" RH type so
that we
- silently ignore the routing header if segleft == 0
- send ICMPv6 Parameter Problem message back to the sender,
otherwise.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ranjit Manomohan [Wed, 11 Jul 2007 05:43:16 +0000 (22:43 -0700)]
[NET_SCHED]: Make HTB scheduler work with TSO.
Currently the HTB scheduler does not correctly account for TSO packets
which causes large inaccuracies in the bandwidth control when using TSO.
This patch allows the HTB scheduler to work with TSO enabled devices.
Signed-off-by: Ranjit Manomohan <ranjitm@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Wed, 11 Jul 2007 05:41:55 +0000 (22:41 -0700)]
[NET]: Update comments for skb checksums
Rusty (whose comments we should all study and emulate :) pointed
out that our comments for skb checksums are no longer up-to-date.
So here is a patch to
1) add the case of partial checksums on input;
2) update partial checksum case to mention csum_start/csum_offset;
3) mention the new IPv6 feature bit.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcel Holtmann [Wed, 11 Jul 2007 07:51:55 +0000 (09:51 +0200)]
[Bluetooth] Add basics to better support and handle eSCO links
To better support and handle eSCO links in the future a bunch of
constants needs to be added and some basic routines need to be
updated. This is the initial step.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Patrick McHardy [Mon, 9 Jul 2007 22:33:40 +0000 (15:33 -0700)]
[NET]: Avoid copying writable clones in tunnel drivers
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Philippe De Muyter [Mon, 9 Jul 2007 22:32:57 +0000 (15:32 -0700)]
[IPV4]: Make ip_tos2prio const.
Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Mon, 9 Jul 2007 22:30:19 +0000 (15:30 -0700)]
[NET]: Fix gen_estimator timer removal race
As noticed by Jarek Poplawski <jarkao2@o2.pl>, the timer removal in
gen_kill_estimator races with the timer function rearming the timer.
Check whether the timer list is empty before rearming the timer
in the timer function to fix this.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jarek Poplawski <jarkao2@o2.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Satyam Sharma [Mon, 9 Jul 2007 22:22:23 +0000 (15:22 -0700)]
[NETPOLL]: Fix a leak-n-bug in netpoll_cleanup()
93ec2c723e3f8a216dde2899aeb85c648672bc6b applied excessive duct tape to
the netpoll beast's netpoll_cleanup(), thus substituting one leak with
another, and opening up a little buglet :-)
net_device->npinfo (netpoll_info) is a shared and refcounted object and
cannot simply be set NULL the first time netpoll_cleanup() is called.
Otherwise, further netpoll_cleanup()'s see np->dev->npinfo == NULL and
become no-ops, thus leaking. And it's a bug too: the first call to
netpoll_cleanup() would thus (annoyingly) "disable" other (still alive)
netpolls too. Maybe nobody noticed this because netconsole (only user
of netpoll) never supported multiple netpoll objects earlier.
This is a trivial and obvious one-line fixlet.
Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
Robert P. J. Day [Mon, 9 Jul 2007 20:20:54 +0000 (13:20 -0700)]
[RXRPC]: Remove Makefile reference to obsolete RXRPC config variable
Since there is no Kconfig variable RXRPC anywhere in the tree, and the
variable AF_RXRPC performs exactly the same function, remove the
reference to CONFIG_RXRPC from net/Makefile.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Aloni [Mon, 9 Jul 2007 20:20:12 +0000 (13:20 -0700)]
[NETFILTER] net/ipv4/netfilter/ip_tables.c: lower printk severity
Signed-off-by: Dan Aloni <da-x@monatomic.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adrian Bunk [Mon, 9 Jul 2007 20:18:57 +0000 (13:18 -0700)]
[DCCP]: Make struct dccp_li_cachep static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matthias Kaehlcke [Mon, 9 Jul 2007 20:18:12 +0000 (13:18 -0700)]
[IRDA]: use mutex instead of semaphore in VLSI 82C147 IrDA controller driver
The VLSI 82C147 IrDA controller driver uses a semaphore as mutex. Use the
mutex API instead of the (binary) semaphore.
Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Morton [Mon, 9 Jul 2007 20:16:00 +0000 (13:16 -0700)]
[NET]: "wrong timeout value in sk_wait_data()": cleanups
- save 4 bytes
- it's read-mostly.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Vasily Averin <vvs@sw.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelianov [Mon, 9 Jul 2007 20:15:14 +0000 (13:15 -0700)]
[NET]: Make some network-related proc files use seq_list_xxx helpers
This includes /proc/net/protocols, /proc/net/rxrpc_calls and
/proc/net/rxrpc_connections files.
All three need seq_list_start_head to show some header.
Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelianov [Mon, 9 Jul 2007 20:12:24 +0000 (13:12 -0700)]
[ATM] br2684: Use seq_list_xxx helpers
The .show callback receives the list_head pointer now, not the struct
br2684_dev one.
Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Sun, 8 Jul 2007 06:03:44 +0000 (23:03 -0700)]
[NET]: netdevice mtu assumptions documentation
Document the expectations about device MTU handling.
The documentation about oversize packet handling is probably too
loose.
IMHO devices should drop oversize packets for robustness,
but many devices allow it now. For example, if you set mtu to 1200
bytes, most ether devices will allow a 1500 byte frame in.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Sun, 8 Jul 2007 05:59:14 +0000 (22:59 -0700)]
[NET]: netdevice locking assumptions documentation
Update the documentation about locking assumptions.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ilpo Järvinen [Sun, 8 Jul 2007 05:54:56 +0000 (22:54 -0700)]
[BNX2]: Seems to not need net/tcp.h
Got bored to always recompile it for no reason.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sun, 8 Jul 2007 05:52:37 +0000 (22:52 -0700)]
[BNX2]: Update version to 1.6.2.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sun, 8 Jul 2007 05:52:02 +0000 (22:52 -0700)]
[BNX2]: Print management firmware version.
Add management firmware version for ethtool -i.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sun, 8 Jul 2007 05:51:36 +0000 (22:51 -0700)]
[BNX2]: Enhance the heartbeat.
In addition to the periodic heartbeat, we're adding a heartbeat
request interrupt when the heartbeat is late. This is needed during
netpoll where the timer is not available. -rt kernels will also
benefit since the timer is not as accurate.
[ We discussed this patch last time and we decided that the -rt
kernel problem alone did not justify this patch. I think the
netpoll problem makes this patch necessary. ]
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sun, 8 Jul 2007 05:51:03 +0000 (22:51 -0700)]
[BNX2]: Reduce spurious INTA interrupts.
Spurious interrupts are often encountered especially on systems
using the 8259 PIC mode. This is because the I/O write to deassert
the interrupt is posted and won't get to the chip immediately. As
a result, the IRQ may remain asserted after the IRQ handler exits,
causing spurious interrupts.
Add read back to flush the I/O write to deassert the IRQ immediately.
We also store the last_status_idx immediately in the IRQ handler to
help detect whether the interrupt is ours or not when the IRQ is
entered again before ->poll gets called.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sun, 8 Jul 2007 05:50:37 +0000 (22:50 -0700)]
[BNX2]: Modify link up message.
Modify the link up dmesg to report remote copper or Serdes link.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sun, 8 Jul 2007 05:50:15 +0000 (22:50 -0700)]
[BNX2]: Add ethtool support for remote PHY.
Modify the driver's ethtool_ops->get_settings and set_settings
functions to support remote PHY. Users control the remote copper
PHY settings by specifying link settings for the tp (twisted pair)
port.
The nway_reset function is also modified to support remote PHY.
mii-tool operations are not supported on remote PHY and we will
return -EOPNOTSUPP.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sun, 8 Jul 2007 05:49:43 +0000 (22:49 -0700)]
[BNX2]: Add support for remote PHY.
In blade servers, the Serdes PHY in 5708S can control the remote
copper PHY through autonegotiation on the backplane. This patch adds
the logic to interface with the firmware to control the remote PHY
autonegotiation and to handle remote PHY link events.
When remote PHY is present, the 5708S Serdes device practically
becomes a copper device with full control over the 1000Base-T
link settings.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sun, 8 Jul 2007 05:48:31 +0000 (22:48 -0700)]
[BNX2]: Add remote PHY bit definitions.
Add new fields in struct bnx2 and other bit definitions in shared
memory to support remote PHY.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sun, 8 Jul 2007 05:48:00 +0000 (22:48 -0700)]
[BNX2]: Add bnx2_set_default_link().
Put existing code to setup the default link settings in this new
function. This makes it easier to support the remote PHY feature in
the next few patches.
Also change ETHTOOL_ALL_FIBRE_SPEED to include 2500Mbps if supported.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Balazs Scheidler [Sun, 8 Jul 2007 05:41:01 +0000 (22:41 -0700)]
[NETFILTER]: x_tables: add more detail to error message about match/target mask mismatch
Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yasuyuki Kozakai [Sun, 8 Jul 2007 05:40:26 +0000 (22:40 -0700)]
[NETFILTER]: nf_queue: Use RCU and mutex for queue handlers
Queue handlers are registered/unregistered in only process context.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yasuyuki Kozakai [Sun, 8 Jul 2007 05:40:08 +0000 (22:40 -0700)]
[NETFILTER]: nfnetlink_queue: don't unregister handler of other subsystem
The queue handlers registered by ip[6]_queue.ko at initialization should
not be unregistered according to requests from userland program
using nfnetlink_queue. If we allow that, there is no way to register
the handlers of built-in ip[6]_queue again.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:39:38 +0000 (22:39 -0700)]
[NETFILTER]: Convert DEBUGP to pr_debug
Convert DEBUGP to pr_debug and fix lots of non-compiling debug statements.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:39:16 +0000 (22:39 -0700)]
[NETFILTER]: xt_helper: use RCU
The ->helper pointer is protected by RCU, no need to take
nf_conntrack_lock. Also remove excessive debugging.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:38:54 +0000 (22:38 -0700)]
[NETFILTER]: nf_conntrack_h323: turn some printks into DEBUGPs
Don't spam the ringbuffer with decoding errors. The only printks remaining
are for dropped packets when we're certain they are H.323.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:38:30 +0000 (22:38 -0700)]
[NETFILTER]: ipt_CLUSTERIP: add compat code
Adjust structure size and don't expect pointers passed in from
userspace to be valid. Also replace an enum in an ABI structure
by a fixed size type.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:38:07 +0000 (22:38 -0700)]
[NETFILTER]: ipt_SAME: add to feature-removal-schedule
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:37:38 +0000 (22:37 -0700)]
[NETFILTER]: nf_conntrack: early_drop improvement
When the maximum number of conntrack entries is reached and a new
one needs to be allocated, conntrack tries to drop an unassured
connection from the same hash bucket the new conntrack would hash
to. Since with a properly sized hash the average number of entries
per bucket is 1, the chances of actually finding one are not very
good. This patch makes it walk the hash until a minimum number of
8 entries are checked.
Based on patch by Vasily Averin <vvs@sw.ru>.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:37:03 +0000 (22:37 -0700)]
[NETFILTER]: nf_conntrack: mark helpers __read_mostly
Most are __read_mostly already, this changes the remaining ones.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:36:46 +0000 (22:36 -0700)]
[NETFILTER]: nf_conntrack_helper: use hashtable for conntrack helpers
Eliminate the last global list searched for every new connection.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:36:24 +0000 (22:36 -0700)]
[NETFILTER]: nf_conntrack_expect: introduce nf_conntrack_expect_max sysct
As a last step of preventing DoS by creating lots of expectations, this
patch introduces a global maximum and a sysctl to control it. The default
is initialized to 4 * the expectation hash table size, which results in
1/64 of the default maxmimum of conntracks.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:35:56 +0000 (22:35 -0700)]
[NETFILTER]: nf_conntrack_expect: maintain per conntrack expectation list
This patch brings back the per-conntrack expectation list that was
removed around 2.6.10 to avoid walking all expectations on expectation
eviction and conntrack destruction.
As these were the last users of the global expectation list, this patch
also kills that.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:35:21 +0000 (22:35 -0700)]
[NETFILTER]: nf_conntrack_helper/nf_conntrack_netlink: convert to expectation hash
Convert from the global expectation list to the hash table.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:34:07 +0000 (22:34 -0700)]
[NETFILTER]: nf_conntrack_expect: convert proc functions to hash
Convert from the global expectation list to the hash table.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:33:47 +0000 (22:33 -0700)]
[NETFILTER]: nf_conntrack: use hashtable for expectations
Currently all expectations are kept on a global list that
- needs to be searched for every new conncetion
- needs to be walked for evicting expectations when a master connection
has reached its limit
- needs to be walked on connection destruction for connections that
have open expectations
This is obviously not good, especially when considering helpers like
H.323 that register *lots* of expectations and can set up permanent
expectations, but it also allows for an easy DoS against firewalls
using connection tracking helpers.
Use a hashtable for expectations to avoid incurring the search overhead
for every new connection. The default hash size is 1/256 of the conntrack
hash table size, this can be overriden using a module parameter.
This patch only introduces the hash table for expectation lookups and
keeps other users to reduce the noise, the following patches will get
rid of it completely.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:32:53 +0000 (22:32 -0700)]
[NETFILTER]: nf_conntrack: move expectaton related init code to nf_conntrack_expect.c
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:32:34 +0000 (22:32 -0700)]
[NETFILTER]: nf_conntrack_netlink: sync expectation dumping with conntrack table dumping
Resync expectation table dumping code with conntrack dumping: don't
rely on the unique ID anymore since that requires to walk the list
backwards, which doesn't work with the upcoming conversion to hlists.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:32:03 +0000 (22:32 -0700)]
[NETFILTER]: nf_conntrack_expect: avoid useless list walking
Don't walk the list when unexpecting an expectation, we already
have a reference and the timer check is enough to guarantee
that it still is on the list.
This comment suggests that it was copied there by mistake from
expectation eviction:
/* choose the oldest expectation to evict */
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:31:32 +0000 (22:31 -0700)]
[NETFILTER]: nf_conntrack: reduce masks to a subset of tuples
Since conntrack currently allows to use masks for every bit of both
helper and expectation tuples, we can't hash them and have to keep
them on two global lists that are searched for every new connection.
This patch removes the never used ability to use masks for the
destination part of the expectation tuple and completely removes
masks from helpers since the only reasonable choice is a full
match on l3num, protonum and src.u.all.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:31:07 +0000 (22:31 -0700)]
[NETFILTER]: nf_conntrack_ftp: use nf_ct_expect_init
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:30:49 +0000 (22:30 -0700)]
[NETFILTER]: nf_conntrack_expect: function naming unification
Currently there is a wild mix of nf_conntrack_expect_, nf_ct_exp_,
expect_, exp_, ...
Consistently use nf_ct_ as prefix for exported functions.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:30:27 +0000 (22:30 -0700)]
[NETFILTER]: nf_nat: use hlists for bysource hash
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:30:08 +0000 (22:30 -0700)]
[NETFILTER]: nf_conntrack: export hash allocation/destruction functions
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:28:42 +0000 (22:28 -0700)]
[NETFILTER]: nf_conntrack: remove 'ignore_conntrack' argument from nf_conntrack_find_get
All callers pass NULL, this also doesn't seem very useful for modules.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:28:14 +0000 (22:28 -0700)]
[NETFILTER]: nf_conntrack: use hlists for conntrack hash
Convert conntrack hash to hlists to reduce its size and cache
footprint. Since the default hashsize to max. entries ratio
sucks (1:16), this patch doesn't reduce the amount of memory
used for the hash by default, but instead uses a better ratio
of 1:8, which results in the same max. entries value.
One thing worth noting is early_drop. It really should use LRU,
so it now has to iterate over the entire chain to find the last
unconfirmed entry. Since chains shouldn't be very long and the
entire operation is very rare this shouldn't be a problem.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:27:33 +0000 (22:27 -0700)]
[NETFILTER]: nf_conntrack: round up hashsize to next multiple of PAGE_SIZE
Don't let the rest of the page go to waste.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:27:06 +0000 (22:27 -0700)]
[NETFILTER]: nf_conntrack_extend: use __read_mostly for struct nf_ct_ext_type
Also make them static.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yasuyuki Kozakai [Sun, 8 Jul 2007 05:26:35 +0000 (22:26 -0700)]
[NETFILTER]: nf_nat: merge nf_conn and nf_nat_info
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yasuyuki Kozakai [Sun, 8 Jul 2007 05:26:16 +0000 (22:26 -0700)]
[NETFILTER]: nf_nat: kill global 'destroy' operation
This kills the global 'destroy' operation which was used by NAT.
Instead it uses the extension infrastructure so that multiple
extensions can register own operations.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yasuyuki Kozakai [Sun, 8 Jul 2007 05:25:51 +0000 (22:25 -0700)]
[NETFILTER]: nf_conntrack: remove old memory allocator of conntrack
Now memory space for help and NAT are allocated by extension
infrastructure.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yasuyuki Kozakai [Sun, 8 Jul 2007 05:25:28 +0000 (22:25 -0700)]
[NETFILTER]: nf_nat: remove unused nf_nat_module_is_loaded
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yasuyuki Kozakai [Sun, 8 Jul 2007 05:24:28 +0000 (22:24 -0700)]
[NETFILTER]: nf_nat: use extension infrastructure
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yasuyuki Kozakai [Sun, 8 Jul 2007 05:24:04 +0000 (22:24 -0700)]
[NETFILTER]: nf_nat: add reference to conntrack from entry of bysource list
I will split 'struct nf_nat_info' out from conntrack. So I cannot use
'offsetof' to get the pointer to conntrack from it.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yasuyuki Kozakai [Sun, 8 Jul 2007 05:23:42 +0000 (22:23 -0700)]
[NETFILTER]: nf_conntrack: use extension infrastructure for helper
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yasuyuki Kozakai [Sun, 8 Jul 2007 05:23:21 +0000 (22:23 -0700)]
[NETFILTER]: nf_conntrack: introduce extension infrastructure
Old space allocator of conntrack had problems about extensibility.
- It required slab cache per combination of extensions.
- It expected what extensions would be assigned, but it was impossible
to expect that completely, then we allocated bigger memory object than
really required.
- It needed to search helper twice due to lock issue.
Now basic informations of a connection are stored in 'struct nf_conn'.
And a storage for extension (helper, NAT) is allocated by kmalloc.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yasuyuki Kozakai [Sun, 8 Jul 2007 05:22:33 +0000 (22:22 -0700)]
[NETFILTER]: nf_nat: move NAT declarations from nf_conntrack_ipv4.h to nf_nat.h
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sun, 8 Jul 2007 05:22:02 +0000 (22:22 -0700)]
[NETFILTER]: x_tables: mark matches and targets __read_mostly
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jozsef Kadlecsik [Sun, 8 Jul 2007 05:21:23 +0000 (22:21 -0700)]
[NETFILTER]: x_tables: add TRACE target
The TRACE target can be used to follow IP and IPv6 packets through
the ruleset.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick NcHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Sun, 8 Jul 2007 05:20:36 +0000 (22:20 -0700)]
[NETFILTER]: Add u32 match
Along comes... xt_u32, a revamped ipt_u32 from POM-NG,
Plus:
* 2007-06-02: added ipv6 support
* 2007-06-05: uses kmalloc for the big buffer
* 2007-06-05: added inversion
* 2007-06-20: use skb_copy_bits() and get rid of the big buffer
and lock (suggested by Pablo Neira Ayuso)
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jerome Borsboom [Sun, 8 Jul 2007 05:19:48 +0000 (22:19 -0700)]
[NETFILTER]: nf_nat_sip: only perform RTP DNAT if SIP session was SNATed
DNAT of the the RTP session is only necessary if the SIP session has
been SNATed.
Signed-off-by: Jerome Borsboom <j.borsboom@erasmusmc.nl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Sun, 8 Jul 2007 05:19:08 +0000 (22:19 -0700)]
[NETFILTER]: Remove redundant parentheses/braces
Removes redundant parentheses and braces (And add one pair in a
xt_tcpudp.c macro).
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Sun, 8 Jul 2007 05:17:36 +0000 (22:17 -0700)]
[NETFILTER]: Remove incorrect inline markers
device_cmp: the function's address is taken (call to nf_ct_iterate_cleanup)
alloc_null_binding: referenced externally
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Sun, 8 Jul 2007 05:16:55 +0000 (22:16 -0700)]
[NETFILTER]: add some consts, remove some casts
Make a number of variables const and/or remove unneeded casts.
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Sun, 8 Jul 2007 05:16:26 +0000 (22:16 -0700)]
[NETFILTER]: x_tables: switch xt_target->checkentry to bool
Switch the return type of target checkentry functions to boolean.
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Sun, 8 Jul 2007 05:16:00 +0000 (22:16 -0700)]
[NETFILTER]: x_tables: switch xt_match->checkentry to bool
Switch the return type of match functions to boolean
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Sun, 8 Jul 2007 05:15:35 +0000 (22:15 -0700)]
[NETFILTER]: x_tables: switch xt_match->match to bool
Switch the return type of match functions to boolean
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Sun, 8 Jul 2007 05:15:12 +0000 (22:15 -0700)]
[NETFILTER]: x_tables: switch hotdrop to bool
Switch the "hotdrop" variables to boolean
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yasuyuki Kozakai [Sun, 8 Jul 2007 05:14:23 +0000 (22:14 -0700)]
[NETFILTER]: ip6_tables: fix explanation of valid upper protocol number
This explains the allowed upper protocol numbers. IP6T_F_NOPROTO was
introduced to use 0 as Hop-by-Hop option header, not wildcard. But that
seemed to be forgotten. 0 has been used as wildcard since 2002-08-23.
Signed-off-by: Yasuyuki Kozakai <yasuyuki@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jing Min Zhao [Sun, 8 Jul 2007 05:13:17 +0000 (22:13 -0700)]
[NETFILTER]: nf_conntrack_h323: check range first in sequence extension
Check range before checking STOP flag. This optimization may save a
nanosecond or less :)
Signed-off-by: Jing Min Zhao <zhaojingmin@vivecode.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
James Chapman [Fri, 6 Jul 2007 00:08:05 +0000 (17:08 -0700)]
[UDP]: Cleanup UDP encapsulation code
This cleanup fell out after adding L2TP support where a new encap_rcv
funcptr was added to struct udp_sock. Have XFRM use the new encap_rcv
funcptr, which allows us to move the XFRM encap code from udp.c into
xfrm4_input.c.
Make xfrm4_rcv_encap() static since it is no longer called externally.
Signed-off-by: James Chapman <jchapman@katalix.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
G. Liakhovetski [Tue, 3 Jul 2007 05:56:57 +0000 (22:56 -0700)]
[IrDA]: tsap init routine factorisation.
This patch extracts common code from irttp_open_tsap() and irttp_dup()
into a new function to 1) avoid code duplication, 2) help avoid
forgetting object initialization in the tsap duplication path in the
future.
Signed-off-by: G. Liakhovetski <gl@dsa-ac.de>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Samuel Ortiz [Tue, 3 Jul 2007 05:56:15 +0000 (22:56 -0700)]
[IrDA]: kingsun-sir.c charset fix.
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Samuel Ortiz [Tue, 3 Jul 2007 05:55:31 +0000 (22:55 -0700)]
[IrDA]: Monitor mode.
Through the IrDA netlink set mode command, we switch to IrDA monitor
mode, where one IrLAP instance receives all the packets on the media,
without ever responding to them.
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Samuel Ortiz [Tue, 3 Jul 2007 05:54:18 +0000 (22:54 -0700)]
[IrDA]: Netlink layer.
First IrDA configuration netlink layer implementation.
Currently, we only support the set/get mode commands.
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guido Guenther [Tue, 3 Jul 2007 05:50:25 +0000 (22:50 -0700)]
[NET]: Allow group ownership of TUN/TAP devices.
Introduce a new syscall TUNSETGROUP for group ownership setting of tap
devices. The user now is allowed to send packages if either his euid or
his egid matches the one specified via tunctl (via -u or -g
respecitvely). If both, gid and uid, are set via tunctl, both have to
match.
Signed-off-by: Guido Guenther <agx@sigxcpu.org>
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Tue, 3 Jul 2007 05:49:07 +0000 (22:49 -0700)]
[NET_SCHED]: Remove unnecessary includes
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Tue, 3 Jul 2007 05:48:13 +0000 (22:48 -0700)]
[NET_SCHED]: sch_htb: use generic estimator
Use the generic estimator instead of reimplementing (parts of) it.
For compatibility always create a default estimator for new classes.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Tue, 3 Jul 2007 05:47:37 +0000 (22:47 -0700)]
[NET_SCHED]: Remove unnecessary stats_lock pointers
Remove stats_lock pointers from qdisc-internal structures, in all cases
it points to dev->queue_lock. The only case where it is necessary is for
top-level qdiscs, where it might also point to dev->ingress_lock in case
of the ingress qdisc. Also remove it from actions completely, it always
points to the actions internal lock.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>