Bryan O'Sullivan [Wed, 29 Mar 2006 23:23:37 +0000 (15:23 -0800)]
IB/ipath: misc infiniband code, part 2
Management datagram support, queue pairs, and reliable and unreliable
connections.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Wed, 29 Mar 2006 23:23:36 +0000 (15:23 -0800)]
IB/ipath: misc infiniband code, part 1
Completion queues, local and remote memory keys, and memory region
support.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Wed, 29 Mar 2006 23:23:35 +0000 (15:23 -0800)]
IB/ipath: infiniband RC protocol support
This is an implementation of the Infiniband RC ("reliable connection")
protocol.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Wed, 29 Mar 2006 23:23:34 +0000 (15:23 -0800)]
IB/ipath: infiniband UC and UD protocol support
These files implement the Infiniband UC ("unreliable connection") and UD
("unreliable datagram") protocols.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Wed, 29 Mar 2006 23:23:33 +0000 (15:23 -0800)]
IB/ipath: infiniband header files
These header files are used by the layered Infiniband driver.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Wed, 29 Mar 2006 23:23:32 +0000 (15:23 -0800)]
IB/ipath: layering interfaces used by higher-level driver code
The layering interfaces are used to implement the Infiniband protocols
and the ethernet emulation driver.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Wed, 29 Mar 2006 23:23:31 +0000 (15:23 -0800)]
IB/ipath: support for userspace apps using core driver
These files introduce a char device that userspace apps use to gain
direct memory-mapped access to the InfiniPath hardware, and routines for
pinning and unpinning user memory in cases where the hardware needs to
DMA into the user address space.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Wed, 29 Mar 2006 23:23:30 +0000 (15:23 -0800)]
IB/ipath: sysfs and ipathfs support for core driver
The ipathfs filesystem contains files that are not appropriate for
sysfs, because they contain binary data. The hierarchy is simple; the
top-level directory contains driver-wide attribute files, while numbered
subdirectories contain per-device attribute files.
Our userspace code currently expects this filesystem to be mounted on
/ipathfs, but a final location has not yet been chosen.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Wed, 29 Mar 2006 23:23:29 +0000 (15:23 -0800)]
IB/ipath: misc driver support code
EEPROM support, interrupt handling, statistics gathering, and write
combining management for x86_64.
A note regarding i2c: The Atmel EEPROM hardware we use looks like an
i2c device electrically, but is not i2c compliant at all from a
functional perspective. We tried using the kernel's i2c support to
talk to it, but failed.
Normal i2c devices have a single 7-bit or 10-bit i2c address that they
respond to. Valid 7-bit addresses range from 0x03 to 0x77. Addresses
0x00 to 0x02 and 0x78 to 0x7F are special reserved addresses
(e.g. 0x00 is the "general call" address.) The Atmel device, on the
other hand, responds to ALL addresses. It's designed to be the only
device on a given i2c bus. A given i2c device address corresponds to
the memory address within the i2c device itself.
At least one reason why the linux core i2c stuff won't work for this
is that it prohibits access to reserved addresses like 0x00, which are
really valid addresses on the Atmel devices.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Wed, 29 Mar 2006 23:23:28 +0000 (15:23 -0800)]
IB/ipath: chip initialisation code, and diag support
ipath_init_chip.c sets up an InfiniPath device for use.
ipath_diag.c permits userspace diagnostic tools to read and write a
chip's registers. It is different in purpose from the mmap interfaces
to the /sys/bus/pci resource files.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Wed, 29 Mar 2006 23:23:27 +0000 (15:23 -0800)]
IB/ipath: support for PCI Express devices
This file contains routines and definitions specific to InfiniPath
devices that have PCI Express interfaces.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Wed, 29 Mar 2006 23:23:26 +0000 (15:23 -0800)]
IB/ipath: support for HyperTransport devices
The ipath_ht400.c file contains routines and definitions specific to
HyperTransport-based InfiniPath devices.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Wed, 29 Mar 2006 23:23:25 +0000 (15:23 -0800)]
IB/ipath: core driver header files
ipath_common.h and ips_common.h contain definitions shared between
userspace and the kernel.
ipath_kernel.h is the core driver header file.
ipath_debug.h contains mask values used for controlling driver debugging.
ipath_registers.h contains bitmask definitions used in chip registers.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Wed, 29 Mar 2006 23:23:24 +0000 (15:23 -0800)]
IB/ipath: core device driver
The ipath driver is a low-level driver for PathScale InfiniPath host
channel adapters (HCAs) based on the HT-400 and PE-800 chips, including
the InfiniPath HT-460, the small form factor InfiniPath HT-460, the
InfiniPath HT-470 and the Linux Networx LS/X.
The ipath_driver.c file contains much of the low-level device handling
code.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Linus Torvalds [Thu, 30 Mar 2006 22:32:38 +0000 (14:32 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/mad: RMPP support for additional classes
IB/mad: include GID/class when matching receives
IB/mthca: Fix section mismatch problems
IPoIB: Fix oops with raw sockets
IB/mthca: Fix check of size in SRQ creation
IB/srp: Fix unmapping of fake scatterlist
Linus Torvalds [Thu, 30 Mar 2006 22:29:20 +0000 (14:29 -0800)]
Merge branch 'upstream-linus' of /linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[PATCH] sata_mv: three bug fixes
[PATCH] libata: ata_dev_init_params() fixes
[PATCH] libata: Fix interesting use of "extern" and also some bracketing
[PATCH] libata: Simplex and other mode filtering logic
[PATCH] libata - ATA is both ATA and CFA
[PATCH] libata: Add ->set_mode hook for odd drivers
[PATCH] libata: BMDMA handling updates
[PATCH] libata: kill trailing whitespace
[PATCH] libata: add FIXME above ata_dev_xfermask()
[PATCH] libata: cosmetic changes in ata_bus_softreset()
[PATCH] libata: kill E.D.D.
Linus Torvalds [Thu, 30 Mar 2006 22:26:27 +0000 (14:26 -0800)]
Merge branch 'drm-patches' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm: remove drm_{alloc,free}_pages
drm: sis fix compile warning
drm: add new radeon PCI ids..
drm: read breadcrumb in IRQ handler
drm: fixup i915 breadcrumb read/write
drm: remove pointless checks in radeon_state
drm: fixup improper cast.
drm: rationalise some pci ids
drm: Add general-purpose packet for manipulating scratch registers (r300)
drm: rework radeon memory map (radeon 1.23)
drm: update r300 register names
drm: fixup PCI DMA support
Linus Torvalds [Thu, 30 Mar 2006 20:38:18 +0000 (12:38 -0800)]
Merge branch 'release' of git://git./linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] ioremap() should prefer WB over UC
[IA64] Add __mca_table to the DISCARD list in gate.lds
[IA64] Move __mca_table out of the __init section
[IA64] simplify some condition checks in iosapic_check_gsi_range
[IA64] correct some messages and fixes some minor things
[IA64-SGI] fix for-loop in sn_hwperf_geoid_to_cnode()
[IA64-SGI] sn_hwperf use of num_online_cpus()
[IA64] optimize flush_tlb_range on large numa box
[IA64] lazy_mmu_prot_update needs to be aware of huge pages
Jens Axboe [Thu, 30 Mar 2006 13:16:46 +0000 (15:16 +0200)]
[PATCH] splice: add support for SPLICE_F_MOVE flag
This enables the caller to migrate pages from one address space page
cache to another. In buzz word marketing, you can do zero-copy file
copies!
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jens Axboe [Thu, 30 Mar 2006 13:15:30 +0000 (15:15 +0200)]
[PATCH] Introduce sys_splice() system call
This adds support for the sys_splice system call. Using a pipe as a
transport, it can connect to files or sockets (latter as output only).
From the splice.c comments:
"splice": joining two ropes together by interweaving their strands.
This is the "extended pipe" functionality, where a pipe is used as
an arbitrary in-memory buffer. Think of a pipe as a small kernel
buffer that you can use to transfer data from one end to the other.
The traditional unix read/write is extended with a "splice()" operation
that transfers data buffers to or from a pipe buffer.
Named by Larry McVoy, original implementation from Linus, extended by
Jens to support splicing to files and fixing the initial implementation
bugs.
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Bjorn Helgaas [Thu, 30 Mar 2006 16:53:39 +0000 (09:53 -0700)]
[IA64] ioremap() should prefer WB over UC
efi_memmap_init() collects full granules of WB memory, without
regard for whether they also support UC. So in order for ioremap()
to work for main memory, it must prefer WB mappings when possible.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Jes Sorensen [Thu, 30 Mar 2006 15:13:22 +0000 (10:13 -0500)]
[IA64] Add __mca_table to the DISCARD list in gate.lds
Add __mca_table to the DISCARD list for the gate.lds linker script to
avoid broken linker references when linking the final vmlinux file.
Also add comment to include/asm-ia64/asmmacros.h to avoid anyone else
hitting this problem in the future.
Credits to James Bottomley <James.Bottomley@SteelEye.com> for spotting
the DISCARD list in gate.lds.S
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Hal Rosenstock [Wed, 29 Mar 2006 00:40:04 +0000 (16:40 -0800)]
IB/mad: RMPP support for additional classes
Add RMPP support for additional management classes that support it.
Also, validate RMPP is consistent with management class specified.
Signed-off-by: Hal Rosenstock <halr@voltaire.com>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Jack Morgenstein [Wed, 29 Mar 2006 00:39:07 +0000 (16:39 -0800)]
IB/mad: include GID/class when matching receives
Received responses are currently matched against sent requests based
on TID only. According to the spec, responses should match based on
the combination of TID, management class, and requester LID/GID.
Without the additional qualification, an agent that is responding to
two requests, both of which have the same TID, can match RMPP ACKs
with the incorrect transaction. This problem can occur on the SM node
when responding to SA queries.
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Jeff Garzik [Thu, 30 Mar 2006 00:43:31 +0000 (19:43 -0500)]
Merge branch 'mv-merge'
Conflicts:
drivers/scsi/sata_mv.c
Mark Lord [Wed, 29 Mar 2006 14:50:31 +0000 (09:50 -0500)]
[PATCH] sata_mv: three bug fixes
(1) A DMA transfer size of 0x10000 was not being written
as 0x0000 in the PRDs. Fixed.
(1) The DEV_IRQ interrupt cause bit happens spuriously
during EDMA operation, and was not being ignored by the driver.
This led to various "drive busy" errors being reported,
with associated unpredictable behaviour. Fixed.
(2) If a SATA or PCI interrupt was received with no outstanding
command, the interrupt handler still attempted to invoke
ata_qc_complete(), triggering assert()/BUG_ON() behaviour
elsewhere in libata. Fixed.
The driver still has issues with confusion after error-recovery,
but should now be reliable in the absence of drive errors.
I will be looking more into the error-handling bugs next.
Signed-Off-By: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Albert Lee [Mon, 27 Mar 2006 08:39:18 +0000 (16:39 +0800)]
[PATCH] libata: ata_dev_init_params() fixes
ata_dev_init_params() fixes:
- Get the "heads" and "sectors" parameters from caller instead of implicitly from dev->id[].
- Return AC_ERR_INVALID instead of 0 if an invalid parameter is found
Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Alan Cox [Mon, 27 Mar 2006 18:01:32 +0000 (19:01 +0100)]
[PATCH] libata: Fix interesting use of "extern" and also some bracketing
Signed-off-by: Alan Cox <alan@redhat.com>
Last of the set, just clean up some oddments. Assuming the whole set is
now ok then the remaining differences are the setup of PIO_0 at reset
and the ->data_xfer method.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Alan Cox [Mon, 27 Mar 2006 17:58:20 +0000 (18:58 +0100)]
[PATCH] libata: Simplex and other mode filtering logic
Add a field to the host_set called 'flags' (was host_set_flags changed
to suit Jeff)
Add a simplex_claimed field so we can remember who owns the DMA channel
Add a ->mode_filter() hook to allow drivers to filter modes
Add docs for mode_filter and set_mode
Filter according to simplex state
Filter cable in core
This provides the needed framework to support all the mode rules found
in the PATA world. The simplex filter deals with 'to spec' simplex DMA
systems found in older chips. The cable filter avoids duplicating the
same rules in each chip driver with PATA. Finally the mode filter is
neccessary because drive/chip combinations have errata that forbid
certain modes with some drives or types of ATA object.
Drive speed setup remains per channel for now and the filters now use
the framework Tejun put into place which cleans them up a lot from the
older libata-pata patches.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Alan Cox [Mon, 27 Mar 2006 17:49:19 +0000 (18:49 +0100)]
[PATCH] libata - ATA is both ATA and CFA
I think this is still needed with the new probe code (which btw seems to
be missing docs in upstream ?).
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Alan Cox [Mon, 27 Mar 2006 17:46:37 +0000 (18:46 +0100)]
[PATCH] libata: Add ->set_mode hook for odd drivers
Some hardware doesn't want the usual mode setup logic running. This
allows the hardware driver to replace it for special cases in the least
invasive way possible.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Alan Cox [Mon, 27 Mar 2006 17:42:40 +0000 (18:42 +0100)]
[PATCH] libata: BMDMA handling updates
This is the minimal patch set to enable the current code to be used with
a controller following SFF (ie any PATA and early SATA controllers)
safely without crashes if there is no BMDMA area or if BMDMA is not
assigned by the BIOS for some reason.
Simplex status is recorded but not acted upon in this change, this isn't
a problem with the current drivers as none of them are for simplex
hardware. A following diff will deal with that.
The flags in the probe structure remain ->host_set_flags although Jeff
asked me to rename them, simply because the rename would break the usual
Linux rules that old code should break when there are changes. not
compile and run and then blow up/eat your computer/etc. Renaming this
later is a trivial exercise once a better name is chosen.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Lennert Buytenhek [Wed, 29 Mar 2006 13:12:44 +0000 (15:12 +0200)]
[PATCH] ixp2000: fix gcc4 breakage
gcc4 doesn't like us declaring a static function inside another
function. We can do away with this construct altogether and use
BUILD_BUG_ON() instead (idea from Andi Kleen.)
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Gary Zambrano [Tue, 28 Mar 2006 22:57:38 +0000 (14:57 -0800)]
[PATCH] b44: ensure valid mac addr
Added code to check for invalid MAC address from eeprom or user input.
Signed-off-by: Gary Zambrano <zambrano@broadcom.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Linas Vepstas [Tue, 28 Mar 2006 22:36:23 +0000 (16:36 -0600)]
[PATCH] Janitor: drivers/net/pcnet32: fix incorrect comments
The comments concerning how the pcnet32 ethernet device driver selects
the MAC addr to use are incorrect. A recent patch (in the last 3 months)
changed how the code worked, but did not change the comments.
Side comment: the new behaviour is good; I've got a pcnet32 card which
powers up with garbage in the CSR's, and a good MAC addr in the PROM.
Signed-off-by: Linas Vepstas <linas@linas.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Arthur Othieno [Tue, 28 Mar 2006 22:09:01 +0000 (14:09 -0800)]
[PATCH] net: remove CONFIG_NET_CBUS conditional for NS8390
Don't bother testing for CONFIG_NET_CBUS ("NEC PC-9800 C-bus cards"); it went
out with the rest of PC98 subarch.
Signed-off-by: Arthur Othieno <apgo@patchbomb.org>
Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Mark Brown [Tue, 28 Mar 2006 22:08:55 +0000 (14:08 -0800)]
[PATCH] natsemi: Support oversized EEPROMs
The natsemi chip can have a larger EEPROM attached than it itself uses for
configuration. This patch adds support for user space access to such an
EEPROM.
Signed-off-by: Mark Brown <broonie@sirena.org.uk>
Cc: Tim Hockin <thockin@hockin.org>
Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jens Osterkamp [Mon, 27 Mar 2006 15:16:36 +0000 (17:16 +0200)]
[PATCH] spidernet : enable tx checksum offloading by default
This enables TX checksum offloading for the spidernet driver by default.
Signed-off-by: Jens Osterkamp <Jens.Osterkamp@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jay Vosburgh [Mon, 27 Mar 2006 21:27:43 +0000 (13:27 -0800)]
[PATCH] bonding: support carrier state for master
Add support for the bonding master to specify its carrier state
based upon the state of the slaves. For 802.3ad, the bond is up if
there is an active, parterned aggregator. For other modes, the bond is
up if any slaves are up. Updates driver version to 3.0.3.
Based on a patch by jamal <hadi@cyberus.ca>.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jeff Garzik [Wed, 29 Mar 2006 22:30:19 +0000 (17:30 -0500)]
Merge branch 'upstream' of git://git./linux/kernel/git/linville/wireless-2.6
Randy Dunlap [Mon, 27 Mar 2006 20:26:12 +0000 (12:26 -0800)]
[PATCH] acenic: fix section mismatches
Fix section mismatches in acenic driver:
WARNING: drivers/net/acenic.o - Section mismatch: reference to .init.data:tigon2FwText from .text between 'acenic_probe_one' (at offset 0x2409) and 'ace_interrupt'
WARNING: drivers/net/acenic.o - Section mismatch: reference to .init.data:tigon2FwRodata from .text between 'acenic_probe_one' (at offset 0x2422) and 'ace_interrupt'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jens Osterkamp [Tue, 28 Mar 2006 13:59:55 +0000 (15:59 +0200)]
[PATCH] spidernet : reduce console spam
This patch reduces the message level of the RX ram full messages
from err to debug to prevent spamming the console leaving it in the
logfiles though.
From: Jens Osterkamp <Jens.Osterkamp@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Roger Luethi [Tue, 28 Mar 2006 18:53:56 +0000 (20:53 +0200)]
[PATCH] via-rhine: link state fix
Problems with link state detection have been reported several times in the
past months.
Denis Vlasenko did all the work tracking it down. Jeff Garzik suggested the
proper place for the fix.
When using the mii library, the driver needs to check mii->force_media
and set dev->state accordingly.
Signed-off-by: Roger Luethi <rl@hellgate.ch>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Komuro [Sat, 25 Mar 2006 22:31:55 +0000 (07:31 +0900)]
[PATCH] axnet_cs.c : add hardware multicast support
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Tejun Heo [Sat, 25 Mar 2006 05:28:57 +0000 (14:28 +0900)]
[PATCH] libata: kill trailing whitespace
Kill trailing whitespace.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jeff Garzik [Wed, 29 Mar 2006 22:18:49 +0000 (17:18 -0500)]
Merge branch 'master'
Gary Zambrano [Wed, 29 Mar 2006 22:12:05 +0000 (17:12 -0500)]
b44: fix force mac address before ifconfig up
Initializing the b44 MAC & PCI functional blocks in the controller must
occur inside init_one(). This will allow access to the MAC registers.
The controller was being powered up in b44_open() which would not allow
access to the registers before ifconfig was up.
Philip Kohlbecher found this bug.
Signed-off-by: Gary Zambrano <zambrano@broadcom.com>
Linus Torvalds [Wed, 29 Mar 2006 19:29:33 +0000 (11:29 -0800)]
Merge /pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[NETFILTER]: Rename init functions.
[TCP]: Fix RFC2465 typo.
[INET]: Introduce tunnel4/tunnel6
[NET]: deinline 200+ byte inlines in sock.h
[ECONET]: Convert away from SOCKOPS_WRAPPED
[NET]: Fix ipx/econet/appletalk/irda ioctl crashes
[NET]: Kill Documentation/networking/TODO
[TG3]: Update version and reldate
[TG3]: Skip timer code during full lock
[TG3]: Speed up SRAM access
[TG3]: Fix PHY loopback on 5700
[TG3]: Fix bug in 40-bit DMA workaround code
[TG3]: Fix probe failure due to invalid MAC address
Linus Torvalds [Wed, 29 Mar 2006 19:28:30 +0000 (11:28 -0800)]
Merge git://git./linux/kernel/git/paulus/powerpc
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (67 commits)
[PATCH] powerpc: Remove oprofile spinlock backtrace code
[PATCH] powerpc: Add oprofile calltrace support to all powerpc cpus
[PATCH] powerpc: Add oprofile calltrace support
[PATCH] for_each_possible_cpu: ppc
[PATCH] for_each_possible_cpu: powerpc
[PATCH] lock PTE before updating it in 440/BookE page fault handler
[PATCH] powerpc: Kill _machine and hard-coded platform numbers
ppc: Fix compile error in arch/ppc/lib/strcase.c
[PATCH] git-powerpc: WARN was a dumb idea
[PATCH] powerpc: a couple of trivial compile warning fixes
powerpc: remove OCP references
powerpc: Make uImage default build output for MPC8540 ADS
powerpc: move math-emu over to arch/powerpc
powerpc: use memparse() for mem= command line parsing
ppc: fix strncasecmp prototype
[PATCH] powerpc: make ISA floppies work again
[PATCH] powerpc: Fix some initcall return values
[PATCH] powerpc: Workaround for pSeries RTAS bug
[PATCH] spufs: fix __init/__exit annotations
[PATCH] powerpc: add hvc backend for rtas
...
Russ Anderson [Wed, 29 Mar 2006 17:31:23 +0000 (11:31 -0600)]
[IA64] Move __mca_table out of the __init section
Move __mca_table out of the __init section.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Roland Dreier [Wed, 29 Mar 2006 17:36:46 +0000 (09:36 -0800)]
IB/mthca: Fix section mismatch problems
Quite a few cleanup functions in mthca were marked as __devexit.
However, they could also be called from error paths during
initialization, so they cannot be marked that way. Just delete all of
the incorrect annotations.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Roland Dreier [Wed, 29 Mar 2006 17:36:46 +0000 (09:36 -0800)]
IPoIB: Fix oops with raw sockets
ipoib_hard_header() needs to handle the case that daddr is NULL. This
can happen when packets are injected via a raw socket, and IPoIB
shouldn't oops in this case.
Reported by Anton Blanchard <anton@samba.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Jack Morgenstein [Sun, 26 Mar 2006 15:01:12 +0000 (17:01 +0200)]
IB/mthca: Fix check of size in SRQ creation
The previous patch for Tavor broke MemFree logic.
The driver should perform limit check only for Tavor. For MemFree,
the check is incorrect, since ds (WQE stride) is always a power-of-2
(although the max_desc_size may not be).
In Tavor, however, WQE stride and desc_size are the same, and are not
necessarily power-of-2. The check was really for the WQE stride (and
it Tavor, we use max_desc_size for the stride).
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Roland Dreier [Wed, 29 Mar 2006 17:36:45 +0000 (09:36 -0800)]
IB/srp: Fix unmapping of fake scatterlist
The recently merged patch to create a fake scatterlist for non-SG SCSI
commands had a bug: the driver ended up doing dma_unmap_sg() on a
scatterlist scmnd->request_buffer rather than the fake scatter list it
created. Fix this so that the driver unmaps the same thing it maps.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Linus Torvalds [Wed, 29 Mar 2006 16:55:36 +0000 (08:55 -0800)]
Merge git://oss.sgi.com:8090/oss/git/xfs-2.6
* git://oss.sgi.com:8090/oss/git/xfs-2.6:
[XFS] Cleanup in XFS after recent get_block_t interface tweaks.
[XFS] Remove unused/obsoleted function: xfs_bmap_do_search_extents()
[XFS] A change to inode chunk allocation to try allocating the new chunk
Fixes a regression from the recent "remove ->get_blocks() support"
[XFS] Fix compiler warning and small code inconsistencies in compat
[XFS] We really suck at spulling. Thanks to Chris Pascoe for fixing all
Linus Torvalds [Wed, 29 Mar 2006 02:48:03 +0000 (18:48 -0800)]
Merge branch 'for-linus' of /linux/kernel/git/scjody/ieee1394
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/scjody/ieee1394:
ohci1394: cleanup the "Unexpected PCI resource length" warning.
sbp2: misc debug logging cleanups
sbp2: proper treatment of DID_OK
ieee1394: set read permission for parameter disable_irm
sbp2: check for ARM failure
ohci1394: clean up asynchronous and physical request filters programming
ieee1394: remove amdtp remains from ieee1394_core.h
ieee1394: remove devfs support
Signed-off-by: Jody McIntyre <scjody@modernduck.com>
sbp2: prevent unloading of 1394 low-level driver
Anton Blanchard [Mon, 27 Mar 2006 01:00:45 +0000 (12:00 +1100)]
[PATCH] powerpc: Remove oprofile spinlock backtrace code
Remove oprofile spinlock backtrace code now we have proper calltrace
support. Also make MMCRA sihv and sipr bits a variable since they may
change in future cpus. Finally, MMCRA should be a 64bit quantity.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Anton Blanchard [Mon, 27 Mar 2006 01:03:17 +0000 (12:03 +1100)]
[PATCH] powerpc: Add oprofile calltrace support to all powerpc cpus
Add calltrace support for other powerpc cpus. Tested on 7450.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Brian Rogan [Mon, 27 Mar 2006 00:57:01 +0000 (11:57 +1100)]
[PATCH] powerpc: Add oprofile calltrace support
Add oprofile calltrace support to powerpc. Disable spinlock backtracing
now we can use calltrace info.
(Updated to work on both 32bit and 64bit by me).
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
KAMEZAWA Hiroyuki [Tue, 28 Mar 2006 22:50:52 +0000 (14:50 -0800)]
[PATCH] for_each_possible_cpu: ppc
for_each_cpu() actually iterates across all possible CPUs. We've had mistakes
in the past where people were using for_each_cpu() where they should have been
iterating across only online or present CPUs. This is inefficient and
possibly buggy.
We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the
future.
This patch replaces for_each_cpu with for_each_possible_cpu.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
KAMEZAWA Hiroyuki [Tue, 28 Mar 2006 22:50:51 +0000 (14:50 -0800)]
[PATCH] for_each_possible_cpu: powerpc
for_each_cpu() actually iterates across all possible CPUs. We've had mistakes
in the past where people were using for_each_cpu() where they should have been
iterating across only online or present CPUs. This is inefficient and
possibly buggy.
We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the
future.
This patch replaces for_each_cpu with for_each_possible_cpu.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Eugene Surovegin [Tue, 28 Mar 2006 18:13:12 +0000 (10:13 -0800)]
[PATCH] lock PTE before updating it in 440/BookE page fault handler
Fix 44x and BookE page fault handler to correctly lock PTE before
trying to pte_update() it, otherwise this PTE might be swapped out
after pte_present() check but before pte_uptdate() call, resulting in
corrupted PTE. This can happen with enabled preemption and low memory
condition.
Signed-off-by: Eugene Surovegin <ebs@ebshome.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:30 +0000 (16:11 -0800)]
[PATCH] send_sigqueue: simplify and fix the race
send_sigqueue() checks PF_EXITING, then locks p->sighand->siglock. This is
unsafe: 'p' can exit in between and set ->sighand = NULL. The race is
theoretical, the window is tiny and irqs are disabled by the caller, so I
don't think we need the fix for -stable tree.
Convert send_sigqueue() to use lock_task_sighand() helper.
Also, delete 'p->flags & PF_EXITING' re-check, it is unneeded and the
comment is wrong.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:29 +0000 (16:11 -0800)]
[PATCH] do_notify_parent_cldstop: remove 'to_self' param
The previous patch has changed callsites of do_notify_parent_cldstop() so that
to_self == (->ptrace & PT_PTRACED) always (as it should be). We can remove
this parameter now.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:28 +0000 (16:11 -0800)]
[PATCH] finish_stop: don't check stop_count < 0
Remove an obscure 'stop_count < 0' check in finish_stop(). The previous patch
made this case impossible.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:28 +0000 (16:11 -0800)]
[PATCH] simplify do_signal_stop()
do_signal_stop() considers 'thread_group_empty()' as a special case.
This was needed to avoid taking tasklist_lock. Since this lock is
unneeded any longer, we can remove this special case and simplify
the code even more.
Also, before this patch, finish_stop() was called with stop_count == -1
for 'thread_group_empty()' case. This is not strictly wrong, but confusing
and unneeded.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:27 +0000 (16:11 -0800)]
[PATCH] cleanup __exit_signal->cleanup_sighand path
Move 'tsk->sighand = NULL' from cleanup_sighand() to __exit_signal(). This
makes the exit path more understandable and allows us to do
cleanup_sighand() outside of ->siglock protected section.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:26 +0000 (16:11 -0800)]
[PATCH] make fork() atomic wrt pgrp/session signals
Eric W. Biederman wrote:
>
> Ok. SUSV3/Posix is clear, fork is atomic with respect
> to signals. Either a signal comes before or after a
> fork but not during. (See the rationale section).
> http://www.opengroup.org/onlinepubs/
000095399/functions/fork.html
>
> The tasklist_lock does not stop forks from adding to a process
> group. The forks stall while the tasklist_lock is held, but a fork
> that began before we grabbed the tasklist_lock simply completes
> afterwards, and the child does not receive the signal.
This also means that SIGSTOP or sig_kernel_coredump() signal can't
be delivered to pgrp/session reliably.
With this patch copy_process() returns -ERESTARTNOINTR when it
detects a pending signal, fork() will be restarted transparently
after handling the signals.
This patch also deletes now unneeded "group_stop_count > 0" check,
copy_process() can no longer succeed while group stop in progress.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-By: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:25 +0000 (16:11 -0800)]
[PATCH] pids: kill PIDTYPE_TGID
This patch kills PIDTYPE_TGID pid_type thus saving one hash table in
kernel/pid.c and speeding up subthreads create/destroy a bit. It is also a
preparation for the further tref/pids rework.
This patch adds 'struct list_head thread_group' to 'struct task_struct'
instead.
We don't detach group leader from PIDTYPE_PID namespace until another
thread inherits it's ->pid == ->tgid, so we are safe wrt premature
free_pidmap(->tgid) call.
Currently there are no users of find_task_by_pid_type(PIDTYPE_TGID).
Should the need arise, we can use find_task_by_pid()->group_leader.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-By: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:24 +0000 (16:11 -0800)]
[PATCH] do_sigaction: don't take tasklist_lock
do_sigaction() does not need tasklist_lock anymore, we can simplify the code.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:23 +0000 (16:11 -0800)]
[PATCH] do_group_exit: don't take tasklist_lock
do_group_exit() takes tasklist_lock for zap_other_threads(), this is unneeded
now.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:22 +0000 (16:11 -0800)]
[PATCH] do_signal_stop: don't take tasklist_lock
do_signal_stop() does not need tasklist_lock anymore. So it does not need to
do misc re-checks, and we can simplify the code.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:22 +0000 (16:11 -0800)]
[PATCH] relax sig_needs_tasklist()
handle_stop_signal() does not need tasklist_lock for SIG_KERNEL_STOP_MASK
signals anymore.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:21 +0000 (16:11 -0800)]
[PATCH] sys_times: don't take tasklist_lock
sys_times: don't take tasklist_lock
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:20 +0000 (16:11 -0800)]
[PATCH] do __unhash_process() under ->siglock
This patch moves __unhash_process() call from realease_task() to
__exit_signal(), so __detach_pid() is called with ->siglock held.
This means we don't need tasklist_lock to iterate over thread group anymore:
copy_process() was already changed to do attach_pid()
under ->siglock.
Eric's "pidhash-kill-switch_exec_pids.patch" from -mm
changed de_thread() so it doesn't touch PIDTYPE_TGID.
NOTE: de_thread() still needs some attention. It still changes task->pid
lockless. Taking ->sighand.siglock here allows to do more tasklist_lock
removals.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:19 +0000 (16:11 -0800)]
[PATCH] revert "Optimize sys_times for a single thread process"
This patch reverts 'CONFIG_SMP && thread_group_empty()' optimization in
sys_times(). The reason is that the next patch breaks memory ordering which
is needed for that optimization.
tasklist_lock in sys_times() will be eliminated completely by further patch.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:18 +0000 (16:11 -0800)]
[PATCH] move __exit_signal() to kernel/exit.c
__exit_signal() is private to release_task() now. I think it is better to
make it static in kernel/exit.c and export flush_sigqueue() instead - this
function is much more simple and straightforward.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:17 +0000 (16:11 -0800)]
[PATCH] rename __exit_sighand to cleanup_sighand
Cosmetic, rename __exit_sighand to cleanup_sighand and move it close to
copy_sighand().
This matches copy_signal/cleanup_signal naming, and I think it is easier to
follow.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:17 +0000 (16:11 -0800)]
[PATCH] cleanup __exit_signal()
This patch factors out duplicated code under 'if' branches. Also, BUG_ON()
conversions and whitespace cleanups.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:16 +0000 (16:11 -0800)]
[PATCH] copy_process: cleanup bad_fork_cleanup_signal
__exit_signal() does important cleanups atomically under ->siglock. It is
also called from copy_process's error path. This is not good, for example we
can't move __unhash_process() under ->siglock for that reason.
We should not mix these 2 paths, just look at ugly 'if (p->sighand)' under
'bad_fork_cleanup_sighand:' label. For copy_process() case it is sufficient
to just backout copy_signal(), nothing more.
Again, nobody can see this task yet. For CLONE_THREAD case we just decrement
signal->count, otherwise nobody can see this ->signal and we can free it
lockless.
This patch assumes it is safe to do exit_thread_group_keys() without
tasklist_lock.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:14 +0000 (16:11 -0800)]
[PATCH] copy_process: cleanup bad_fork_cleanup_sighand
The only caller of exit_sighand(tsk) is copy_process's error path. We can
call __exit_sighand() directly and kill exit_sighand().
This 'tsk' was not yet registered in pid_hash[] or init_task.tasks, it has no
external references, nobody can see it, and
IF (clone_flags & CLONE_SIGHAND)
At least 'current' has a reference to ->sighand, this
means atomic_dec_and_test(sighand->count) can't be true.
ELSE
Nobody can see this ->sighand, this means we can free it
without any locking.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:14 +0000 (16:11 -0800)]
[PATCH] introduce sig_needs_tasklist() helper
In my opinion this patch cleans up the code.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:13 +0000 (16:11 -0800)]
[PATCH] introduce lock_task_sighand() helper
Add lock_task_sighand() helper and converts group_send_sig_info() to use
it. Hopefully we will have more users soon.
This patch also removes '!sighand->count' and '!p->usage' checks, I think
they both are bogus, racy and unneeded (but probably it makes sense to
restore them as BUG_ON()s).
->sighand is cleared and it's ->count is decremented in release_task() with
sighand->siglock held, so it is a bug to have '!p->usage || !->count' after
we already locked and verified it is the same. On the other hand, an
already dead task without ->sighand can have a non-zero ->usage due to
ptrace, for example.
If we read the stale value of ->sighand we must see the change after
spin_lock(), because that change was done while holding that same old
->sighand.siglock.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:12 +0000 (16:11 -0800)]
[PATCH] convert sighand_cache to use SLAB_DESTROY_BY_RCU
This patch borrows a clever Hugh's 'struct anon_vma' trick.
Without tasklist_lock held we can't trust task->sighand until we locked it
and re-checked that it is still the same.
But this means we don't need to defer 'kmem_cache_free(sighand)'. We can
return the memory to slab immediately, all we need is to be sure that
sighand->siglock can't dissapear inside rcu protected section.
To do so we need to initialize ->siglock inside ctor function,
SLAB_DESTROY_BY_RCU does the rest.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:11 +0000 (16:11 -0800)]
[PATCH] release_task: replace open-coded ptrace_unlink()
Use ptrace_unlink() instead of open-coding. No changes in kernel/exit.o
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:10 +0000 (16:11 -0800)]
[PATCH] wait_for_helper: trivial style cleanup
Use NULL instead of (... *)0
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:09 +0000 (16:11 -0800)]
[PATCH] reparent_thread: use remove_parent/add_parent
Use remove_parent/add_parent instead of open coding.
No changes in kernel/exit.o
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:09 +0000 (16:11 -0800)]
[PATCH] pidhash: don't use zero pids
daemonize() calls set_special_pids(1,1), while init and kernel threads spawned
from init/main.c:init() run with 0,0 special pids. This patch changes
INIT_SIGNALS() so that that they run with ->pgrp == ->session == 1 also. This
patch relies on fact that swapper's pid == 1.
Now we have no hashed zero pids in pid_hash[].
User-space visibible change is that now /sbin/init runs with (1,1) special
pids and becomes a session leader.
Quoting Eric W. Biederman:
>
> daemonize consuming pids (1,1) then consumes pgrp 1. So that when
> /sbin/init calls setsid() it thinks /sbin/init is a process group
> leader and setsid() fails. So /sbin/init wants pgrp 1 session 1
> but doesn't get it. I am pretty certain daemonize did not exist so
> /sbin/init got pgrp 1 session 1 in 2.4.
>
> That is the bug that is being fixed.
>
> This patch takes things one step farther and essentially calls
> setsid() for pid == 1 before init is execed. That is new behavior
> but it cleans up the kernel as we now do not need to support the
> case of a process without a process group or a session.
>
> The only process that could have possibly cared was /sbin/init
> and it already calls setsid() because it doesn't want that.
>
> If this was going to break anything noticeable the change in behavior
> from 2.4 to 2.6 would have already done that.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:07 +0000 (16:11 -0800)]
[PATCH] pidhash: don't count idle threads
fork_idle() does unhash_process() just after copy_process(). Contrary,
boot_cpu's idle thread explicitely registers itself for each pid_type with nr
= 0.
copy_process() already checks p->pid != 0 before process_counts++, I think we
can just skip attach_pid() calls and job control inits for idle threads and
kill unhash_process(). We don't need to cleanup ->proc_dentry in fork_idle()
because with this patch idle threads are never hashed in
kernel/pid.c:pid_hash[].
We don't need to hash pid == 0 in pidmap_init(). free_pidmap() is never
called with pid == 0 arg, so it will never be reused. So it is still possible
to use pid == 0 in any PIDTYPE_xxx namespace from kernel/pid.c's POV.
However with this patch we don't hash pid == 0 for PIDTYPE_PID case. We still
have have PIDTYPE_PGID/PIDTYPE_SID entries with pid == 0: /sbin/init and
kernel threads which don't call daemonize().
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:06 +0000 (16:11 -0800)]
[PATCH] kill SET_LINKS/REMOVE_LINKS
Both SET_LINKS() and SET_LINKS/REMOVE_LINKS() have exactly one caller, and
these callers already check thread_group_leader().
This patch kills theese macros, they mix two different things: setting
process's parent and registering it in init_task.tasks list. Callers are
updated to do these actions by hand.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:05 +0000 (16:11 -0800)]
[PATCH] don't use REMOVE_LINKS/SET_LINKS for reparenting
There are places where kernel uses REMOVE_LINKS/SET_LINKS while changing
process's ->parent. Use add_parent/remove_parent instead, they don't abuse
of global process list.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:05 +0000 (16:11 -0800)]
[PATCH] remove add_parent()'s parent argument
add_parent(p, parent) is always called with parent == p->parent, and it makes
no sense to do it differently. This patch removes this argument.
No changes in affected .o files.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:11:04 +0000 (16:11 -0800)]
[PATCH] choose_new_parent: remove unused arg, sanitize exit_state check
'child_reaper' arg is not used in choose_new_parent().
"->exit_state >= EXIT_ZOMBIE" check is a leftover, was
valid when EXIT_ZOMBIE lived in ->state var.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Eric W. Biederman [Wed, 29 Mar 2006 00:11:03 +0000 (16:11 -0800)]
[PATCH] pidhash: kill switch_exec_pids
switch_exec_pids is only called from de_thread by way of exec, and it is
only called when we are exec'ing from a non thread group leader.
Currently switch_exec_pids gives the leader the pid of the thread and
unhashes and rehashes all of the process groups. The leader is already in
the EXIT_DEAD state so no one cares about it's pids. The only concern for
the leader is that __unhash_process called from release_task will function
correctly. If we don't touch the leader at all we know that
__unhash_process will work fine so there is no need to touch the leader.
For the task becomming the thread group leader, we just need to give it the
pid of the old thread group leader, add it to the task list, and attach it
to the session and the process group of the thread group.
Currently de_thread is also adding the task to the task list which is just
silly.
Currently the only leader of __detach_pid besides detach_pid is
switch_exec_pids because of the ugly extra work that was being
performed.
So this patch removes switch_exec_pids because it is doing too much, it is
creating an unnecessary special case in pid.c, duing work duplicated in
de_thread, and generally obscuring what it is going on.
The necessary work is added to de_thread, and it seems to be a little
clearer there what is going on.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Kirill Korotaev <dev@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Eric W. Biederman [Wed, 29 Mar 2006 00:11:02 +0000 (16:11 -0800)]
[PATCH] do_SAK: don't depend on session ID 0
I'm not really certain what the thinking was but the code obviously wanted to
walk processes other than just those in it's session, for purposes of do_SAK.
Just walking those tasks that don't have a session assigned sounds at the very
least incomplete.
So modify the code to kill everything in the session and anything else that
might have the tty open. Hopefully this helps if the do_SAK functionality is
ever finished.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Eric W. Biederman [Wed, 29 Mar 2006 00:11:01 +0000 (16:11 -0800)]
[PATCH] do_tty_hangup: use group_send_sig_info not send_group_sig_info
We already have the tasklist_lock so there is no need for us to reacquire it
with send_group_sig_info. reader/writer locks allow multiple readers and thus
recursion so the old code was ok just wastful.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Eric W. Biederman [Wed, 29 Mar 2006 00:11:00 +0000 (16:11 -0800)]
[PATCH] Remove dead kill_sl prototype from sched.h
The kill_sl function doesn't exist in the kernel so a prototype is completely
unnecessary.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 29 Mar 2006 00:10:59 +0000 (16:10 -0800)]
[PATCH] simplify exec from init's subthread
I think it is enough to take tasklist_lock for reading while changing
child_reaper:
Reparenting needs write_lock(tasklist_lock)
Only one thread in a thread group can do exec()
sighand->siglock garantees that get_signal_to_deliver()
will not see a stale value of child_reaper.
This means that we can change child_reaper earlier, without calling
zap_other_threads() twice.
"child_reaper = current" is a NOOP when init does exec from main thread, we
don't care.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Eric W. Biederman [Wed, 29 Mar 2006 00:10:58 +0000 (16:10 -0800)]
[PATCH] exec: allow init to exec from any thread.
After looking at the problem of init calling exec some more I figured out
an easy way to make the code work.
The actual symptom without out this patch is that all threads will die
except pid == 1, and the thread calling exec. The thread calling exec will
wait forever for pid == 1 to die.
Since pid == 1 does not install a handler for SIGKILL it will never die.
This modifies the tests for init from current->pid == 1 to the equivalent
current == child_reaper. And then it causes exec in the ugly case to
modify child_reaper.
The only weird symptom is that you wind up with an init process that
doesn't have the oldest start time on the box.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paul Mackerras [Wed, 29 Mar 2006 02:24:50 +0000 (13:24 +1100)]
Merge ../linux-2.6