Linus Torvalds [Wed, 31 Dec 2008 01:32:25 +0000 (17:32 -0800)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
sata_sil: add Large Block Transfer support
[libata] ata_piix: cleanup dmi strings checking
DMI: add dmi_match
libata: blacklist NCQ on OCZ CORE 2 SSD (resend)
[libata] Update kernel-doc comments to match source code
libata: perform port detach in EH
libata: when restoring SControl during detach do the PMP links first
libata: beef up iterators
Linus Torvalds [Wed, 31 Dec 2008 01:31:25 +0000 (17:31 -0800)]
Merge branch 'oprofile-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'oprofile-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
oprofile: select RING_BUFFER
ring_buffer: adding EXPORT_SYMBOLs
oprofile: fix lost sample counter
oprofile: remove nr_available_slots()
oprofile: port to the new ring_buffer
ring_buffer: add remaining cpu functions to ring_buffer.h
oprofile: moving cpu_buffer_reset() to cpu_buffer.h
oprofile: adding cpu_buffer_entries()
oprofile: adding cpu_buffer_write_commit()
oprofile: adding cpu buffer r/w access functions
ftrace: remove unused function arg in trace_iterator_increment()
ring_buffer: update description for ring_buffer_alloc()
oprofile: set values to default when creating oprofilefs
oprofile: implement switch/case in buffer_sync.c
x86/oprofile: cleanup IBS init/exit functions in op_model_amd.c
x86/oprofile: reordering IBS code in op_model_amd.c
oprofile: fix typo
oprofile: whitspace changes only
oprofile: update comment for oprofile_add_sample()
oprofile: comment cleanup
Linus Torvalds [Wed, 31 Dec 2008 01:28:09 +0000 (17:28 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/penberg/slab-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
slub: avoid leaking caches or refcounts on sysfs error
slab: Fix comment on #endif
slab: remove GFP_THISNODE clearing from alloc_slabmgmt()
slub: Add might_sleep_if() to slab_alloc()
SLUB: failslab support
slub: Fix incorrect use of loose
slab: Update the kmem_cache_create documentation regarding the name parameter
slub: make early_kmem_cache_node_alloc void
slab: unsigned slabp->inuse cannot be less than 0
slub - fix get_object_page comment
SLUB: Replace __builtin_return_address(0) with _RET_IP_.
SLUB: cleanup - define macros instead of hardcoded numbers
Linus Torvalds [Wed, 31 Dec 2008 01:25:49 +0000 (17:25 -0800)]
Merge branch 'drm-next' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (37 commits)
drm/i915: fix modeset devname allocation + agp init return check.
drm/i915: Remove redundant test in error path.
drm: Add a debug node for vblank state.
drm: Avoid use-before-null-test on dev in drm_cleanup().
drm/i915: Don't print to dmesg when taking signal during object_pin.
drm: pin new and unpin old buffer when setting a mode.
drm/i915: un-EXPORT and make 'intelfb_panic' static
drm/i915: Delete unused, pointless i915_driver_firstopen.
drm/i915: fix sparse warnings: returning void-valued expression
drm/i915: fix sparse warnings: move 'extern' decls to header file
drm/i915: fix sparse warnings: make symbols static
drm/i915: fix sparse warnings: declare one-bit bitfield as unsigned
drm/i915: Don't double-unpin buffers if we take a signal in evict_everything().
drm/i915: Fix fbcon setup to align display pitch to 64b.
drm/i915: Add missing userland definitions for gem init/execbuffer.
i915/drm: provide compat defines for userspace for certain struct members.
drm: drop DRM_IOCTL_MODE_REPLACEFB, add+remove works just as well.
drm: sanitise drm modesetting API + remove unused hotplug
drm: fix allowing master ioctls on non-master fds.
drm/radeon: use locked rmmap to remove sarea mapping.
...
Linus Torvalds [Wed, 31 Dec 2008 01:25:29 +0000 (17:25 -0800)]
Merge branch 'agp-next' of git://git./linux/kernel/git/airlied/agp-2.6
* 'agp-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6:
agp/intel: Fix broken ® symbol in device name.
agp/intel: add support for G41 chipset
Linus Torvalds [Wed, 31 Dec 2008 01:23:31 +0000 (17:23 -0800)]
Merge git://git./linux/kernel/git/davem/sparc-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next-2.6: (98 commits)
sparc: move select of ARCH_SUPPORTS_MSI
sparc: drop SUN_IO
sparc: unify sections.h
sparc: use .data.init_task section for init_thread_union
sparc: fix array overrun check in of_device_64.c
sparc: unify module.c
sparc64: prepare module_64.c for unification
sparc64: use bit neutral Elf symbols
sparc: unify module.h
sparc: introduce CONFIG_BITS
sparc: fix hardirq.h removal fallout
sparc64: do not export pus_fs_struct
sparc: use sparc64 version of scatterlist.h
sparc: Commonize memcmp assembler.
sparc: Unify strlen assembler.
sparc: Add asm/asm.h
sparc: Kill memcmp_32.S code which has been ifdef'd out for centuries.
sparc: replace for_each_cpu_mask_nr with for_each_cpu
sparc: fix sparse warnings in irq_32.c
sparc: add include guards to kernel.h
...
Linus Torvalds [Wed, 31 Dec 2008 01:20:05 +0000 (17:20 -0800)]
Merge branch 'for-2.6.29' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.29' of git://git.kernel.dk/linux-2.6-block: (43 commits)
bio: get rid of bio_vec clearing
bounce: don't rely on a zeroed bio_vec list
cciss: simplify parameters to deregister_disk function
cfq-iosched: fix race between exiting queue and exiting task
loop: Do not call loop_unplug for not configured loop device.
loop: Flush possible running bios when loop device is released.
alpha: remove dead BIO_VMERGE_BOUNDARY
Get rid of CONFIG_LSF
block: make blk_softirq_init() static
block: use min_not_zero in blk_queue_stack_limits
block: add one-hit cache for disk partition lookup
cfq-iosched: remove limit of dispatch depth of max 4 times quantum
nbd: tell the block layer that it is not a rotational device
block: get rid of elevator_t typedef
aio: make the lookup_ioctx() lockless
bio: add support for inlining a number of bio_vecs inside the bio
bio: allow individual slabs in the bio_set
bio: move the slab pointer inside the bio_set
bio: only mempool back the largest bio_vec slab cache
block: don't use plugging on SSD devices
...
Linus Torvalds [Wed, 31 Dec 2008 00:20:19 +0000 (16:20 -0800)]
Merge branch 'irq-core-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, sparseirq: clean up Kconfig entry
x86: turn CONFIG_SPARSE_IRQ off by default
sparseirq: fix numa_migrate_irq_desc dependency and comments
sparseirq: add kernel-doc notation for new member in irq_desc, -v2
locking, irq: enclose irq_desc_lock_class in CONFIG_LOCKDEP
sparseirq, xen: make sure irq_desc is allocated for interrupts
sparseirq: fix !SMP building, #2
x86, sparseirq: move irq_desc according to smp_affinity, v7
proc: enclose desc variable of show_stat() in CONFIG_SPARSE_IRQ
sparse irqs: add irqnr.h to the user headers list
sparse irqs: handle !GENIRQ platforms
sparseirq: fix !SMP && !PCI_MSI && !HT_IRQ build
sparseirq: fix Alpha build failure
sparseirq: fix typo in !CONFIG_IO_APIC case
x86, MSI: pass irq_cfg and irq_desc
x86: MSI start irq numbering from nr_irqs_gsi
x86: use NR_IRQS_LEGACY
sparse irq_desc[] array: core kernel and x86 changes
genirq: record IRQ_LEVEL in irq_desc[]
irq.h: remove padding from irq_desc on 64bits
Linus Torvalds [Wed, 31 Dec 2008 00:16:21 +0000 (16:16 -0800)]
Merge branch 'timers-core-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
hrtimers: fix warning in kernel/hrtimer.c
x86: make sure we really have an hpet mapping before using it
x86: enable HPET on Fujitsu u9200
linux/timex.h: cleanup for userspace
posix-timers: simplify de_thread()->exit_itimers() path
posix-timers: check ->it_signal instead of ->it_pid to validate the timer
posix-timers: use "struct pid*" instead of "struct task_struct*"
nohz: suppress needless timer reprogramming
clocksource, acpi_pm.c: put acpi_pm_read_slow() under CONFIG_PCI
nohz: no softirq pending warnings for offline cpus
hrtimer: removing all ur callback modes, fix
hrtimer: removing all ur callback modes, fix hotplug
hrtimer: removing all ur callback modes
x86: correct link to HPET timer specification
rtc-cmos: export second NVRAM bank
Fixed up conflicts in sound/drivers/pcsp/pcsp.c and sound/core/hrtimer.c
manually.
Linus Torvalds [Wed, 31 Dec 2008 00:10:19 +0000 (16:10 -0800)]
Merge branch 'core-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (63 commits)
stacktrace: provide save_stack_trace_tsk() weak alias
rcu: provide RCU options on non-preempt architectures too
printk: fix discarding message when recursion_bug
futex: clean up futex_(un)lock_pi fault handling
"Tree RCU": scalable classic RCU implementation
futex: rename field in futex_q to clarify single waiter semantics
x86/swiotlb: add default swiotlb_arch_range_needs_mapping
x86/swiotlb: add default phys<->bus conversion
x86: unify pci iommu setup and allow swiotlb to compile for 32 bit
x86: add swiotlb allocation functions
swiotlb: consolidate swiotlb info message printing
swiotlb: support bouncing of HighMem pages
swiotlb: factor out copy to/from device
swiotlb: add arch hook to force mapping
swiotlb: allow architectures to override phys<->bus<->phys conversions
swiotlb: add comment where we handle the overflow of a dma mask on 32 bit
rcu: fix rcutorture behavior during reboot
resources: skip sanity check of busy resources
swiotlb: move some definitions to header
swiotlb: allow architectures to override swiotlb pool allocation
...
Fix up trivial conflicts in
arch/x86/kernel/Makefile
arch/x86/mm/init_32.c
include/linux/hardirq.h
as per Ingo's suggestions.
Robert Hancock [Thu, 25 Dec 2008 01:06:06 +0000 (19:06 -0600)]
sata_sil: add Large Block Transfer support
This implements support for the Large Block Transfer feature found in Silicon
Image 311x controllers. This allows transferring bigger contiguous chunks of
data from system memory and avoids the 64KB boundary restriction of standard
SFF controllers.
This is based on a patch from Jeff Garzik (from the sii-lbt branch of
libata-dev) but includes a few bug fixes: Since the bmdma2 register does not
implement the status bits, the original bmdma register must be used except
where the bmdma2 register is required. As well the DMA boundary should be
31-bit instead of 32-bit since the top bit of the length field is still
required for the PRD end-of-table flag.
Signed-off-by: Robert Hancock <hancockr@shaw.ca>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Jiri Slaby [Wed, 10 Dec 2008 13:07:22 +0000 (14:07 +0100)]
[libata] ata_piix: cleanup dmi strings checking
Commit
ATA: piix, fix pointer deref on suspend
fixed a possible oops in an ugly manner. Use newly introduced dmi_match()
to make the code pretty again.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Alexandru Romanescu <a_romanescu@yahoo.co.uk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Jiri Slaby [Wed, 10 Dec 2008 13:07:21 +0000 (14:07 +0100)]
DMI: add dmi_match
Add a wrapper for testing system_info which will handle also NULL
system infos.
This will be used by the ata PIIX driver.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Alexandru Romanescu <a_romanescu@yahoo.co.uk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Lubomir Bulej [Mon, 22 Dec 2008 10:35:22 +0000 (11:35 +0100)]
libata: blacklist NCQ on OCZ CORE 2 SSD (resend)
The patchlet below blacklists NCQ on OCZ CORE v2 SSD drive(s). Even
though the drive advertises NCQ support with queue depth 1, it responds
with all-zeroes FIS to NCQ commands which triggers ata error handling
several times before the kernel decides to disable NCQ on the drive.
Signed-off-by: Lubomir Bulej <lubomir.bulej@dsrg.mff.cuni.cz>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Pekka Enberg [Mon, 29 Dec 2008 09:47:05 +0000 (11:47 +0200)]
Merge branch 'topic/failslab' into for-linus
Conflicts:
mm/slub.c
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Pekka Enberg [Mon, 29 Dec 2008 09:45:47 +0000 (11:45 +0200)]
Merge branches 'topic/fixes', 'topic/cleanups' and 'topic/documentation' into for-linus
David Rientjes [Thu, 18 Dec 2008 06:09:46 +0000 (22:09 -0800)]
slub: avoid leaking caches or refcounts on sysfs error
If a slab cache is mergeable and the sysfs alias cannot be added, the
target cache shall have its refcount decremented. kmem_cache_create()
will return NULL, so if kmem_cache_destroy() is ever called on the target
cache, it will never be freed if the refcount has been leaked.
Likewise, if a slab cache is not mergeable and the sysfs link cannot be
added, the new cache shall be removed from the slab_caches list.
kmem_cache_create() will return NULL, so it will be impossible to call
kmem_cache_destroy() on it.
Both of these operations require slub_lock since refcount of all slab
caches and slab_caches are protected by the lock.
In the mergeable case, it would be better to restore objsize and offset
back to their original values, but this could race with another merge
since slub_lock was dropped.
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Pascal Terjan [Tue, 25 Nov 2008 14:08:19 +0000 (15:08 +0100)]
slab: Fix comment on #endif
This #endif in slab.h is described as closing the inner block while it's for
the big CONFIG_NUMA one. That makes reading the code a bit harder.
This trivial patch fixes the comment.
Signed-off-by: Pascal Terjan <pterjan@mandriva.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Pekka Enberg [Wed, 26 Nov 2008 08:01:31 +0000 (10:01 +0200)]
slab: remove GFP_THISNODE clearing from alloc_slabmgmt()
Commit
6cb062296f73e74768cca2f3eaf90deac54de02d ("Categorize GFP flags")
left one call-site in alloc_slabmgmt() to clear GFP_THISNODE instead of
GFP_CONSTRAINT_MASK. Unfortunately, that ends up clearing __GFP_NOWARN
and __GFP_NORETRY as well which is not what we want. As the only caller
of alloc_slabmgmt() already clears GFP_CONSTRAINT_MASK before passing
local_flags to it, we can just remove the clearing of GFP_THISNODE.
This patch should fix spurious page allocation failure warnings on the
mempool_alloc() path. See the following URL for the original discussion
of the bug:
http://lkml.org/lkml/2008/10/27/100
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Reported-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
OGAWA Hirofumi [Wed, 19 Nov 2008 12:23:59 +0000 (21:23 +0900)]
slub: Add might_sleep_if() to slab_alloc()
Currently SLUB doesn't warn about __GFP_WAIT. Add it into slab_alloc().
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Akinobu Mita [Tue, 23 Dec 2008 10:37:01 +0000 (19:37 +0900)]
SLUB: failslab support
Currently fault-injection capability for SLAB allocator is only
available to SLAB. This patch makes it available to SLUB, too.
[penberg@cs.helsinki.fi: unify slab and slub implementations]
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Dave Airlie [Mon, 29 Dec 2008 06:35:02 +0000 (16:35 +1000)]
drm/i915: fix modeset devname allocation + agp init return check.
devname needs to be allocated before the irq is installed, so the
irq routines get the correct name in /proc.
Also check the return value from the AGP init function, and
fixup the exit points.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Julia Lawall [Sun, 21 Dec 2008 15:28:47 +0000 (16:28 +0100)]
drm/i915: Remove redundant test in error path.
The error path for object list being null is in the second goto target.
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@linux.ie>
Eric Anholt [Sat, 20 Dec 2008 01:23:38 +0000 (17:23 -0800)]
drm: Add a debug node for vblank state.
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@linux.ie>
Eric Anholt [Fri, 19 Dec 2008 23:07:11 +0000 (15:07 -0800)]
drm: Avoid use-before-null-test on dev in drm_cleanup().
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@linux.ie>
Eric Anholt [Fri, 19 Dec 2008 22:47:48 +0000 (14:47 -0800)]
drm/i915: Don't print to dmesg when taking signal during object_pin.
This showed up in logs where people had a hung chip, so pinning was blocked
on the chip unpinning other buffers, and the X Server took its scheduler
signal during that time.
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@linux.ie>
Kristian Høgsberg [Thu, 18 Dec 2008 03:14:46 +0000 (22:14 -0500)]
drm: pin new and unpin old buffer when setting a mode.
This removes the requirement for user space to pin a buffer before
setting a mode that is backed by the pixels from that buffer.
Signed-off-by: Kristian Høgsberg <krh@redhat.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@linux.ie>
Hannes Eder [Fri, 19 Dec 2008 11:34:27 +0000 (12:34 +0100)]
drm/i915: un-EXPORT and make 'intelfb_panic' static
Fix this sparse warning:
drivers/gpu/drm/i915/intel_fb.c:417:5: warning: symbol 'intelfb_panic' was not declared. Should it be static?
Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@linux.ie>
Eric Anholt [Fri, 19 Dec 2008 22:30:31 +0000 (14:30 -0800)]
drm/i915: Delete unused, pointless i915_driver_firstopen.
Thanks to Hannes Eder for pointing out that this code was dead according to
sparse.
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@linux.ie>
Hannes Eder [Thu, 18 Dec 2008 20:24:18 +0000 (21:24 +0100)]
drm/i915: fix sparse warnings: returning void-valued expression
Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@linux.ie>
Hannes Eder [Thu, 18 Dec 2008 20:22:24 +0000 (21:22 +0100)]
drm/i915: fix sparse warnings: move 'extern' decls to header file
Move 'extern'-decls from "intel_dvo.c" to "dvo.h", as "dvo.h" is
included by and only by files where the symbols are either defined or
used.
Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@linux.ie>
Hannes Eder [Thu, 18 Dec 2008 20:18:47 +0000 (21:18 +0100)]
drm/i915: fix sparse warnings: make symbols static
Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@linux.ie>
Hannes Eder [Thu, 18 Dec 2008 14:09:00 +0000 (15:09 +0100)]
drm/i915: fix sparse warnings: declare one-bit bitfield as unsigned
Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Dave Airlie <airlied@linux.ie>
Eric Anholt [Wed, 10 Dec 2008 18:09:41 +0000 (10:09 -0800)]
drm/i915: Don't double-unpin buffers if we take a signal in evict_everything().
We haven't seen this in practice, but it was visible when looking at a bug
report from when i915_gem_evict_everything() was broken and would always
return error.
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@linux.ie>
Eric Anholt [Thu, 11 Dec 2008 01:23:00 +0000 (17:23 -0800)]
drm/i915: Fix fbcon setup to align display pitch to 64b.
This is required by the display plane, and fixes 1400x1050 laptop displays.
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Eric Anholt [Thu, 18 Dec 2008 06:32:14 +0000 (22:32 -0800)]
drm/i915: Add missing userland definitions for gem init/execbuffer.
fdo bug #19132.
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 19 Dec 2008 05:07:46 +0000 (15:07 +1000)]
i915/drm: provide compat defines for userspace for certain struct members.
Painfully userspace started using new names that were never actually to be
used from the external repo.
Also fill out the gaps in the structure for old/new userspace compat
Add compat defines for these structs.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Kristian H�gsberg [Thu, 18 Dec 2008 03:14:37 +0000 (13:14 +1000)]
drm: drop DRM_IOCTL_MODE_REPLACEFB, add+remove works just as well.
The replace fb ioctl replaces the backing buffer object for a modesetting
framebuffer object. This can be acheived by just creating a new
framebuffer backed by the new buffer object, setting that for the crtcs
in question and then removing the old framebuffer object.
Signed-off-by: Kristian Hogsberg <krh@redhat.com>
Acked-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jakob Bornecrantz [Fri, 19 Dec 2008 04:50:50 +0000 (14:50 +1000)]
drm: sanitise drm modesetting API + remove unused hotplug
The initially merged modesetting API has some uglies in it, this
cleans up the struct members and ioctl ordering for initial submission.
It also removes the unneeded hotplug infrastructure.
airlied:- I've pulled this patch in from git modesetting-gem tree.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 19 Dec 2008 01:00:46 +0000 (12:00 +1100)]
drm: fix allowing master ioctls on non-master fds.
The multi-master patches changed master to a pointer, and this fell out,
change to use is_master.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 18 Dec 2008 23:23:14 +0000 (10:23 +1100)]
drm/radeon: use locked rmmap to remove sarea mapping.
this exports the locked version of the symbol as struct_mutex locks it all.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 18 Dec 2008 23:22:02 +0000 (10:22 +1100)]
drm/radeon: fix missing hunk from the master changes.
Thanks to Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> for reporting
this.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 18 Dec 2008 06:59:02 +0000 (16:59 +1000)]
drm: fix useless gcc unused variable warning
the calling function doesn't call this function unless one of the two
states that sets the value is true.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 18 Dec 2008 06:56:11 +0000 (16:56 +1000)]
drm/radeon: fix warning due to PAGE_SIZE max
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sat, 13 Dec 2008 08:21:08 +0000 (18:21 +1000)]
drm: kconfig have drm core select i2c for kms
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 11 Dec 2008 07:06:35 +0000 (17:06 +1000)]
drm: PAGE_CACHE_WC is x86 only so far
The page protections need to be checked whether they need to be more flexible.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Mon, 8 Dec 2008 04:55:27 +0000 (14:55 +1000)]
drm: pick an 800x600@60HZ mode by default for unknown CRT.
This is what X picks now, so we should do the same.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Eric Anholt [Thu, 4 Dec 2008 06:50:02 +0000 (22:50 -0800)]
drm/i915: Fix stolen memory detection on G45 and GM45.
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Eric Anholt [Thu, 4 Dec 2008 06:43:14 +0000 (22:43 -0800)]
drm/i915: Register module dependencies for the modesetting code.
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jesse Barnes [Fri, 7 Nov 2008 22:24:08 +0000 (14:24 -0800)]
DRM: i915: add mode setting support
This commit adds i915 driver support for the DRM mode setting APIs.
Currently, VGA, LVDS, SDVO DVI & VGA, TV and DVO LVDS outputs are
supported. HDMI, DisplayPort and additional SDVO output support will
follow.
Support for the mode setting code is controlled by the new 'modeset'
module option. A new config option, CONFIG_DRM_I915_KMS controls the
default behavior, and whether a PCI ID list is built into the module for
use by user level module utilities.
Note that if mode setting is enabled, user level drivers that access
display registers directly or that don't use the kernel graphics memory
manager will likely corrupt kernel graphics memory, disrupt output
configuration (possibly leading to hangs and/or blank displays), and
prevent panic/oops messages from appearing. So use caution when
enabling this code; be sure your user level code supports the new
interfaces.
A new SysRq key, 'g', provides emergency support for switching back to
the kernel's framebuffer console; which is useful for testing.
Co-authors: Dave Airlie <airlied@linux.ie>, Hong Liu <hong.liu@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 7 Nov 2008 22:05:41 +0000 (14:05 -0800)]
DRM: add mode setting support
Add mode setting support to the DRM layer.
This is a fairly big chunk of work that allows DRM drivers to provide
full output control and configuration capabilities to userspace. It was
motivated by several factors:
- the fb layer's APIs aren't suited for anything but simple
configurations
- coordination between the fb layer, DRM layer, and various userspace
drivers is poor to non-existent (radeonfb excepted)
- user level mode setting drivers makes displaying panic & oops
messages more difficult
- suspend/resume of graphics state is possible in many more
configurations with kernel level support
This commit just adds the core DRM part of the mode setting APIs.
Driver specific commits using these new structure and APIs will follow.
Co-authors: Jesse Barnes <jbarnes@virtuousgeek.org>, Jakob Bornecrantz <jakob@tungstengraphics.com>
Contributors: Alan Hourihane <alanh@tungstengraphics.com>, Maarten Maathuis <madman2003@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jesse Barnes [Wed, 12 Nov 2008 18:03:55 +0000 (10:03 -0800)]
drm/i915: add GEM GTT mapping support
Use the new core GEM object mapping code to allow GTT mapping of GEM
objects on i915. The fault handler will make sure a fence register is
allocated too, if the object in question is tiled.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jesse Barnes [Wed, 5 Nov 2008 18:31:53 +0000 (10:31 -0800)]
drm: GEM mmap support
Add core support for mapping of GEM objects. Drivers should provide a
vm_operations_struct if they want to support page faulting of objects.
The code for handling GEM object offsets was taken from TTM, which was
written by Thomas Hellström.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Eric Anholt [Wed, 5 Nov 2008 20:37:42 +0000 (12:37 -0800)]
drm/i915: Add /proc debugging entry for reading out the HWS.
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sun, 7 Dec 2008 02:02:21 +0000 (12:02 +1000)]
drm: reorganise start and load.
Make sure we have the primary node so the device can add maps.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Vegard Nossum [Tue, 2 Dec 2008 03:38:47 +0000 (13:38 +1000)]
drm: fix leak of uninitialized data to userspace
...so drm_getunique() is trying to copy some uninitialized data to
userspace. The ECX register contains the number of words that are
left to copy -- so there are 5 * 4 = 20 bytes left. The offset of the
first uninitialized byte (counting from the start of the string) is
also 20 (i.e. 0xf65d2294&((1 << 5)-1) == 20). So somebody tried to
copy 40 bytes when the string was only 19 long.
In drm_set_busid() we have this code:
dev->unique_len = 40;
dev->unique = drm_alloc(dev->unique_len + 1, DRM_MEM_DRIVER);
...
len = snprintf(dev->unique, dev->unique_len, pci:%04x:%02x:%02x.%d",
...so it seems that dev->unique is never updated to reflect the
actual length of the string. The remaining bytes (20 in this case)
are random uninitialized bytes that are copied into userspace.
This patch fixes the problem by setting dev->unique_len after the
snprintf().
airlied- I've had to fix this up to store the alloced size so
we have it for drm_free later.
Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Vegard Nossum <vegardno@thuin.ifi.uio.no>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 28 Nov 2008 04:22:24 +0000 (14:22 +1000)]
drm: move to kref per-master structures.
This is step one towards having multiple masters sharing a drm
device in order to get fast-user-switching to work.
It splits out the information associated with the drm master
into a separate kref counted structure, and allocates this when
a master opens the device node. It also allows the current master
to abdicate (say while VT switched), and a new master to take over
the hardware.
It moves the Intel and radeon drivers to using the sarea from
within the new master structures.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 28 Nov 2008 03:43:47 +0000 (13:43 +1000)]
drm: cleanup exit path for module unload
The current sub-module unload exit path is a mess, it tries
to abuse the idr. Just keep a list of devices per driver struct
and free them in-order on rmmod.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jens Axboe [Tue, 23 Dec 2008 11:46:21 +0000 (12:46 +0100)]
bio: get rid of bio_vec clearing
We don't need to clear the memory used for adding bio_vec entries,
since nobody should be looking at members unitialized. Any valid
use should be below bio->bi_vcnt, and that members up until that count
must be valid since they were added through bio_add_page().
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Tue, 23 Dec 2008 11:44:19 +0000 (12:44 +0100)]
bounce: don't rely on a zeroed bio_vec list
__blk_queue_bounce() relies on a zeroed bio_vec list, since it looks
up arbitrary indexes in the allocated bio. The block layer only
guarentees that added entries are valid, so clear memory after alloc.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Stephen M. Cameron [Thu, 18 Dec 2008 13:55:51 +0000 (14:55 +0100)]
cciss: simplify parameters to deregister_disk function
Simplify parameters to deregister_disk function.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Mon, 15 Dec 2008 20:19:25 +0000 (21:19 +0100)]
cfq-iosched: fix race between exiting queue and exiting task
Original patch from Nikanth Karthikesan <knikanth@suse.de>
When a queue exits the queue lock is taken and cfq_exit_queue() would free all
the cic's associated with the queue.
But when a task exits, cfq_exit_io_context() gets cic one by one and then
locks the associated queue to call __cfq_exit_single_io_context. It looks like
between getting a cic from the ioc and locking the queue, the queue might have
exited on another cpu.
Fix this by rechecking the cfq_io_context queue key inside the queue lock
again, and not calling into __cfq_exit_single_io_context() if somebody
beat us to it.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Milan Broz [Fri, 12 Dec 2008 13:50:49 +0000 (14:50 +0100)]
loop: Do not call loop_unplug for not configured loop device.
In loop_unplug() function is expected that mapping is set
and lo->lo_backing_file is not NULL.
Unfortunately loop_set_fd() set the request queue unplug function,
but loop_clr_fd() doesn't clear that.
Loop device allows open of non-configured loop in some situations.
If the unplug on request queue is called, loop module oopses because
of missing lo_backing_file.
Simple reproducer:
losetup /dev/loop0 /xxx
losetup -d /dev/loop0
dmsetup create x --table "0 1 linear /dev/loop0 0"
EIP is at loop_unplug+0x1d/0x3b
...
Call Trace:
blk_unplug+0x57/0x5e
dm_table_unplug_all+0x34/0x77 [dm_mod]
destroy_inode+0x27/0x38
generic_delete_inode+0xd5/0xd9
iput+0x4b/0x4e
dm_resume+0xca/0xfe [dm_mod]
dev_suspend+0x143/0x165 [dm_mod]
dm_ctl_ioctl+0x18e/0x1cf [dm_mod]
dev_suspend+0x0/0x165 [dm_mod]
dm_ctl_ioctl+0x0/0x1cf [dm_mod]
vfs_ioctl+0x22/0x69
do_vfs_ioctl+0x39d/0x3c7
trace_hardirqs_on+0xb/0xd
remove_vma+0x50/0x56
do_munmap+0x21c/0x237
sys_ioctl+0x2c/0x45
sysenter_do_call+0x12/0x31
Several reports here
http://www.kerneloops.org/search.php?search=loop_unplug
Fix it by simply clear unplug function together with
removing of backing file.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Milan Broz [Fri, 12 Dec 2008 13:48:27 +0000 (14:48 +0100)]
loop: Flush possible running bios when loop device is released.
When there are still queued bios and reference count
drops to zero, loop device must flush all queued bios.
Otherwise it can lead to situation that caller
closes the device, but some bios are still running
and endio() function call later OOpses when uses
unallocated mempool.
This happens for example when running dm-crypt over loop,
here is typical oops backtrace:
Oops: 0000 [#1] PREEMPT SMP
EIP is at mempool_free+0x12/0x6b
...
crypt_dec_pending+0x50/0x54 [dm_crypt]
crypt_endio+0x9f/0xa7 [dm_crypt]
crypt_endio+0x0/0xa7 [dm_crypt]
bio_endio+0x2b/0x2e
loop_thread+0x37a/0x3b1
do_lo_send_aops+0x0/0x165
autoremove_wake_function+0x0/0x33
loop_thread+0x0/0x3b1
kthread+0x3b/0x61
kthread+0x0/0x61
kernel_thread_helper+0x7/0x10
(But crash is reproducible with different dm targets
running over loop device too.)
Patch fixes it by flushing the bios in release call,
reusing the flush mechanism for switching backing store.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
FUJITA Tomonori [Fri, 12 Dec 2008 08:52:35 +0000 (09:52 +0100)]
alpha: remove dead BIO_VMERGE_BOUNDARY
The block layer dropped the virtual merge feature
(
b8b3e16cfe6435d961f6aaebcfd52a1ff2a988c5). BIO_VMERGE_BOUNDARY
definition is meaningless now.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Fri, 12 Dec 2008 08:51:16 +0000 (09:51 +0100)]
Get rid of CONFIG_LSF
We have two seperate config entries for large devices/files. One
is CONFIG_LBD that guards just the devices, the other is CONFIG_LSF
that handles large files. This doesn't make a lot of sense, you typically
want both or none. So get rid of CONFIG_LSF and change CONFIG_LBD wording
to indicate that it covers both.
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Roel Kluin [Wed, 10 Dec 2008 14:47:33 +0000 (15:47 +0100)]
block: make blk_softirq_init() static
Sparse asked whether these could be static.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
FUJITA Tomonori [Thu, 4 Dec 2008 07:56:35 +0000 (08:56 +0100)]
block: use min_not_zero in blk_queue_stack_limits
zero is invalid for max_phys_segments, max_hw_segments, and
max_segment_size. It's better to use use min_not_zero instead of
min. min() works though (because the commit
0e435ac makes sure that
these values are set to the default values, non zero, if a queue is
initialized properly).
With this patch, blk_queue_stack_limits does the almost same thing
that dm's combine_restrictions_low() does. I think that it's easy to
remove dm's combine_restrictions_low.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Fri, 24 Oct 2008 10:52:42 +0000 (12:52 +0200)]
block: add one-hit cache for disk partition lookup
disk_map_sector_rcu() returns a partition from a sector offset,
which we use for IO statistics on a per-partition basis. The
lookup itself is an O(N) list lookup, where N is the number of
partitions. This actually hurts performance quite a bit, even
on the lower end partitions. On higher numbered partitions,
it can get pretty bad.
Solve this by adding a one-hit cache for partition lookup.
This makes the lookup O(1) for the case where we do most IO to
one partition. Even for mixed partition workloads, amortized cost
is pretty close to O(1) since the natural IO batching makes the
one-hit cache last for lots of IOs.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Mon, 20 Oct 2008 13:44:28 +0000 (15:44 +0200)]
cfq-iosched: remove limit of dispatch depth of max 4 times quantum
This basically limits the hardware queue depth to 4*quantum at any
point in time, which is 16 with the default settings. As CFQ uses
other means to shrink the hardware queue when necessary in the first
place, there's really no need for this extra heuristic. Additionally,
it ends up hurting performance in some cases.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Fri, 31 Oct 2008 09:06:37 +0000 (10:06 +0100)]
nbd: tell the block layer that it is not a rotational device
Then we can get rid of that manual elevator type fiddling.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Fri, 31 Oct 2008 09:05:07 +0000 (10:05 +0100)]
block: get rid of elevator_t typedef
Just use struct elevator_queue everywhere instead.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Tue, 9 Dec 2008 07:11:22 +0000 (08:11 +0100)]
aio: make the lookup_ioctx() lockless
The mm->ioctx_list is currently protected by a reader-writer lock,
so we always grab that lock on the read side for doing ioctx
lookups. As the workload is extremely reader biased, turn this into
an rcu hlist so we can make lookup_ioctx() lockless. Get rid of
the rwlock and use a spinlock for providing update side exclusion.
There's usually only 1 entry on this list, so it doesn't make sense
to look into fancier data structures.
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Tue, 23 Dec 2008 11:42:54 +0000 (12:42 +0100)]
bio: add support for inlining a number of bio_vecs inside the bio
When we go and allocate a bio for IO, we actually do two allocations.
One for the bio itself, and one for the bi_io_vec that holds the
actual pages we are interested in.
This feature inlines a definable amount of io vecs inside the bio
itself, so we eliminate the bio_vec array allocation for IO's up
to a certain size. It defaults to 4 vecs, which is typically 16k
of IO.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Wed, 10 Dec 2008 14:35:05 +0000 (15:35 +0100)]
bio: allow individual slabs in the bio_set
Instead of having a global bio slab cache, add a reference to one
in each bio_set that is created. This allows for personalized slabs
in each bio_set, so that they can have bios of different sizes.
This means we can personalize the bios we return. File systems may
want to embed the bio inside another structure, to avoid allocation
more items (and stuffing them in ->bi_private) after the get a bio.
Or we may want to embed a number of bio_vecs directly at the end
of a bio, to avoid doing two allocations to return a bio. This is now
possible.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Wed, 22 Oct 2008 18:32:58 +0000 (20:32 +0200)]
bio: move the slab pointer inside the bio_set
In preparation for adding differently sized bios.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Thu, 11 Dec 2008 10:53:43 +0000 (11:53 +0100)]
bio: only mempool back the largest bio_vec slab cache
We only very rarely need the mempool backing, so it makes sense to
get rid of all but one of the mempool in a bio_set. So keep the
largest bio_vec count mempool so we can always honor the largest
allocation, and "upgrade" callers that fail.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Fri, 17 Oct 2008 11:58:29 +0000 (13:58 +0200)]
block: don't use plugging on SSD devices
We just want to hand the first bits of IO to the device as fast
as possible. Gains a few percent on the IOPS rate.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Fri, 28 Nov 2008 04:32:07 +0000 (13:32 +0900)]
block: fix empty barrier on write-through w/ ordered tag
Empty barrier on write-through (or no cache) w/ ordered tag has no
command to execute and without any command to execute ordered tag is
never issued to the device and the ordering is never achieved. Force
draining for such cases.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Fri, 28 Nov 2008 04:32:06 +0000 (13:32 +0900)]
block: simplify empty barrier implementation
Empty barrier required special handling in __elv_next_request() to
complete it without letting the low level driver see it.
With previous changes, barrier code is now flexible enough to skip the
BAR step using the same barrier sequence selection mechanism. Drop
the special handling and mask off q->ordered from start_ordered().
Remove blk_empty_barrier() test which now has no user.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Fri, 28 Nov 2008 04:32:05 +0000 (13:32 +0900)]
block: make barrier completion more robust
Barrier completion had the following assumptions.
* start_ordered() couldn't finish the whole sequence properly. If all
actions are to be skipped, q->ordseq is set correctly but the actual
completion was never triggered thus hanging the barrier request.
* Drain completion in elv_complete_request() assumed that there's
always at least one request in the queue when drain completes.
Both assumptions are true but these assumptions need to be removed to
improve empty barrier implementation. This patch makes the following
changes.
* Make start_ordered() use blk_ordered_complete_seq() to mark skipped
steps complete and notify __elv_next_request() that it should fetch
the next request if the whole barrier has completed inside
start_ordered().
* Make drain completion path in elv_complete_request() check whether
the queue is empty. Empty queue also indicates drain completion.
* While at it, convert 0/1 return from blk_do_ordered() to false/true.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Fri, 28 Nov 2008 04:32:04 +0000 (13:32 +0900)]
block: make every barrier action optional
In all barrier sequences, the barrier write itself was always assumed
to be issued and thus didn't have corresponding control flag. This
patch adds QUEUE_ORDERED_DO_BAR and unify action mask handling in
start_ordered() such that any barrier action can be skipped.
This patch doesn't introduce any visible behavior changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Fri, 28 Nov 2008 04:32:03 +0000 (13:32 +0900)]
block: remove duplicate or unused barrier/discard error paths
* Because barrier mode can be changed dynamically, whether barrier is
supported or not can be determined only when actually issuing the
barrier and there is no point in checking it earlier. Drop barrier
support check in generic_make_request() and __make_request(), and
update comment around the support check in blk_do_ordered().
* There is no reason to check discard support in both
generic_make_request() and __make_request(). Drop the check in
__make_request(). While at it, move error action block to the end
of the function and add unlikely() to q existence test.
* Barrier request, be it empty or not, is never passed to low level
driver and thus it's meaningless to try to copy back req->sector to
bio->bi_sector on error. In addition, the notion of failed sector
doesn't make any sense for empty barrier to begin with. Drop the
code block from __end_that_request_first().
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Fri, 28 Nov 2008 04:32:02 +0000 (13:32 +0900)]
block: reorganize QUEUE_ORDERED_* constants
Separate out ordering type (drain,) and action masks (preflush,
postflush, fua) from visible ordering mode selectors
(QUEUE_ORDERED_*). Ordering types are now named QUEUE_ORDERED_BY_*
while action masks are named QUEUE_ORDERED_DO_*.
This change is necessary to add QUEUE_ORDERED_DO_BAR and make it
optional to improve empty barrier implementation.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Richard Kennedy [Wed, 3 Dec 2008 11:41:40 +0000 (12:41 +0100)]
block: reorder struct bio to remove padding on 64bit
Remove 8 bytes of padding from struct bio which also removes 16 bytes from
struct bio_pair to make it 248 bytes. bio_pair then fits into one fewer
cache lines & into a smaller slab.
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Cheng Renquan [Wed, 3 Dec 2008 11:41:39 +0000 (12:41 +0100)]
block: use cancel_work_sync() instead of kblockd_flush_work()
After many improvements on kblockd_flush_work, it is now identical to
cancel_work_sync, so a direct call to cancel_work_sync is suggested.
The only difference is that cancel_work_sync is a GPL symbol,
so no non-GPL modules anymore.
Signed-off-by: Cheng Renquan <crquan@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Keith Mannthey [Tue, 25 Nov 2008 09:24:35 +0000 (10:24 +0100)]
block: Supress Buffer I/O errors when SCSI REQ_QUIET flag set
Allow the scsi request REQ_QUIET flag to be propagated to the buffer
file system layer. The basic ideas is to pass the flag from the scsi
request to the bio (block IO) and then to the buffer layer. The buffer
layer can then suppress needless printks.
This patch declutters the kernel log by removed the 40-50 (per lun)
buffer io error messages seen during a boot in my multipath setup . It
is a good chance any real errors will be missed in the "noise" it the
logs without this patch.
During boot I see blocks of messages like
"
__ratelimit: 211 callbacks suppressed
Buffer I/O error on device sdm, logical block
5242879
Buffer I/O error on device sdm, logical block
5242879
Buffer I/O error on device sdm, logical block
5242847
Buffer I/O error on device sdm, logical block 1
Buffer I/O error on device sdm, logical block
5242878
Buffer I/O error on device sdm, logical block
5242879
Buffer I/O error on device sdm, logical block
5242879
Buffer I/O error on device sdm, logical block
5242879
Buffer I/O error on device sdm, logical block
5242879
Buffer I/O error on device sdm, logical block
5242872
"
in my logs.
My disk environment is multipath fiber channel using the SCSI_DH_RDAC
code and multipathd. This topology includes an "active" and "ghost"
path for each lun. IO's to the "ghost" path will never complete and the
SCSI layer, via the scsi device handler rdac code, quick returns the IOs
to theses paths and sets the REQ_QUIET scsi flag to suppress the scsi
layer messages.
I am wanting to extend the QUIET behavior to include the buffer file
system layer to deal with these errors as well. I have been running this
patch for a while now on several boxes without issue. A few runs of
bonnie++ show no noticeable difference in performance in my setup.
Thanks for John Stultz for the quiet_error finalization.
Submitted-by: Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Wu Fengguang [Tue, 25 Nov 2008 08:08:39 +0000 (09:08 +0100)]
block: don't take lock on changing ra_pages
There's no need to take queue_lock or kernel_lock when modifying
bdi->ra_pages. So remove them. Also remove out of date comment for
queue_max_sectors_store().
Signed-off-by: Wu Fengguang <wfg@linux.intel.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Nikanth Karthikesan [Mon, 24 Nov 2008 09:46:29 +0000 (10:46 +0100)]
Documentation: remove reference to ll_rw_blk.c and moved drivers/block/elevator.c
The drivers/block/ll_rw_block.c has been split and organized in the block/
directory, and also drivers/block/elevator.c has been moved to the block/
directory. Update Documentation/block/biodoc.txt accordingly
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Qinghuang Feng [Mon, 24 Nov 2008 09:43:36 +0000 (10:43 +0100)]
block/blk-tag.c: cleanup kernel-doc
There is no argument named @tags in blk_init_tags,
remove its' comment.
Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Thu, 20 Nov 2008 08:46:09 +0000 (09:46 +0100)]
cciss: switch to using hlist for command list management
This both cleans up the code and also helps detect the spurious case
of a command attempted being removed from a queue it doesn't belong
to.
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Nikanth Karthikesan [Wed, 19 Nov 2008 09:20:23 +0000 (10:20 +0100)]
Do not free io context when taking recursive faults in do_exit
When taking recursive faults in do_exit, if the io_context is not null,
exit_io_context() is being called. But it might decrement the refcount
more than once. It is better to leave this task alone.
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Marcin Slusarz [Sun, 16 Nov 2008 18:06:37 +0000 (19:06 +0100)]
cdrom: reduce stack usage of mmc_ioctl_dvd_read_struct
1. kmalloc 192 bytes in dvd_read_bca (which is inlined into dvd_read_struct)
2. Pass struct packet_command to all dvd_read_* functions.
Checkstack output:
Before: mmc_ioctl_dvd_read_struct: 280
After: mmc_ioctl_dvd_read_struct: 56
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Marcin Slusarz [Sun, 16 Nov 2008 18:04:47 +0000 (19:04 +0100)]
cdrom: split mmc_ioctl to lower stack usage
Checkstack output:
Before:
mmc_ioctl: 584
After:
mmc_ioctl_dvd_read_struct: 280
mmc_ioctl_cdrom_subchannel: 152
mmc_ioctl_cdrom_read_data: 120
mmc_ioctl_cdrom_volume: 104
mmc_ioctl_cdrom_read_audio: 104
(mmc_ioctl is inlined into cdrom_ioctl - 104 bytes)
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Milton Miller [Mon, 17 Nov 2008 12:10:34 +0000 (13:10 +0100)]
scsi-ioctl: use clock_t <> jiffies
Convert the timeout ioctl scalling to use the clock_t functions
which are much more accurate with some USER_HZ vs HZ combinations.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Wed, 19 Nov 2008 13:38:39 +0000 (14:38 +0100)]
block: leave the request timeout timer running even on an empty list
For sync IO, we'll often do them serialized. This means we'll be touching
the queue timer for every IO, as opposed to only occasionally like we
do for queued IO. Instead of deleting the timer when the last request
is removed, just let continue running. If a new request comes up soon
we then don't have to readd the timer again. If no new requests arrive,
the timer will expire without side effect later.
This improves high iops sync IO by ~1%.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Thu, 30 Oct 2008 07:53:02 +0000 (08:53 +0100)]
block: add comment in blk_rq_timed_out() about why next can not be 0
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
malahal@us.ibm.com [Thu, 30 Oct 2008 07:51:58 +0000 (08:51 +0100)]
block: optimizations in blk_rq_timed_out_timer()
Now the rq->deadline can't be zero if the request is in the
timeout_list, so there is no need to have next_set. There is no need to
access a request's deadline field if blk_rq_timed_out is called on it.
Signed-off-by: Malahal Naineni <malahal@us.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Fernando Luis Vázquez Cao [Mon, 27 Oct 2008 09:45:54 +0000 (18:45 +0900)]
xen-blkfront: set queue paravirt flag
Xen's blkfront sets noop as the default I/O scheduler at initialization
time to avoid elevator overheads such as idling, but with the advent of
basic disk profiling capabilities this is not necessary anymore. We
should just tell the block layer that we are a paravirt front-end driver
and the elevator will automatically make the necessary adjustments.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Fernando Luis Vázquez Cao [Mon, 27 Oct 2008 09:45:15 +0000 (18:45 +0900)]
virtio_blk: set queue paravirt flag
As a paravirt front-end driver, virtio_blk is not a rotational device so
we want do avoid idling in AS/CFQ. Tell the block layer about this.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>