Andi Kleen [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] x86_64: Log machine checks from boot on Intel systems
The logging for boot errors was turned off because it was broken
on some AMD systems. But give Intel EM64T systems a chance because they are
supposed to be correct there.
The advantage is that there is a chance to actually log uncorrected
machine checks after the reset.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ravikiran G Thirumalai [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] x86_64: Make ACPI NUMA and NUMA emulation peers of K8_NUMA in Kconfig
On x86_64 arches, there is no way to choose ACPI_NUMA without having to choose
K8_NUMA. CONFIG_K8_NUMA is not needed for Intel EM64T NUMA boxes. It also
looks odd if you have to select ACPI_NUMA from the power management menu.
This patch fixes those oddities. Patch does the following:
1. Makes NUMA a config option like other arches
2. Makes topology detection options like K8_NUMA dependent on NUMA
3. Choosing ACPI NUMA detection can be done from the standard
"Processor type and features" menu
AK: I fixed up the dependencies and changed the help texts a bit
on top of Kiran's patch.
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paolo 'Blaisorblade' Giarrusso [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] x86_64: Use common sys_time64
Keeping this function does not makes sense because it's a copied (and
buggy) copy of sys_time. The only difference is that now.tv_sec (which is
a time_t, i.e. a 64-bit long) is copied (and truncated) into a int
(32-bit).
The prototype is the same (they both take a long __user *), so let's drop
this and redirect it to sys_time (and make sure it exists by defining
__ARCH_WANT_SYS_TIME).
Only disadvantage is that the sys_stime definition is also compiled (may be
fixed if needed by adding a separate __ARCH_WANT_SYS_STIME macro, and
defining it for all arch's defining __ARCH_WANT_SYS_TIME except x86_64).
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paolo 'Blaisorblade' Giarrusso [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] x86_64: Set ____cacheline_maxaligned_in_smp alignment to 128 bytes
The current value was correct before the introduction of Intel EM64T support -
but now L1_CACHE_SHIFT_MAX can be less than L1_CACHE_SHIFT, which _is_ funny!
Between the few users of ____cacheline_maxaligned_in_smp, we also have (for
example) rcu_ctrlblk, and struct zone, with zone->{lru_,}lock. I.e. we have
a lot of excess cacheline bouncing on them.
No correctness issues, obviously. So this could even be merged for 2.6.14
(I'm not a fan of this idea, though).
CC: Andi Kleen <ak@suse.de>
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] x86_64: Remove asm-x86_64/rwsem.h
Not needed since x86-64 always uses the spinlock based rwsems.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] x86_64: Remove optimization for B stepping AMD K8
B stepping were the first shipping Opterons. memcpy/memset/copy_page/
clear_page had special optimized version for them. These are really
old and in the minority now and the difference to the generic versions
(using rep microcode) is not that big anyways. So just remove them.
TODO: figure out optimized versions for Intel Netburst based EM64T
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] x86_64: Reduce number of retries for reset through keyboard controller
Old code could retry for 10 seconds worst time. Only try it
for one second now.
Suggested by Yinghai Lu
Cc: Yinghai.Lu@amd.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Siddha, Suresh B [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] x86_64: x86_64/i386 fix Intel cache detection code assumption about threads sharing
Fix the Intel cache detection code assumption that number of threads
sharing the cache will either be equal to number of HT or core siblings.
This also cleans up the code in general a bit.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Siddha, Suresh B [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] x86-64/i386: Intel HT, Multi core detection fixes
Fields obtained through cpuid vector 0x1(ebx[16:23]) and
vector 0x4(eax[14:25], eax[26:31]) indicate the maximum values and might not
always be the same as what is available and what OS sees. So make sure
"siblings" and "cpu cores" values in /proc/cpuinfo reflect the values as seen
by OS instead of what cpuid instruction says. This will also fix the buggy BIOS
cases (for example where cpuid on a single core cpu says there are "2" siblings,
even when HT is disabled in the BIOS.
http://bugzilla.kernel.org/show_bug.cgi?id=4359)
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] x86_64: Fix NUMA node lookup debug code which had bitrotted
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] x86_64: Don't enable interrupt unconditionally in reboot path
When they were disabled before (e.g. after a panic) it's better
to keep them off, otherwise followon panics can happen from timer
interrupt handlers etc.
Drawback is that pageup in the console won't work anymore though.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] x86_64: Formatting fixes for arch/x86_64/kernel/process.c
No functional changes.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] x86_64: Allow modular build of ia32 aout loader
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Shaohua Li [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] x86_64: Force correct address space size for MTRR on some 64bit Intel Xeons
They report 40bit, but only have 36bits of physical address space.
This caused problems with setting up the correct masks for MTRR.
CPUID workaround for steppings 0F33h(supporting x86) and 0F34h(supporting x86
and EM64T). Detail info can be found at:
http://download.intel.com/design/Xeon/specupdt/
30240216.pdf
http://download.intel.com/design/Pentium4/specupdt/
30235221.pdf
Signed-off-by: Shaohua Li<shaohua.li@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] AGP: Make gart iterator in K8 AGP driver SMP safe
Ugh!
Cc: davej@redhat.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] AGP: Try unsupported AGP chipsets on x86-64 by default
So far all new ones have worked and there isn't much variation because
the CPU does all the interesting bits.
So enable try unsupported by default.
Can be still disabled with try_unsupported=0 (module) or
amd64.try_unsupported=0 (boot option)
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] AGP: Support ULI/ALI 1689 bridge on AMD64
(no name because I'm not sure of the correct name)
Cc: davej@redhat.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Eric Dumazet [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] x86_64: Optimize NUMA node hash function
Compute the highest possible value for memnode_shift, in order to reduce
footprint of memnodemap[] to the minimum, thus making all users
(phys_to_nid(), kfree()), more cache friendly.
Before the patch :
Node 0 MemBase
0000000000000000 Limit
00000001ffffffff
Node 1 MemBase
0000000200000000 Limit
00000003ffffffff
Using 23 for the hash shift. Max adder is
3ffffffff
After the patch :
Node 0 MemBase
0000000000000000 Limit
00000001ffffffff
Node 1 MemBase
0000000200000000 Limit
00000003ffffffff
Using 33 for the hash shift.
In this case, only 2 bytes of memnodemap[] are used, instead of 2048
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Bryan Ford [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] x86_64: Save/restore CS in 64bit signal handlers and force __USER_CS for CS
This allows to run 64bit signal handlers in 64bit processes that run small
code snippets in compat mode.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] x86_64: New heuristics to find out hotpluggable CPUs.
With a NR_CPUS==128 kernel with CPU hotplug enabled we would waste 4MB
on per CPU data of all possible CPUs. The reason was that HOTPLUG
always set up possible map to NR_CPUS cpus and then we need to allocate
that much (each per CPU data is roughly ~32k now)
The underlying problem is that ACPI didn't tell us how many hotplug CPUs
the platform supports. So the old code just assumed all, which would
lead to this memory wastage.
This implements some new heuristics:
- If the BIOS specified disabled CPUs in the ACPI/mptables assume they
can be enabled later (this is bending the ACPI specification a bit,
but seems like a obvious extension)
- The user can overwrite it with a new additionals_cpus=NUM option
- Otherwise use half of the available CPUs or 2, whatever is more.
Cc: ashok.raj@intel.com
Cc: len.brown@intel.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] x86_64: Use int operations in spinlocks to support more than 128 CPUs spinning.
Pointed out by Eric Dumazet
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:54 +0000 (17:25 +0100)]
[PATCH] x86_64: Some clarifications for Documention/x86_64/mm.txt
I got some questions on this, so just fix up the documentation.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] x86_64: Replace swiotlb extern with include
Minor victory on the continuous quest against all stray extern.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] x86_64: Replace cpu_pda extern with include
Minor cleanup - remove obsolete extern
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] x86_64: Only use asm/sections.h to declare section symbols
Adding __initdata_* to asm-generic/sections.h
Replaces a lot of open coded externs in arch/x86_64/*
I had to change __bss_end to __bss_stop to match the other architectures.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] x86_64: Don't apply __PHYSICAL_MASK to page frame numbers
It is for physical addresses, not for PFNs.
Pointed out by Tejun Heo.
Cc: htejun@gmail.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Siddha, Suresh B [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] x86_64: Unmap NULL during early bootup
We should zap the low mappings, as soon as possible, so that we can catch
kernel bugs more effectively. Previously early boot had NULL mapped
and didn't trap on NULL references.
This patch introduces boot_level4_pgt, which will always have low identity
addresses mapped. Druing boot, all the processors will use this as their
level4 pgt. On BP, we will switch to init_level4_pgt as soon as we enter C
code and zap the low mappings as soon as we are done with the usage of
identity low mapped addresses. On AP's we will zap the low mappings as
soon as we jump to C code.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] x86_64: Speed up numa_node_id by putting it directly into the PDA
Not go from the CPU number to an mapping array.
Mode number is often used now in fast paths.
This also adds a generic numa_node_id to all the topology includes
Suggested by Eric Dumazet
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] x86_64: Fix gcc 4 warning in aperture.c
Fix
arch/x86_64/kernel/aperture.c: In function #iommu_hole_init#:
arch/x86_64/kernel/aperture.c:199: warning: #aper_order# may be used uninitialized in this function
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Suresh Siddha [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] x86-64/i386: Fix CPU model for family 6
According to cpuid instruction in IA32 SDM-Vol2, when computing cpu model,
we need to consider extended model ID for family 0x6 also.
AK: Also added fixes/simplifcation from Petr Vandrovec
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ashok Raj [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] x86_64: Remove duplicate __cpuinit define
Remove duplicate __cpuinit in smp.c. Already defined in init.h which is
already included.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] x86_64: Use the DMA32 zone for dma_alloc_coherent()/pci_alloc_consistent
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] x86_64: Remove obsolete ARCH_HAS_ATOMIC_UNSIGNED and page_flags_t
Has been introduced for x86-64 at some point to save memory
in struct page, but has been obsolete for some time. Just
remove it.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] x86_64: Fix up outdated pfn_to_page comment
pfn_to_page really requires pfn_valid to be true now, no question.
Some people stumbled over it, but it was misleading and wrong.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
James Cleverdon [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] i386/x86-64: Share interrupt vectors when there is a large number of interrupt sources
Here's a patch that builds on Natalie Protasevich's IRQ compression
patch and tries to work for MPS boots as well as ACPI. It is meant for
a 4-node IBM x460 NUMA box, which was dying because it had interrupt
pins with GSI numbers > NR_IRQS and thus overflowed irq_desc.
The problem is that this system has 270 GSIs (which are 1:1 mapped with
I/O APIC RTEs) and an 8-node box would have 540. This is much bigger
than NR_IRQS (224 for both i386 and x86_64). Also, there aren't enough
vectors to go around. There are about 190 usable vectors, not counting
the reserved ones and the unused vectors at 0x20 to 0x2F. So, my patch
attempts to compress the GSI range and share vectors by sharing IRQs.
Cc: "Protasevich, Natalie" <Natalie.Protasevich@unisys.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jacob Shin [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] x86_64: Support for AMD specific MCE Threshold.
MC4_MISC - DRAM Errors Threshold Register realized under AMD K8 Rev F.
This register is used to count correctable and uncorrectable ECC errors that occur during DRAM read operations.
The user may interface through sysfs files in order to change the threshold configuration.
bank%d/error_count - reads current error count, write to clear.
bank%d/interrupt_enable - set/clear interrupt enable.
bank%d/threshold_limit - read/write the threshold limit.
APIC vector 0xF9 in hw_irq.h.
5 software defined bank ids in mce.h.
new apic.c function to setup threshold apic lvt.
defaults to interrupt off, count enabled, and threshold limit max.
sysfs interface created on /sys/devices/system/threshold.
AK: added some ifdefs to make it compile on UP
Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jan Beulich [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] x86_64: Adjust, correct, and complete the HPET definitions for x86-64.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] x86_64: Account mem_map in VM holes accounting
The VM needs to know about lost memory in zones to accurately
balance dirty pages. This patch accounts mem_map in there too,
which fixes a constant errror of a few percent. Also some
other misc mappings and the kernel text itself are accounted
too.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] x86_64: When cpu_up fails clean up page allocator properly
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] x86_64: Make i386 compile again with fourth DMA32 zone
The code should deal with an additional empty zone, so fix up the
#error.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] x86_64: Set compatibility flag for 4GB zone on IA64
IA64 traditionally had a 4GB DMA32 zone. Set the compatibility flag
to keep old drivers working.
For new drivers it would be better to use ZONE_DMA32 now.
Cc: tony.luck@intel.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] x86_64: Add 4GB DMA32 zone
Add a new 4GB GFP_DMA32 zone between the GFP_DMA and GFP_NORMAL zones.
As a bit of historical background: when the x86-64 port
was originally designed we had some discussion if we should
use a 16MB DMA zone like i386 or a 4GB DMA zone like IA64 or
both. Both was ruled out at this point because it was in early
2.4 when VM is still quite shakey and had bad troubles even
dealing with one DMA zone. We settled on the 16MB DMA zone mainly
because we worried about older soundcards and the floppy.
But this has always caused problems since then because
device drivers had trouble getting enough DMA able memory. These days
the VM works much better and the wide use of NUMA has proven
it can deal with many zones successfully.
So this patch adds both zones.
This helps drivers who need a lot of memory below 4GB because
their hardware is not accessing more (graphic drivers - proprietary
and free ones, video frame buffer drivers, sound drivers etc.).
Previously they could only use IOMMU+16MB GFP_DMA, which
was not enough memory.
Another common problem is that hardware who has full memory
addressing for >4GB misses it for some control structures in memory
(like transmit rings or other metadata). They tended to allocate memory
in the 16MB GFP_DMA or the IOMMU/swiotlb then using pci_alloc_consistent,
but that can tie up a lot of precious 16MB GFPDMA/IOMMU/swiotlb memory
(even on AMD systems the IOMMU tends to be quite small) especially if you have
many devices. With the new zone pci_alloc_consistent can just put
this stuff into memory below 4GB which works better.
One argument was still if the zone should be 4GB or 2GB. The main
motivation for 2GB would be an unnamed not so unpopular hardware
raid controller (mostly found in older machines from a particular four letter
company) who has a strange 2GB restriction in firmware. But
that one works ok with swiotlb/IOMMU anyways, so it doesn't really
need GFP_DMA32. I chose 4GB to be compatible with IA64 and because
it seems to be the most common restriction.
The new zone is so far added only for x86-64.
For other architectures who don't set up this
new zone nothing changes. Architectures can set a compatibility
define in Kconfig CONFIG_DMA_IS_DMA32 that will define GFP_DMA32
as GFP_DMA. Otherwise it's a nop because on 32bit architectures
it's normally not needed because GFP_NORMAL (=0) is DMA able
enough.
One problem is still that GFP_DMA means different things on different
architectures. e.g. some drivers used to have #ifdef ia64 use GFP_DMA
(trusting it to be 4GB) #elif __x86_64__ (use other hacks like
the swiotlb because 16MB is not enough) ... . This was quite
ugly and is now obsolete.
These should be now converted to use GFP_DMA32 unconditionally. I haven't done
this yet. Or best only use pci_alloc_consistent/dma_alloc_coherent
which will use GFP_DMA32 transparently.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Sat, 5 Nov 2005 16:25:53 +0000 (17:25 +0100)]
[PATCH] x86_64: Update defconfig
Rerun and enable autofs 4, relayfs and softdog
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jeff Garzik [Sat, 5 Nov 2005 03:08:00 +0000 (22:08 -0500)]
[libata] ATAPI pad allocation fixes/cleanup
Use ata_pad_{alloc,free} in two drivers, to factor out common code.
Add ata_pad_{alloc,free} to two other drivers, which needed the padding
but had not been updated.
Jeff Garzik [Sat, 5 Nov 2005 02:39:31 +0000 (21:39 -0500)]
Merge branch 'master'
Calin A. Culianu [Sat, 5 Nov 2005 01:38:04 +0000 (20:38 -0500)]
[PATCH] nvidiafb: Geforce 7800 series support added
This adds support for the Nvidia Geforce 7800 series of cards to the
nvidiafb framebuffer driver. All it does is add the PCI device id for
the 7800, 7800 GTX, 7800 GO, and 7800 GTX GO cards to the module device
table for the nvidiafb.ko driver, so that nvidiafb.ko will actually work
on these cards.
I also added the relevant PCI device ids to linux/pci_ids.h
I tested it on my 7800 GTX here and it works like a charm. I now can
get framebuffer support on this card! Woo hoo!! Nothing like 200x75 text
mode to make your eyes BLEED. ;)
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Sat, 5 Nov 2005 00:32:36 +0000 (16:32 -0800)]
Merge branch 'srp' of /linux/kernel/git/roland/infiniband
Linus Torvalds [Sat, 5 Nov 2005 00:31:54 +0000 (16:31 -0800)]
Merge branch 'for-linus' of /linux/kernel/git/roland/infiniband
Linus Torvalds [Sat, 5 Nov 2005 00:27:50 +0000 (16:27 -0800)]
Merge /linux/kernel/git/paulus/powerpc-merge
Paul Mackerras [Fri, 4 Nov 2005 23:36:59 +0000 (10:36 +1100)]
powerpc: Fix vmlinux.lds.S for 32-bit
We can't currently use asm-ppc/page.h in vmlinux.lds.S, so until
we have a merged page.h, define PAGE_SIZE and KERNELBASE locally.
Also gets rid of some dynamic executable cruft that we had for
32-bit. With -Ttext=$(KERNELBASE) this didn't cause any problem,
but when we changed to putting . = KERNELBASE in the vmlinux.lds.S
this cruft caused the text to get linked at 0xa0 instead of
0xc0000000. Oops.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Paul Mackerras [Fri, 4 Nov 2005 23:33:55 +0000 (10:33 +1100)]
powerpc: Merge smp.c and smp.h
This also moves setup_cpu_maps to setup-common.c (calling it
smp_setup_cpu_maps) and uses it on both 32-bit and 64-bit.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Chuck Lever [Tue, 1 Nov 2005 21:53:32 +0000 (16:53 -0500)]
NFS,SUNRPC,NLM: fix unused variable warnings when CONFIG_SYSCTL is disabled
Fix some dprintk's so that NLM, NFS client, and RPC client compile
cleanly if CONFIG_SYSCTL is disabled.
Test plan:
Compile kernel with CONFIG_NFS enabled and CONFIG_SYSCTL disabled.
Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Tue, 1 Nov 2005 17:24:48 +0000 (12:24 -0500)]
SUNRPC: allow sunrpc.o to link when CONFIG_SYSCTL is disabled
The sunrpc module should build properly even when CONFIG_SYSCTL is
disabled.
Reported by Jan-Benedict Glaw.
Test plan:
Compile kernel with CONFIG_NFS as a module and built-in, and CONFIG_SYSCTL
enabled and disabled.
Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 4 Nov 2005 20:39:36 +0000 (15:39 -0500)]
NFSv4: Teach NFSv4 to cache locks when we hold a delegation
Now that we have a method of dealing with delegation recalls, actually
enable the caching of posix and BSD locks.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 4 Nov 2005 20:38:11 +0000 (15:38 -0500)]
NFSv4: Recover locks too when returning a delegation
Delegations allow us to cache posix and BSD locks, however when the
delegation is recalled, we need to "flush the cache" and send
the cached LOCK requests to the server.
This patch sets up the mechanism for doing so.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 4 Nov 2005 20:35:30 +0000 (15:35 -0500)]
NFSv4: Fix recovery of flock() locks.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 4 Nov 2005 20:35:02 +0000 (15:35 -0500)]
NFSv4: Return any delegations before sillyrenaming the file
I missed this one... Any form of rename will result in a delegation
recall, so it is more efficient to return the one we hold before
trying the rename.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 4 Nov 2005 20:33:50 +0000 (15:33 -0500)]
NFSv4: Fix the handling of the error NFS4ERR_OLD_STATEID
Ensure that we retry the failed operation...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 4 Nov 2005 20:33:38 +0000 (15:33 -0500)]
NFSv4: Fix problem with OPEN_DOWNGRADE
RFC 3530 states that for OPEN_DOWNGRADE "The share_access and share_deny
bits specified must be exactly equal to the union of the share_access and
share_deny bits specified for some subset of the OPENs in effect for
current openowner on the current file.
Setattr is currently violating the NFSv4 rules for OPEN_DOWNGRADE in that
it may cause a downgrade from OPEN4_SHARE_ACCESS_BOTH to
OPEN4_SHARE_ACCESS_WRITE despite the fact that there exists no open file
with O_WRONLY access mode.
Fix the problem by replacing nfs4_find_state() with a modified version of
nfs_find_open_context().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 4 Nov 2005 20:32:58 +0000 (15:32 -0500)]
NFSv4: Fix a race between open() and close()
We must not remove the nfs4_state structure from the inode open lists
before we are in sequence lock.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
David S. Miller [Fri, 4 Nov 2005 19:17:24 +0000 (11:17 -0800)]
[USB]: Make early handoff a final fixup instead of a header one.
At header fixup time, it is not yet legal to ioremap() PCI
device registers, yet that is what this quirk code needs to
do.
Signed-off-by: David S. Miller <davem@davemloft.net>
Oleg Nesterov [Fri, 4 Nov 2005 15:54:30 +0000 (18:54 +0300)]
[PATCH] improve scheduler fairness a bit
Do not transfer remaining time slice to another cpu on process exit.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Fri, 4 Nov 2005 18:42:53 +0000 (10:42 -0800)]
Merge master.kernel.org:/home/rmk/linux-2.6-serial
Linus Torvalds [Fri, 4 Nov 2005 18:40:11 +0000 (10:40 -0800)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
Linus Torvalds [Fri, 4 Nov 2005 18:39:28 +0000 (10:39 -0800)]
Merge /pub/scm/linux/kernel/git/acme/net-2.6
Russell King [Fri, 4 Nov 2005 17:28:34 +0000 (17:28 +0000)]
[PATCH] ARM: Reverted 2918/1: [update] Base port of Comdial MP1000 platfrom
No longer maintained
Russell King [Fri, 4 Nov 2005 17:26:57 +0000 (17:26 +0000)]
[PATCH] ARM: Reverted 2921/1: Support for the RTC / nvram on the Comdial MP1000
No longer maintained
Russell King [Fri, 4 Nov 2005 17:26:56 +0000 (17:26 +0000)]
[PATCH] ARM: Reverted 2919/1: CS8900A ethernet driver modifications for the Comdial MP1000
No longer maintained
Nicolas Pitre [Fri, 4 Nov 2005 17:17:30 +0000 (17:17 +0000)]
[ARM] 3097/1: change library link ordering
Patch from Nicolas Pitre
We have an optimized sha1 routine (arch/arm/lib/sha1.S) meant to
override the generic one in lib/sha1.c.
Unfortunately lib/lib.a is listed _before_ arch/arm/lib/lib.a in the
link argument list and therefore the architecture specific lib functions
are not picked up before the generic versions.
This patch is a quick fix to change that ordering for ARM. Here's what
the kbuild maintainer had to say about it (was also CC'd on lkml):
On Wed, 2 Nov 2005, Sam Ravnborg wrote:
> This looks like an obvious way to achive correct ordering.
> We could change it so arch defines always took precedence but
> the above is so simple that it is not worth the effort.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Todd Poynor [Fri, 4 Nov 2005 17:15:45 +0000 (17:15 +0000)]
[ARM] 3087/1: PXA2xx flash platform device conversion
Patch from Todd Poynor
Add platform devices for flash to Lubbock and Mainstone board files.
Once in place, the two existing mtd map drivers for the boards will be
converted to use a single pxa2xx map driver in the linux-mtd tree.
Take 4: flash_platform_data .map_name vs. .name cleaned up, resync with
merged irda patch context.
Signed-off-by: Todd Poynor <tpoynor@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Dave Jiang [Fri, 4 Nov 2005 17:15:44 +0000 (17:15 +0000)]
[ARM] 3086/1: ixp2xxx error irq handling
Patch from Dave Jiang
This provides support for IXP2xxx error interrupt handling. Previously there was a patch to remove this (although the original stuff was broken). Well, now the error bits are needed again. These are used extensively by the micro-engine drivers according to Deepak and also we will need it for the new EDAC code that Alan Cox is trying to push into the main kernel.
Re-submit of 3072/1, generated against git tree pulled today. AFAICT, this git tree pulled in all the ARM changes that's in arm.diff. Please let me know if there are additional changes. Thx!
Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Nicolas Pitre [Fri, 4 Nov 2005 17:15:43 +0000 (17:15 +0000)]
[ARM] 3094/1: remove PLD stuff from old uaccess code
Patch from Nicolas Pitre
ARM processors that have pld instructions are not using those copy_user
implementation anymore. Let's remove the useless PLD lines which were
half wrong anyway.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Paul Mackerras [Fri, 4 Nov 2005 06:03:39 +0000 (17:03 +1100)]
Merge git://oak/home/sfr/kernels/iseries/work
Stephen Rothwell [Fri, 4 Nov 2005 05:58:59 +0000 (16:58 +1100)]
powerpc: merge tlbflush.h
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Paul Mackerras [Fri, 4 Nov 2005 05:17:32 +0000 (16:17 +1100)]
Merge branch 'for-paulus' of git://kernel/home/michael/src/work/
Paul Mackerras [Fri, 4 Nov 2005 02:28:58 +0000 (13:28 +1100)]
powerpc: Merge smp-tbsync.c (the generic timebase sync routine)
Signed-off-by: Paul Mackerras <paulus@samba.org>
Michael Ellerman [Fri, 4 Nov 2005 01:12:52 +0000 (12:12 +1100)]
Merge with Paulus
Michael Ellerman [Thu, 3 Nov 2005 10:10:48 +0000 (21:10 +1100)]
powerpc: Fix random memory corruption in merged elf.h
The merged verison of ELF_CORE_COPY_REGS is basically the PPC64 version, with
a memset that came from PPC and a few types abstracted out into #defines. But
it's not _quite_ right.
The first problem is we calculate the number of registers with:
nregs = sizeof(struct pt_regs) / sizeof(ELF_GREG_TYPE)
For a 32-bit process on a 64-bit kernel that's bogus because the registers are
64 bits, but ELF_GREG_TYPE is u32, so nregs == 88 which is wrong.
The other problem is the memset, which assumes a struct pt_regs is smaller
than a struct elf_regs. For a 32-bit process on a 64-bit kernel that's false.
The fix is to calculate the number of regs using sizeof(unsigned long), which
should always be right, and just memset the whole damn thing _before_ copying
the registers in.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Michael Ellerman [Fri, 4 Nov 2005 01:09:42 +0000 (12:09 +1100)]
powerpc: Implement smp_release_cpus() in C not asm
There's no reason for smp_release_cpus() to be asm, and most people can make
more sense of C code. Add an extern declaration to smp.h and remove the custom
one in machine_kexec.c
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Linus Torvalds [Fri, 4 Nov 2005 00:25:58 +0000 (16:25 -0800)]
Merge git://oss.sgi.com:8090/oss/git/xfs-2.6
Nathan Scott [Thu, 3 Nov 2005 23:51:01 +0000 (10:51 +1100)]
[XFS] Remove no-longer-used qsort source.
Signed-off-by: Nathan Scott <nathans@sgi.com>
Stephen Rothwell [Thu, 3 Nov 2005 23:20:27 +0000 (10:20 +1100)]
powerpc: merge tlb.h
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Jack Morgenstein [Thu, 3 Nov 2005 22:58:33 +0000 (14:58 -0800)]
[IB] mthca: check P_Key index in modify QP
Make sure that the P_Key index passed into mthca_modify_qp() is
within the device's P_Key table.
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Nathan Scott [Thu, 3 Nov 2005 22:49:07 +0000 (09:49 +1100)]
[XFS] Fix an inode32 regression - if no options are presented, must still
set default flags.
SGI-PV: 945242
SGI-Modid: xfs-linux-melb:xfs-kern:24292a
Signed-off-by: Nathan Scott <nathans@sgi.com>
Ben Dooks [Thu, 3 Nov 2005 21:07:37 +0000 (21:07 +0000)]
[SERIAL] 8250_early.c passing 0 instead of NULL
Fix sparse warning about passing `0` to simple_strtoul()
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Deepak Saxena [Thu, 3 Nov 2005 21:05:39 +0000 (21:05 +0000)]
[ARM] Fix IXDP2x01 config files
IXDP2401 config file has wrong baudrate and both boards have 3 UARTs.
Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Thu, 3 Nov 2005 21:02:39 +0000 (21:02 +0000)]
[ARM] Merge SMP tree
Nicolas Pitre [Thu, 3 Nov 2005 20:40:50 +0000 (20:40 +0000)]
[ARM] 3092/1: remove excessive print format padding
Patch from Nicolas Pitre
Using a llx format to print addresses that might possibly be (only) 36
bits wide make sense. However making it a zero padded 16 char wide
field is a bit excessive and useless.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Thu, 3 Nov 2005 20:32:45 +0000 (20:32 +0000)]
[ARM SMP] Do not clear cpu_vm_mask for VIPT caches
Since we do not invalidate TLBs/caches on MM switches, we should not
clear the cpu_vm_mask for the CPU.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Roland Dreier [Thu, 3 Nov 2005 20:01:18 +0000 (12:01 -0800)]
[IB] umad: fix hot remove of IB devices
Fix hotplug of devices for ib_umad module: when a device goes away,
kill off all MAD agents for open files associated with that device,
and make sure that the device is not touched again after ib_umad
returns from its remove_one function.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Russell King [Thu, 3 Nov 2005 15:48:21 +0000 (15:48 +0000)]
[ARM SMP] Add configuration option for ARMv6K processors
The 'K' extension adds several new instructions to the ARMv6 ISA
which are primerily useful for SMP.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Thu, 3 Nov 2005 11:04:53 +0000 (11:04 +0000)]
[ARM] Fix another build error with IOP3xx platforms
ld doesn't like comments starting with // in its scripts
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Thu, 3 Nov 2005 10:17:44 +0000 (10:17 +0000)]
[ARM] Add Realview default configuration file
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Thu, 3 Nov 2005 10:06:35 +0000 (10:06 +0000)]
[ARM] Fix more 3016/1 breakage
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Michael Ellerman [Thu, 3 Nov 2005 08:34:38 +0000 (19:34 +1100)]
powerpc: Cleanup vpa code
register_vpa() doesn't actually do a VPA register call it just uses the flags
you pass it, so rename it to vpa_call() to be clearer.
We can then define register_vpa() and unregister_vpa() which are both simple
wrappers around vpa_call(). (we'll need unregister_vpa() for kexec soon)
We can then cleanup vpa_init(), and because vpa_init() is only called from
platforms/pseries we remove the definition in asm-ppc64/smp.h.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Roland Dreier [Thu, 3 Nov 2005 06:59:37 +0000 (22:59 -0800)]
[IB] mthca: fix format of FW version
Mellanox has decided that the components of the firmware version are
really meant to be displayed in decimal, e.g. 0x000400070190 is
version 4.7.400. Change the format we use from "%x.%x.%x" to
"%d.%d.%d" to match this convention.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Michael Ellerman [Thu, 3 Nov 2005 06:57:53 +0000 (17:57 +1100)]
powerpc: Add helper functions for synthesising instructions at runtime
There's a few places already, and soon will be more, where we synthesise
branch instructions at runtime. Rather than doing it by hand in each case,
it would make sense to have one implementation.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Paul Mackerras [Thu, 3 Nov 2005 06:04:08 +0000 (17:04 +1100)]
Merge git://oak/home/sfr/kernels/iseries/work
Stephen Rothwell [Thu, 3 Nov 2005 05:59:17 +0000 (16:59 +1100)]
powerpc: merge ucontext.h
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
David Gibson [Wed, 2 Nov 2005 23:13:58 +0000 (10:13 +1100)]
[PATCH] powerpc: Keep fixing merged ipcbuf.h
Oops, replacing the two u64s in struct ipc64_perm with __u32s changed
the alignment of that structure, which could mess up userspace.
Revert to using two unsigned long longs (which is what ppc32 had
originally). ppc64 orignally had two unsigned longs, but long long is
the same size on 64 bit, so this should be ok there too.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>