The current x86_64 NUMA memory code is inconsequent when it comes to node
memory ranges. The exact behaviour varies depending on which config option
that is used.
setup_node_bootmem() has start and end as arguments and these are used to
calculate the size of the node like this: (end - start). This is all fine
if end is pointing to the first non-available byte. The problem is that the
current x86_64 code sometimes treats it as the last present byte and sometimes
as the first non-available byte. The result is that some configurations might
lose a page at the end of the range.
This patch tries to fix CONFIG_ACPI_NUMA, CONFIG_K8_NUMA and CONFIG_NUMA_EMU
so they all treat the end variable as the first non-available byte. This is
the same way as the single node code.
The patch is boot tested on dual x86_64 hardware with the above configurations,
but maybe the removed code is needed as some workaround?
Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
limit >>= 16;
limit <<= 24;
limit |= (1<<24)-1;
+ limit++;
if (limit > end_pfn << PAGE_SHIFT)
limit = end_pfn << PAGE_SHIFT;
if (i == numa_fake-1)
sz = (end_pfn<<PAGE_SHIFT) - nodes[i].start;
nodes[i].end = nodes[i].start + sz;
- if (i != numa_fake-1)
- nodes[i].end--;
printk(KERN_INFO "Faking node %d at %016Lx-%016Lx (%LuMB)\n",
i,
nodes[i].start, nodes[i].end,
nd->start = nd->end;
}
if (nd->end > end) {
- if (!(end & 0xfff))
- end--;
nd->end = end;
if (nd->start > nd->end)
nd->start = nd->end;
if (nd->end < end)
nd->end = end;
}
- if (!(nd->end & 0xfff))
- nd->end--;
printk(KERN_INFO "SRAT: Node %u PXM %u %Lx-%Lx\n", node, pxm,
nd->start, nd->end);
}