powerpc: Move 64bit VDSO to improve context switch performance
On 64bit applications the VDSO is the only thing in segment 0. Since the VDSO
is position independent we can remove the hint and let get_unmapped_area pick
an area. This will mean the vdso will be near other mmaps and will share
an SLB entry:
10000000-
10001000 r-xp
00000000 08:06
5778459 /root/context_switch_64
10010000-
10011000 r--p
00000000 08:06
5778459 /root/context_switch_64
10011000-
10012000 rw-p
00001000 08:06
5778459 /root/context_switch_64
fffa92ae000-
fffa92b0000 rw-p
00000000 00:00 0
fffa92b0000-
fffa9453000 r-xp
00000000 08:06
4334051 /lib64/power6/libc-2.9.so
fffa9453000-
fffa9462000 ---p
001a3000 08:06
4334051 /lib64/power6/libc-2.9.so
fffa9462000-
fffa9466000 r--p
001a2000 08:06
4334051 /lib64/power6/libc-2.9.so
fffa9466000-
fffa947c000 rw-p
001a6000 08:06
4334051 /lib64/power6/libc-2.9.so
fffa947c000-
fffa9480000 rw-p
00000000 00:00 0
fffa9480000-
fffa94a8000 r-xp
00000000 08:06
4333852 /lib64/ld-2.9.so
fffa94b3000-
fffa94b4000 rw-p
00000000 00:00 0
fffa94b4000-
fffa94b7000 r-xp
00000000 00:00 0 [vdso] <----- here I am
fffa94b7000-
fffa94b8000 r--p
00027000 08:06
4333852 /lib64/ld-2.9.so
fffa94b8000-
fffa94bb000 rw-p
00028000 08:06
4333852 /lib64/ld-2.9.so
fffa94bb000-
fffa94bc000 rw-p
00000000 00:00 0
fffe4c10000-
fffe4c25000 rw-p
00000000 00:00 0 [stack]
On a microbenchmark that bounces a token between two 64bit processes over pipes
and calls gettimeofday each iteration (to access the VDSO), our context switch
rate goes from 268k to 277k ctx switches/sec (tested on a 4GHz POWER6).
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>