x86: set X86_FEATURE_TSC_RELIABLE
authorIngo Molnar <mingo@elte.hu>
Thu, 26 Feb 2009 19:16:58 +0000 (20:16 +0100)
committerIngo Molnar <mingo@elte.hu>
Thu, 26 Feb 2009 20:20:25 +0000 (21:20 +0100)
commit83ce400928680a6c8123d492684b27857f5a2d95
tree384dfa725400a13b335204baa819a8741c47e0c4
parentb342501cd31e5546d0c9ca8ceff5ded1832f9e5b
x86: set X86_FEATURE_TSC_RELIABLE

If the TSC is constant and non-stop, also set it reliable.

(We will turn this off in DMI quirks for multi-chassis systems)

The performance number on a 16-way Nehalem system running
32 tasks that context-switch between each other is significant:

   sched_clock_stable=0 sched_clock_stable=1
   ....................         ....................
   22.456925 million/sec        24.306972 million/sec   [+8.2%]

lmbench's "lat_ctx -s 0 2" goes from 0.63 microseconds to
0.59 microseconds - a 6.7% increase in context-switching
performance.

Perfstat of 1 million pipe context switches between two tasks:

 Performance counter stats for './pipe-test-1m':

       [before]           [after]
   ............      ............
   37621.421089      36436.848378    task clock ticks     (msecs)

              0                 0    CPU migrations       (events)
        2000274           2000189    context switches     (events)
            194               193    pagefaults           (events)
     8433799643        8171016416    CPU cycles           (events) -3.21%
     8370133368        8180999694    instructions         (events) -2.31%
        4158565           3895941    cache references     (events) -6.74%
          44312             46264    cache misses         (events)

    2349.287976       2279.362465    wall-time            (msecs)  -3.06%

The speedup comes straight from the reduction in the instruction
count. sched_clock_cpu() got simpler and the whole workload thus
executes faster.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
arch/x86/kernel/cpu/intel.c