x86: Force inlining of atomic ops
With both gcc 4.7.2 and 4.9.2, sometimes gcc mysteriously
doesn't inline very small functions we expect to be inlined:
$ nm --size-sort vmlinux | grep -iF ' t ' | uniq -c | grep -v '^
*1 ' | sort -rn 473
000000000000000b t spin_unlock_irqrestore
449
000000000000005f t rcu_read_unlock
355
0000000000000009 t atomic_inc <== THIS
353
000000000000006e t rcu_read_lock
350
0000000000000075 t rcu_read_lock_sched_held
291
000000000000000b t spin_unlock
266
0000000000000019 t arch_local_irq_restore
215
000000000000000b t spin_lock
180
0000000000000011 t kzalloc
165
0000000000000012 t list_add_tail
161
0000000000000019 t arch_local_save_flags
153
0000000000000016 t test_and_set_bit
134
000000000000000b t spin_unlock_irq
134
0000000000000009 t atomic_dec <== THIS
130
000000000000000b t spin_unlock_bh
122
0000000000000010 t brelse
120
0000000000000016 t test_and_clear_bit
120
000000000000000b t spin_lock_irq
119
000000000000001e t get_dma_ops
117
0000000000000053 t cpumask_next
116
0000000000000036 t kref_get
114
000000000000001a t schedule_work
106
000000000000000b t spin_lock_bh
103
0000000000000019 t arch_local_irq_disable
...
Note sizes of marked functions. They are merely 9 bytes long!
Selecting function with 'atomic' in their names:
355
0000000000000009 t atomic_inc
134
0000000000000009 t atomic_dec
98
0000000000000014 t atomic_dec_and_test
31
000000000000000e t atomic_add_return
27
000000000000000a t atomic64_inc
26
000000000000002f t kmap_atomic
24
0000000000000009 t atomic_add
12
0000000000000009 t atomic_sub
10
0000000000000021 t __atomic_add_unless
10
000000000000000a t atomic64_add
5
000000000000001f t __atomic_add_unless.constprop.7
5
000000000000000a t atomic64_dec
4
000000000000001f t __atomic_add_unless.constprop.18
4
000000000000001f t __atomic_add_unless.constprop.12
4
000000000000001f t __atomic_add_unless.constprop.10
3
000000000000001f t __atomic_add_unless.constprop.13
3
0000000000000011 t atomic64_add_return
2
000000000000001f t __atomic_add_unless.constprop.9
2
000000000000001f t __atomic_add_unless.constprop.8
2
000000000000001f t __atomic_add_unless.constprop.6
2
000000000000001f t __atomic_add_unless.constprop.5
2
000000000000001f t __atomic_add_unless.constprop.3
2
000000000000001f t __atomic_add_unless.constprop.22
2
000000000000001f t __atomic_add_unless.constprop.14
2
000000000000001f t __atomic_add_unless.constprop.11
2
000000000000001e t atomic_dec_if_positive
2
0000000000000014 t atomic_inc_and_test
2
0000000000000011 t atomic_add_return.constprop.4
2
0000000000000011 t atomic_add_return.constprop.17
2
0000000000000011 t atomic_add_return.constprop.16
2
000000000000000d t atomic_inc.constprop.4
2
000000000000000c t atomic_cmpxchg
This patch fixes this for x86 atomic ops via
s/inline/__always_inline/. This decreases allyesconfig kernel by
about 25k:
text data bss dec hex filename
82399481 22255416 20627456 125282353 777a831 vmlinux.before
82375570 22255544 20627456 125258570 7774b4a vmlinux
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1431080762-17797-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>