From 69102311a57d1fd65cdc4002c55c5d551c799044 Mon Sep 17 00:00:00 2001 From: Jiri Kosina Date: Wed, 6 Aug 2014 16:08:43 -0700 Subject: [PATCH] ./Makefile: tell gcc optimizer to never introduce new data races We have been chasing a memory corruption bug, which turned out to be caused by very old gcc (4.3.4), which happily turned conditional load into a non-conditional one, and that broke correctness (the condition was met only if lock was held) and corrupted memory. This particular problem with that particular code did not happen when never gccs were used. I've brought this up with our gcc folks, as I wanted to make sure that this can't really happen again, and it turns out it actually can. Quoting Martin Jambor : "More current GCCs are more careful when it comes to replacing a conditional load with a non-conditional one, most notably they check that a store happens in each iteration of _a_ loop but they assume loops are executed. They also perform a simple check whether the store cannot trap which currently passes only for non-const variables. A simple testcase demonstrating it on an x86_64 is for example the following: $ cat cond_store.c int g_1 = 1; int g_2[1024] __attribute__((section ("safe_section"), aligned (4096))); int c = 4; int __attribute__ ((noinline)) foo (void) { int l; for (l = 0; (l != 4); l++) { if (g_1) return l; for (g_2[0] = 0; (g_2[0] >= 26); ++g_2[0]) ; } return 2; } int main (int argc, char* argv[]) { if (mprotect (g_2, sizeof(g_2), PROT_READ) == -1) { int e = errno; error (e, e, "mprotect error %i", e); } foo (); __builtin_printf("OK\n"); return 0; } /* EOF */ $ ~/gcc/trunk/inst/bin/gcc cond_store.c -O2 --param allow-store-data-races=0 $ ./a.out OK $ ~/gcc/trunk/inst/bin/gcc cond_store.c -O2 --param allow-store-data-races=1 $ ./a.out Segmentation fault The testcase fails the same at least with 4.9, 4.8 and 4.7. Therefore I would suggest building kernels with this parameter set to zero. I also agree with Jikos that the default should be changed for -O2. I have run most of the SPEC 2k6 CPU benchmarks (gamess and dealII failed, at -O2, not sure why) compiled with and without this option and did not see any real difference between respective run-times" Hopefully the default will be changed in newer gccs, but let's force it for kernel builds so that we are on a safe side even when older gcc are used. The code in question was out-of-tree printk-in-NMI (yeah, surprise suprise, once again) patch written by Petr Mladek, let me quote his comment from our internal bugzilla: "I have spent few days investigating inconsistent state of kernel ring buffer. It went out that it was caused by speculative store generated by gcc-4.3.4. The problem is in assembly generated for make_free_space(). The functions is called the following way: + vprintk_emit(); + log = MAIN_LOG; // with logbuf_lock or log = NMI_LOG; // with nmi_logbuf_lock cont_add(log, ...); + cont_flush(log, ...); + log_store(log, ...); + log_make_free_space(log, ...); If called with log = NMI_LOG then only nmi_log_* global variables are safe to modify but the generated code does store also into (main_)log_* global variables: : 55 push %rbp 89 f6 mov %esi,%esi 48 8b 05 03 99 51 01 mov 0x1519903(%rip),%rax # ffffffff82620868 44 8b 1d ec 98 51 01 mov 0x15198ec(%rip),%r11d # ffffffff82620858 8b 35 36 60 14 01 mov 0x1146036(%rip),%esi # ffffffff8224cfa8 44 8b 35 33 60 14 01 mov 0x1146033(%rip),%r14d # ffffffff8224cfac 4c 8b 2d d0 98 51 01 mov 0x15198d0(%rip),%r13 # ffffffff82620850 4c 8b 25 11 61 14 01 mov 0x1146111(%rip),%r12 # ffffffff8224d098 49 89 c2 mov %rax,%r10 48 21 c2 and %rax,%rdx 48 8b 1d 0c 99 55 01 mov 0x155990c(%rip),%rbx # ffffffff826608a0 49 c1 ea 20 shr $0x20,%r10 48 89 55 d0 mov %rdx,-0x30(%rbp) 44 29 de sub %r11d,%esi 45 29 d6 sub %r10d,%r14d 4c 8b 0d 97 98 51 01 mov 0x1519897(%rip),%r9 # ffffffff82620840 eb 7e jmp ffffffff81107029 [...] 85 ff test %edi,%edi # edi = 1 for NMI_LOG 4c 89 e8 mov %r13,%rax 4c 89 ca mov %r9,%rdx 74 0a je ffffffff8110703d 8b 15 27 98 51 01 mov 0x1519827(%rip),%edx # ffffffff82620860 48 8b 45 d0 mov -0x30(%rbp),%rax 48 39 c2 cmp %rax,%rdx # end of loop 0f 84 da 00 00 00 je ffffffff81107120 [...] 85 ff test %edi,%edi # edi = 1 for NMI_LOG 4c 89 0d 17 97 51 01 mov %r9,0x1519717(%rip) # ffffffff82620840 ^^^^^^^^^^^^^^^^^^^^^^^^^^ KABOOOM 74 35 je ffffffff81107160 It stores log_first_seq when edi == NMI_LOG. This instructions are used also when edi == MAIN_LOG but the store is done speculatively before the condition is decided. It is unsafe because we do not have "logbuf_lock" in NMI context and some other process migh modify "log_first_seq" in parallel" I believe that the best course of action is both - building kernel (and anything multi-threaded, I guess) with that optimization turned off - persuade gcc folks to change the default for future releases Signed-off-by: Jiri Kosina Cc: Martin Jambor Cc: Petr Mladek Cc: Linus Torvalds Cc: Paul E. McKenney Cc: Peter Zijlstra Cc: Marek Polacek Cc: Jakub Jelinek Cc: Steven Noonan Cc: Richard Biener Cc: Dan Carpenter Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Makefile | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Makefile b/Makefile index af2f2fcfe01c..a897c50db515 100644 --- a/Makefile +++ b/Makefile @@ -621,6 +621,9 @@ else KBUILD_CFLAGS += -O2 endif +# Tell gcc to never replace conditional load with a non-conditional one +KBUILD_CFLAGS += $(call cc-option,--param=allow-store-data-races=0) + ifdef CONFIG_READABLE_ASM # Disable optimizations that make assembler listings hard to read. # reorder blocks reorders the control in the function -- 2.20.1