sparc64: Convert BUG_ON to warning
Pagefault handling has a BUG_ON path that panics the system. Convert it to
a warning instead. There is no need to bring down the system for this kind
of failure.
The following was hit while running:
perf sched record -g -- make -j 16
[
3609412.782801] kernel BUG at /opt/dahern/linux.git/arch/sparc/mm/fault_64.c:416!
[
3609412.782833] \|/ ____ \|/
[
3609412.782833] "@'/ .. \`@"
[
3609412.782833] /_| \__/ |_\
[
3609412.782833] \__U_/
[
3609412.782870] cat(4516): Kernel bad sw trap 5 [#1]
[
3609412.782889] CPU: 0 PID: 4516 Comm: cat Tainted: G E 4.1.0-rc8+ #6
[
3609412.782909] task:
fff8000126e31f80 ti:
fff8000110d90000 task.ti:
fff8000110d90000
[
3609412.782931] TSTATE:
0000004411001603 TPC:
000000000096b164 TNPC:
000000000096b168 Y:
0000004e Tainted: G E
[
3609412.782964] TPC: <do_sparc64_fault+0x5e4/0x6a0>
[
3609412.782979] g0:
000000000096abe0 g1:
0000000000d314c4 g2:
0000000000000000 g3:
0000000000000001
[
3609412.783009] g4:
fff8000126e31f80 g5:
fff80001302d2000 g6:
fff8000110d90000 g7:
00000000000000ff
[
3609412.783045] o0:
0000000000aff6a8 o1:
00000000000001a0 o2:
0000000000000001 o3:
0000000000000054
[
3609412.783080] o4:
fff8000100026820 o5:
0000000000000001 sp:
fff8000110d935f1 ret_pc:
000000000096b15c
[
3609412.783117] RPC: <do_sparc64_fault+0x5dc/0x6a0>
[
3609412.783137] l0:
000007feff996000 l1:
0000000000030001 l2:
0000000000000004 l3:
fff8000127bd0120
[
3609412.783174] l4:
0000000000000054 l5:
fff8000127bd0188 l6:
0000000000000000 l7:
fff8000110d9dba8
[
3609412.783210] i0:
fff8000110d93f60 i1:
fff8000110ca5530 i2:
000000000000003f i3:
0000000000000054
[
3609412.783244] i4:
fff800010000081a i5:
fff8000100000398 i6:
fff8000110d936a1 i7:
0000000000407c6c
[
3609412.783286] I7: <sparc64_realfault_common+0x10/0x20>
[
3609412.783308] Call Trace:
[
3609412.783329] [
0000000000407c6c] sparc64_realfault_common+0x10/0x20
[
3609412.783353] Disabling lock debugging due to kernel taint
[
3609412.783379] Caller[
0000000000407c6c]: sparc64_realfault_common+0x10/0x20
[
3609412.783449] Caller[
fff80001002283e4]: 0xfff80001002283e4
[
3609412.783471] Instruction DUMP:
921021a0 7feaff91 901222a8 <
91d02005>
82086100 02f87f7b 808a2873 81cfe008 01000000
[
3609412.783542] Kernel panic - not syncing: Fatal exception
[
3609412.784605] Press Stop-A (L1-A) to return to the boot prom
[
3609412.784615] ---[ end Kernel panic - not syncing: Fatal exception
With this patch rather than a panic I occasionally get something like this:
perf sched record -g -m 1024 -- make -j N
where N is based on number of cpus (128 to 1024 for a T7-4 and 8 for an 8 cpu
VM on a T5-2).
WARNING: CPU: 211 PID: 52565 at /opt/dahern/linux.git/arch/sparc/mm/fault_64.c:417 do_sparc64_fault+0x340/0x70c()
address (
7feffcd6000) != regs->tpc (
fff80001004873c0)
Modules linked in: ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_reject_ipv6 xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 cdc_ether usbnet mii ixgbe mdio igb i2c_algo_bit i2c_core ptp crc32c_sparc64 camellia_sparc64 des_sparc64 des_generic md5_sparc64 sha512_sparc64 sha1_sparc64 uio_pdrv_genirq uio usb_storage mpt3sas scsi_transport_sas raid_class aes_sparc64 sunvnet sunvdc sha256_sparc64(E) sha256_generic(E)
CPU: 211 PID: 52565 Comm: ld Tainted: G W E 4.1.0-rc8+ #19
Call Trace:
[
000000000045ce30] warn_slowpath_common+0x7c/0xa0
[
000000000045ceec] warn_slowpath_fmt+0x30/0x40
[
000000000098ad64] do_sparc64_fault+0x340/0x70c
[
0000000000407c2c] sparc64_realfault_common+0x10/0x20
---[ end trace
62ee02065a01a049 ]---
ld[52565]: segfault at
fff80001004873c0 ip
fff80001004873c0 (rpc
fff8000100158868) sp
000007feffcd70e1 error 30002 in libc-2.12.so[
fff8000100410000+184000]
The segfault is horrible, but better than a system panic.
An 8-cpu VM on a T5-2 also showed the above traces from time to time,
so it is a general problem and not specific to the T7 or baremetal.
Signed-off-by: David Ahern <david.ahern@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>