drm/radeon: add a force flush to delay work when radeon
[ Upstream commit
f461950fdc374a3ada5a63c669d997de4600dffe ]
Although radeon card fence and wait for gpu to finish processing current batch rings,
there is still a corner case that radeon lockup work queue may not be fully flushed,
and meanwhile the radeon_suspend_kms() function has called pci_set_power_state() to
put device in D3hot state.
Per PCI spec rev 4.0 on 5.3.1.4.1 D3hot State.
> Configuration and Message requests are the only TLPs accepted by a Function in
> the D3hot state. All other received Requests must be handled as Unsupported Requests,
> and all received Completions may optionally be handled as Unexpected Completions.
This issue will happen in following logs:
Unable to handle kernel paging request at virtual address
00008800e0008010
CPU 0 kworker/0:3(131): Oops 0
pc = [<
ffffffff811bea5c>] ra = [<
ffffffff81240844>] ps = 0000 Tainted: G W
pc is at si_gpu_check_soft_reset+0x3c/0x240
ra is at si_dma_is_lockup+0x34/0xd0
v0 =
0000000000000000 t0 =
fff08800e0008010 t1 =
0000000000010000
t2 =
0000000000008010 t3 =
fff00007e3c00000 t4 =
fff00007e3c00258
t5 =
000000000000ffff t6 =
0000000000000001 t7 =
fff00007ef078000
s0 =
fff00007e3c016e8 s1 =
fff00007e3c00000 s2 =
fff00007e3c00018
s3 =
fff00007e3c00000 s4 =
fff00007fff59d80 s5 =
0000000000000000
s6 =
fff00007ef07bd98
a0 =
fff00007e3c00000 a1 =
fff00007e3c016e8 a2 =
0000000000000008
a3 =
0000000000000001 a4 =
8f5c28f5c28f5c29 a5 =
ffffffff810f4338
t8 =
0000000000000275 t9 =
ffffffff809b66f8 t10 =
ff6769c5d964b800
t11=
000000000000b886 pv =
ffffffff811bea20 at =
0000000000000000
gp =
ffffffff81d89690 sp =
00000000aa814126
Disabling lock debugging due to kernel taint
Trace:
[<
ffffffff81240844>] si_dma_is_lockup+0x34/0xd0
[<
ffffffff81119610>] radeon_fence_check_lockup+0xd0/0x290
[<
ffffffff80977010>] process_one_work+0x280/0x550
[<
ffffffff80977350>] worker_thread+0x70/0x7c0
[<
ffffffff80977410>] worker_thread+0x130/0x7c0
[<
ffffffff80982040>] kthread+0x200/0x210
[<
ffffffff809772e0>] worker_thread+0x0/0x7c0
[<
ffffffff80981f8c>] kthread+0x14c/0x210
[<
ffffffff80911658>] ret_from_kernel_thread+0x18/0x20
[<
ffffffff80981e40>] kthread+0x0/0x210
Code:
ad3e0008 43f0074a ad7e0018 ad9e0020 8c3001e8 40230101
<
88210000>
4821ed21
So force lockup work queue flush to fix this problem.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zhenneng Li <lizhenneng@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>