An arbitrary amount of time can pass between spin_unlock and
fence_wait_any_timeout, so we need to ensure that nobody frees the
fences from under us.
A stress test (rapidly starting and killing hundreds of glxgears
instances) ran into a deadlock in fence_wait_any_timeout after
about an hour, and this race condition appears to be a plausible
cause.
v2: agd: rebase on upstream
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: stable@vger.kernel.org
for (i = 0, count = 0; i < AMDGPU_MAX_RINGS; ++i)
if (fences[i])
- fences[count++] = fences[i];
+ fences[count++] = fence_get(fences[i]);
if (count) {
spin_unlock(&sa_manager->wq.lock);
t = fence_wait_any_timeout(fences, count, false,
MAX_SCHEDULE_TIMEOUT);
+ for (i = 0; i < count; ++i)
+ fence_put(fences[i]);
+
r = (t > 0) ? 0 : t;
spin_lock(&sa_manager->wq.lock);
} else {