drm/vc4: improve throughput by pipelining binning and rendering jobs
The hardware provides us with separate threads for binning and
rendering, and the existing model waits for them both to complete
before submitting the next job.
Splitting the binning and rendering submissions reduces idle time and
gives us approx 20-30% speedup with some x11perf tests such as -line10
and -tilerect1. Improves openarena performance by 1.01897% +/-
0.247857% (n=16).
Thanks to anholt for suggesting this.
v2: Rebase on the spurious resets fix (change by anholt).
Signed-off-by: Varad Gautam <varadgautam@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Anholt <eric@anholt.net>