usb: musb: ux500: optimize DMA callback routine
Skip the use of work queue and call musb_dma_completion() directly from
DMA callback context.
Here follows measurements on a Snowball board with ondemand governor active.
Performance using work queue:
(105 MB) copied, 6.23758 s, 16.8 MB/s
(105 MB) copied, 5.7151 s, 18.3 MB/s
(105 MB) copied, 5.83583 s, 18.0 MB/s
(105 MB) copied, 5.93611 s, 17.7 MB/s
Performance without work queue
(105 MB) copied, 5.62173 s, 18.7 MB/s
(105 MB) copied, 5.61811 s, 18.7 MB/s
(105 MB) copied, 5.57817 s, 18.8 MB/s
(105 MB) copied, 5.58549 s, 18.8 MB/s
Signed-off-by: Per Forlin <per.forlin@linaro.org>
Acked-by: Mian Yousaf Kaukab <mian.yousaf.kaukab@stericsson.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>