x86: sbrdsp: implement SSE qmf_deint_bfly
From 312 to 89/68 (sse/sse2) cycles on Arrandale and Win64.
Sandybridge: 68/47 cycles.
Having a loop counter is a 7 cycle gain.
Unrolling is another 7 cycle gain.
Working in reverse scan is another 6 cycles.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Showing
Please
register
or
sign in
to comment