x86: vc1: fix and enable optimised loop filter
The problem is that the ssse3 psign instruction does the wrong thing here. Commit ea60dfe2 incorrectly removed a macro emulating this instruction for pre-ssse3 code. However, the emulation is incorrect, and the code relies on the behaviour of the macro. Specifically, the psign sets destination elements to zero where the corresponding source element is zero, whereas the emulation only negates destination elements where the source is negative. Furthermore, the PSIGNW_MMX macro in x86util.asm is totally bogus, which is why the original VC-1 code had an additional right shift when using it. Since the psign instruction cannot be used here, skip all the macro hell and use the working instruction sequence directly. None of this was noticed due a stray return statement in ff_vc1dsp_init_mmx() which meant that only the mmx version of the loop filter was ever used (before being removed in ea60dfe2). Signed-off-by: Mans Rullgard <mans@mansr.com>
Showing
Please
register
or
sign in
to comment