- 28 Feb, 2014 10 commits
-
-
Christophe Gisquet authored
The vector dequantization has a test in a loop preventing effective SIMD implementation. By moving it out of the loop, this loop can be DSPized. Therefore, modify the current DSP implementation. In particular, the DSP implementation no longer has to handle null loop sizes. The decode_hf implementations have following timings: For x86 Arrandale: C SSE SSE2 SSE4 win32: 260 162 119 104 win64: 242 N/A 89 72 The arm NEON optimizations follow in a later patch as external asm. The now unused check for the y modifier in arm inline asm is removed from configure.
-
Janne Grunau authored
Based on a patch from Christophe Gisquet. Unrolling of the m == 0 case avoids a possible use of the uninitilized value sum when s->predictor_history is not set. I failed to find a sample for it. It also reduced the cycle count from 220 to 150 on sandy bridge, x86_64 linux, gcc 4.8.2 compared to his patch.
-
Christophe Gisquet authored
Timings for Arrandale: C SSE win32: 2108 334 win64: 1152 322 Factorizing the inner loop with a call/jmp is a >15 cycles cost, even with the jmp destination being aligned. Unrolling for ARCH_X86_64 is a 20 cycles gain. Signed-off-by: Janne Grunau <janne-libav@jannau.net>
-
Christophe Gisquet authored
This change is inspired by x86 asm where it frees a register. Signed-off-by: Janne Grunau <janne-libav@jannau.net>
-
Christophe Gisquet authored
Results for Arrandale/Windows: 32: 1670 -> 316 64: 728 -> 298 Signed-off-by: Janne Grunau <janne-libav@jannau.net>
-
Christophe Gisquet authored
The scaling factor is constant so it is faster to scale the FIR coefficients in the tables during compilation. Signed-off-by: Janne Grunau <janne-libav@jannau.net>
-
Diego Biurrun authored
-
Diego Biurrun authored
None of the encoder bits are arch-optimized.
-
Diego Biurrun authored
No permutation is necessary for the FDCT.
-
Diego Biurrun authored
-
- 27 Feb, 2014 1 commit
-
-
Diego Biurrun authored
This also avoids a macro name clash and related warning on ARM.
-
- 26 Feb, 2014 3 commits
-
-
Anton Khirnov authored
-
Diego Biurrun authored
These are already covered through dependencies specified in configure.
-
Andrew Kelley authored
Signed-off-by: Anton Khirnov <anton@khirnov.net>
-
- 25 Feb, 2014 4 commits
-
-
Diego Biurrun authored
-
Anton Khirnov authored
Based on a patch by Andrew Kelley <superjoe30@gmail.com> Signed-off-by: Diego Biurrun <diego@biurrun.de>
-
Luca Barbato authored
Signed-off-by: Diego Biurrun <diego@biurrun.de>
-
Diego Biurrun authored
-
- 24 Feb, 2014 14 commits
-
-
Janne Grunau authored
-
Vittorio Giovara authored
-
Vittorio Giovara authored
-
Anton Khirnov authored
-
Anton Khirnov authored
-
Anton Khirnov authored
-
Derek Buitenhuis authored
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Derek Buitenhuis authored
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Derek Buitenhuis authored
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Derek Buitenhuis authored
Framerate is now a sane rational instead of an integer, and inputDepth is changed to what it actually is. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Anton Khirnov authored
-
Anton Khirnov authored
-
Anton Khirnov authored
-
Anton Khirnov authored
-
- 23 Feb, 2014 7 commits
-
-
James Almer authored
Based on x264 code Signed-off-by: James Almer <jamrial@gmail.com>
-
James Almer authored
Based on x264 code Signed-off-by: James Almer <jamrial@gmail.com>
-
James Almer authored
Signed-off-by: James Almer <jamrial@gmail.com>
-
Janne Grunau authored
Moving cpunop from the HAVE_LIST to the ARCH_EXT_LIST_X86 has the side effect of enabling it. The semantics of the check have to be changed from enable if successful to disable if unsuccessful. This was missing in 2b0bb699 causing build errors with nasm.
-
Luca Barbato authored
-
Luca Barbato authored
-
Luca Barbato authored
beta_offset is pre-multiplied by 2.
-
- 22 Feb, 2014 1 commit
-
-
Anton Khirnov authored
-