- 02 Dec, 2017 1 commit
-
-
Martin Vignali authored
-
- 25 Nov, 2017 1 commit
-
-
Mikulas Patocka authored
The commit b7c16a3f ("x86: fft: Port to cpuflags") breaks the opus decoder in ffmpeg when compiling for 3dnow. The output is audible, but there's a lot of noise. The reason for the breakage is that the commit unintentionally changed the INTERL macro so that it is empty when compiling for 3dnow. This patch fixes it. Signed-off-by:
Mikulas Patocka <mikulas@twibright.com> Signed-off-by:
James Almer <jamrial@gmail.com>
-
- 23 Nov, 2017 1 commit
-
-
Martin Vignali authored
speed seems to be similar, but simplify code
-
- 21 Nov, 2017 10 commits
-
-
James Almer authored
Remove the broadcast instructions as well now that they are wide enough. Signed-off-by:
James Almer <jamrial@gmail.com>
-
James Almer authored
Signed-off-by:
James Almer <jamrial@gmail.com>
-
Martin Vignali authored
-
Martin Vignali authored
-
Martin Vignali authored
-
Martin Vignali authored
-
Martin Vignali authored
-
Martin Vignali authored
better func separator and add comment for the restore rgb planes10 declaration
-
Martin Vignali authored
-
Martin Vignali authored
-
- 20 Nov, 2017 1 commit
-
-
James Almer authored
jpeg2000_ict_float_c: 2296.0 jpeg2000_ict_float_sse: 628.0 jpeg2000_ict_float_avx: 317.0 jpeg2000_ict_float_fma3: 262.0 Signed-off-by:
James Almer <jamrial@gmail.com>
-
- 14 Nov, 2017 1 commit
-
-
Michael Niedermayer authored
Fixes: out of array read Fixes: 3516/attachment-311488.dat Found-by: Insu Yun, Georgia Tech. Tested-by: wuninsu@gmail.com Signed-off-by:
Michael Niedermayer <michael@niedermayer.cc>
-
- 13 Nov, 2017 1 commit
-
-
Thomas Köppe authored
Variables used in inline assembly need to be marked with attribute((used)). Static constants already were, via the define of DECLARE_ASM_CONST. But DECLARE_ALIGNED does not add this attribute, and some of the variables defined with it are const only used in inline assembly, and therefore appeared dead. This change adds a macro DECLARE_ASM_ALIGNED that marks variables as used. This change makes FFMPEG work with Clang's ThinLTO. Signed-off-by:
Michael Niedermayer <michael@niedermayer.cc>
-
- 06 Nov, 2017 2 commits
-
-
Martin Vignali authored
libavcodec/lossless_video_dsp : cosmetic add better separator for each function, in order to make reading of the asm file easier
-
Martin Vignali authored
-
- 30 Oct, 2017 1 commit
-
-
James Almer authored
Fixes build with old nasm/yasm. Signed-off-by:
James Almer <jamrial@gmail.com>
-
- 29 Oct, 2017 1 commit
-
-
Martin Vignali authored
-
- 05 Oct, 2017 1 commit
-
-
James Almer authored
Fixes assembling with old yasm.
-
- 04 Oct, 2017 2 commits
-
-
Michael Niedermayer authored
Add () to regsize define Suggested-by:
Henrik Gramner <henrik@gramner.com> Signed-off-by:
Michael Niedermayer <michael@niedermayer.cc>
-
Michael Niedermayer authored
Fixes out of array access Fixes: crash-huf.avi Regression since: 6b41b441 This could also be fixed by adding checks in the C code that calls the dsp Found-by:
Zhibin Hu and 连一汉 <lianyihan@360.cn> Signed-off-by:
Michael Niedermayer <michael@niedermayer.cc>
-
- 03 Oct, 2017 1 commit
-
-
Martin Vignali authored
Also modify the required alignment, to 32 instead of 16 for several codecs Signed-off-by:
James Almer <jamrial@gmail.com>
-
- 01 Oct, 2017 1 commit
-
-
Martin Vignali authored
Signed-off-by:
James Almer <jamrial@gmail.com>
-
- 19 Sep, 2017 1 commit
-
-
Henrik Gramner authored
Tested with "checkasm --test=exrdsp -bench" Before: reorder_pixels_c: 5187.8 reorder_pixels_sse2: 377.0 reorder_pixels_avx2: 331.3 After: reorder_pixels_c: 5181.5 reorder_pixels_sse2: 377.0 reorder_pixels_avx2: 313.8 Signed-off-by:
James Almer <jamrial@gmail.com>
-
- 17 Sep, 2017 2 commits
-
-
James Almer authored
Make dst be the first parameter and src const. It's more in line with the rest of the codebase. Signed-off-by:
James Almer <jamrial@gmail.com>
-
Martin Vignali authored
Signed-off-by:
James Almer <jamrial@gmail.com>
-
- 21 Aug, 2017 1 commit
-
-
Michael Niedermayer authored
Adds a diff_pixels_unaligned() Fixes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=872503Signed-off-by:
Michael Niedermayer <michael@niedermayer.cc>
-
- 19 Aug, 2017 1 commit
-
-
Ivan Kalvachev authored
opus_pvq_search: Restore the proper use of conditional define and simplify the function name suffix handling. Using named define properly documents the code paths. It also avoids passing additional numbered arguments through multiple levels of macro templates. The suffix handling is done by concatenation, like in other asm functions and avoid having two separate "cglobal" defines. Signed-off-by:
Ivan Kalvachev <ikalvachev@gmail.com>
-
- 18 Aug, 2017 4 commits
-
-
Rostislav Pehlivanov authored
This splits the asm function into exact and non-exact version. The exact version is as fast or faster on newer CPUs (which EXTERNAL_AVX_FAST describes well) whilst the non-exact version is faster than the exact on older CPUs. Also fixes yasm compilation which doesn't accept !cpuflags(avx) syntax. Signed-off-by:
Rostislav Pehlivanov <atomnuker@gmail.com>
-
Rostislav Pehlivanov authored
Makes the search produce idential results with the C version. Signed-off-by:
Rostislav Pehlivanov <atomnuker@gmail.com>
-
Rostislav Pehlivanov authored
There's no point in toggling it, even for debugging. Its just worse. Signed-off-by:
Rostislav Pehlivanov <atomnuker@gmail.com>
-
Ivan Kalvachev authored
Explanation on the workings and methods used by the Pyramid Vector Quantization Search function could be found in the following Work-In-Progress mail threads: http://ffmpeg.org/pipermail/ffmpeg-devel/2017-June/212146.html http://ffmpeg.org/pipermail/ffmpeg-devel/2017-June/212816.html http://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213030.html http://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213436.htmlSigned-off-by:
Ivan Kalvachev <ikalvachev@gmail.com>
-
- 30 Jul, 2017 1 commit
-
-
Rostislav Pehlivanov authored
2.5ms frames: Before (c): 2638 decicycles in postrotate, 2097040 runs, 112 skips After (sse3): 1467 decicycles in postrotate, 2097083 runs, 69 skips After (avx2): 1244 decicycles in postrotate, 2097085 runs, 67 skips 5ms frames: Before (c): 4987 decicycles in postrotate, 1048371 runs, 205 skips After (sse3): 2644 decicycles in postrotate, 1048509 runs, 67 skips After (avx2): 2031 decicycles in postrotate, 1048523 runs, 53 skips 10ms frames: Before (c): 9153 decicycles in postrotate, 523575 runs, 713 skips After (sse3): 5110 decicycles in postrotate, 523726 runs, 562 skips After (avx2): 3738 decicycles in postrotate, 524223 runs, 65 skips 20ms frames: Before (c): 17857 decicycles in postrotate, 261866 runs, 278 skips After (sse3): 10041 decicycles in postrotate, 261746 runs, 398 skips After (avx2): 7050 decicycles in postrotate, 262116 runs, 28 skips Improves total decoding performance for real world content by 9% with avx2. Signed-off-by:
Rostislav Pehlivanov <atomnuker@gmail.com>
-
- 21 Jul, 2017 1 commit
-
-
Wan-Teh Chang authored
This file already has #include "idctdsp.h", which is resolved to the idctdsp.h header in the directory where this file resides by compilers. Two other files in this directory, libavcodec/x86/idctdsp_init.c and libavcodec/x86/xvididct_init.c, also rely on #include "idctdsp.h" working this way. Signed-off-by:
Wan-Teh Chang <wtc@google.com> Signed-off-by:
Michael Niedermayer <michael@niedermayer.cc>
-
- 05 Jul, 2017 4 commits
-
-
James Almer authored
This reverts commit 24bb7db4. noise has to after all be sign extended, not zero extended, on tests other than checkasm. Fixes most aac tests broken by the now reverted commit.
-
James Almer authored
noise needs to be zero extended and it can be done implicitly as a side effect in a subsequent instruction. Signed-off-by:
James Almer <jamrial@gmail.com>
-
James Almer authored
Tested-by:
Michael Niedermayer <michael@niedermayer.cc> Signed-off-by:
James Almer <jamrial@gmail.com>
-
James Almer authored
Reviewed-by:
Paul B Mahol <onemda@gmail.com> Signed-off-by:
James Almer <jamrial@gmail.com>
-