- 03 Nov, 2016 1 commit
-
-
Martin Storsjö authored
This makes it match the pattern already used for VP8 MC functions. This also makes the signature match ffmpeg's version of these functions, easing porting of code in both directions. Signed-off-by:
Martin Storsjö <martin@martin.st>
-
- 03 Aug, 2016 8 commits
-
-
Ronald S. Bultje authored
Also a slight change to the ssse3 code, which prevents a theoretical overflow in the sharp filter. Signed-off-by:
Anton Khirnov <anton@khirnov.net>
-
James Almer authored
Roughly 25% faster MC than ssse3 for blocksizes 32 and 64. Reviewed-by:
Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by:
James Almer <jamrial@gmail.com> Signed-off-by:
Anton Khirnov <anton@khirnov.net>
-
Clément Bœsch authored
Signed-off-by:
Anton Khirnov <anton@khirnov.net>
-
James Almer authored
pavgb is an sse integer instruction, so the mmxext flag is enough Signed-off-by:
James Almer <jamrial@gmail.com> Reviewed-by:
"Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by:
Anton Khirnov <anton@khirnov.net>
-
Clément Bœsch authored
Signed-off-by:
Anton Khirnov <anton@khirnov.net>
-
Ronald S. Bultje authored
Signed-off-by:
Anton Khirnov <anton@khirnov.net>
-
Anton Khirnov authored
It only contains the MC SIMD, other SIMD will go into different files.
-
Christophe Gisquet authored
Signed-off-by:
Anton Khirnov <anton@khirnov.net>
-
- 07 Dec, 2013 1 commit
-
-
Ronald S. Bultje authored
(And in future, loopfilter or intra pred could be put in their own respective files also.)
-
- 21 Nov, 2013 1 commit
-
-
Clément Bœsch authored
-
- 15 Nov, 2013 1 commit
-
-
Ronald S. Bultje authored
Originally written by Ronald S. Bultje <rsbultje@gmail.com> and Clément Bœsch <u@pkh.me> Further contributions by: Anton Khirnov <anton@khirnov.net> Diego Biurrun <diego@biurrun.de> Luca Barbato <lu_zero@gentoo.org> Martin Storsjö <martin@martin.st> Signed-off-by:
Luca Barbato <lu_zero@gentoo.org> Signed-off-by:
Anton Khirnov <anton@khirnov.net>
-
- 05 Nov, 2013 1 commit
-
-
Clément Bœsch authored
1789 decicycles in idct_idct_4x4_add_c, 262136 runs, 8 skips 1839 decicycles in idct_idct_4x4_add_c, 524270 runs, 18 skips 1864 decicycles in idct_idct_4x4_add_c, 1048548 runs, 28 skips 529 decicycles in ff_vp9_idct_idct_4x4_add_ssse3, 262138 runs, 6 skips 516 decicycles in ff_vp9_idct_idct_4x4_add_ssse3, 524282 runs, 6 skips 474 decicycles in ff_vp9_idct_idct_4x4_add_ssse3, 1048565 runs, 11 skips (~3.9x faster) 7726 decicycles in idct_idct_8x8_add_c, 1048433 runs, 143 skips 7732 decicycles in idct_idct_8x8_add_c, 2096882 runs, 270 skips 7731 decicycles in idct_idct_8x8_add_c, 4193772 runs, 532 skips 1145 decicycles in ff_vp9_idct_idct_8x8_add_ssse3, 1048549 runs, 27 skips 1137 decicycles in ff_vp9_idct_idct_8x8_add_ssse3, 2097097 runs, 55 skips 1086 decicycles in ff_vp9_idct_idct_8x8_add_ssse3, 4194188 runs, 116 skips (~7.1x faster) Overall decode time before commit: 16.48s user 0.03s system 99% cpu 16.526 total 16.54s user 0.01s system 99% cpu 16.566 total 16.46s user 0.03s system 99% cpu 16.511 total Overall decode time after commit: 16.34s user 0.02s system 99% cpu 16.378 total 16.28s user 0.02s system 99% cpu 16.315 total 16.32s user 0.03s system 99% cpu 16.366 total Tested on i7 920 with 40s 1080p footage.
-
- 08 Oct, 2013 1 commit
-
-
Ronald S. Bultje authored
Signed-off-by:
Michael Niedermayer <michaelni@gmx.at>
-
- 03 Oct, 2013 2 commits
-
-
Ronald S. Bultje authored
Decoding time of ped1080p.webm goes from 11.3sec to 11.1sec.
-
Ronald S. Bultje authored
Decoding time of ped1080p.webm goes from 20.7sec to 11.3sec.
-