- 01 Mar, 2017 12 commits
-
-
James Almer authored
Signed-off-by: James Almer <jamrial@gmail.com>
-
Ganesh Ajjanagadde authored
-
Diego Biurrun authored
libavutil uses pthreads in the buffer code (abstracted through a header).
-
Diego Biurrun authored
-
Diego Biurrun authored
-
Diego Biurrun authored
None of them are specific to the YASM assembler.
-
Diego Biurrun authored
-
Diego Biurrun authored
-
Diego Biurrun authored
Also drop stray duplicate OBJCC config.mak entry.
-
Diego Biurrun authored
This fixes several warnings of the sort warning: label alone on a line without a colon might be in error
-
Diego Biurrun authored
Previously, all link-time dependencies were added for all libraries, resulting in bogus link-time dependencies since not all dependencies are shared across libraries. Also, in some cases like libavutil, not all dependencies were taken into account, resulting in some cases of underlinking. To address all this mess a machinery is added for tracking which dependency belongs to which library component and then leveraged to determine correct dependencies for all individual libraries.
-
Diego Biurrun authored
-
- 28 Feb, 2017 5 commits
-
-
Michael Niedermayer authored
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
-
Diego Biurrun authored
Leaving those variables in an undefined state allows them getting implicitly enabled when they are declared as weak dependencies of other components. In that case, the library check is not run and required linker flags are not added, resulting in a failing build. Fixes linking when enabling libfreetype without libfontconfig.
-
Diego Biurrun authored
The codec used in those files is WMV3/WMV9, not WMV2/WMV8.
-
Luca Barbato authored
-
Ben Chang authored
The map is a sparse array and does not need a empty element to terminate it. The empty element is stored after the last one inserted in the list, overwriting whichever element was next with zeros. Bug-Id: 1029 Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
-
- 27 Feb, 2017 7 commits
-
-
Diego Biurrun authored
This allows dropping /dev/null as reference value when no output is generated.
-
Diego Biurrun authored
-
Luca Barbato authored
And use av_malloc_array.
-
Luca Barbato authored
-
Luca Barbato authored
-
Diego Biurrun authored
-
Diego Biurrun authored
This fixes the test with mmxext disabled because the current reference frame hashes correspond to the non-bitexact mmxext optimizations.
-
- 25 Feb, 2017 4 commits
-
-
Anton Khirnov authored
This error is treated specially by the API. CC: libav-stable@libav.org
-
James Almer authored
The size field in the header/footer accounts for the entire APE tag structure except the 32 bytes from header, for compatibility with APEv1. Signed-off-by: James Almer <jamrial@gmail.com> CC: libav-stable@libav.org Signed-off-by: Anton Khirnov <anton@khirnov.net>
-
James Almer authored
According to the spec[1], a value of 0 means the footer is present and a value of 1 means it's absent, the exact opposite of header presence flag where 1 means present and 0 absent. The reason for this is compatibility with APEv1 tags, where there's no header, footer presence was mandatory for all files, and the flags field was a zeroed reserved field. [1] http://wiki.hydrogenaud.io/index.php?title=Ape_Tags_FlagsSigned-off-by: James Almer <jamrial@gmail.com> CC: libav-stable@libav.org Signed-off-by: Anton Khirnov <anton@khirnov.net>
-
Anton Khirnov authored
Currently it incorrectly compares bits with bytes. Also, move the check right before where it's relevant, so that the correct number of remaining bits is used. CC: libav-stable@libav.org
-
- 24 Feb, 2017 3 commits
-
-
John Stebbins authored
avio_skip returns file position and overflows int
-
John Stebbins authored
-
Diego Biurrun authored
-
- 23 Feb, 2017 9 commits
-
-
Martin Storsjö authored
This matches the order they are in the 16 bpp version. There they are in this order, to make sure we access them in the same order they are declared, easing loading only half of the coefficients at a time. This makes the 8 bpp version match the 16 bpp version better. Signed-off-by: Martin Storsjö <martin@martin.st>
-
Martin Storsjö authored
This matches the order they are in the 16 bpp version. There they are in this order, to make sure we access them in the same order they are declared, easing loading only half of the coefficients at a time. This makes the 8 bpp version match the 16 bpp version better. Signed-off-by: Martin Storsjö <martin@martin.st>
-
Martin Storsjö authored
All elements are used pairwise, except for the first one. Previously, the 16th element was unused. Move the unused element to the second slot, to make the later element pairs not split across registers. This simplifies loading only parts of the coefficients, reducing the difference to the 16 bpp version. Signed-off-by: Martin Storsjö <martin@martin.st>
-
Martin Storsjö authored
All elements are used pairwise, except for the first one. Previously, the 16th element was unused. Move the unused element to the second slot, to make the later element pairs not split across registers. This simplifies loading only parts of the coefficients, reducing the difference to the 16 bpp version. Signed-off-by: Martin Storsjö <martin@martin.st>
-
Martin Storsjö authored
The idct32x32 function actually pushed d8-d15 onto the stack even though it didn't clobber them; there are plenty of registers that can be used to allow keeping all the idct coefficients in registers without having to reload different subsets of them at different stages in the transform. After this, we still can skip pushing d12-d15. Before: vp9_inv_dct_dct_32x32_sub32_add_neon: 8128.3 After: vp9_inv_dct_dct_32x32_sub32_add_neon: 8053.3 Signed-off-by: Martin Storsjö <martin@martin.st>
-
Martin Storsjö authored
The idct32x32 function actually pushed q4-q7 onto the stack even though it didn't clobber them; there are plenty of registers that can be used to allow keeping all the idct coefficients in registers without having to reload different subsets of them at different stages in the transform. Since the idct16 core transform avoids clobbering q4-q7 (but clobbers q2-q3 instead, to avoid needing to back up and restore q4-q7 at all in the idct16 function), and the lanewise vmul needs a register in the q0-q3 range, we move the stored coefficients from q2-q3 into q4-q5 while doing idct16. While keeping these coefficients in registers, we still can skip pushing q7. Before: Cortex A7 A8 A9 A53 vp9_inv_dct_dct_32x32_sub32_add_neon: 18553.8 17182.7 14303.3 12089.7 After: vp9_inv_dct_dct_32x32_sub32_add_neon: 18470.3 16717.7 14173.6 11860.8 Signed-off-by: Martin Storsjö <martin@martin.st>
-
Martin Storsjö authored
For this case, with 8 inputs but only changing 4 of them, we can fit all 16 input pixels into a q register, and still have enough temporary registers for doing the loop filter. The wd=8 filters would require too many temporary registers for processing all 16 pixels at once though. Before: Cortex A7 A8 A9 A53 vp9_loop_filter_mix2_v_44_16_neon: 289.7 256.2 237.5 181.2 After: vp9_loop_filter_mix2_v_44_16_neon: 221.2 150.5 177.7 138.0 Signed-off-by: Martin Storsjö <martin@martin.st>
-
Martin Storsjö authored
This is one cycle faster in total, and three instructions fewer. Before: vp9_loop_filter_mix2_v_44_16_neon: 123.2 After: vp9_loop_filter_mix2_v_44_16_neon: 122.2 Signed-off-by: Martin Storsjö <martin@martin.st>
-
Martin Storsjö authored
The theoretical maximum value of E is 193, so we can just saturate the addition to 255. Before: Cortex A7 A8 A9 A53 A53/AArch64 vp9_loop_filter_v_4_8_neon: 143.0 127.7 114.8 88.0 87.7 vp9_loop_filter_v_8_8_neon: 241.0 197.2 173.7 140.0 136.7 vp9_loop_filter_v_16_8_neon: 497.0 419.5 379.7 293.0 275.7 vp9_loop_filter_v_16_16_neon: 965.2 818.7 731.4 579.0 452.0 After: vp9_loop_filter_v_4_8_neon: 136.0 125.7 112.6 84.0 83.0 vp9_loop_filter_v_8_8_neon: 234.0 195.5 171.5 136.0 133.7 vp9_loop_filter_v_16_8_neon: 490.0 417.5 377.7 289.0 271.0 vp9_loop_filter_v_16_16_neon: 951.2 814.7 732.3 571.0 446.7 Signed-off-by: Martin Storsjö <martin@martin.st>
-