- 26 Mar, 2014 40 commits
-
-
Michael Niedermayer authored
* commit 'fcf5fc44': truehd: tune VLC decoding for ARM. Conflicts: libavcodec/mlpdec.c See: e555e1bcMerged-by: Michael Niedermayer <michaelni@gmx.at>
-
Ben Avison authored
Verified with profiling that this doesn't have a measurable effect upon overall performance. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
Michael Niedermayer authored
* commit '483321fe': truehd: add hand-scheduled ARM asm version of ff_mlp_rematrix_channel. See: 89135716Merged-by: Michael Niedermayer <michaelni@gmx.at>
-
Ben Avison authored
Profiling results for overall audio decode and the rematrix_channels function in particular are as follows: Before After Mean StdDev Mean StdDev Confidence Change 6:2 total 370.8 17.0 348.8 20.1 99.9% +6.3% 6:2 function 46.4 8.4 45.8 6.6 18.0% +1.2% (insignificant) 8:2 total 343.2 19.0 339.1 15.4 54.7% +1.2% (insignificant) 8:2 function 38.9 3.9 40.2 6.9 52.4% -3.2% (insignificant) 6:6 total 658.4 15.7 604.6 20.8 100.0% +8.9% 6:6 function 109.0 8.7 59.5 5.4 100.0% +83.3% 8:8 total 896.2 24.5 766.4 17.6 100.0% +16.9% 8:8 function 223.4 12.8 93.8 5.0 100.0% +138.3% The assembly version has also been tested with a fuzz tester to ensure that any combinations of inputs not exercised by my available test streams still generate mathematically identical results to the C version. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
Michael Niedermayer authored
* commit '4e5aa080': truehd: break out part of rematrix_channels into platform-specific callback. See: 3f4e73afMerged-by: Michael Niedermayer <michaelni@gmx.at>
-
Ben Avison authored
Verified with profiling that this doesn't have a measurable effect upon overall performance. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
Michael Niedermayer authored
* commit '15a29c39': truehd: add hand-scheduled ARM asm version of mlp_filter_channel. Conflicts: libavcodec/arm/Makefile libavcodec/arm/mlpdsp_init_arm.c See: 87b128d5Merged-by: Michael Niedermayer <michaelni@gmx.at>
-
Ben Avison authored
Profiling results for overall audio decode and the mlp_filter_channel(_arm) function in particular are as follows: Before After Mean StdDev Mean StdDev Confidence Change 6:2 total 380.4 22.0 370.8 17.0 87.4% +2.6% (insignificant) 6:2 function 60.7 7.2 36.6 8.1 100.0% +65.8% 8:2 total 357.0 17.5 343.2 19.0 97.8% +4.0% (insignificant) 8:2 function 60.3 8.8 37.3 3.8 100.0% +61.8% 6:6 total 717.2 23.2 658.4 15.7 100.0% +8.9% 6:6 function 140.4 12.9 81.5 9.2 100.0% +72.4% 8:8 total 981.9 16.2 896.2 24.5 100.0% +9.6% 8:8 function 193.4 15.0 103.3 11.5 100.0% +87.2% Experiments with adding preload instructions to this function yielded no useful benefit, so these have not been included. The assembly version has also been tested with a fuzz tester to ensure that any combinations of inputs not exercised by my available test streams still generate mathematically identical results to the C version. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
Michael Niedermayer authored
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
Ben Avison authored
Profiling on a Raspberry Pi revealed the best performance to correspond with VLC_BITS = 5. Results for overall audio decode and the get_vlc2 function in particular are as follows: Before After Mean StdDev Mean StdDev Confidence Change 6:2 total 348.8 20.1 339.6 15.1 88.8% +2.7% (insignificant) 6:2 function 38.1 8.1 26.4 4.1 100.0% +44.5% 8:2 total 339.1 15.4 324.5 15.5 99.4% +4.5% 8:2 function 33.8 7.0 27.3 5.6 99.7% +23.6% 6:6 total 604.6 20.8 572.8 20.6 100.0% +5.6% 6:6 function 95.8 8.4 68.9 8.2 100.0% +39.1% 8:8 total 766.4 17.6 741.5 21.2 100.0% +3.4% 8:8 function 106.0 11.4 86.1 9.9 100.0% +23.1% Signed-off-by: Martin Storsjö <martin@martin.st>
-
Ben Avison authored
Profiling results for overall audio decode and the rematrix_channels function in particular are as follows: Before After Mean StdDev Mean StdDev Confidence Change 6:2 total 370.8 17.0 348.8 20.1 99.9% +6.3% 6:2 function 46.4 8.4 45.8 6.6 18.0% +1.2% (insignificant) 8:2 total 343.2 19.0 339.1 15.4 54.7% +1.2% (insignificant) 8:2 function 38.9 3.9 40.2 6.9 52.4% -3.2% (insignificant) 6:6 total 658.4 15.7 604.6 20.8 100.0% +8.9% 6:6 function 109.0 8.7 59.5 5.4 100.0% +83.3% 8:8 total 896.2 24.5 766.4 17.6 100.0% +16.9% 8:8 function 223.4 12.8 93.8 5.0 100.0% +138.3% The assembly version has also been tested with a fuzz tester to ensure that any combinations of inputs not exercised by my available test streams still generate mathematically identical results to the C version. Signed-off-by: Martin Storsjö <martin@martin.st>
-
Ben Avison authored
Verified with profiling that this doesn't have a measurable effect upon overall performance. Signed-off-by: Martin Storsjö <martin@martin.st>
-
Ben Avison authored
Profiling results for overall audio decode and the mlp_filter_channel(_arm) function in particular are as follows: Before After Mean StdDev Mean StdDev Confidence Change 6:2 total 380.4 22.0 370.8 17.0 87.4% +2.6% (insignificant) 6:2 function 60.7 7.2 36.6 8.1 100.0% +65.8% 8:2 total 357.0 17.5 343.2 19.0 97.8% +4.0% (insignificant) 8:2 function 60.3 8.8 37.3 3.8 100.0% +61.8% 6:6 total 717.2 23.2 658.4 15.7 100.0% +8.9% 6:6 function 140.4 12.9 81.5 9.2 100.0% +72.4% 8:8 total 981.9 16.2 896.2 24.5 100.0% +9.6% 8:8 function 193.4 15.0 103.3 11.5 100.0% +87.2% Experiments with adding preload instructions to this function yielded no useful benefit, so these have not been included. The assembly version has also been tested with a fuzz tester to ensure that any combinations of inputs not exercised by my available test streams still generate mathematically identical results to the C version. Signed-off-by: Martin Storsjö <martin@martin.st>
-
Michael Niedermayer authored
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
wm4 authored
The memory allocation for f->diffs was freed multiple times in some corner cases. Simplify the code so that this doesn't happen. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
Michael Niedermayer authored
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
Michael Niedermayer authored
Fixes h264_mp4toannexb_bsf_failure.mkv Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
Michael Niedermayer authored
* qatar/master: x86: hpeldsp: Keep all rnd_template instantiations in hpeldsp_init Conflicts: libavcodec/x86/rnd_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
-
Michael Niedermayer authored
* commit '55d7f26e': hpeldsp_template: Move content to hpeldsp Conflicts: libavcodec/hpeldsp_template.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
-
Michael Niedermayer authored
* commit '09d4389d': hpeldsp_template: Drop av_unused attribute from *_no_rnd_pixels16_8_c functions Merged-by: Michael Niedermayer <michaelni@gmx.at>
-
Michael Niedermayer authored
* commit '92ba9651': dsputil: Move draw_edges and clear_block* out of dsputil_template Conflicts: libavcodec/dsputil.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
-
Michael Niedermayer authored
* commit 'da5be235': dsputil: Move RV40-specific bits into rv40dsp Conflicts: libavcodec/dsputil.c libavcodec/rv40dsp.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
-
Michael Niedermayer authored
* commit '8011ac91': hpeldsp_template: Detemplatize the code Conflicts: libavcodec/hpeldsp_template.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
-
Michael Niedermayer authored
* commit '2c01ad8b': dsputil_template: Detemplatize the code Conflicts: libavcodec/dsputil.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
-
Michael Niedermayer authored
* commit 'aba70bb5': Add missing headers to make template files compile (more) standalone Merged-by: Michael Niedermayer <michaelni@gmx.at>
-
Michael Niedermayer authored
* commit 'e7373585': dsputil_template: Move bits that are used templatized into separate file Conflicts: libavcodec/dsputil_template.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
-
Michael Niedermayer authored
* commit 'd3c3c166': dsputil: Move hpel_template #include out of dsputil_template Merged-by: Michael Niedermayer <michaelni@gmx.at>
-
Michael Niedermayer authored
* commit 'd0aabeab': x86: h264_qpel: Fix typo in CALL_2X_PIXELS macro invocation See: c8246d37Merged-by: Michael Niedermayer <michaelni@gmx.at>
-
Diego Biurrun authored
There is no point in having a separate file just for the instantiation that provides the public functions.
-
Diego Biurrun authored
There is no point in having this separate; it is not used as a template.
-
Diego Biurrun authored
-
Diego Biurrun authored
The functions are not used templatized.
-
Diego Biurrun authored
-
Diego Biurrun authored
The indirection makes no sense without multiple instantiation.
-
Diego Biurrun authored
The indirection makes no sense without multiple instantiation.
-
Diego Biurrun authored
-
Diego Biurrun authored
This allows detemplatizing the bits that are not instantiated twice.
-
Diego Biurrun authored
Multiple inclusion makes no sense as it is only used in the 8-bit case.
-
Diego Biurrun authored
This fixes FATE with mmxext CPUFLAGS set.
-
Michael Niedermayer authored
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-