- 21 Jul, 2014 7 commits
-
-
Katerina Barone-Adesi authored
Signed-off-by: Diego Biurrun <diego@biurrun.de>
-
Diego Biurrun authored
-
Diego Biurrun authored
-
Diego Biurrun authored
-
Nidhi Makhijani authored
Signed-off-by: Diego Biurrun <diego@biurrun.de>
-
Diego Biurrun authored
-
Janne Grunau authored
This reverts commit b31d76e4 as it uses an unkown pixel format.
-
- 20 Jul, 2014 5 commits
-
-
Vittorio Giovara authored
-
Ronald S. Bultje authored
Such files can be created using the --bff x264 option. Sample-Id: h264_direct_temporal_mvs_bff.mkv Signed-off-by: Luca Barbato <lu_zero@gentoo.org> Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
-
Carl Eugen Hoyos authored
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
-
Diego Biurrun authored
-
Nidhi Makhijani authored
Signed-off-by: Diego Biurrun <diego@biurrun.de>
-
- 19 Jul, 2014 3 commits
-
-
Diego Biurrun authored
-
Diego Biurrun authored
-
Diego Biurrun authored
-
- 18 Jul, 2014 13 commits
-
-
Nidhi Makhijani authored
Signed-off-by: Diego Biurrun <diego@biurrun.de>
-
Diego Biurrun authored
-
Diego Biurrun authored
-
Diego Biurrun authored
-
Nidhi Makhijani authored
Signed-off-by: Diego Biurrun <diego@biurrun.de>
-
Diego Biurrun authored
-
Diego Biurrun authored
Also rename the enum values to be consistent with other DCT permutations.
-
Diego Biurrun authored
Anonymous structs can cause trouble in header files, so try to avoid them altogether as a matter of good style.
-
Diego Biurrun authored
-
Diego Biurrun authored
-
Diego Biurrun authored
-
Martin Storsjö authored
Signed-off-by: Martin Storsjö <martin@martin.st>
-
Martin Storsjö authored
Signed-off-by: Martin Storsjö <martin@martin.st>
-
- 17 Jul, 2014 9 commits
-
-
Ben Avison authored
I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in butterflies_float_c() / ff_butterflies_float_vfp() for the same sample AAC stream: Before After Mean StdDev Mean StdDev Confidence Change Audio decode 1542.8 43.7 1470.5 41.5 100.0% +4.9% butterflies_float 130.0 11.9 70.2 12.1 100.0% +85.2% Signed-off-by: Martin Storsjö <martin@martin.st>
-
Ben Avison authored
I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in vector_fmul_window_c() / ff_vector_fmul_window_vfp() for the same sample AAC stream: Before After Mean StdDev Mean StdDev Confidence Change Audio decode 1598.2 47.4 1529.2 25.4 100.0% +4.5% vector_fmul_window 244.0 22.1 188.9 22.3 100.0% +29.2% Signed-off-by: Martin Storsjö <martin@martin.st>
-
Ben Avison authored
The previous implementation targeted DTS Coherent Acoustics, which only requires nbits == 4 (fft16()). This case was (and still is) linked directly rather than being indirected through ff_fft_calc_vfp(), but now the full range from radix-4 up to radix-65536 is available. This benefits other codecs such as AAC and AC3. The implementaion is based upon the C version, with each routine larger than radix-16 calling a hierarchy of smaller FFT functions, then performing a post-processing pass. This pass benefits a lot from loop unrolling to counter the long pipelines in the VFP. A relaxed calling standard also reduces the overhead of the call hierarchy, and avoiding the excessive inlining performed by GCC probably helps with I-cache utilisation too. I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in the FFT routines (fft4() to fft512() and pass()) for the same sample AAC stream: Before After Mean StdDev Mean StdDev Confidence Change Audio decode 2245.5 53.1 1599.6 43.8 100.0% +40.4% FFT routines 940.6 22.0 348.1 20.8 100.0% +170.2% Signed-off-by: Martin Storsjö <martin@martin.st>
-
Ben Avison authored
The previous implementation targeted DTS Coherent Acoustics, which only requires mdct_bits == 6. This relatively small size lent itself to unrolling the loops a small number of times, and encoding offsets calculated at assembly time within the load/store instructions of each iteration. In the more general case (codecs such as AAC and AC3) much larger arrays are used - mdct_bits == [8, 9, 11]. The old method does not scale for these cases, so more integer registers are used with non-unrolled versions of the loops (and with some stack spillage). The postrotation filter loop is still unrolled by a factor of 2 to permit the double-buffering of some VFP registers to facilitate overlap of neighbouring iterations. I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in ff_imdct_half_c / ff_imdct_half_vfp, for the same example AAC stream: Before After Mean StdDev Mean StdDev Confidence Change aac_decode_frame 2368.1 35.8 2117.2 35.3 100.0% +11.8% ff_imdct_half_* 457.5 22.4 251.2 16.2 100.0% +82.1% Signed-off-by: Martin Storsjö <martin@martin.st>
-
Diego Biurrun authored
-
Martin Storsjö authored
Signed-off-by: Martin Storsjö <martin@martin.st>
-
Diego Biurrun authored
This makes the init files match the structure of the dsputil split.
-
Nidhi Makhijani authored
Signed-off-by: Diego Biurrun <diego@biurrun.de>
-
Luca Barbato authored
Signed-off-by: Diego Biurrun <diego@biurrun.de> Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
-
- 16 Jul, 2014 3 commits
-
-
Diego Biurrun authored
-
Diego Biurrun authored
-
Vittorio Giovara authored
-