- 17 Jul, 2014 7 commits
-
-
Ben Avison authored
The previous implementation targeted DTS Coherent Acoustics, which only requires nbits == 4 (fft16()). This case was (and still is) linked directly rather than being indirected through ff_fft_calc_vfp(), but now the full range from radix-4 up to radix-65536 is available. This benefits other codecs such as AAC and AC3. The implementaion is based upon the C version, with each routine larger than radix-16 calling a hierarchy of smaller FFT functions, then performing a post-processing pass. This pass benefits a lot from loop unrolling to counter the long pipelines in the VFP. A relaxed calling standard also reduces the overhead of the call hierarchy, and avoiding the excessive inlining performed by GCC probably helps with I-cache utilisation too. I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in the FFT routines (fft4() to fft512() and pass()) for the same sample AAC stream: Before After Mean StdDev Mean StdDev Confidence Change Audio decode 2245.5 53.1 1599.6 43.8 100.0% +40.4% FFT routines 940.6 22.0 348.1 20.8 100.0% +170.2% Signed-off-by: Martin Storsjö <martin@martin.st>
-
Ben Avison authored
The previous implementation targeted DTS Coherent Acoustics, which only requires mdct_bits == 6. This relatively small size lent itself to unrolling the loops a small number of times, and encoding offsets calculated at assembly time within the load/store instructions of each iteration. In the more general case (codecs such as AAC and AC3) much larger arrays are used - mdct_bits == [8, 9, 11]. The old method does not scale for these cases, so more integer registers are used with non-unrolled versions of the loops (and with some stack spillage). The postrotation filter loop is still unrolled by a factor of 2 to permit the double-buffering of some VFP registers to facilitate overlap of neighbouring iterations. I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in ff_imdct_half_c / ff_imdct_half_vfp, for the same example AAC stream: Before After Mean StdDev Mean StdDev Confidence Change aac_decode_frame 2368.1 35.8 2117.2 35.3 100.0% +11.8% ff_imdct_half_* 457.5 22.4 251.2 16.2 100.0% +82.1% Signed-off-by: Martin Storsjö <martin@martin.st>
-
Diego Biurrun authored
-
Martin Storsjö authored
Signed-off-by: Martin Storsjö <martin@martin.st>
-
Diego Biurrun authored
This makes the init files match the structure of the dsputil split.
-
Nidhi Makhijani authored
Signed-off-by: Diego Biurrun <diego@biurrun.de>
-
Luca Barbato authored
Signed-off-by: Diego Biurrun <diego@biurrun.de> Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
-
- 16 Jul, 2014 3 commits
-
-
Diego Biurrun authored
-
Diego Biurrun authored
-
Vittorio Giovara authored
-
- 14 Jul, 2014 1 commit
-
-
Martin Storsjö authored
This fixes running fate in configs where the samples are located in a different path on the target. Signed-off-by: Martin Storsjö <martin@martin.st>
-
- 13 Jul, 2014 2 commits
-
-
Diego Biurrun authored
The remaining dsputil bits are encoding-specific anyway.
-
Diego Biurrun authored
-
- 11 Jul, 2014 8 commits
-
-
Diego Biurrun authored
doc/examples/output.c:460:9: warning: unused variable ‘i’
-
Diego Biurrun authored
-
Luca Barbato authored
-
Luca Barbato authored
The specification requires at most 1 track enabled per alternate group.
-
Gildas Cocherel authored
Sample-Id: OPFLAG_B_Qualcomm_1.bit, OPFLAG_C_Qualcomm_1.bit Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>
-
Mickaël Raulet authored
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>
-
Mickaël Raulet authored
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>
-
Anton Khirnov authored
-
- 10 Jul, 2014 2 commits
-
-
Nidhi Makhijani authored
Signed-off-by: Diego Biurrun <diego@biurrun.de>
-
Alexander V. Lukyanov authored
AVFormatContext->priv_data is not always a MpegTSContext, it can be RTSPState when decoding a RTP stream. So it is necessary to pass MpegTSContext pointer explicitly. Within libav, the write_section_data function doesn't actually use the MpegTSContext at all, so this doesn't change anything at the moment (no memory was corrupted before), but it reduces the risk of anybody trying to touch the MpegTSContext via AVFormatContext->priv_data in the future. Signed-off-by: Martin Storsjö <martin@martin.st>
-
- 09 Jul, 2014 16 commits
-
-
Diego Biurrun authored
-
Vittorio Giovara authored
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
-
Vittorio Giovara authored
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
-
Vittorio Giovara authored
-
Andrew Kelley authored
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
-
Anton Khirnov authored
Use it for logging, instead of NULL or the stream codec context.
-
Anton Khirnov authored
Its contents are meaningful only if the stream codec context is the one actually used for encoding, which is often not the case (and is discouraged). Use AVCodecContext.field_order instead.
-
Anton Khirnov authored
-
Anton Khirnov authored
-
Anton Khirnov authored
-
Anton Khirnov authored
It is supposed to be set by decoders only.
-
Anton Khirnov authored
The only thing the demuxer needs is the sample rate to set the timebase, which can be simply read with AV_RB32.
-
Anton Khirnov authored
-
Anton Khirnov authored
This is required by the new API.
-
Anton Khirnov authored
The callers should now set the stream timebase, not the codec one.
-
Anton Khirnov authored
Bug-Id: 55
-
- 08 Jul, 2014 1 commit
-
-
Martin Storsjö authored
This silences a warning with gcc. Signed-off-by: Martin Storsjö <martin@martin.st>
-