• Ben Avison's avatar
    armv6: Accelerate ff_fft_calc for general case (nbits != 4) · 87552d54
    Ben Avison authored
    The previous implementation targeted DTS Coherent Acoustics, which only
    requires nbits == 4 (fft16()). This case was (and still is) linked directly
    rather than being indirected through ff_fft_calc_vfp(), but now the full
    range from radix-4 up to radix-65536 is available. This benefits other codecs
    such as AAC and AC3.
    
    The implementaion is based upon the C version, with each routine larger than
    radix-16 calling a hierarchy of smaller FFT functions, then performing a
    post-processing pass. This pass benefits a lot from loop unrolling to
    counter the long pipelines in the VFP. A relaxed calling standard also
    reduces the overhead of the call hierarchy, and avoiding the excessive
    inlining performed by GCC probably helps with I-cache utilisation too.
    
    I benchmarked the result by measuring the number of gperftools samples that
    hit anywhere in the AAC decoder (starting from aac_decode_frame()) or
    specifically in the FFT routines (fft4() to fft512() and pass()) for the
    same sample AAC stream:
    
                  Before          After
                  Mean   StdDev   Mean   StdDev  Confidence  Change
    Audio decode  2245.5 53.1     1599.6 43.8    100.0%      +40.4%
    FFT routines  940.6  22.0     348.1  20.8    100.0%      +170.2%
    Signed-off-by: 's avatarMartin Storsjö <martin@martin.st>
    87552d54
Name
Last commit
Last update
compat Loading commit data...
doc Loading commit data...
libavcodec Loading commit data...
libavdevice Loading commit data...
libavfilter Loading commit data...
libavformat Loading commit data...
libavresample Loading commit data...
libavutil Loading commit data...
libswscale Loading commit data...
presets Loading commit data...
tests Loading commit data...
tools Loading commit data...
.gitignore Loading commit data...
COPYING.GPLv2 Loading commit data...
COPYING.GPLv3 Loading commit data...
COPYING.LGPLv2.1 Loading commit data...
COPYING.LGPLv3 Loading commit data...
CREDITS Loading commit data...
Changelog Loading commit data...
INSTALL Loading commit data...
LICENSE Loading commit data...
Makefile Loading commit data...
README Loading commit data...
RELEASE Loading commit data...
arch.mak Loading commit data...
avconv.c Loading commit data...
avconv.h Loading commit data...
avconv_dxva2.c Loading commit data...
avconv_filter.c Loading commit data...
avconv_opt.c Loading commit data...
avconv_vda.c Loading commit data...
avconv_vdpau.c Loading commit data...
avplay.c Loading commit data...
avprobe.c Loading commit data...
cmdutils.c Loading commit data...
cmdutils.h Loading commit data...
cmdutils_common_opts.h Loading commit data...
common.mak Loading commit data...
configure Loading commit data...
library.mak Loading commit data...
version.sh Loading commit data...