Commits · a2ae381b5a6f50669bcbd37001c110567a61f446 · Linshizhi / ffmpeg.wasm-core

09 Dec, 2014 1 commit
- arm: Use .data.rel.ro for const data with relocations · f963f803
  Martin Storsjö authored 10 years ago
```
Signed-off-by: Martin Storsjö <martin@martin.st>
```
  f963f803
08 Dec, 2014 2 commits

arm: fft_vfp: Unify the behaviour in ff_fft_calc_vfp between arm/thumb · b280c620

Martin Storsjö authored 10 years ago

Don't include the function pointer table in the code segment
in arm mode.

This shouldn't have any significant performance effect. It does
end up as a few more instructions than before, for ARM, but
only at the entry to this function, not within the fft functions
themselves.
Signed-off-by: Martin Storsjö <martin@martin.st>

b280c620

arm: fft_vfp: Add a missing "endconst" when building in thumb mode · ae815764
Martin Storsjö authored 10 years ago
```
Signed-off-by: Martin Storsjö <martin@martin.st>
```
ae815764

17 Jul, 2014 1 commit

armv6: Accelerate ff_fft_calc for general case (nbits != 4) · 87552d54

Ben Avison authored 10 years ago

The previous implementation targeted DTS Coherent Acoustics, which only
requires nbits == 4 (fft16()). This case was (and still is) linked directly
rather than being indirected through ff_fft_calc_vfp(), but now the full
range from radix-4 up to radix-65536 is available. This benefits other codecs
such as AAC and AC3.

The implementaion is based upon the C version, with each routine larger than
radix-16 calling a hierarchy of smaller FFT functions, then performing a
post-processing pass. This pass benefits a lot from loop unrolling to
counter the long pipelines in the VFP. A relaxed calling standard also
reduces the overhead of the call hierarchy, and avoiding the excessive
inlining performed by GCC probably helps with I-cache utilisation too.

I benchmarked the result by measuring the number of gperftools samples that
hit anywhere in the AAC decoder (starting from aac_decode_frame()) or
specifically in the FFT routines (fft4() to fft512() and pass()) for the
same sample AAC stream:

Before After
Mean StdDev Mean StdDev Confidence Change
Audio decode 2245.5 53.1 1599.6 43.8 100.0% +40.4%
FFT routines 940.6 22.0 348.1 20.8 100.0% +170.2%
Signed-off-by: Martin Storsjö <martin@martin.st>

87552d54

22 Jul, 2013 1 commit

arm: Add VFP-accelerated version of fft16 · 8b9eba66

Martin Storsjö authored 11 years ago

               Before           After
               Mean    StdDev   Mean    StdDev  Change
This function   1389.3  4.2       967.8  35.1   +43.6%
Overall        15577.5 83.2     15400.0 336.4    +1.2%
Signed-off-by: Martin Storsjö <martin@martin.st>

8b9eba66