Commits · 478a4b7e6d3ec51ba80e77f6dc3df75d9f6de66b · Linshizhi / ffmpeg.wasm-core

17 Jul, 2014 1 commit

armv6: Accelerate ff_imdct_half for general case (mdct_bits != 6) · 5c22e8e4

Ben Avison authored 10 years ago

The previous implementation targeted DTS Coherent Acoustics, which only
requires mdct_bits == 6. This relatively small size lent itself to
unrolling the loops a small number of times, and encoding offsets
calculated at assembly time within the load/store instructions of each
iteration.

In the more general case (codecs such as AAC and AC3) much larger arrays
are used - mdct_bits == [8, 9, 11]. The old method does not scale for
these cases, so more integer registers are used with non-unrolled versions
of the loops (and with some stack spillage). The postrotation filter loop
is still unrolled by a factor of 2 to permit the double-buffering of some
VFP registers to facilitate overlap of neighbouring iterations.

I benchmarked the result by measuring the number of gperftools samples
that hit anywhere in the AAC decoder (starting from aac_decode_frame())
or specifically in ff_imdct_half_c / ff_imdct_half_vfp, for the same
example AAC stream:

Before After
Mean StdDev Mean StdDev Confidence Change
aac_decode_frame 2368.1 35.8 2117.2 35.3 100.0% +11.8%
ff_imdct_half_* 457.5 22.4 251.2 16.2 100.0% +82.1%
Signed-off-by: Martin Storsjö <martin@martin.st>

5c22e8e4

13 Jul, 2014 1 commit

armv6: Accelerate ff_imdct_half for general case (mdct_bits != 6) · 42c1cc35

Ben Avison authored 10 years ago

42c1cc35

22 Jul, 2013 4 commits

arm: Mangle external symbols properly in new vfp assembly files · 69e6702c
Martin Storsjö authored 11 years ago
```
Signed-off-by: Martin Storsjö <martin@martin.st>
```
69e6702c
arm: Mangle external symbols properly in new vfp assembly files · 47d57f24
Martin Storsjö authored 11 years ago
```
Reviewed-by: Kostya Shishkov
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
```
47d57f24

arm: Add VFP-accelerated version of fft16 · 8b9eba66

Martin Storsjö authored 11 years ago

               Before           After
               Mean    StdDev   Mean    StdDev  Change
This function   1389.3  4.2       967.8  35.1   +43.6%
Overall        15577.5 83.2     15400.0 336.4    +1.2%
Signed-off-by: Martin Storsjö <martin@martin.st>

8b9eba66

arm: Add VFP-accelerated version of imdct_half · b63bb251

Martin Storsjö authored 11 years ago

               Before           After
               Mean    StdDev   Mean    StdDev  Change
This function   2653.0  28.5     1108.8  51.4   +139.3%
Overall        17049.5 408.2    15973.0 223.2     +6.7%
Signed-off-by: Martin Storsjö <martin@martin.st>

b63bb251