• Martin Storsjö's avatar
    aarch64: vp8: Optimize vp8_idct_add_neon for aarch64 · 7e42d5f0
    Martin Storsjö authored
    The previous version was a pretty exact translation of the arm
    version. This version does do some unnecessary arithemetic (it does
    more operations on vectors that are only half filled; it does 4
    uaddw and 4 sqxtun instead of 2 of each), but it reduces the overhead
    of packing data together (which could be done for free in the arm
    version).
    
    This gives a decent speedup on Cortex A53, a minor speedup on
    A72 and a very minor slowdown on Cortex A73.
    
    Before:        Cortex A53    A72    A73
    vp8_idct_add_neon:   79.7   67.5   65.0
    After:
    vp8_idct_add_neon:   67.7   64.8   66.7
    Signed-off-by: 's avatarMartin Storsjö <martin@martin.st>
    7e42d5f0
Name
Last commit
Last update
..
Makefile Loading commit data...
asm-offsets.h Loading commit data...
cabac.h Loading commit data...
dcadsp_init.c Loading commit data...
dcadsp_neon.S Loading commit data...
fft_init_aarch64.c Loading commit data...
fft_neon.S Loading commit data...
fmtconvert_init.c Loading commit data...
fmtconvert_neon.S Loading commit data...
h264chroma_init_aarch64.c Loading commit data...
h264cmc_neon.S Loading commit data...
h264dsp_init_aarch64.c Loading commit data...
h264dsp_neon.S Loading commit data...
h264idct_neon.S Loading commit data...
h264pred_init.c Loading commit data...
h264pred_neon.S Loading commit data...
h264qpel_init_aarch64.c Loading commit data...
h264qpel_neon.S Loading commit data...
hpeldsp_init_aarch64.c Loading commit data...
hpeldsp_neon.S Loading commit data...
imdct15_init.c Loading commit data...
imdct15_neon.S Loading commit data...
mdct_init.c Loading commit data...
mdct_neon.S Loading commit data...
mpegaudiodsp_init.c Loading commit data...
mpegaudiodsp_neon.S Loading commit data...
neon.S Loading commit data...
neontest.c Loading commit data...
rv40dsp_init_aarch64.c Loading commit data...
synth_filter_neon.S Loading commit data...
vc1dsp_init_aarch64.c Loading commit data...
videodsp.S Loading commit data...
videodsp_init.c Loading commit data...
vorbisdsp_init.c Loading commit data...
vorbisdsp_neon.S Loading commit data...
vp8dsp.h Loading commit data...
vp8dsp_init_aarch64.c Loading commit data...
vp8dsp_neon.S Loading commit data...
vp9dsp_init_aarch64.c Loading commit data...
vp9itxfm_neon.S Loading commit data...
vp9lpf_neon.S Loading commit data...
vp9mc_neon.S Loading commit data...