• Sebastian Pop's avatar
    swscale/aarch64: use multiply accumulate and increase vector factor to 4 · bd831912
    Sebastian Pop authored
    This patch implements ff_hscale_8_to_15_neon with NEON fused multiply accumulate
    and bumps the vectorization factor from 2 to 4.
    The speedup is of 25% on Graviton1 A1 instances based on A-72 cpus:
    
    $ ffmpeg -nostats -f lavfi -i testsrc2=4k:d=2 -vf bench=start,scale=1024x1024,bench=stop -f null -
    before: t:0.040303 avg:0.040287 max:0.040371 min:0.039214
    after:  t:0.032168 avg:0.032215 max:0.033081 min:0.032146
    
    The speedup is of 39% on Graviton2 m6g instances based on Neoverse-N1 cpus:
    $ ffmpeg -nostats -f lavfi -i testsrc2=4k:d=2 -vf bench=start,scale=1024x1024,bench=stop -f null -
    before: t:0.019446 avg:0.019423 max:0.019493 min:0.019181
    after:  t:0.014015 avg:0.014096 max:0.015018 min:0.013971
    
    Tested with `make check` on aarch64-linux.
    Signed-off-by: 's avatarSebastian Pop <spop@amazon.com>
    Reviewed-by: 's avatarJean-Baptiste Kempf <jb@videolan.org>
    Signed-off-by: 's avatarMichael Niedermayer <michael@niedermayer.cc>
    bd831912
Name
Last commit
Last update
..
aarch64 Loading commit data...
arm Loading commit data...
ppc Loading commit data...
tests Loading commit data...
x86 Loading commit data...
Makefile Loading commit data...
alphablend.c Loading commit data...
bayer_template.c Loading commit data...
gamma.c Loading commit data...
hscale.c Loading commit data...
hscale_fast_bilinear.c Loading commit data...
input.c Loading commit data...
libswscale.v Loading commit data...
log2_tab.c Loading commit data...
options.c Loading commit data...
output.c Loading commit data...
rgb2rgb.c Loading commit data...
rgb2rgb.h Loading commit data...
rgb2rgb_template.c Loading commit data...
slice.c Loading commit data...
swscale.c Loading commit data...
swscale.h Loading commit data...
swscale_internal.h Loading commit data...
swscale_unscaled.c Loading commit data...
swscaleres.rc Loading commit data...
utils.c Loading commit data...
version.h Loading commit data...
vscale.c Loading commit data...
yuv2rgb.c Loading commit data...