• Christophe Gisquet's avatar
    rv40: x86 SIMD for biweight · e5c9de2a
    Christophe Gisquet authored
    Provide MMX, SSE2 and SSSE3 versions, with a fast-path when the weights are
    multiples of 512 (which is often the case when the values round up nicely).
    
    *_TIMER report for the 16x16 and 8x8 cases:
    C:
    9015 decicycles in 16, 524257 runs, 31 skips
    2656 decicycles in 8, 524271 runs, 17 skips
    MMX:
    4156 decicycles in 16, 262090 runs, 54 skips
    1206 decicycles in 8, 262131 runs, 13 skips
    MMX on fast-path:
    2760 decicycles in 16, 524222 runs, 66 skips
    995 decicycles in 8, 524252 runs, 36 skips
    SSE2:
    2163 decicycles in 16, 262131 runs, 13 skips
    832 decicycles in 8, 262137 runs, 7 skips
    SSE2 with fast path:
    1783 decicycles in 16, 524276 runs, 12 skips
    711 decicycles in 8, 524283 runs, 5 skips
    SSSE3:
    2117 decicycles in 16, 262136 runs, 8 skips
    814 decicycles in 8, 262143 runs, 1 skips
    SSSE3 with fast path:
    1315 decicycles in 16, 524285 runs, 3 skips
    578 decicycles in 8, 524286 runs, 2 skips
    
    This means around a 4% speedup for some sequences.
    Signed-off-by: 's avatarDiego Biurrun <diego@biurrun.de>
    e5c9de2a
Name
Last commit
Last update
doc Loading commit data...
libavcodec Loading commit data...
libavdevice Loading commit data...
libavfilter Loading commit data...
libavformat Loading commit data...
libavutil Loading commit data...
libpostproc Loading commit data...
libswscale Loading commit data...
presets Loading commit data...
tests Loading commit data...
tools Loading commit data...
.gitignore Loading commit data...
COPYING.GPLv2 Loading commit data...
COPYING.GPLv3 Loading commit data...
COPYING.LGPLv2.1 Loading commit data...
COPYING.LGPLv3 Loading commit data...
CREDITS Loading commit data...
Changelog Loading commit data...
Doxyfile Loading commit data...
INSTALL Loading commit data...
LICENSE Loading commit data...
Makefile Loading commit data...
README Loading commit data...
RELEASE Loading commit data...
avconv.c Loading commit data...
avplay.c Loading commit data...
avprobe.c Loading commit data...
avserver.c Loading commit data...
cmdutils.c Loading commit data...
cmdutils.h Loading commit data...
cmdutils_common_opts.h Loading commit data...
common.mak Loading commit data...
configure Loading commit data...
library.mak Loading commit data...
version.sh Loading commit data...