• Christophe Gisquet's avatar
    x86 dsputil: provide SSE2/SSSE3 versions of bswap_buf · 6b039003
    Christophe Gisquet authored
    While pshufb allows emulating bswap on XMM registers for SSSE3, more
    shuffling is needed for SSE2. Alignment is critical, so specific codepaths
    are provided for this case.
    
    For the huffyuv sequence "angels_480-huffyuvcompress.avi":
    C (using bswap instruction): ~ 55k cycles
    SSE2:                        ~ 40k cycles
    SSSE3 using unaligned loads: ~ 35k cycles
    SSSE3 using aligned loads:   ~ 30k cycles
    Signed-off-by: 's avatarDiego Biurrun <diego@biurrun.de>
    6b039003
Name
Last commit
Last update
doc Loading commit data...
libavcodec Loading commit data...
libavdevice Loading commit data...
libavfilter Loading commit data...
libavformat Loading commit data...
libavutil Loading commit data...
libpostproc Loading commit data...
libswscale Loading commit data...
presets Loading commit data...
tests Loading commit data...
tools Loading commit data...
.gitignore Loading commit data...
COPYING.GPLv2 Loading commit data...
COPYING.GPLv3 Loading commit data...
COPYING.LGPLv2.1 Loading commit data...
COPYING.LGPLv3 Loading commit data...
CREDITS Loading commit data...
Changelog Loading commit data...
Doxyfile Loading commit data...
INSTALL Loading commit data...
LICENSE Loading commit data...
Makefile Loading commit data...
README Loading commit data...
RELEASE Loading commit data...
avconv.c Loading commit data...
avplay.c Loading commit data...
avprobe.c Loading commit data...
avserver.c Loading commit data...
cmdutils.c Loading commit data...
cmdutils.h Loading commit data...
cmdutils_common_opts.h Loading commit data...
common.mak Loading commit data...
configure Loading commit data...
library.mak Loading commit data...
version.sh Loading commit data...