• Ronald S. Bultje's avatar
    vp9: add 32x32 idct AVX2 implementation. · 726501a3
    Ronald S. Bultje authored
    About 1.8x speedup compared to AVX version for full IDCT. Other
    sub-IDCT scenarios also see speedups. Full --bench output for
    idct_32x32_add_{bpp}_${subidct}_${opt} (50k cycles):
    
    nop: 16.5
    vp9_inv_dct_dct_32x32_add_8_1_c: 2284.4
    vp9_inv_dct_dct_32x32_add_8_1_sse2: 145.0
    vp9_inv_dct_dct_32x32_add_8_1_ssse3: 137.4
    vp9_inv_dct_dct_32x32_add_8_1_avx: 137.1
    vp9_inv_dct_dct_32x32_add_8_1_avx2: 73.2
    vp9_inv_dct_dct_32x32_add_8_2_c: 14680.8
    vp9_inv_dct_dct_32x32_add_8_2_sse2: 2617.2
    vp9_inv_dct_dct_32x32_add_8_2_ssse3: 982.9
    vp9_inv_dct_dct_32x32_add_8_2_avx: 958.5
    vp9_inv_dct_dct_32x32_add_8_2_avx2: 704.2
    vp9_inv_dct_dct_32x32_add_8_4_c: 14443.1
    vp9_inv_dct_dct_32x32_add_8_4_sse2: 2717.1
    vp9_inv_dct_dct_32x32_add_8_4_ssse3: 965.7
    vp9_inv_dct_dct_32x32_add_8_4_avx: 1000.7
    vp9_inv_dct_dct_32x32_add_8_4_avx2: 717.1
    vp9_inv_dct_dct_32x32_add_8_8_c: 14436.4
    vp9_inv_dct_dct_32x32_add_8_8_sse2: 2671.8
    vp9_inv_dct_dct_32x32_add_8_8_ssse3: 1038.5
    vp9_inv_dct_dct_32x32_add_8_8_avx: 983.0
    vp9_inv_dct_dct_32x32_add_8_8_avx2: 729.4
    vp9_inv_dct_dct_32x32_add_8_16_c: 14614.7
    vp9_inv_dct_dct_32x32_add_8_16_sse2: 2701.7
    vp9_inv_dct_dct_32x32_add_8_16_ssse3: 1334.4
    vp9_inv_dct_dct_32x32_add_8_16_avx: 1276.7
    vp9_inv_dct_dct_32x32_add_8_16_avx2: 719.5
    vp9_inv_dct_dct_32x32_add_8_32_c: 14363.6
    vp9_inv_dct_dct_32x32_add_8_32_sse2: 2575.6
    vp9_inv_dct_dct_32x32_add_8_32_ssse3: 2633.9
    vp9_inv_dct_dct_32x32_add_8_32_avx: 2539.6
    vp9_inv_dct_dct_32x32_add_8_32_avx2: 1395.0
    726501a3
Name
Last commit
Last update
compat Loading commit data...
doc Loading commit data...
libavcodec Loading commit data...
libavdevice Loading commit data...
libavfilter Loading commit data...
libavformat Loading commit data...
libavresample Loading commit data...
libavutil Loading commit data...
libpostproc Loading commit data...
libswresample Loading commit data...
libswscale Loading commit data...
presets Loading commit data...
tests Loading commit data...
tools Loading commit data...
.gitattributes Loading commit data...
.gitignore Loading commit data...
.travis.yml Loading commit data...
COPYING.GPLv2 Loading commit data...
COPYING.GPLv3 Loading commit data...
COPYING.LGPLv2.1 Loading commit data...
COPYING.LGPLv3 Loading commit data...
CREDITS Loading commit data...
Changelog Loading commit data...
INSTALL.md Loading commit data...
LICENSE.md Loading commit data...
MAINTAINERS Loading commit data...
Makefile Loading commit data...
README.md Loading commit data...
RELEASE Loading commit data...
arch.mak Loading commit data...
cmdutils.c Loading commit data...
cmdutils.h Loading commit data...
cmdutils_common_opts.h Loading commit data...
cmdutils_opencl.c Loading commit data...
common.mak Loading commit data...
configure Loading commit data...
ffmpeg.c Loading commit data...
ffmpeg.h Loading commit data...
ffmpeg_cuvid.c Loading commit data...
ffmpeg_dxva2.c Loading commit data...
ffmpeg_filter.c Loading commit data...
ffmpeg_opt.c Loading commit data...
ffmpeg_qsv.c Loading commit data...
ffmpeg_vaapi.c Loading commit data...
ffmpeg_vdpau.c Loading commit data...
ffmpeg_videotoolbox.c Loading commit data...
ffplay.c Loading commit data...
ffprobe.c Loading commit data...
ffserver.c Loading commit data...
ffserver_config.c Loading commit data...
ffserver_config.h Loading commit data...
library.mak Loading commit data...
version.sh Loading commit data...