• Ganesh Ajjanagadde's avatar
    avutil/mathematics: speed up av_gcd by using Stein's binary GCD algorithm · 971d12b7
    Ganesh Ajjanagadde authored
    This uses Stein's binary GCD algorithm:
    https://en.wikipedia.org/wiki/Binary_GCD_algorithm
    to get a roughly 4x speedup over Euclidean GCD on standard architectures
    with a compiler intrinsic for ctzll, and a roughly 2x speedup otherwise.
    At the moment, the compiler intrinsic is used on GCC and Clang due to
    its easy availability.
    
    Quick note regarding overflow: yes, subtractions on int64_t can, but the
    llabs takes care of that. The llabs is also guaranteed to be safe, with
    no annoying INT64_MIN business since INT64_MIN being a power of 2, is
    shifted down before being sent to llabs.
    
    The binary GCD needs ff_ctzll, an extension of ff_ctz for long long (int64_t). On
    GCC, this is provided by a built-in. On Microsoft, there is a
    BitScanForward64 analog of BitScanForward that should work; but I can't confirm.
    Apparently it is not available on 32 bit builds; so this may or may not
    work correctly. On Intel, per the documentation there is only an
    intrinsic for _bit_scan_forward and people have posted on forums
    regarding _bit_scan_forward64, but often their documentation is
    woeful. Again, I don't have it, so I can't test.
    
    As such, to be safe, for now only the GCC/Clang intrinsic is added, the rest
    use a compiled version based on the De-Bruijn method of Leiserson et al:
    http://supertech.csail.mit.edu/papers/debruijn.pdf.
    
    Tested with FATE, sample benchmark (x86-64, GCC 5.2.0, Haswell)
    with a START_TIMER and STOP_TIMER in libavutil/rationsl.c, followed by a
    make fate.
    
    aac-am00_88.err:
    builtin:
    714 decicycles in av_gcd,    4095 runs,      1 skips
    
    de-bruijn:
    1440 decicycles in av_gcd,    4096 runs,      0 skips
    
    previous:
    2889 decicycles in av_gcd,    4096 runs,      0 skips
    Signed-off-by: 's avatarGanesh Ajjanagadde <gajjanagadde@gmail.com>
    Signed-off-by: 's avatarMichael Niedermayer <michael@niedermayer.cc>
    971d12b7
Name
Last commit
Last update
..
aarch64 Loading commit data...
arm Loading commit data...
avr32 Loading commit data...
bfin Loading commit data...
mips Loading commit data...
ppc Loading commit data...
sh4 Loading commit data...
tomi Loading commit data...
x86 Loading commit data...
Makefile Loading commit data...
adler32.c Loading commit data...
adler32.h Loading commit data...
aes.c Loading commit data...
aes.h Loading commit data...
atomic.c Loading commit data...
atomic.h Loading commit data...
atomic_gcc.h Loading commit data...
atomic_suncc.h Loading commit data...
atomic_win32.h Loading commit data...
attributes.h Loading commit data...
audio_fifo.c Loading commit data...
audio_fifo.h Loading commit data...
avassert.h Loading commit data...
avstring.c Loading commit data...
avstring.h Loading commit data...
avutil.h Loading commit data...
avutilres.rc Loading commit data...
base64.c Loading commit data...
base64.h Loading commit data...
blowfish.c Loading commit data...
blowfish.h Loading commit data...
bprint.c Loading commit data...
bprint.h Loading commit data...
bswap.h Loading commit data...
buffer.c Loading commit data...
buffer.h Loading commit data...
buffer_internal.h Loading commit data...
camellia.c Loading commit data...
camellia.h Loading commit data...
cast5.c Loading commit data...
cast5.h Loading commit data...
channel_layout.c Loading commit data...
channel_layout.h Loading commit data...
color_utils.c Loading commit data...
color_utils.h Loading commit data...
colorspace.h Loading commit data...
common.h Loading commit data...
cpu.c Loading commit data...
cpu.h Loading commit data...
cpu_internal.h Loading commit data...
crc.c Loading commit data...
crc.h Loading commit data...
des.c Loading commit data...
des.h Loading commit data...
dict.c Loading commit data...
dict.h Loading commit data...
display.c Loading commit data...
display.h Loading commit data...
downmix_info.c Loading commit data...
downmix_info.h Loading commit data...
dynarray.h Loading commit data...
error.c Loading commit data...
error.h Loading commit data...
eval.c Loading commit data...
eval.h Loading commit data...
fifo.c Loading commit data...
fifo.h Loading commit data...
file.c Loading commit data...
file.h Loading commit data...
file_open.c Loading commit data...
fixed_dsp.c Loading commit data...
fixed_dsp.h Loading commit data...
float_dsp.c Loading commit data...
float_dsp.h Loading commit data...
frame.c Loading commit data...
frame.h Loading commit data...
hash.c Loading commit data...
hash.h Loading commit data...
hmac.c Loading commit data...
hmac.h Loading commit data...
imgutils.c Loading commit data...
imgutils.h Loading commit data...
integer.c Loading commit data...
integer.h Loading commit data...
internal.h Loading commit data...
intfloat.h Loading commit data...
intmath.c Loading commit data...
intmath.h Loading commit data...
intreadwrite.h Loading commit data...
lfg.c Loading commit data...
lfg.h Loading commit data...
libavutil.v Loading commit data...
libm.h Loading commit data...
lls.c Loading commit data...
lls.h Loading commit data...
log.c Loading commit data...
log.h Loading commit data...
log2_tab.c Loading commit data...
lzo.c Loading commit data...
lzo.h Loading commit data...
macros.h Loading commit data...
mathematics.c Loading commit data...
mathematics.h Loading commit data...
md5.c Loading commit data...
md5.h Loading commit data...
mem.c Loading commit data...
mem.h Loading commit data...
mem_internal.h Loading commit data...
motion_vector.h Loading commit data...
murmur3.c Loading commit data...
murmur3.h Loading commit data...
opencl.c Loading commit data...
opencl.h Loading commit data...
opencl_internal.c Loading commit data...
opencl_internal.h Loading commit data...
opt.c Loading commit data...
opt.h Loading commit data...
parseutils.c Loading commit data...
parseutils.h Loading commit data...
pca.c Loading commit data...
pca.h Loading commit data...
pixdesc.c Loading commit data...
pixdesc.h Loading commit data...
pixelutils.c Loading commit data...
pixelutils.h Loading commit data...
pixfmt.h Loading commit data...
qsort.h Loading commit data...
random_seed.c Loading commit data...
random_seed.h Loading commit data...
rational.c Loading commit data...
rational.h Loading commit data...
rc4.c Loading commit data...
rc4.h Loading commit data...
replaygain.h Loading commit data...
reverse.c Loading commit data...
ripemd.c Loading commit data...
ripemd.h Loading commit data...
samplefmt.c Loading commit data...
samplefmt.h Loading commit data...
sha.c Loading commit data...
sha.h Loading commit data...
sha512.c Loading commit data...
sha512.h Loading commit data...
softfloat.c Loading commit data...
softfloat.h Loading commit data...
softfloat_tables.h Loading commit data...
stereo3d.c Loading commit data...
stereo3d.h Loading commit data...
tea.c Loading commit data...
tea.h Loading commit data...
thread.h Loading commit data...
threadmessage.c Loading commit data...
threadmessage.h Loading commit data...
time.c Loading commit data...
time.h Loading commit data...
time_internal.h Loading commit data...
timecode.c Loading commit data...
timecode.h Loading commit data...
timer.h Loading commit data...
timestamp.h Loading commit data...
tree.c Loading commit data...
tree.h Loading commit data...
twofish.c Loading commit data...
twofish.h Loading commit data...
utf8.c Loading commit data...
utils.c Loading commit data...
version.h Loading commit data...
wchar_filename.h Loading commit data...
x86_cpu.h Loading commit data...
xga_font_data.c Loading commit data...
xga_font_data.h Loading commit data...
xtea.c Loading commit data...
xtea.h Loading commit data...