• Ganesh Ajjanagadde's avatar
    avutil/mathematics: speed up av_gcd by using Stein's binary GCD algorithm · 971d12b7
    Ganesh Ajjanagadde authored
    This uses Stein's binary GCD algorithm:
    https://en.wikipedia.org/wiki/Binary_GCD_algorithm
    to get a roughly 4x speedup over Euclidean GCD on standard architectures
    with a compiler intrinsic for ctzll, and a roughly 2x speedup otherwise.
    At the moment, the compiler intrinsic is used on GCC and Clang due to
    its easy availability.
    
    Quick note regarding overflow: yes, subtractions on int64_t can, but the
    llabs takes care of that. The llabs is also guaranteed to be safe, with
    no annoying INT64_MIN business since INT64_MIN being a power of 2, is
    shifted down before being sent to llabs.
    
    The binary GCD needs ff_ctzll, an extension of ff_ctz for long long (int64_t). On
    GCC, this is provided by a built-in. On Microsoft, there is a
    BitScanForward64 analog of BitScanForward that should work; but I can't confirm.
    Apparently it is not available on 32 bit builds; so this may or may not
    work correctly. On Intel, per the documentation there is only an
    intrinsic for _bit_scan_forward and people have posted on forums
    regarding _bit_scan_forward64, but often their documentation is
    woeful. Again, I don't have it, so I can't test.
    
    As such, to be safe, for now only the GCC/Clang intrinsic is added, the rest
    use a compiled version based on the De-Bruijn method of Leiserson et al:
    http://supertech.csail.mit.edu/papers/debruijn.pdf.
    
    Tested with FATE, sample benchmark (x86-64, GCC 5.2.0, Haswell)
    with a START_TIMER and STOP_TIMER in libavutil/rationsl.c, followed by a
    make fate.
    
    aac-am00_88.err:
    builtin:
    714 decicycles in av_gcd,    4095 runs,      1 skips
    
    de-bruijn:
    1440 decicycles in av_gcd,    4096 runs,      0 skips
    
    previous:
    2889 decicycles in av_gcd,    4096 runs,      0 skips
    Signed-off-by: 's avatarGanesh Ajjanagadde <gajjanagadde@gmail.com>
    Signed-off-by: 's avatarMichael Niedermayer <michael@niedermayer.cc>
    971d12b7
mathematics.c 5.52 KB