• Giorgio Vazzana's avatar
    md5: optimize second round by using 4-operation form of G() · d0a34aee
    Giorgio Vazzana authored
    4-operation form is preferred over 3-operation because it breaks a long
    dependency chain, thus allowing a superscalar processor to execute more
    operations in parallel.
    The idea was taken from: http://www.zorinaq.com/papers/md5-amd64.html
    
    AMD Athlon(tm) II X3 450 Processor, x86_64
    
    $ for i in $(seq 1 4); do ./avutil_md5_test2; done
    size: 1048576  runs: 1024  time:    5.821 +- 0.019
    size: 1048576  runs: 1024  time:    5.822 +- 0.019
    size: 1048576  runs: 1024  time:    5.841 +- 0.018
    size: 1048576  runs: 1024  time:    5.821 +- 0.018
    
    $ for i in $(seq 1 4); do ./avutil_md5_test2; done
    size: 1048576  runs: 1024  time:    5.646 +- 0.019
    size: 1048576  runs: 1024  time:    5.646 +- 0.018
    size: 1048576  runs: 1024  time:    5.642 +- 0.019
    size: 1048576  runs: 1024  time:    5.641 +- 0.019
    Signed-off-by: 's avatarMichael Niedermayer <michaelni@gmx.at>
    d0a34aee
md5.c 6.52 KB