• Ganesh Ajjanagadde's avatar
    swresample/resample: improve bessel function accuracy and speed · a5202bc9
    Ganesh Ajjanagadde authored
    This improves accuracy for the bessel function at large arguments, and this in turn
    should improve the quality of the Kaiser window. It also improves the
    performance of the bessel function and hence build_filter by ~ 20%.
    Details are given below.
    
    Algorithm: taken from the Boost project, who have done a detailed
    investigation of the accuracy of their method, as compared with e.g the
    GNU Scientific Library (GSL):
    http://www.boost.org/doc/libs/1_52_0/libs/math/doc/sf_and_dist/html/math_toolkit/special/bessel/mbessel.html.
    Boost source code (also cited and licensed in the code):
    https://searchcode.com/codesearch/view/14918379/.
    
    Accuracy: sample values may be obtained as follows. i0 denotes the old bessel code,
    i0_boost the approach here, and i0_real an arbitrary precision result (truncated) from Wolfram Alpha:
    type "bessel i0(6.0)" to reproduce. These are evaluation points that occur for
    the default kaiser_beta = 9.
    
    Some illustrations:
    bessel(8.0)
    i0      (8.000000) = 427.564115721804739678191254
    i0_boost(8.000000) = 427.564115721804796521610115
    i0_real (8.000000) = 427.564115721804785177396791
    
    bessel(6.0)
    i0      (6.000000) = 67.234406976477956163762428
    i0_boost(6.000000) = 67.234406976477970374617144
    i0_real (6.000000) = 67.234406976477975326188025
    
    Reason for accuracy: Main accuracy benefits come at larger bessel arguments, where the
    Taylor-Maclaurin method is not that good: 23+ iterations
    (at large arguments, since the series is about 0) can cause
    significant floating point error accumulation.
    
    Benchmarks: Obtained on x86-64, Haswell, GNU/Linux via a loop calling
    build_filter 1000 times:
    test: fate-swr-resample-dblp-44100-2626
    
    new:
    995894468 decicycles in build_filter(loop 1000),     256 runs,      0 skips
    1029719302 decicycles in build_filter(loop 1000),     512 runs,      0 skips
    984101131 decicycles in build_filter(loop 1000),    1024 runs,      0 skips
    
    old:
    1250020763 decicycles in build_filter(loop 1000),     256 runs,      0 skips
    1246353282 decicycles in build_filter(loop 1000),     512 runs,      0 skips
    1220017565 decicycles in build_filter(loop 1000),    1024 runs,      0 skips
    
    A further ~ 5% may be squeezed by enabling -ftree-vectorize. However,
    this is a separate issue from this patch.
    Reviewed-by: 's avatarMichael Niedermayer <michael@niedermayer.cc>
    Signed-off-by: 's avatarGanesh Ajjanagadde <gajjanagadde@gmail.com>
    a5202bc9
resample.c 19.9 KB