• Janne Grunau's avatar
    arm64: port synth_filter_float_neon from arm · 705f5e5e
    Janne Grunau authored
    ~25% faster dts decoding overall. The checkasm CPU cycles numbers are
    not that useful since synth_filter_float() calls FFTContext.imdct_half().
    
                             cortex-a57   cortex-a53
    synth_filter_float_c:    1866.2       3490.9
    synth_filter_float_neon:  915.0       1531.5
    
    With fftc.imdct_half forced to imdct_half_neon:
                             cortex-a57   cortex-a53
    synth_filter_float_c:    1718.4       3025.3
    synth_filter_float_neon:  926.2       1530.1
    705f5e5e
synth_filter.c 2.48 KB