libavcodec/arm/fmtconvert_init_arm.c · 8d918a98aa24134a043d578ef45bae363dbed9db · Linshizhi / ffmpeg.wasm-core

arm: add ff_int32_to_float_fmul_array8_neon · 90b1b935

Janne Grunau authored Dec 03, 2015

Quite a bit faster than int32_to_float_fmul_array8_c calling
ff_int32_to_float_fmul_scalar_neon through FmtConvertContext.
Number of cycles per int32_to_float_fmul_array8 call while decoding
padded.dts on exynos5422:

               before  after   change
cortex-a7:     1270     951    -25%
cortex-a15:     434     285    -34%

checkasm --bench cycle counts:     cortex-a15   cortex-a7
int32_to_float_fmul_array8_c:      1730.4       4384.5
int32_to_float_fmul_array8_neon_c:  571.5       1694.3
int32_to_float_fmul_array8_neon:    374.0       1448.8

Interesting are the differences between
int32_to_float_fmul_array8_neon_c and int32_to_float_fmul_array8_neon.
The former is current behaviour of calling
ff_int32_to_float_fmul_scalar_neon repeatedly from the c function,
The raw numbers differ since checkasm uses different lengths than the
dca decoder.

90b1b935

fmtconvert_init_arm.c 2.14 KB

Replace fmtconvert_init_arm.c