• Martin Storsjö's avatar
    aarch64: vp9itxfm: Make the larger core transforms standalone functions · 11547601
    Martin Storsjö authored
    This work is sponsored by, and copyright, Google.
    
    This reduces the code size of libavcodec/aarch64/vp9itxfm_neon.o from
    19496 to 14740 bytes.
    
    This gives a small slowdown of a couple of tens of cycles, but makes
    it more feasible to add more optimized versions of these transforms.
    
    Before:
    vp9_inv_dct_dct_16x16_sub4_add_neon:    1036.7
    vp9_inv_dct_dct_16x16_sub16_add_neon:   1372.2
    vp9_inv_dct_dct_32x32_sub4_add_neon:    5180.0
    vp9_inv_dct_dct_32x32_sub32_add_neon:   8095.7
    
    After:
    vp9_inv_dct_dct_16x16_sub4_add_neon:    1051.0
    vp9_inv_dct_dct_16x16_sub16_add_neon:   1390.1
    vp9_inv_dct_dct_32x32_sub4_add_neon:    5199.9
    vp9_inv_dct_dct_32x32_sub32_add_neon:   8125.8
    Signed-off-by: 's avatarMartin Storsjö <martin@martin.st>
    11547601
vp9itxfm_neon.S 47.2 KB