• Martin Storsjö's avatar
    aarch64: vp9itxfm: Avoid reloading the idct32 coefficients · 65aa002d
    Martin Storsjö authored
    The idct32x32 function actually pushed d8-d15 onto the stack even
    though it didn't clobber them; there are plenty of registers that
    can be used to allow keeping all the idct coefficients in registers
    without having to reload different subsets of them at different
    stages in the transform.
    
    After this, we still can skip pushing d12-d15.
    
    Before:
    vp9_inv_dct_dct_32x32_sub32_add_neon: 8128.3
    After:
    vp9_inv_dct_dct_32x32_sub32_add_neon: 8053.3
    Signed-off-by: 's avatarMartin Storsjö <martin@martin.st>
    65aa002d
Name
Last commit
Last update
avbuild Loading commit data...
avtools Loading commit data...
compat Loading commit data...
doc Loading commit data...
libavcodec Loading commit data...
libavdevice Loading commit data...
libavfilter Loading commit data...
libavformat Loading commit data...
libavresample Loading commit data...
libavutil Loading commit data...
libswscale Loading commit data...
presets Loading commit data...
tests Loading commit data...
tools Loading commit data...
.gitattributes Loading commit data...
.gitignore Loading commit data...
.travis.yml Loading commit data...
COPYING.GPLv2 Loading commit data...
COPYING.GPLv3 Loading commit data...
COPYING.LGPLv2.1 Loading commit data...
COPYING.LGPLv3 Loading commit data...
CREDITS Loading commit data...
Changelog Loading commit data...
INSTALL Loading commit data...
LICENSE Loading commit data...
Makefile Loading commit data...
README Loading commit data...
README.md Loading commit data...
RELEASE Loading commit data...
configure Loading commit data...