h264/aarch64: add intra loop filter neon asm
Add my neon asm from x264 relicensed under the LGPL 2.1 or later. Ported (x264 uses nv12 chroma) and optimized. Cycle count for checkasm --bench on a Snapdragon 820e: h264_h_loop_filter_luma_intra_8bpp_c: 60.0 h264_h_loop_filter_luma_intra_8bpp_neon: 54.2 h264_v_loop_filter_luma_intra_8bpp_c: 148.3 h264_v_loop_filter_luma_intra_8bpp_neon: 73.8 h264_h_loop_filter_chroma_intra_8bpp_c: 27.8 h264_h_loop_filter_chroma_intra_8bpp_neon: 21.4 h264_h_loop_filter_chroma_mbaff_intra_8bpp_c: 15.8 h264_h_loop_filter_chroma_mbaff_intra_8bpp_neon: 15.7 h264_v_loop_filter_chroma_intra_8bpp_c: 45.8 h264_v_loop_filter_chroma_intra_8bpp_neon: 17.3
Showing
This diff is collapsed.
Please
register
or
sign in
to comment