1. 03 Feb, 2015 14 commits
  2. 02 Feb, 2015 23 commits
  3. 01 Feb, 2015 3 commits
    • Christophe Gisquet's avatar
      hevc/sao: use aligned copies · 6a6aeb53
      Christophe Gisquet authored
      For band filter, source and destination are aligned (except for 16x16 ctbs),
      and otherwise, they are most often aligned. Overall, the total width is also
      too small for amortizing memcpy.
      
      Timings (using an intrinsic version of edge filters):
                B/32     B/64     E/32     E/64
      Before:  32045    93952    38925    126896
      After:   26772    83803    33942    117182
      6a6aeb53
    • Christophe Gisquet's avatar
      x86: hevc/sao: aligned source buffers · bff7feb3
      Christophe Gisquet authored
      Usefull for at least band filter, for which:
      - Band filter call only:
                 32      64
      Before:  16556    54015
      After:   16497    52355
      - Whole case:
                 32      64
      Before:  37031   103008
      After:   32045    93952
      bff7feb3
    • James Almer's avatar
      x86/hevc: add ff_hevc_sao_band_filter_{8,10,12}_{sse2,avx,avx2} · fa3eccb4
      James Almer authored
      Original x86 intrinsics code and initial 8bit yasm port by Pierre-Edouard Lepere.
      10/12bit yasm ports, refactoring and optimizations by James Almer
      
      Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U
      
      width 32
      40338 decicycles in sao_band_filter_0_8, 2048 runs, 0 skips
      8056 decicycles in ff_hevc_sao_band_filter_8_32_sse2, 2048 runs, 0 skips
      7458 decicycles in ff_hevc_sao_band_filter_8_32_avx, 2048 runs, 0 skips
      4504 decicycles in ff_hevc_sao_band_filter_8_32_avx2, 2048 runs, 0 skips
      
      width 64
      136046 decicycles in sao_band_filter_0_8, 16384 runs, 0 skips
      28576 decicycles in ff_hevc_sao_band_filter_8_32_sse2, 16384 runs, 0 skips
      26707 decicycles in ff_hevc_sao_band_filter_8_32_avx, 16384 runs, 0 skips
      14387 decicycles in ff_hevc_sao_band_filter_8_32_avx2, 16384 runs, 0 skips
      Reviewed-by: 's avatarChristophe Gisquet <christophe.gisquet@gmail.com>
      Signed-off-by: 's avatarJames Almer <jamrial@gmail.com>
      fa3eccb4