1. 14 Feb, 2015 1 commit
  2. 12 Feb, 2015 1 commit
  3. 09 Feb, 2015 1 commit
  4. 06 Feb, 2015 2 commits
  5. 05 Feb, 2015 2 commits
    • James Almer's avatar
      x86/hevcdsp: add ff_hevc_sao_edge_filter_{10,12}_{sse2,avx2} · 15574c50
      James Almer authored
      Original x86 intrinsics code by Pierre-Edouard Lepere.
      Yasm port, refactoring and optimizations by James Almer.
      
      Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U
      
      Width 32
      342694 decicycles in sao_edge_filter_10, 16384 runs, 0 skips
      29476 decicycles in ff_hevc_sao_edge_filter_32_10_ssse3, 16384 runs, 0 skips
      13996 decicycles in ff_hevc_sao_edge_filter_32_10_avx2, 16381 runs, 3 skips
      
      Width 64
      581163 decicycles in sao_edge_filter_10, 8192 runs, 0 skips
      59774 decicycles in ff_hevc_sao_edge_filter_64_10_ssse3, 8192 runs, 0 skips
      28383 decicycles in ff_hevc_sao_edge_filter_64_10_avx2, 8191 runs, 1 skips
      Signed-off-by: 's avatarJames Almer <jamrial@gmail.com>
      15574c50
    • James Almer's avatar
      x86/hevcdsp: add ff_hevc_sao_edge_filter_8_{ssse3,avx2} · 042c1159
      James Almer authored
      Original x86 intrinsics code and initial yasm port by Pierre-Edouard Lepere.
      Refactoring and optimizations by James Almer.
      
      Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U
      
      Width 32
      158583 decicycles in edge, sao_edge_filter_8 runs, 0 skips
      5205 decicycles in ff_hevc_sao_edge_filter_32_8_ssse3, 32767 runs, 1 skips
      2942 decicycles in ff_hevc_sao_edge_filter_32_8_avx2, 32767 runs, 1 skips
      
      Width 64
      705639 decicycles in sao_edge_filter_8, 262144 runs, 0 skips
      19224 decicycles in ff_hevc_sao_edge_filter_64_8_ssse3, 262111 runs, 33 skips
      10433 decicycles in ff_hevc_sao_edge_filter_64_8_avx2, 262115 runs, 29 skips
      Signed-off-by: 's avatarJames Almer <jamrial@gmail.com>
      042c1159
  6. 01 Feb, 2015 1 commit
    • James Almer's avatar
      x86/hevc: add ff_hevc_sao_band_filter_{8,10,12}_{sse2,avx,avx2} · fa3eccb4
      James Almer authored
      Original x86 intrinsics code and initial 8bit yasm port by Pierre-Edouard Lepere.
      10/12bit yasm ports, refactoring and optimizations by James Almer
      
      Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U
      
      width 32
      40338 decicycles in sao_band_filter_0_8, 2048 runs, 0 skips
      8056 decicycles in ff_hevc_sao_band_filter_8_32_sse2, 2048 runs, 0 skips
      7458 decicycles in ff_hevc_sao_band_filter_8_32_avx, 2048 runs, 0 skips
      4504 decicycles in ff_hevc_sao_band_filter_8_32_avx2, 2048 runs, 0 skips
      
      width 64
      136046 decicycles in sao_band_filter_0_8, 16384 runs, 0 skips
      28576 decicycles in ff_hevc_sao_band_filter_8_32_sse2, 16384 runs, 0 skips
      26707 decicycles in ff_hevc_sao_band_filter_8_32_avx, 16384 runs, 0 skips
      14387 decicycles in ff_hevc_sao_band_filter_8_32_avx2, 16384 runs, 0 skips
      Reviewed-by: 's avatarChristophe Gisquet <christophe.gisquet@gmail.com>
      Signed-off-by: 's avatarJames Almer <jamrial@gmail.com>
      fa3eccb4
  7. 04 Sep, 2014 1 commit
  8. 24 Aug, 2014 1 commit
    • Christophe Gisquet's avatar
      x86: hevc_mc: split differently calls · 3e892b2b
      Christophe Gisquet authored
      In some cases, 2 or 3 calls are performed to functions for unusual
      widths. Instead, perform 2 calls for different widths to split the
      workload.
      
      The 8+16 and 4+8 widths for respectively 8 and more than 8 bits can't
      be processed that way without modifications: some calls use unaligned
      buffers, and having branches to handle this was resulting in no
      micro-benchmark benefit.
      
      For block_w == 12 (around 1% of the pixels of the sequence):
      Before:
      12758 decicycles in epel_uni, 4093 runs, 3 skips
      19389 decicycles in qpel_uni, 8187 runs, 5 skips
      22699 decicycles in epel_bi, 32743 runs, 25 skips
      34736 decicycles in qpel_bi, 32733 runs, 35 skips
      
      After:
      11929 decicycles in epel_uni, 4096 runs, 0 skips
      18131 decicycles in qpel_uni, 8184 runs, 8 skips
      20065 decicycles in epel_bi, 32750 runs, 18 skips
      31458 decicycles in qpel_bi, 32753 runs, 15 skips
      Signed-off-by: 's avatarMichael Niedermayer <michaelni@gmx.at>
      3e892b2b
  9. 22 Aug, 2014 2 commits
  10. 21 Aug, 2014 1 commit
  11. 20 Aug, 2014 1 commit
  12. 19 Aug, 2014 1 commit
  13. 29 Jul, 2014 1 commit
  14. 26 Jul, 2014 4 commits
  15. 25 Jul, 2014 2 commits
  16. 23 Jul, 2014 1 commit
  17. 22 Jul, 2014 1 commit
  18. 13 Jul, 2014 1 commit
  19. 25 Jun, 2014 1 commit
  20. 17 Jun, 2014 1 commit
  21. 01 Jun, 2014 1 commit
  22. 16 May, 2014 1 commit
  23. 09 May, 2014 1 commit
  24. 08 May, 2014 1 commit
  25. 06 May, 2014 3 commits