1. 20 Mar, 2019 2 commits
  2. 05 Feb, 2019 1 commit
    • Lauri Kasanen's avatar
      libswscale/ppc: VSX-optimize 9-16 bit yuv2planeX · 8522d219
      Lauri Kasanen authored
      ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p16be \
      -s 1920x1728 -f null -vframes 100 -v error -nostats -
      
      9-14 bit funcs get about 6x speedup, 16-bit gets about 15x.
      Fate passes, each format tested with an image to video conversion.
      
      Only POWER8 includes 32-bit vector multiplies, so POWER7 is locked out
      of the 16-bit function. This includes the vec_mulo/mule functions too,
      not just vmuluwm.
      
      With TIMER_REPORT skips disabled:
      yuv420p9le
        12412 UNITS in planarX,  131072 runs,      0 skips
        73136 UNITS in planarX,  131072 runs,      0 skips
      yuv420p9be
        12481 UNITS in planarX,  131072 runs,      0 skips
        73410 UNITS in planarX,  131072 runs,      0 skips
      yuv420p10le
        12322 UNITS in planarX,  131072 runs,      0 skips
        72546 UNITS in planarX,  131072 runs,      0 skips
      yuv420p10be
        12291 UNITS in planarX,  131072 runs,      0 skips
        72935 UNITS in planarX,  131072 runs,      0 skips
      yuv420p12le
        12316 UNITS in planarX,  131072 runs,      0 skips
        72708 UNITS in planarX,  131072 runs,      0 skips
      yuv420p12be
        12319 UNITS in planarX,  131072 runs,      0 skips
        72577 UNITS in planarX,  131072 runs,      0 skips
      yuv420p14le
        12259 UNITS in planarX,  131072 runs,      0 skips
        72516 UNITS in planarX,  131072 runs,      0 skips
      yuv420p14be
        12440 UNITS in planarX,  131072 runs,      0 skips
        72962 UNITS in planarX,  131072 runs,      0 skips
      yuv420p16le
        10548 UNITS in planarX,  131072 runs,      0 skips
        73429 UNITS in planarX,  131072 runs,      0 skips
      yuv420p16be
        10634 UNITS in planarX,  131072 runs,      0 skips
       150959 UNITS in planarX,  131072 runs,      0 skips
      Signed-off-by: 's avatarLauri Kasanen <cand@gmx.com>
      8522d219
  3. 04 Dec, 2018 1 commit
  4. 26 Nov, 2018 1 commit
    • Lauri Kasanen's avatar
      swscale/output: Altivec-optimize yuv2plane1_8 · 46c5693e
      Lauri Kasanen authored
      ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p \
      -f null -vframes 100 -v error -nostats -
      
      1158 UNITS in planar1,   65528 runs,      8 skips
      
      -cpuflags 0
      
      19082 UNITS in planar1,   65533 runs,      3 skips
      
      16.48 speedup ratio. On x86, SSE2 is ~7. Curiously, the Power C version
      takes as many cycles as the x86 SSE2 version, yikes it's fast.
      
      Note that this function uses VSX instructions, but is not marked so.
      This is because several existing functions also make that mistake.
      I'll submit a patch moving them once this is reviewed.
      Signed-off-by: 's avatarLauri Kasanen <cand@gmx.com>
      Signed-off-by: 's avatarMichael Niedermayer <michael@niedermayer.cc>
      46c5693e
  5. 14 Aug, 2018 1 commit
  6. 09 Nov, 2016 1 commit
  7. 12 Oct, 2016 1 commit
  8. 27 Sep, 2016 1 commit
  9. 31 Mar, 2016 1 commit
  10. 31 May, 2015 1 commit
  11. 27 Apr, 2015 1 commit
  12. 14 Mar, 2015 1 commit
  13. 12 Nov, 2014 1 commit
  14. 29 Aug, 2013 1 commit
  15. 08 Oct, 2012 1 commit
  16. 05 Oct, 2012 1 commit
  17. 22 Jul, 2012 1 commit
  18. 04 Jul, 2012 1 commit
  19. 06 Mar, 2012 1 commit
  20. 21 Feb, 2012 1 commit
  21. 25 Jan, 2012 1 commit
  22. 22 Oct, 2011 2 commits
  23. 25 Sep, 2011 1 commit
  24. 18 Aug, 2011 1 commit
  25. 12 Aug, 2011 2 commits
  26. 11 Jul, 2011 1 commit
  27. 01 Jul, 2011 1 commit
  28. 30 Jun, 2011 1 commit
  29. 29 Jun, 2011 1 commit
  30. 28 Jun, 2011 3 commits
  31. 26 Jun, 2011 1 commit
  32. 07 Jun, 2011 2 commits
  33. 03 Jun, 2011 2 commits