1. 05 Feb, 2019 1 commit
    • Lauri Kasanen's avatar
      libswscale/ppc: VSX-optimize 9-16 bit yuv2planeX · 8522d219
      Lauri Kasanen authored
      ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p16be \
      -s 1920x1728 -f null -vframes 100 -v error -nostats -
      
      9-14 bit funcs get about 6x speedup, 16-bit gets about 15x.
      Fate passes, each format tested with an image to video conversion.
      
      Only POWER8 includes 32-bit vector multiplies, so POWER7 is locked out
      of the 16-bit function. This includes the vec_mulo/mule functions too,
      not just vmuluwm.
      
      With TIMER_REPORT skips disabled:
      yuv420p9le
        12412 UNITS in planarX,  131072 runs,      0 skips
        73136 UNITS in planarX,  131072 runs,      0 skips
      yuv420p9be
        12481 UNITS in planarX,  131072 runs,      0 skips
        73410 UNITS in planarX,  131072 runs,      0 skips
      yuv420p10le
        12322 UNITS in planarX,  131072 runs,      0 skips
        72546 UNITS in planarX,  131072 runs,      0 skips
      yuv420p10be
        12291 UNITS in planarX,  131072 runs,      0 skips
        72935 UNITS in planarX,  131072 runs,      0 skips
      yuv420p12le
        12316 UNITS in planarX,  131072 runs,      0 skips
        72708 UNITS in planarX,  131072 runs,      0 skips
      yuv420p12be
        12319 UNITS in planarX,  131072 runs,      0 skips
        72577 UNITS in planarX,  131072 runs,      0 skips
      yuv420p14le
        12259 UNITS in planarX,  131072 runs,      0 skips
        72516 UNITS in planarX,  131072 runs,      0 skips
      yuv420p14be
        12440 UNITS in planarX,  131072 runs,      0 skips
        72962 UNITS in planarX,  131072 runs,      0 skips
      yuv420p16le
        10548 UNITS in planarX,  131072 runs,      0 skips
        73429 UNITS in planarX,  131072 runs,      0 skips
      yuv420p16be
        10634 UNITS in planarX,  131072 runs,      0 skips
       150959 UNITS in planarX,  131072 runs,      0 skips
      Signed-off-by: 's avatarLauri Kasanen <cand@gmx.com>
      8522d219
  2. 01 Jan, 2019 1 commit
  3. 26 Dec, 2018 1 commit
    • Lauri Kasanen's avatar
      swscale/output: Altivec-optimize float yuv2plane1 · 8dd9df9e
      Lauri Kasanen authored
      This function wouldn't benefit from VSX instructions, so I put it
      under altivec.
      
      ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt grayf32le \
      -f null -vframes 100 -v error -nostats -
      
      3743 UNITS in planar1,   65495 runs,     41 skips
      
      -cpuflags 0
      
      23511 UNITS in planar1,   65530 runs,      6 skips
      
      grayf32be
      
      4647 UNITS in planar1,   65449 runs,     87 skips
      
      -cpuflags 0
      
      28608 UNITS in planar1,   65530 runs,      6 skips
      
      The native speedup is 6.28133, and the bswapping one 6.15623.
      Fate passes, each format tested with an image to video conversion.
      Signed-off-by: 's avatarLauri Kasanen <cand@gmx.com>
      Signed-off-by: 's avatarMichael Niedermayer <michael@niedermayer.cc>
      8dd9df9e
  4. 14 Dec, 2018 1 commit
  5. 12 Dec, 2018 1 commit
    • Lauri Kasanen's avatar
      swscale/output: VSX-optimize nbps yuv2plane1 · 1046cba2
      Lauri Kasanen authored
      ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p9le \
      -f null -vframes 100 -v error -nostats -
      
      Speedups:
      yuv2plane1_9BE_vsx	11.2042
      yuv2plane1_9LE_vsx	11.156
      yuv2plane1_10BE_vsx	9.89428
      yuv2plane1_10LE_vsx	10.3637
      yuv2plane1_12BE_vsx	9.71923
      yuv2plane1_12LE_vsx	11.0404
      yuv2plane1_14BE_vsx	10.1763
      yuv2plane1_14LE_vsx	11.2728
      
      Fate passes, each format tested with an image to video conversion.
      Signed-off-by: 's avatarMichael Niedermayer <michael@niedermayer.cc>
      1046cba2
  6. 04 Dec, 2018 1 commit
  7. 26 Nov, 2018 1 commit
    • Lauri Kasanen's avatar
      swscale/output: Altivec-optimize yuv2plane1_8 · 46c5693e
      Lauri Kasanen authored
      ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p \
      -f null -vframes 100 -v error -nostats -
      
      1158 UNITS in planar1,   65528 runs,      8 skips
      
      -cpuflags 0
      
      19082 UNITS in planar1,   65533 runs,      3 skips
      
      16.48 speedup ratio. On x86, SSE2 is ~7. Curiously, the Power C version
      takes as many cycles as the x86 SSE2 version, yikes it's fast.
      
      Note that this function uses VSX instructions, but is not marked so.
      This is because several existing functions also make that mistake.
      I'll submit a patch moving them once this is reviewed.
      Signed-off-by: 's avatarLauri Kasanen <cand@gmx.com>
      Signed-off-by: 's avatarMichael Niedermayer <michael@niedermayer.cc>
      46c5693e
  8. 24 Nov, 2018 1 commit
  9. 01 Nov, 2018 2 commits
  10. 24 Oct, 2018 3 commits
  11. 18 Oct, 2018 2 commits
  12. 13 Oct, 2018 2 commits
  13. 09 Sep, 2018 1 commit
  14. 22 Aug, 2018 2 commits
  15. 14 Aug, 2018 1 commit
  16. 10 Jun, 2018 1 commit
    • Carl Eugen Hoyos's avatar
      lsws/rgb2rgb_template: Do not compile unneeded shuffle functions on big-endian. · 3a56ade1
      Carl Eugen Hoyos authored
      Fixes the following warnings:
      In file included from libswscale/rgb2rgb.c:128:0:
      libswscale/rgb2rgb_template.c:346:13: warning: 'shuffle_bytes_3210_c' defined but not used
      libswscale/rgb2rgb_template.c:346:13: warning: 'shuffle_bytes_3012_c' defined but not used
      libswscale/rgb2rgb_template.c:346:13: warning: 'shuffle_bytes_1230_c' defined but not used
      3a56ade1
  17. 05 May, 2018 1 commit
  18. 22 Apr, 2018 1 commit
  19. 16 Apr, 2018 2 commits
  20. 03 Apr, 2018 1 commit
    • wm4's avatar
      avutil/pixdesc: deprecate AV_PIX_FMT_FLAG_PSEUDOPAL · d6fc031c
      wm4 authored
      PSEUDOPAL pixel formats are not paletted, but carried a palette with the
      intention of allowing code to treat unpaletted formats as paletted. The
      palette simply mapped the byte values to the resulting RGB values,
      making it some sort of LUT for RGB conversion.
      
      It was used for 1 byte formats only: RGB4_BYTE, BGR4_BYTE, RGB8, BGR8,
      GRAY8. The first 4 are awfully obscure, used only by some ancient bitmap
      formats. The last one, GRAY8, is more common, but its treatment is
      grossly incorrect. It considers full range GRAY8 only, so GRAY8 coming
      from typical Y video planes was not mapped to the correct RGB values.
      This cannot be fixed, because AVFrame.color_range can be freely changed
      at runtime, and there is nothing to ensure the pseudo palette is
      updated.
      
      Also, nothing actually used the PSEUDOPAL palette data, except xwdenc
      (trivially changed in the previous commit). All other code had to treat
      it as a special case, just to ignore or to propagate palette data.
      
      In conclusion, this was just a very strange old mechnaism that has no
      real justification to exist anymore (although it may have been nice and
      useful in the past). Now it's an artifact that makes the API harder to
      use: API users who allocate their own pixel data have to be aware that
      they need to allocate the palette, or FFmpeg will crash on them in
      _some_ situations. On top of this, there was no API to allocate the
      pseuo palette outside of av_frame_get_buffer().
      
      This patch not only deprecates AV_PIX_FMT_FLAG_PSEUDOPAL, but also makes
      the pseudo palette optional. Nothing accesses it anymore, though if it's
      set, it's propagated. It's still allocated and initialized for
      compatibility with API users that rely on this feature. But new API
      users do not need to allocate it. This was an explicit goal of this
      patch.
      
      Most changes replace AV_PIX_FMT_FLAG_PSEUDOPAL with FF_PSEUDOPAL. I
      first tried #ifdefing all code, but it was a mess. The FF_PSEUDOPAL
      macro reduces the mess, and still allows defining FF_API_PSEUDOPAL to 0.
      
      Passes FATE with FF_API_PSEUDOPAL enabled and disabled. In addition,
      FATE passes with FF_API_PSEUDOPAL set to 1, but with allocation
      functions manually changed to not allocating a palette.
      d6fc031c
  21. 31 Mar, 2018 1 commit
  22. 24 Mar, 2018 4 commits
  23. 03 Mar, 2018 1 commit
  24. 02 Mar, 2018 1 commit
  25. 13 Nov, 2017 1 commit
    • Thomas Köppe's avatar
      Fix missing used attribute for inline assembly variables · 43171a2a
      Thomas Köppe authored
      Variables used in inline assembly need to be marked with attribute((used)).
      Static constants already were, via the define of DECLARE_ASM_CONST.
      But DECLARE_ALIGNED does not add this attribute, and some of the variables
      defined with it are const only used in inline assembly, and therefore
      appeared dead. This change adds a macro DECLARE_ASM_ALIGNED that marks
      variables as used.
      
      This change makes FFMPEG work with Clang's ThinLTO.
      Signed-off-by: 's avatarMichael Niedermayer <michael@niedermayer.cc>
      43171a2a
  26. 29 Oct, 2017 1 commit
  27. 25 Oct, 2017 1 commit
  28. 23 Oct, 2017 1 commit
  29. 11 Oct, 2017 1 commit
  30. 10 Oct, 2017 1 commit