1. 11 Apr, 2019 3 commits
    • Paul B Mahol's avatar
      avcodec/agm: add support for non-dct coding · 7be8f7ac
      Paul B Mahol authored
      7be8f7ac
    • Paul B Mahol's avatar
      0f283559
    • Lauri Kasanen's avatar
      swscale/ppc: VSX-optimize non-full-chroma yuv2rgb_2 · ce92ee4b
      Lauri Kasanen authored
      ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags fast_bilinear \
              -s 1200x720 -f null -vframes 100 -pix_fmt $i -nostats \
              -cpuflags 0 -v error -
      
      32-bit mul, power8 only.
      
      ~2x speedup:
      
      rgb24
        24431 UNITS in yuv2packed2,   16384 runs,      0 skips
        13783 UNITS in yuv2packed2,   16383 runs,      1 skips
      bgr24
        24396 UNITS in yuv2packed2,   16384 runs,      0 skips
        14059 UNITS in yuv2packed2,   16384 runs,      0 skips
      rgba
        26815 UNITS in yuv2packed2,   16383 runs,      1 skips
        12797 UNITS in yuv2packed2,   16383 runs,      1 skips
      bgra
        27060 UNITS in yuv2packed2,   16384 runs,      0 skips
        13138 UNITS in yuv2packed2,   16384 runs,      0 skips
      argb
        26998 UNITS in yuv2packed2,   16384 runs,      0 skips
        12728 UNITS in yuv2packed2,   16381 runs,      3 skips
      bgra
        26651 UNITS in yuv2packed2,   16384 runs,      0 skips
        13124 UNITS in yuv2packed2,   16384 runs,      0 skips
      
      This is a low speedup, but the x86 mmx version also gets only ~2x. The mmx version
      is also heavily inaccurate, while the vsx version has high accuracy.
      ce92ee4b
  2. 10 Apr, 2019 5 commits
  3. 09 Apr, 2019 5 commits
    • Lynne's avatar
      aarch64/opusdsp: implement NEON accelerated postfilter and deemphasis · 4d2f6215
      Lynne authored
      153372 UNITS in postfilter_c,   65536 runs,      0 skips
      73164 UNITS in postfilter_neon,   65536 runs,      0 skips -> 2.1x speedup
      
      80591 UNITS in deemphasis_c,  131072 runs,      0 skips
      43969 UNITS in deemphasis_neon,  131072 runs,      0 skips -> 1.83x speedup
      
      Total decoder speedup: ~15% on a Raspberry Pi 3 (from 28.1x to 33.5x realtime)
      
      Deemphasis SIMD based on the following unrolling:
      const float c1 = CELT_EMPH_COEFF, c2 = c1*c1, c3 = c2*c1, c4 = c3*c1;
      float state = coeff;
      
      for (int i = 0; i < len; i += 4) {
          y[0] = x[0] + c1*state;
          y[1] = x[1] + c2*state + c1*x[0];
          y[2] = x[2] + c3*state + c1*x[1] + c2*x[0];
          y[3] = x[3] + c4*state + c1*x[2] + c2*x[1] + c3*x[0];
      
          state = y[3];
          y += 4;
          x += 4;
      }
      
      Unlike the x86 version, duplication is used instead of pslldq so
      the structure and tables are different.
      4d2f6215
    • Jarek Samic's avatar
      libavutil/hwcontext_opencl: Fix channel order in format support check · 1c50d61a
      Jarek Samic authored
      The `opencl_get_plane_format` function was incorrectly determining the
      value used to set the image channel order. This resulted in all RGB
      pixel formats being set to the `CL_RGBA` pixel format, regardless of
      whether or not they actually *were* RGBA.
      
      This patch fixes the issue by using the `offset` and depth of components
      rather than the loop index to determine the value of `order`.
      Signed-off-by: 's avatarJarek Samic <cldfire3@gmail.com>
      Signed-off-by: 's avatarMark Thompson <sw@jkqxz.net>
      1c50d61a
    • Tristan Matthews's avatar
      avformat/matroskaenc: fix leak on error · 1ec777dc
      Tristan Matthews authored
      Signed-off-by: 's avatarJames Almer <jamrial@gmail.com>
      1ec777dc
    • Carl Eugen Hoyos's avatar
      d6a83922
    • Jun Zhao's avatar
      lavf/matroskaenc: Fix memory leak after write trailer · 0a347ff4
      Jun Zhao authored
      Fix memory leak after write trailer for #7827, only store a audio
      packet whose buffer has size greater than zero in cur_audio_pkt.
      
      Audio packets with size zero, but with side-data currently lead to
      memleaks, in the Matroska muxer, because they are not properly freed:
      
      They are currently put into an AVPacket in the MatroskaMuxContext to
      ensure that the necessary audio is always available for a new cluster,
      but are only written and freed when their size is > 0.
      
      As the only use we have for such packets consists in updating the
      CodecPrivate it makes no sense to store these packets at all and this
      is how this commit solves the memleak.
      Signed-off-by: 's avatarAndreas Rheinhardt <andreas.rheinhardt@googlemail.com>
      Signed-off-by: 's avatarJun Zhao <barryjzhao@tencent.com>
      0a347ff4
  4. 08 Apr, 2019 1 commit
  5. 07 Apr, 2019 8 commits
    • Paul B Mahol's avatar
      ecdaa4b4
    • Paul B Mahol's avatar
      3a2adeed
    • Nikolas Bowe via ffmpeg-devel's avatar
      avfilter/af_asetnsamples: fix sample queuing. · 4c8e3725
      Nikolas Bowe via ffmpeg-devel authored
      When asetnsamples uses output samples < input samples, remaining samples build up in the fifo over time.
      Fix this by marking the filter as ready again if there are enough samples.
      
      Regression since ef3babb2Reviewed-by: 's avatarPaul B Mahol <onemda@gmail.com>
      Signed-off-by: 's avatarMichael Niedermayer <michael@niedermayer.cc>
      4c8e3725
    • Lauri Kasanen's avatar
      swscale/ppc: VSX-optimize yuv2rgb_full_X · 8607e29f
      Lauri Kasanen authored
      ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 \
                      -s 1200x720 -f null -vframes 100 -pix_fmt $i -nostats \
                      -cpuflags 0 -v error -
      
      32-bit mul, power8 only.
      
      ~6.4x speedup:
      
      rgb24
       214278 UNITS in yuv2packedX,   16384 runs,      0 skips
        33249 UNITS in yuv2packedX,   16384 runs,      0 skips
      bgr24
       214616 UNITS in yuv2packedX,   16384 runs,      0 skips
        33233 UNITS in yuv2packedX,   16384 runs,      0 skips
      rgba
       214517 UNITS in yuv2packedX,   16384 runs,      0 skips
        33271 UNITS in yuv2packedX,   16384 runs,      0 skips
      bgra
       214973 UNITS in yuv2packedX,   16384 runs,      0 skips
        33397 UNITS in yuv2packedX,   16384 runs,      0 skips
      argb
       214613 UNITS in yuv2packedX,   16384 runs,      0 skips
        33310 UNITS in yuv2packedX,   16384 runs,      0 skips
      bgra
       214637 UNITS in yuv2packedX,   16384 runs,      0 skips
        33330 UNITS in yuv2packedX,   16384 runs,      0 skips
      8607e29f
    • Lauri Kasanen's avatar
      swscale/ppc: VSX-optimize yuv2rgb_full_2 · 3256e949
      Lauri Kasanen authored
      ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags area \
                  -s 1200x720 -f null -vframes 100 -pix_fmt $i -nostats \
                  -cpuflags 0 -v error -
      
      32-bit mul, power8 only.
      
      ~4x speedup:
      
      rgb24
        52763 UNITS in yuv2packed2,   16384 runs,      0 skips
        13453 UNITS in yuv2packed2,   16384 runs,      0 skips
      bgr24
        53144 UNITS in yuv2packed2,   16384 runs,      0 skips
        13616 UNITS in yuv2packed2,   16384 runs,      0 skips
      rgba
        52796 UNITS in yuv2packed2,   16384 runs,      0 skips
        12904 UNITS in yuv2packed2,   16384 runs,      0 skips
      bgra
        52732 UNITS in yuv2packed2,   16384 runs,      0 skips
        13262 UNITS in yuv2packed2,   16384 runs,      0 skips
      argb
        52661 UNITS in yuv2packed2,   16384 runs,      0 skips
        12879 UNITS in yuv2packed2,   16384 runs,      0 skips
      bgra
        52662 UNITS in yuv2packed2,   16384 runs,      0 skips
        12932 UNITS in yuv2packed2,   16384 runs,      0 skips
      3256e949
    • Lauri Kasanen's avatar
      swscale/ppc: VSX-optimize non-full-chroma yuv2rgb_1 · 50e672bc
      Lauri Kasanen authored
      ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags fast_bilinear \
              -s 1200x1440 -f null -vframes 100 -pix_fmt $i -nostats \
              -cpuflags 0 -v error -
      
      32-bit mul, power8 only.
      
      1.8-2.3x speedup:
      
      rgb24
        18192 UNITS in yuv2packed1,   32767 runs,      1 skips
         9983 UNITS in yuv2packed1,   32760 runs,      8 skips
      bgr24
        18665 UNITS in yuv2packed1,   32766 runs,      2 skips
         9925 UNITS in yuv2packed1,   32763 runs,      5 skips
      rgba
        20239 UNITS in yuv2packed1,   32767 runs,      1 skips
         8794 UNITS in yuv2packed1,   32759 runs,      9 skips
      bgra
        20354 UNITS in yuv2packed1,   32768 runs,      0 skips
         8770 UNITS in yuv2packed1,   32761 runs,      7 skips
      argb
        20185 UNITS in yuv2packed1,   32768 runs,      0 skips
         8761 UNITS in yuv2packed1,   32761 runs,      7 skips
      bgra
        20360 UNITS in yuv2packed1,   32766 runs,      2 skips
         8759 UNITS in yuv2packed1,   32764 runs,      4 skips
      
      This is a low speedup, but the x86 mmx version also gets only ~2x. The mmx version
      is also heavily inaccurate, while the vsx version has high accuracy.
      50e672bc
    • Jun Zhao's avatar
      doc/examples/metadata: fix the example can't dump FLV metadata · 7c187514
      Jun Zhao authored
      fix the example can't dump FLV metadata.
      Signed-off-by: 's avatarJun Zhao <barryjzhao@tencent.com>
      7c187514
    • Carl Eugen Hoyos's avatar
  6. 06 Apr, 2019 2 commits
    • Swaraj Hota's avatar
      lavf/flvdec: added support for KUX container · 208ae228
      Swaraj Hota authored
      Fixes ticket #4519.
      
      The metadata starting at 0xe00004 is encrypted
      with the password "meta" but zlib does not
      support decryption, so no kux metadata is read.
      208ae228
    • Octavio Alvarez's avatar
      lavd/x11grab: fix vertical repositioning · f4f40cbb
      Octavio Alvarez authored
      There is a calculation error in xcbgrab_reposition() that breaks
      vertical repositioning on follow_mouse. It made the bottom
      reposition occur when moving the mouse lower than N pixels after
      the capture bottom edge, instead of before.
      
      This commit fixes the calculation to match the documentation.
      
      follow_mouse: centered or number of pixels. The documentation says:
      
      When it is specified with "centered", the grabbing region follows
      the mouse pointer and keeps the pointer at the center of region;
      otherwise, the region follows only when the mouse pointer reaches
      within PIXELS (greater than zero) to the edge of region.
      f4f40cbb
  7. 05 Apr, 2019 3 commits
  8. 04 Apr, 2019 1 commit
  9. 03 Apr, 2019 7 commits
  10. 02 Apr, 2019 5 commits