1. 22 Aug, 2019 2 commits
    • Marton Balint's avatar
      avutil/imgutils: remove dead assignment · b2e37e3e
      Marton Balint authored
      Signed-off-by: 's avatarMarton Balint <cus@passwd.hu>
      b2e37e3e
    • Peter Collingbourne's avatar
      Add assembly support for -fsanitize=hwaddress tagged globals. · 9bcb1cb6
      Peter Collingbourne authored
      As of LLVM r368102, Clang will set a pointer tag in bits 56-63 of the
      address of a global when compiling with -fsanitize=hwaddress. This requires
      an adjustment to assembly code that takes the address of such globals: the
      code cannot use the regular R_AARCH64_ADR_PREL_PG_HI21 relocation to refer
      to the global, since the tag would take the address out of range. Instead,
      the code must use the non-checking (_NC) variant of the relocation (the
      link-time check is substituted by a runtime check).
      
      This change makes the necessary adjustment in the movrel macro, where it is
      needed when compiling with -fsanitize=hwaddress.
      Signed-off-by: 's avatarPeter Collingbourne <pcc@google.com>
      Reviewed-by: Martin Storsjö
      Reviewed-by: Janne Grunau
      9bcb1cb6
  2. 14 Aug, 2019 1 commit
  3. 13 Aug, 2019 1 commit
    • gxw's avatar
      avutil/mips: refine msa macros CLIP_*. · a3e572d9
      gxw authored
      Changing details as following:
      1. Remove the local variable 'out_m' in 'CLIP_SH' and store the result in
         source vector.
      2. Refine the implementation of macro 'CLIP_SH_0_255' and 'CLIP_SW_0_255'.
         Performance of VP8 decoding has speed up about 1.1%(from 7.03x to 7.11x).
         Performance of H264 decoding has speed up about 0.5%(from 4.35x to 4.37x).
         Performance of Theora decoding has speed up about 0.7%(from 5.79x to 5.83x).
      3. Remove redundant macro 'CLIP_SH/Wn_0_255_MAX_SATU' and use 'CLIP_SH/Wn_0_255'
         instead, because there are no difference in the effect of this two macros.
      Reviewed-by: 's avatarShiyou Yin <yinshiyou-hf@loongson.cn>
      Signed-off-by: 's avatarMichael Niedermayer <michael@niedermayer.cc>
      a3e572d9
  4. 02 Aug, 2019 2 commits
  5. 30 Jul, 2019 1 commit
  6. 21 Jul, 2019 3 commits
  7. 18 Jul, 2019 1 commit
    • Shiyou Yin's avatar
      avutil/mips: refactor msa load and store macros. · 153c6075
      Shiyou Yin authored
      Replace STnxm_UB and LDnxm_SH with new macros ST_{H/W/D}{1/2/4/8}.
      The old macros are difficult to use because they don't follow the same parameter passing rules.
      Changing details as following:
      1. remove LD4x4_SH.
      2. replace ST2x4_UB with ST_H4.
      3. replace ST4x2_UB with ST_W2.
      4. replace ST4x4_UB with ST_W4.
      5. replace ST4x8_UB with ST_W8.
      6. replace ST6x4_UB with ST_W2 and ST_H2.
      7. replace ST8x1_UB with ST_D1.
      8. replace ST8x2_UB with ST_D2.
      9. replace ST8x4_UB with ST_D4.
      10. replace ST8x8_UB with ST_D8.
      11. replace ST12x4_UB with ST_D4 and ST_W4.
      
      Examples of new macro: ST_H4(in, idx0, idx1, idx2, idx3, pdst, stride)
      ST_H4 store four half-word elements in vector 'in' to pdst with stride.
      About the macro name:
      1) 'ST' means store operation.
      2) 'H/W/D' means type of vector element is 'half-word/word/double-word'.
      3) Number '1/2/4/8' means how many elements will be stored.
      About the macro parameter:
      1) 'in0, in1...' 128-bits vector.
      2) 'idx0, idx1...' elements index.
      3) 'pdst' destination pointer to store to
      4) 'stride' stride of each store operation.
      Signed-off-by: 's avatarMichael Niedermayer <michael@niedermayer.cc>
      153c6075
  8. 11 Jul, 2019 1 commit
  9. 10 Jul, 2019 1 commit
  10. 07 Jul, 2019 1 commit
  11. 29 Jun, 2019 1 commit
  12. 16 Jun, 2019 1 commit
  13. 02 Jun, 2019 5 commits
    • Mark Thompson's avatar
      hwcontext_qsv: Try to select a matching VAAPI device by default · 468f0038
      Mark Thompson authored
      Tries to find a device backed by the i915 kernel driver and loads the iHD
      VAAPI driver to use with it.  This reduces confusion on machines with
      multiple DRM devices and removes the surprising requirement to set the
      LIBVA_DRIVER_NAME environment variable to use libmfx at all.
      468f0038
    • Mark Thompson's avatar
      hwcontext_vaapi: Try to create devices via DRM before X11 · 0b4696fb
      Mark Thompson authored
      Opening the device via X11 (DRI2/DRI3) rather than opening a DRM render
      node directly is only useful if you intend to use the legacy X11 interop
      functions.  That's never true for the ffmpeg utility, and a library user
      who does want this will likely provide their own display instance rather
      than making a new one here.
      0b4696fb
    • Mark Thompson's avatar
      hwcontext_vaapi: Add option to set driver name · 7f3f5a24
      Mark Thompson authored
      For example: -init_hw_device vaapi:/dev/dri/renderD128,driver=foo
      
      This may be more convenient that using the environment variable, and allows
      loading different drivers for different devices in the same process.
      7f3f5a24
    • Mark Thompson's avatar
      hwcontext_vaapi: Make default DRM device selection more helpful · 6b6b8a63
      Mark Thompson authored
      Iterate over available render devices and pick the first one which looks
      usable.  Adds an option to specify the name of the kernel driver associated
      with the desired device, so that it is possible to select a specific type
      of device in a multiple-device system without knowing the card numbering.
      
      For example: -init_hw_device vaapi:,kernel_driver=amdgpu will select only
      devices using the "amdgpu" driver (as used with recent AMD graphics cards).
      
      Kernel driver selection requires libdrm to work.
      6b6b8a63
    • Mark Thompson's avatar
      hwcontext_vaapi: Add option to specify connection type · d2141a9b
      Mark Thompson authored
      Can be set to "drm" or "x11" to force a specific connection type.
      d2141a9b
  14. 01 Jun, 2019 1 commit
  15. 16 May, 2019 2 commits
  16. 15 May, 2019 2 commits
    • Lynne's avatar
    • Lynne's avatar
      libavutil: add an FFT & MDCT implementation · b79b29dd
      Lynne authored
      This commit adds a new API to libavutil to allow for arbitrary transformations
      on various types of data.
      This is a partly new implementation, with the power of two transforms taken
      from libavcodec/fft_template, the 5 and 15-point FFT taken from mdct15, while
      the 3-point FFT was written from scratch.
      The (i)mdct folding code is taken from mdct15 as well, as the mdct_template
      code was somewhat old, messy and not easy to separate.
      
      A notable feature of this implementation is that it allows for 3xM and 5xM
      based transforms, where M is a power of two, e.g. 384, 640, 768, 1280, etc.
      AC-4 uses 3xM transforms while Siren uses 5xM transforms, so the code will
      allow for decoding of such streams.
      A non-exaustive list of supported sizes:
      4, 8, 12, 16, 20, 24, 32, 40, 48, 60, 64, 80, 96, 120, 128, 160, 192, 240,
      256, 320, 384, 480, 512, 640, 768, 960, 1024, 1280, 1536, 1920, 2048, 2560...
      
      The API was designed such that it allows for not only 1D transforms but also
      2D transforms of certain block sizes. This was partly on accident as the stride
      argument is required for Opus MDCTs, but can be used in the context of a 2D
      transform as well.
      Also, various data types would be implemented eventually as well, such as
      "double" and "int32_t".
      
      Some performance comparisons with libfftw3f (SIMD disabled for both):
      120:
        22353 decicycles in     fftwf_execute,     1024 runs,      0 skips
        21836 decicycles in compound_fft_15x8,     1024 runs,      0 skips
      
      128:
        22003 decicycles in       fftwf_execute,   1024 runs,      0 skips
        23132 decicycles in monolithic_fft_ptwo,   1024 runs,      0 skips
      
      384:
        75939 decicycles in      fftwf_execute,    1024 runs,      0 skips
        73973 decicycles in compound_fft_3x128,    1024 runs,      0 skips
      
      640:
       104354 decicycles in       fftwf_execute,   1024 runs,      0 skips
       149518 decicycles in compound_fft_5x128,    1024 runs,      0 skips
      
      768:
       109323 decicycles in      fftwf_execute,    1024 runs,      0 skips
       164096 decicycles in compound_fft_3x256,    1024 runs,      0 skips
      
      960:
       186210 decicycles in      fftwf_execute,    1024 runs,      0 skips
       215256 decicycles in compound_fft_15x64,    1024 runs,      0 skips
      
      1024:
       163464 decicycles in       fftwf_execute,   1024 runs,      0 skips
       199686 decicycles in monolithic_fft_ptwo,   1024 runs,      0 skips
      
      With SIMD we should be faster than fftw for 15xM transforms as our fft15 SIMD
      is around 2x faster than theirs, even if our ptwo SIMD is slightly slower.
      
      The goal is to remove the libavcodec/mdct15 code and deprecate the
      libavcodec/avfft interface once aarch64 and x86 SIMD code has been ported.
      New code throughout the project should use this API.
      
      The implementation passes fate when used in Opus, AAC and Vorbis, and the output
      is identical with ATRAC9 as well.
      b79b29dd
  17. 12 May, 2019 1 commit
    • Philip Langdale's avatar
      avutil: Add NV24 and NV42 pixel formats · 5de4f1d8
      Philip Langdale authored
      These are the 4:4:4 variants of the semi-planar NV12/NV21 formats.
      
      These formats are not used much, so we've never had a reason to add
      them until now. VDPAU recently added support HEVC 4:4:4 content
      and when you use the OpenGL interop, the returned surfaces are in
      NV24 format, so we need the pixel format for media players, even
      if there's no direct use within ffmpeg.
      
      Separately, there are apparently webcams that use NV24, but I've
      never seen one.
      5de4f1d8
  18. 05 May, 2019 1 commit
    • ManojGuptaBonda's avatar
      avutil/hwcontext_vdpau: Map 444 pix fmts to new VdpYCbCr types · d617d54e
      ManojGuptaBonda authored
      New VdpYCbCr Formats VDP_YCBCR_FORMAT_Y_U_V_444 and,
      VDP_YCBCR_FORMAT_Y_UV_444 have been added in VDPAU with libvdpau-1.2
      to be used in get/putbits for YUV 4:4:4 surfaces. Earlier mapping of
      AV_PIX_FMT_YUV444P to VDP_YCBCR_FORMAT_YV12 is not valid.
      
      Hence this Change maps AV_PIX_FMT_YUV444P to VDP_YCBCR_FORMAT_Y_U_V_444
      to access the YUV 4:4:4 surface via read-back API's of VDPAU.
      d617d54e
  19. 30 Apr, 2019 1 commit
  20. 24 Apr, 2019 1 commit
  21. 19 Apr, 2019 2 commits
    • Carl Eugen Hoyos's avatar
      lavu/hwcontext_d3d: Cast src pointers calling av_image_copy*(). · a24a1523
      Carl Eugen Hoyos authored
      Silences several warnings:
      libavutil/hwcontext_d3d11va.c:413:49: warning: passing argument 3 of ‘av_image_copy’ from incompatible pointer type
      libavutil/hwcontext_d3d11va.c:425:47: warning: passing argument 3 of ‘av_image_copy’ from incompatible pointer type
      libavutil/hwcontext_dxva2.c:351:45: warning: passing argument 3 of ‘av_image_copy’ from incompatible pointer type
      libavutil/hwcontext_dxva2.c:382:52: warning: passing argument 3 of ‘av_image_copy_uc_from’ from incompatible pointer type
      a24a1523
    • Gyan Doshi's avatar
      3bef1dab
  22. 16 Apr, 2019 4 commits
  23. 09 Apr, 2019 1 commit
  24. 30 Mar, 2019 1 commit
  25. 22 Mar, 2019 1 commit
  26. 17 Mar, 2019 1 commit