Commits · 5231e89eb9eedc119d4f762469355f83e3628f20 · Linshizhi / ffmpeg.wasm-core

22 Aug, 2019 2 commits

avutil/imgutils: remove dead assignment · b2e37e3e
Marton Balint authored 5 years ago
```
Signed-off-by: Marton Balint <cus@passwd.hu>
```
b2e37e3e

Add assembly support for -fsanitize=hwaddress tagged globals. · 9bcb1cb6

Peter Collingbourne authored 5 years ago

As of LLVM r368102, Clang will set a pointer tag in bits 56-63 of the
address of a global when compiling with -fsanitize=hwaddress. This requires
an adjustment to assembly code that takes the address of such globals: the
code cannot use the regular R_AARCH64_ADR_PREL_PG_HI21 relocation to refer
to the global, since the tag would take the address out of range. Instead,
the code must use the non-checking (_NC) variant of the relocation (the
link-time check is substituted by a runtime check).

This change makes the necessary adjustment in the movrel macro, where it is
needed when compiling with -fsanitize=hwaddress.
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Martin Storsjö
Reviewed-by: Janne Grunau

9bcb1cb6

14 Aug, 2019 1 commit
- avutil/mips: remove redundant code in TRANSPOSE16x8_UB_UB. · e1039b09
  Shiyou Yin authored 5 years ago
```
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
```
  e1039b09
13 Aug, 2019 1 commit

avutil/mips: refine msa macros CLIP_*. · a3e572d9

gxw authored 5 years ago

Changing details as following:
1. Remove the local variable 'out_m' in 'CLIP_SH' and store the result in
source vector.
2. Refine the implementation of macro 'CLIP_SH_0_255' and 'CLIP_SW_0_255'.
Performance of VP8 decoding has speed up about 1.1%(from 7.03x to 7.11x).
Performance of H264 decoding has speed up about 0.5%(from 4.35x to 4.37x).
Performance of Theora decoding has speed up about 0.7%(from 5.79x to 5.83x).
3. Remove redundant macro 'CLIP_SH/Wn_0_255_MAX_SATU' and use 'CLIP_SH/Wn_0_255'
instead, because there are no difference in the effect of this two macros.
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>

a3e572d9

02 Aug, 2019 2 commits

avutil/mips: Avoid instruction exception caused by gssqc1/gslqc1. · 11f99a9a
Shiyou Yin authored 5 years ago
```
Ensure the address accesed by gssqc1/gslqc1 are 16-byte aligned.
```
11f99a9a

lavu/tx: add support for double precision FFT and MDCT · 42e2319b

Lynne authored 5 years ago

Simply moves and templates the actual transforms to support an
additional data type.
Unlike the float version, which is equal or better than libfftw3f,
double precision output is bit identical with libfftw3.

42e2319b

30 Jul, 2019 1 commit

lavu/hwcontext_qsv: fix the memory leak · b3b7523f

Linjie Fu authored 5 years ago

av_dict_free child_device_opts to fix the memory leak.
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
Signed-off-by: Zhong Li <zhong.li@intel.com>

b3b7523f

21 Jul, 2019 3 commits
- Bump minor versions again on master to keep 4.2 versions separate from master · 80bb65fa
  Michael Niedermayer authored 5 years ago
```
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
```
  80bb65fa
- Bump minor versions to separate 4.2 from master · 22db337a
  Michael Niedermayer authored 5 years ago
```
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
```
  22db337a
- avutil/softfloat_ieee754: Fix odd bit position for exponent and sign in av_bits2sf_ieee754() · 82e389d0
  Michael Niedermayer authored 5 years ago
```
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
```
  82e389d0
18 Jul, 2019 1 commit

avutil/mips: refactor msa load and store macros. · 153c6075

Shiyou Yin authored 5 years ago

Replace STnxm_UB and LDnxm_SH with new macros ST_{H/W/D}{1/2/4/8}.
The old macros are difficult to use because they don't follow the same parameter passing rules.
Changing details as following:
1. remove LD4x4_SH.
2. replace ST2x4_UB with ST_H4.
3. replace ST4x2_UB with ST_W2.
4. replace ST4x4_UB with ST_W4.
5. replace ST4x8_UB with ST_W8.
6. replace ST6x4_UB with ST_W2 and ST_H2.
7. replace ST8x1_UB with ST_D1.
8. replace ST8x2_UB with ST_D2.
9. replace ST8x4_UB with ST_D4.
10. replace ST8x8_UB with ST_D8.
11. replace ST12x4_UB with ST_D4 and ST_W4.

Examples of new macro: ST_H4(in, idx0, idx1, idx2, idx3, pdst, stride)
ST_H4 store four half-word elements in vector 'in' to pdst with stride.
About the macro name:
1) 'ST' means store operation.
2) 'H/W/D' means type of vector element is 'half-word/word/double-word'.
3) Number '1/2/4/8' means how many elements will be stored.
About the macro parameter:
1) 'in0, in1...' 128-bits vector.
2) 'idx0, idx1...' elements index.
3) 'pdst' destination pointer to store to
4) 'stride' stride of each store operation.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>

153c6075

11 Jul, 2019 1 commit
- avutil/hwcontext_vaapi: move kernel_driver into CONFIG_LIBDRM · 1498e394
  Steven Liu authored 5 years ago
```
Reviewed-by: Zhong Li <zhong.li@intel.com>
Signed-off-by: Steven Liu <lq@onvideo.cn>
```
  1498e394
10 Jul, 2019 1 commit

avutil/mips: optimize UNPCK&SAD macros with MSA2.0 instruction. · a45e8ade

Shiyou Yin authored 5 years ago

Loongson 3A4000 and 2k1000 has supported MSA2.0.
This patch optimized SAD_UB2_UH,UNPCK_R_SH_SW,UNPCK_SB_SH and UNPCK_SH_SW with MSA2.0 instruction.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>

a45e8ade

07 Jul, 2019 1 commit

lavu/frame: Improve ROI documentation · 451a5112

Mark Thompson authored 5 years ago

Clarify and add examples for the behaviour of the quantisation offset,
and define how multiple ranges should be handled.

451a5112

29 Jun, 2019 1 commit
- avutil: add FF_DECODE_ERROR_DECODE_SLICES for AVFrame.decode_error_flags · a30e4409
  Amir Pauker authored 5 years ago
```
Signed-off-by: Amir Pauker <amir@livelyvideo.tv>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
```
  a30e4409
16 Jun, 2019 1 commit

avutil: add FF_DECODE_ERROR_CONCEALMENT_ACTIVE flag for AVFrame.decode_error_flags · edfced8c

Amir Pauker authored 5 years ago

FF_DECODE_ERROR_CONCEALMENT_ACTIVE is set when the decoded frame has error(s) but the returned value from
avcodec_receive_frame is zero i.e. concealed errors
Signed-off-by: Amir Pauker <amir@livelyvideo.tv>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>

edfced8c

02 Jun, 2019 5 commits

hwcontext_qsv: Try to select a matching VAAPI device by default · 468f0038

Mark Thompson authored 5 years ago

Tries to find a device backed by the i915 kernel driver and loads the iHD
VAAPI driver to use with it. This reduces confusion on machines with
multiple DRM devices and removes the surprising requirement to set the
LIBVA_DRIVER_NAME environment variable to use libmfx at all.

468f0038

hwcontext_vaapi: Try to create devices via DRM before X11 · 0b4696fb

Mark Thompson authored 5 years ago

Opening the device via X11 (DRI2/DRI3) rather than opening a DRM render
node directly is only useful if you intend to use the legacy X11 interop
functions. That's never true for the ffmpeg utility, and a library user
who does want this will likely provide their own display instance rather
than making a new one here.

0b4696fb

hwcontext_vaapi: Add option to set driver name · 7f3f5a24

Mark Thompson authored 6 years ago

For example: -init_hw_device vaapi:/dev/dri/renderD128,driver=foo

This may be more convenient that using the environment variable, and allows
loading different drivers for different devices in the same process.

7f3f5a24

hwcontext_vaapi: Make default DRM device selection more helpful · 6b6b8a63

Mark Thompson authored 5 years ago

Iterate over available render devices and pick the first one which looks
usable. Adds an option to specify the name of the kernel driver associated
with the desired device, so that it is possible to select a specific type
of device in a multiple-device system without knowing the card numbering.

For example: -init_hw_device vaapi:,kernel_driver=amdgpu will select only
devices using the "amdgpu" driver (as used with recent AMD graphics cards).

Kernel driver selection requires libdrm to work.

6b6b8a63

hwcontext_vaapi: Add option to specify connection type · d2141a9b
Mark Thompson authored 5 years ago
```
Can be set to "drm" or "x11" to force a specific connection type.
```
d2141a9b

01 Jun, 2019 1 commit
- avutil/dynarry.h: fix comment grammar mistakes of FF_DYNARRAY_ADD · 76ef18fd
  Steven Liu authored 5 years ago
```
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
```
  76ef18fd
16 May, 2019 2 commits

avutil/tx: should check against (*ctx) · 65646db8

Ruiling Song authored 5 years ago

ctx is a pointer to pointer here.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>

65646db8

avutil/tx: fix forward compound non-mod-15 based MDCTs · 60445349

Lynne authored 5 years ago

There was a hardcoded value left. Wasn't caught earlier as no code uses
compound forward mod-3/5 MDCTs yet.

60445349

15 May, 2019 2 commits

lavu: bump minor and update APIchanges for the new transform API · 87ee9d58
Lynne authored 5 years ago

87ee9d58

libavutil: add an FFT & MDCT implementation · b79b29dd

Lynne authored 5 years ago

This commit adds a new API to libavutil to allow for arbitrary transformations
on various types of data.
This is a partly new implementation, with the power of two transforms taken
from libavcodec/fft_template, the 5 and 15-point FFT taken from mdct15, while
the 3-point FFT was written from scratch.
The (i)mdct folding code is taken from mdct15 as well, as the mdct_template
code was somewhat old, messy and not easy to separate.

A notable feature of this implementation is that it allows for 3xM and 5xM
based transforms, where M is a power of two, e.g. 384, 640, 768, 1280, etc.
AC-4 uses 3xM transforms while Siren uses 5xM transforms, so the code will
allow for decoding of such streams.
A non-exaustive list of supported sizes:
4, 8, 12, 16, 20, 24, 32, 40, 48, 60, 64, 80, 96, 120, 128, 160, 192, 240,
256, 320, 384, 480, 512, 640, 768, 960, 1024, 1280, 1536, 1920, 2048, 2560...

The API was designed such that it allows for not only 1D transforms but also
2D transforms of certain block sizes. This was partly on accident as the stride
argument is required for Opus MDCTs, but can be used in the context of a 2D
transform as well.
Also, various data types would be implemented eventually as well, such as
"double" and "int32_t".

Some performance comparisons with libfftw3f (SIMD disabled for both):
120:
  22353 decicycles in     fftwf_execute,     1024 runs,      0 skips
  21836 decicycles in compound_fft_15x8,     1024 runs,      0 skips

128:
  22003 decicycles in       fftwf_execute,   1024 runs,      0 skips
  23132 decicycles in monolithic_fft_ptwo,   1024 runs,      0 skips

384:
  75939 decicycles in      fftwf_execute,    1024 runs,      0 skips
  73973 decicycles in compound_fft_3x128,    1024 runs,      0 skips

640:
 104354 decicycles in       fftwf_execute,   1024 runs,      0 skips
 149518 decicycles in compound_fft_5x128,    1024 runs,      0 skips

768:
 109323 decicycles in      fftwf_execute,    1024 runs,      0 skips
 164096 decicycles in compound_fft_3x256,    1024 runs,      0 skips

960:
 186210 decicycles in      fftwf_execute,    1024 runs,      0 skips
 215256 decicycles in compound_fft_15x64,    1024 runs,      0 skips

1024:
 163464 decicycles in       fftwf_execute,   1024 runs,      0 skips
 199686 decicycles in monolithic_fft_ptwo,   1024 runs,      0 skips

With SIMD we should be faster than fftw for 15xM transforms as our fft15 SIMD
is around 2x faster than theirs, even if our ptwo SIMD is slightly slower.

The goal is to remove the libavcodec/mdct15 code and deprecate the
libavcodec/avfft interface once aarch64 and x86 SIMD code has been ported.
New code throughout the project should use this API.

The implementation passes fate when used in Opus, AAC and Vorbis, and the output
is identical with ATRAC9 as well.

b79b29dd

12 May, 2019 1 commit

avutil: Add NV24 and NV42 pixel formats · 5de4f1d8

Philip Langdale authored 5 years ago

These are the 4:4:4 variants of the semi-planar NV12/NV21 formats.

These formats are not used much, so we've never had a reason to add
them until now. VDPAU recently added support HEVC 4:4:4 content
and when you use the OpenGL interop, the returned surfaces are in
NV24 format, so we need the pixel format for media players, even
if there's no direct use within ffmpeg.

Separately, there are apparently webcams that use NV24, but I've
never seen one.

5de4f1d8

05 May, 2019 1 commit

avutil/hwcontext_vdpau: Map 444 pix fmts to new VdpYCbCr types · d617d54e

ManojGuptaBonda authored 5 years ago

New VdpYCbCr Formats VDP_YCBCR_FORMAT_Y_U_V_444 and,
VDP_YCBCR_FORMAT_Y_UV_444 have been added in VDPAU with libvdpau-1.2
to be used in get/putbits for YUV 4:4:4 surfaces. Earlier mapping of
AV_PIX_FMT_YUV444P to VDP_YCBCR_FORMAT_YV12 is not valid.

Hence this Change maps AV_PIX_FMT_YUV444P to VDP_YCBCR_FORMAT_Y_U_V_444
to access the YUV 4:4:4 surface via read-back API's of VDPAU.

d617d54e

30 Apr, 2019 1 commit

lavu/hwcontext_qsv: Fix the realign check for hwupload · 2d81acaa

Linjie Fu authored 5 years ago

Fix the aligned check in hwupload, input surface should be 16 aligned
too.

Partly fix #7830.
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
Signed-off-by: Zhong Li <zhong.li@intel.com>

2d81acaa

24 Apr, 2019 1 commit

avutil/avstring: Fix bug and undefined behavior in av_strncasecmp() · 6f0e9a86

Michael Niedermayer authored 5 years ago

The function in case of n=0 would read more bytes than 0.
The end pointer could be beyond the allocated space, which
is undefined.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>

6f0e9a86

19 Apr, 2019 2 commits

lavu/hwcontext_d3d: Cast src pointers calling av_image_copy*(). · a24a1523

Carl Eugen Hoyos authored 5 years ago

Silences several warnings:
libavutil/hwcontext_d3d11va.c:413:49: warning: passing argument 3 of ‘av_image_copy’ from incompatible pointer type
libavutil/hwcontext_d3d11va.c:425:47: warning: passing argument 3 of ‘av_image_copy’ from incompatible pointer type
libavutil/hwcontext_dxva2.c:351:45: warning: passing argument 3 of ‘av_image_copy’ from incompatible pointer type
libavutil/hwcontext_dxva2.c:382:52: warning: passing argument 3 of ‘av_image_copy_uc_from’ from incompatible pointer type

a24a1523

avutil/colorspace: add macros for RGB->YUV BT.709 · 3bef1dab
Gyan Doshi authored 5 years ago

3bef1dab

16 Apr, 2019 4 commits

lavu/hwcontext_qsv: Mark a pointer as const. · 5ba76921

Carl Eugen Hoyos authored 5 years ago

Silences a warning:
libavutil/hwcontext_qsv.c:912:15: warning: assignment discards 'const' qualifier from pointer target type

5ba76921

time_internal: Prefix fallback versions of gmtime_r/localtime_r with ff_ · c4642788

Martin Storsjö authored 5 years ago

Use a macro to redirect calling code from the official name to the
ff_ prefixed one.

Detecting these functions in configure can be tricky (on mingw, they
are conditionally available depending on posix feature defines).
If configure didn't detect them, but they still are visible at
compile time (due to an unrelated header defining the posix feature
defines), providing the local fallback versions with a prefixed
name is safer.
Signed-off-by: Martin Storsjö <martin@martin.st>

c4642788

time_internal: Do not attempt to override *time_r() macros · 9485cce6

Michael Niedermayer authored 10 years ago

In case these already are defined as macros, we shouldn't try to
redefine them.
Signed-off-by: Martin Storsjö <martin@martin.st>

9485cce6

avcodec/videotoolbox: add support for 10bit pixel format · 036b4b0f

fumoboy007 authored 5 years ago

this patch was originally posted on issue #7704 and was slightly
adjusted to check for the availability of the pixel format.

036b4b0f

09 Apr, 2019 1 commit

libavutil/hwcontext_opencl: Fix channel order in format support check · 1c50d61a

Jarek Samic authored 5 years ago

The `opencl_get_plane_format` function was incorrectly determining the
value used to set the image channel order. This resulted in all RGB
pixel formats being set to the `CL_RGBA` pixel format, regardless of
whether or not they actually *were* RGBA.

This patch fixes the issue by using the `offset` and depth of components
rather than the loop index to determine the value of `order`.
Signed-off-by: Jarek Samic <cldfire3@gmail.com>
Signed-off-by: Mark Thompson <sw@jkqxz.net>

1c50d61a

30 Mar, 2019 1 commit

avutil/hcontext_cuda: Remove unnecessary stream synchronisation · 52d8f35b

Philip Langdale authored 5 years ago

Similarly to the previous changes, we don't need to synchronise
after a memcpy to device memory. On the other hand, we need to
keep synchronising after a copy to host memory, otherwise there's
no guarantee that subsequent host reads will return valid data.

52d8f35b

22 Mar, 2019 1 commit

lavu/opencl: replace va_ext.h with standard name · 61cb505d

Ruiling Song authored 6 years ago

Khronos OpenCL header (https://github.com/KhronosGroup/OpenCL-Headers)
uses cl_va_api_media_sharing_intel.h. And Intel's official OpenCL driver
for Intel GPU (https://github.com/intel/compute-runtime) was compiled
against Khronos OpenCL header. So it's better to align with Khronos.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>

61cb505d

17 Mar, 2019 1 commit

lavu/qsv: allow surface size larger than requirement · 15d016be

Zhong Li authored 6 years ago

Just like commit 6829a079,
surface size larger than requirement should not be treated as error.
Signed-off-by: Zhong Li <zhong.li@intel.com>

15d016be