Commits · 0a35f128f3c6e0ae9a0a2236c557602c108da269 · Linshizhi / ffmpeg.wasm-core

01 Dec, 2016 1 commit
- cabac: x86: Give optimizations header a more meaningful name · 0a35f128
  Diego Biurrun authored Mar 09, 2016
  
  0a35f128
30 Nov, 2016 7 commits

aarch64: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32 · cad42fad

Martin Storsjö authored Nov 18, 2016

This work is sponsored by, and copyright, Google.

Previously all subpartitions except the eob=1 (DC) case ran with
the same runtime:

vp9_inv_dct_dct_16x16_sub16_add_neon:   1373.2
vp9_inv_dct_dct_32x32_sub32_add_neon:   8089.0

By skipping individual 8x16 or 8x32 pixel slices in the first pass,
we reduce the runtime of these functions like this:

vp9_inv_dct_dct_16x16_sub1_add_neon:     235.3
vp9_inv_dct_dct_16x16_sub2_add_neon:    1036.7
vp9_inv_dct_dct_16x16_sub4_add_neon:    1036.7
vp9_inv_dct_dct_16x16_sub8_add_neon:    1036.7
vp9_inv_dct_dct_16x16_sub12_add_neon:   1372.1
vp9_inv_dct_dct_16x16_sub16_add_neon:   1372.1
vp9_inv_dct_dct_32x32_sub1_add_neon:     555.1
vp9_inv_dct_dct_32x32_sub2_add_neon:    5190.2
vp9_inv_dct_dct_32x32_sub4_add_neon:    5180.0
vp9_inv_dct_dct_32x32_sub8_add_neon:    5183.1
vp9_inv_dct_dct_32x32_sub12_add_neon:   6161.5
vp9_inv_dct_dct_32x32_sub16_add_neon:   6155.5
vp9_inv_dct_dct_32x32_sub20_add_neon:   7136.3
vp9_inv_dct_dct_32x32_sub24_add_neon:   7128.4
vp9_inv_dct_dct_32x32_sub28_add_neon:   8098.9
vp9_inv_dct_dct_32x32_sub32_add_neon:   8098.8

I.e. in general a very minor overhead for the full subpartition case due
to the additional cmps, but a significant speedup for the cases when we
only need to process a small part of the actual input data.
Signed-off-by: Martin Storsjö <martin@martin.st>

cad42fad

arm: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32 · 9c8bc74c

Martin Storsjö authored Nov 18, 2016

This work is sponsored by, and copyright, Google.

Previously all subpartitions except the eob=1 (DC) case ran with
the same runtime:

Cortex A7 A8 A9 A53
vp9_inv_dct_dct_16x16_sub16_add_neon: 3188.1 2435.4 2499.0 1969.0
vp9_inv_dct_dct_32x32_sub32_add_neon: 18531.7 16582.3 14207.6 12000.3

By skipping individual 4x16 or 4x32 pixel slices in the first pass,
we reduce the runtime of these functions like this:

vp9_inv_dct_dct_16x16_sub1_add_neon: 274.6 189.5 211.7 235.8
vp9_inv_dct_dct_16x16_sub2_add_neon: 2064.0 1534.8 1719.4 1248.7
vp9_inv_dct_dct_16x16_sub4_add_neon: 2135.0 1477.2 1736.3 1249.5
vp9_inv_dct_dct_16x16_sub8_add_neon: 2446.7 1828.7 1993.6 1494.7
vp9_inv_dct_dct_16x16_sub12_add_neon: 2832.4 2118.3 2266.5 1735.1
vp9_inv_dct_dct_16x16_sub16_add_neon: 3211.7 2475.3 2523.5 1983.1
vp9_inv_dct_dct_32x32_sub1_add_neon: 756.2 456.7 862.0 553.9
vp9_inv_dct_dct_32x32_sub2_add_neon: 10682.2 8190.4 8539.2 6762.5
vp9_inv_dct_dct_32x32_sub4_add_neon: 10813.5 8014.9 8518.3 6762.8
vp9_inv_dct_dct_32x32_sub8_add_neon: 11859.6 9313.0 9347.4 7514.5
vp9_inv_dct_dct_32x32_sub12_add_neon: 12946.6 10752.4 10192.2 8280.2
vp9_inv_dct_dct_32x32_sub16_add_neon: 14074.6 11946.5 11001.4 9008.6
vp9_inv_dct_dct_32x32_sub20_add_neon: 15269.9 13662.7 11816.1 9762.6
vp9_inv_dct_dct_32x32_sub24_add_neon: 16327.9 14940.1 12626.7 10516.0
vp9_inv_dct_dct_32x32_sub28_add_neon: 17462.7 15776.1 13446.2 11264.7
vp9_inv_dct_dct_32x32_sub32_add_neon: 18575.5 17157.0 14249.3 12015.1

I.e. in general a very minor overhead for the full subpartition case due
to the additional loads and cmps, but a significant speedup for the cases
when we only need to process a small part of the actual input data.

In common VP9 content in a few inspected clips, 70-90% of the non-dc-only
16x16 and 32x32 IDCTs only have nonzero coefficients in the upper left
8x8 or 16x16 subpartitions respectively.
Signed-off-by: Martin Storsjö <martin@martin.st>

9c8bc74c

arm: vp9itxfm: Only reload the idct coeffs for the iadst_idct combination · 3c87039a

Martin Storsjö authored Nov 28, 2016

This avoids reloading them if they haven't been clobbered, if the
first pass also was idct.

This is similar to what was done in the aarch64 version.
Signed-off-by: Martin Storsjö <martin@martin.st>

3c87039a

vp9dsp: add DC only versions for idct/idct. · c4c5f538

Clément Bœsch authored Nov 22, 2013

before:

time ./avconv -v 0 -nostats -threads 1 -i sintel_vp9_500kbps.webm -f null -
real    0m11.125s
user    0m11.059s
sys     0m0.050s

time ./avconv -v 0 -nostats -threads 1 -i sintel_vp9_500kbps.webm -f null -
real    0m10.944s
user    0m10.819s
sys     0m0.064s

after:

time ./avconv -v 0 -nostats -threads 1 -i sintel_vp9_500kbps.webm -f null -
real    0m8.153s
user    0m8.034s
sys     0m0.050s

time ./avconv -v 0 -nostats -threads 1 -i sintel_vp9_500kbps.webm -f null -
real    0m8.038s
user    0m7.980s
sys     0m0.039s
Signed-off-by: Martin Storsjö <martin@martin.st>

c4c5f538

hevc: Eliminate pointless variable indirection · e4382a4a
Diego Biurrun authored Jan 11, 2016

e4382a4a
hevc: Drop pointless av_unused attribute · 5c890225
Diego Biurrun authored Nov 17, 2016

5c890225
metasound: Drop unused tables · 0983f911
Diego Biurrun authored Jan 03, 2016

0983f911

29 Nov, 2016 12 commits
- configure: Integrate X11 checks into vaapi/vdpau checks · c21d78a9
  Diego Biurrun authored Nov 23, 2016
  
  c21d78a9
- configure: Do not add newlines in filter()/filter_out() functions · 8b56dbe7
  Diego Biurrun authored Nov 09, 2016
  
  8b56dbe7
- configure: Move hardware-accelerated codec deps out of hwaccel section · 9254344e
  Diego Biurrun authored Nov 16, 2016
  
  9254344e
- configure: MMAL-related decoders should depend on, not select, mmal · d4f2a681
  Diego Biurrun authored Nov 10, 2016
  
  d4f2a681
- mjpegdec: Check return values of functions that may fail · 212c6a1d
  Diego Biurrun authored May 11, 2016
  
  212c6a1d
- dxva2: Adjust printf length modifiers where appropriate · 3ee5f25d
  Diego Biurrun authored Nov 24, 2016
  
  3ee5f25d
- avisynth: Cast to the right type when loading avisynth library functions · 239d02ef
  Diego Biurrun authored Nov 24, 2016
```
Fixes a number of related warnings.
```
  239d02ef
- lavc: move decoding-related code from utils.c to a new file · 3fe2a01d
  Anton Khirnov authored Oct 26, 2016
  
  3fe2a01d
- lavc: move encoding-related code from utils.c to a new file · 328cd2b5
  Anton Khirnov authored Oct 26, 2016
  
  328cd2b5
- aac_adtstoasc_bsf: validate and forward extradata if the stream is already ASC · 45d199d5
  James Almer authored Nov 25, 2016
```
Fixes AAC AudioSpecificConfig passthrough.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
```
  45d199d5
- mss2: only use error correction for matching block counts · 1762a39e
  Andreas Cadhalpun authored Nov 24, 2016
```
This fixes a heap-buffer-overflow in ff_er_frame_end when decoding mss2
with coded_width/coded_height larger than width/height.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
```
  1762a39e
- avconv: Fix the audio next dts computation · d0c84c41
  Luca Barbato authored Nov 28, 2016
```
Use the correct timebase.

CC: libav-stable@libav.org
```
  d0c84c41
28 Nov, 2016 3 commits
- ac3enc: Avoid unnecessary macro indirections · eb135516
  Diego Biurrun authored Jan 04, 2016
  
  eb135516
- ac3enc: Reshuffle functions to avoid forward declarations · f0d3e43b
  Diego Biurrun authored Jan 04, 2016
  
  f0d3e43b
- ac3enc: Reshuffle some float/fixed-mode ifdefs to avoid a dummy function · e22c63ac
  Diego Biurrun authored Jan 04, 2016
  
  e22c63ac
26 Nov, 2016 1 commit
- hwcontext_vaapi: Don't abort on failing to allocate from a fixed-size pool · d30719e6
  Mark Thompson authored Nov 25, 2016
  
  d30719e6
25 Nov, 2016 7 commits
- tta: avoid undefined shifts · 4adbb44a
  Anton Khirnov authored Nov 23, 2016
```
Signed-off-by: Diego Biurrun <diego@biurrun.de>
```
  4adbb44a
- tta: use get_unary() instead of a custom implementation · dc4b6250
  Anton Khirnov authored Nov 23, 2016
```
Signed-off-by: Diego Biurrun <diego@biurrun.de>
```
  dc4b6250
- build: Drop gcrypt support · e122b12c
  Diego Biurrun authored Nov 23, 2016
```
GnuTLS in combination with gcrypt has been deprecated since 2010.
```
  e122b12c
- configure: Use correct libm linker flag during math function checks · bf2f748f
  Diego Biurrun authored Nov 23, 2016
  
  bf2f748f
- configure: Add missing asyncts filter, movie filter, and output example deps · ce6f780b
  Diego Biurrun authored Nov 22, 2016
```
Also add a missing avcodec.h #include in the movie filter.
```
  ce6f780b
- configure: Use correct variable name in libsnappy test · 04698d52
  Diego Biurrun authored Nov 23, 2016
  
  04698d52
- configure: Remove old avisynth support leftover · 30f0d1b9
  Diego Biurrun authored Nov 22, 2016
  
  30f0d1b9
24 Nov, 2016 9 commits
- arm: warn/error on movrelx usage problematic with PIC on ELF · 6a1ea4ec
  Janne Grunau authored Nov 18, 2016
```
The warning has false positives but our asm does not trigger it. For
new code false positives can only be avoided by changing the register
allocation.
```
  6a1ea4ec
- configure: Disable warning C4703 with MSVC · 5bcc6f76
  Diego Biurrun authored Nov 22, 2016
```
This disables warnings about potentially uninitialized local pointer
variables.  Disabling the warning is in line with what we do for gcc.
```
  5bcc6f76
- w32pthreads: Fix function pointer casts · bd9cd046
  Diego Biurrun authored Nov 22, 2016
```
This eliminates a handful of warnings at every inclusion of the header.
```
  bd9cd046
- qt-faststart: Do not try to use fancy 64-bit seeking functions on mingw32ce · 233d50b2
  Martin Storsjö authored Sep 13, 2012
```
These functions are not available on mingw32ce.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
```
  233d50b2
- rtmpdh: Do global initialization before running the test · 537b5b77
  Martin Storsjö authored Nov 23, 2016
```
The rtmpdh code can use crypto libraries which may require
a process global init. (gcrypt is one of the libraries
where the rtmpdh test code can fail if global init hasn't been
done, depending on gcrypt version.)
Signed-off-by: Martin Storsjö <martin@martin.st>
```
  537b5b77
- aarch64: vp9itxfm: Don't repeatedly set x9 when nothing overwrites it · 2f99117f
  Martin Storsjö authored Nov 22, 2016
```
Signed-off-by: Martin Storsjö <martin@martin.st>
```
  2f99117f
- rdt: Convert to the new bitstream reader · 2dbe2aa2
  Alexandra Hájková authored Apr 16, 2016
  
  2dbe2aa2
- ogg: Convert to the new bitstream reader · 2cef81a8
  Alexandra Hájková authored Apr 16, 2016
  
  2cef81a8
- mpegts: Convert to the new bitstream reader · 8d1997ad
  Alexandra Hájková authored Apr 16, 2016
  
  8d1997ad