Commits · 9e1ddc08208d7b484d5d97d4e169c75b91e3ff21 · Linshizhi / ffmpeg.wasm-core

18 Nov, 2016 1 commit

Hendrik Leppkes authored Nov 18, 2016

* commit '390b95b8':
  fate: Add a mixed NAL coding sample
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>

9e1ddc08

17 Nov, 2016 22 commits

doc/bsfs: various improvements · c493a531

Moritz Barsnick authored Nov 15, 2016

- Restored alphabetical order.
- Enhanced sections aac_adtstoasc, dca_core, h264_mp4toannexb.
- Added sections hevc_mp4toannexb and vp9_superframe.
- Renamed (if required) and filled previously empty sections
  mjpegadump, mov2textsub/text2movsub, mp3decomp, and
  remove_extra.
- Fixes ticket #3198.
Signed-off-by: Moritz Barsnick <barsnick@gmx.net>
Signed-off-by: Lou Logan <lou@lrcd.com>

c493a531

ffprobe: fix crash in case -of is specified with an empty string · 427a47ab
Stefano Sabatini authored Nov 17, 2016
```
Fix trac issue #5957.
```
427a47ab

avformat/movenc: Check frame rate before use. · 709c8710

Michael Niedermayer authored Nov 17, 2016

Fixes division by 0
This is similar to how avg_frame_rate is checked elsewhere
Fixes: 6d24add0455f41b1b45b7ba615cd46f3/asan_generic_dc34c3_5480_0a2ef411cae999b9871ed71a2e481b71.mov

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>

709c8710

avcodec/ass_split: Change order of operations in ass_split_section() · ae514b12

Michael Niedermayer authored Nov 17, 2016

This matches the other branch
Fixes out of array read
Fixes: 4d142ca76d39fe685effcf5017098723/asan_heap-oob_31ae824_8611_348fdb64f9009b63c8a8eae9a0e497c5.mkv

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>

ae514b12

Merge commit '' · 1398ded7

Hendrik Leppkes authored Nov 17, 2016

* commit 'cbbb4040':
  fate: Restore order of h264 entries
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>

1398ded7

Merge commit '' · 2f1a539d

Hendrik Leppkes authored Nov 17, 2016

* commit '61bd0ed7':
  h264: Log more information about invalid NALu size
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>

2f1a539d

Merge commit '' · 286d8bae

Hendrik Leppkes authored Nov 17, 2016

* commit '7b1ae0e7':
  checkasm/arm: preserve the stack alignment checkasm_checked_call
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>

286d8bae

Merge commit '' · c0af1ee9

Hendrik Leppkes authored Nov 17, 2016

* commit '80fbb7be':
  checkasm: vp8.mc: initialize the full src buffer after ec325742Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>

c0af1ee9

Merge commit '' · 62d9b7a6

Hendrik Leppkes authored Nov 17, 2016

* commit '17c99b61':
  h2645_parse: handle embedded Annex B NAL units in size prefixed NAL units

This commit is a noop, see a9bb4cf8Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>

62d9b7a6

Merge commit '' · cca4fd47

Hendrik Leppkes authored Nov 17, 2016

* commit 'a8cbe5a0':
  h264_ps: export actual height in MBs as SPS.mb_height
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>

cca4fd47

Merge commit '' · 4c5c522f

Hendrik Leppkes authored Nov 17, 2016

* commit '99cf9433':
  d3d11va: don't keep the context lock while waiting for a frame
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>

4c5c522f

Merge commit '' · e999a4ed

Hendrik Leppkes authored Nov 17, 2016

* commit '2866d108':
  vp8dsp: Remove the comment saying that the height is equal to the width
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>

e999a4ed

Merge commit '' · 90b72f6b

Hendrik Leppkes authored Nov 17, 2016

* commit '8c816c0c':
  checkasm/arm: align the clobber check data properly for ldrd
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>

90b72f6b

Merge commit '' · 4fe013fc

Hendrik Leppkes authored Nov 17, 2016

* commit 'ec325742':
  checkasm: vp8: mc: test unequal width/height for partitions
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>

4fe013fc

Merge commit '' · 2818aaab

Hendrik Leppkes authored Nov 17, 2016

* commit '5f74bd31':
  vp8/armv6: mc: avoid boolean expression in calculation
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>

2818aaab

Merge commit '' · 711b7b77

Hendrik Leppkes authored Nov 17, 2016

* commit 'fc5cdc0d':
  doc: escape left brace in texi2pod.pl regex

This commit is a noop, see e43ea1cbMerged-by: Hendrik Leppkes <h.leppkes@gmail.com>

711b7b77

Merge commit '' · da97b244

Hendrik Leppkes authored Nov 17, 2016

* commit 'd825b1a5':
  libopenh264: Support building with the 1.6 release

This commit is a noop, see 293676c4Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>

da97b244

Merge commit '' · 4a485daa

Hendrik Leppkes authored Nov 17, 2016

* commit '4f7723cb':
  movenc: Add an option for skipping writing the mfra/tfra/mfro trailer
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>

4a485daa

lavc/ffv1dec: Scale output for msb-packed compression to full 16bit. · 55a424c5
Carl Eugen Hoyos authored Nov 17, 2016
```
2% slowdown for existing decode-line timer.
```
55a424c5
lavc/ffv1enc: Support pix_fmt GRAY10. · f8247c0c
Carl Eugen Hoyos authored Nov 14, 2016

f8247c0c
avcodec/mpeg4videodec: Workaround interlaced mpeg4 edge MC bug · 2c910625
Michael Niedermayer authored Nov 12, 2016
```
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
```
2c910625
avcodec/mpegvideo: Fix edge emu buffer overlap with interlaced mpeg4 · 85407c7e
Michael Niedermayer authored Nov 12, 2016
```
Fixes Ticket5936
Regression since c5fc8ae1Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
```
85407c7e

16 Nov, 2016 13 commits

libavcodec/exr : fix channel size calculation for uint32 channel · 52da3f6f

Martin Vignali authored Nov 16, 2016

uint32 need 4 bytes not 1.
Fix decoding when there is half/float and uint32 channel.

This fixes crashes due to pointer corruption caused by invalid writes.

The problem was introduced in commit
03152e74.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>

52da3f6f

exr: reindent after previous commit · ce3147eb
Andreas Cadhalpun authored Nov 16, 2016
```
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
```
ce3147eb

exr: fix out-of-bounds read · ffdc5d09

Andreas Cadhalpun authored Nov 16, 2016

channel_index can be -1.

This problem was introduced in commit
2dd7b461.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>

ffdc5d09

avutil/frame: fix indention after last commit · 721c90f0
Michael Niedermayer authored Nov 16, 2016
```
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
```
721c90f0

avutil/frame: Copy size=0 side data in ff_init_buffer_info() · 2acee08a

Michael Niedermayer authored Nov 16, 2016

Fixes null pointer dereference
Fixes: 189/FOO

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpegSigned-off-by: Michael Niedermayer <michael@niedermayer.cc>

2acee08a

libschroedingerdec: fix leaking of framewithpts · 3c0328d5

Andreas Cadhalpun authored Nov 13, 2016

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>

3c0328d5

libschroedingerdec: don't produce empty frames · a86ebbf7

Andreas Cadhalpun authored Nov 13, 2016

They are not valid and can cause problems/crashes for API users.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>

a86ebbf7

dds: limit 4 bpp handling to AV_PIX_FMT_PAL8 · 90ebf3c4

Andreas Cadhalpun authored Nov 15, 2016

This fixes NULL pointer dereferencing for formats, where frame->data[1]
is not allocated.

The problem was introduced in commit
257fbc3a.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>

90ebf3c4

doc/filters: adds recently added -vf colorspace options · 605f3084
kieranjol authored Nov 16, 2016
```
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
```
605f3084
cmdutils: remove duplicate windows.h include · d79d8ef9
Michael Niedermayer authored Nov 08, 2016
```
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
```
d79d8ef9

configure: properly add dxva2 link dependencies · 99218ee3

Hendrik Leppkes authored Nov 12, 2016

Fixes building with --disable-everything --enable-shared --enable-dxva2

The hwcontext DXVA2 implementation in avutil needs this library now, instead
of just the ffmpeg program.

99218ee3

fate: Add h264 extradata reload tests · 00c80798
Vittorio Giovara authored Nov 15, 2016
```
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
```
00c80798
Fix -Werror=parentheses error · c5125466
Thierry Foucu authored Nov 15, 2016
```
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
```
c5125466

15 Nov, 2016 4 commits

avcodec/rv40: Test remaining space in loop of get_dimension() · 1546d487

Michael Niedermayer authored Nov 15, 2016

Fixes infinite loop
Fixes: 178/fuzz-3-ffmpeg_VIDEO_AV_CODEC_ID_RV40_fuzzer

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpegSigned-off-by: Michael Niedermayer <michael@niedermayer.cc>

1546d487

mlz: limit next_code to data buffer size · 1abcd972

Andreas Cadhalpun authored Nov 14, 2016

This fixes a heap-buffer-overflow detected by AddressSanitizer.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>

1abcd972

aarch64: vp9: Implement NEON loop filters · f1212e47

Martin Storsjö authored Nov 14, 2016

This work is sponsored by, and copyright, Google.

These are ported from the ARM version; thanks to the larger
amount of registers available, we can do the loop filters with
16 pixels at a time. The implementation is fully templated, with
a single macro which can generate versions for both 8 and
16 pixels wide, for both 4, 8 and 16 pixels loop filters
(and the 4/8 mixed versions as well).

For the 8 pixel wide versions, it is pretty close in speed (the
v_4_8 and v_8_8 filters are the best examples of this; the h_4_8
and h_8_8 filters seem to get some gain in the load/transpose/store
part). For the 16 pixels wide ones, we get a speedup of around
1.2-1.4x compared to the 32 bit version.

Examples of runtimes vs the 32 bit version, on a Cortex A53:
                                       ARM AArch64
vp9_loop_filter_h_4_8_neon:          144.0   127.2
vp9_loop_filter_h_8_8_neon:          207.0   182.5
vp9_loop_filter_h_16_8_neon:         415.0   328.7
vp9_loop_filter_h_16_16_neon:        672.0   558.6
vp9_loop_filter_mix2_h_44_16_neon:   302.0   203.5
vp9_loop_filter_mix2_h_48_16_neon:   365.0   305.2
vp9_loop_filter_mix2_h_84_16_neon:   365.0   305.2
vp9_loop_filter_mix2_h_88_16_neon:   376.0   305.2
vp9_loop_filter_mix2_v_44_16_neon:   193.2   128.2
vp9_loop_filter_mix2_v_48_16_neon:   246.7   218.4
vp9_loop_filter_mix2_v_84_16_neon:   248.0   218.5
vp9_loop_filter_mix2_v_88_16_neon:   302.0   218.2
vp9_loop_filter_v_4_8_neon:           89.0    88.7
vp9_loop_filter_v_8_8_neon:          141.0   137.7
vp9_loop_filter_v_16_8_neon:         295.0   272.7
vp9_loop_filter_v_16_16_neon:        546.0   453.7

The speedup vs C code in checkasm tests is around 2-7x, which is
pretty much the same as for the 32 bit version. Even if these functions
are faster than their 32 bit equivalent, the C version that we compare
to also became around 1.3-1.7x faster than the C version in 32 bit.

Based on START_TIMER/STOP_TIMER wrapping around a few individual
functions, the speedup vs C code is around 4-5x.

Examples of runtimes vs C on a Cortex A57 (for a slightly older version
of the patch):
                         A57 gcc-5.3  neon
loop_filter_h_4_8_neon:        256.6  93.4
loop_filter_h_8_8_neon:        307.3 139.1
loop_filter_h_16_8_neon:       340.1 254.1
loop_filter_h_16_16_neon:      827.0 407.9
loop_filter_mix2_h_44_16_neon: 524.5 155.4
loop_filter_mix2_h_48_16_neon: 644.5 173.3
loop_filter_mix2_h_84_16_neon: 630.5 222.0
loop_filter_mix2_h_88_16_neon: 697.3 222.0
loop_filter_mix2_v_44_16_neon: 598.5 100.6
loop_filter_mix2_v_48_16_neon: 651.5 127.0
loop_filter_mix2_v_84_16_neon: 591.5 167.1
loop_filter_mix2_v_88_16_neon: 855.1 166.7
loop_filter_v_4_8_neon:        271.7  65.3
loop_filter_v_8_8_neon:        312.5 106.9
loop_filter_v_16_8_neon:       473.3 206.5
loop_filter_v_16_16_neon:      976.1 327.8

The speed-up compared to the C functions is 2.5 to 6 and the cortex-a57
is again 30-50% faster than the cortex-a53.

This is an adapted cherry-pick from libav commits
9d2afd1e and
31756abe.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>

f1212e47

aarch64: vp9: Add NEON itxfm routines · f43079e1

Martin Storsjö authored Nov 14, 2016

This work is sponsored by, and copyright, Google.

These are ported from the ARM version; thanks to the larger
amount of registers available, we can do the 16x16 and 32x32
transforms in slices 8 pixels wide instead of 4. This gives
a speedup of around 1.4x compared to the 32 bit version.

The fact that aarch64 doesn't have the same d/q register
aliasing makes some of the macros quite a bit simpler as well.

Examples of runtimes vs the 32 bit version, on a Cortex A53:
                                       ARM  AArch64
vp9_inv_adst_adst_4x4_add_neon:       90.0     87.7
vp9_inv_adst_adst_8x8_add_neon:      400.0    354.7
vp9_inv_adst_adst_16x16_add_neon:   2526.5   1827.2
vp9_inv_dct_dct_4x4_add_neon:         74.0     72.7
vp9_inv_dct_dct_8x8_add_neon:        271.0    256.7
vp9_inv_dct_dct_16x16_add_neon:     1960.7   1372.7
vp9_inv_dct_dct_32x32_add_neon:    11988.9   8088.3
vp9_inv_wht_wht_4x4_add_neon:         63.0     57.7

The speedup vs C code (2-4x) is smaller than in the 32 bit case,
mostly because the C code ends up significantly faster (around
1.6x faster, with GCC 5.4) when built for aarch64.

Examples of runtimes vs C on a Cortex A57 (for a slightly older version
of the patch):
                                A57 gcc-5.3   neon
vp9_inv_adst_adst_4x4_add_neon:       152.2   60.0
vp9_inv_adst_adst_8x8_add_neon:       948.2  288.0
vp9_inv_adst_adst_16x16_add_neon:    4830.4 1380.5
vp9_inv_dct_dct_4x4_add_neon:         153.0   58.6
vp9_inv_dct_dct_8x8_add_neon:         789.2  180.2
vp9_inv_dct_dct_16x16_add_neon:      3639.6  917.1
vp9_inv_dct_dct_32x32_add_neon:     20462.1 4985.0
vp9_inv_wht_wht_4x4_add_neon:          91.0   49.8

The asm is around factor 3-4 faster than C on the cortex-a57 and the asm
is around 30-50% faster on the a57 compared to the a53.

This is an adapted cherry-pick from libav commit
3c9546df.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>

f43079e1