- 29 Jan, 2017 2 commits
-
-
Andreas Cadhalpun authored
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
-
Andreas Cadhalpun authored
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
-
- 28 Jan, 2017 6 commits
-
-
Marijn Meijles authored
avformat/ac3dec: Fix to prevent runaway ac3 detection by looking at the actual frame rather than the first detected frame. When detecting a swapped AC3 marker the data of the frame is swapped. However, in subsequent frames the data swapped is taken from the first frame rather than the current frame. Signed-off-by: Marijn Meijles <marijn@bitpit.net> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
-
Paul Arzelier authored
Reviewed-by: wm4 <nfxjfg@googlemail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
-
James Almer authored
Found-by: Aaron Colwell <acolwell@google.com> Signed-off-by: James Almer <jamrial@gmail.com>
-
Paul B Mahol authored
Signed-off-by: Paul B Mahol <onemda@gmail.com>
-
Chris Moeller authored
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
-
Aaron Colwell authored
Signed-off-by: James Almer <jamrial@gmail.com>
-
- 27 Jan, 2017 7 commits
-
-
Michael Niedermayer authored
Fixes CID1396252 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
-
Sasi Inguva authored
Signed-off-by: Sasi Inguva <isasi@google.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
-
Sasi Inguva authored
Signed-off-by: Sasi Inguva <isasi@google.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
-
Paul B Mahol authored
Signed-off-by: Paul B Mahol <onemda@gmail.com>
-
Paul B Mahol authored
Make sure no division by zero is done. Make sure there are actually samples available. Signed-off-by: Paul B Mahol <onemda@gmail.com>
-
Paul B Mahol authored
Signed-off-by: Paul B Mahol <onemda@gmail.com>
-
Carl Eugen Hoyos authored
Ensures that probing doesn't finish prematurely for small files.
-
- 26 Jan, 2017 4 commits
-
-
Michael Niedermayer authored
Fixes using freed memory Introduced in 74480198 Fixes: 471/fuzz-1-ffmpeg_VIDEO_AV_CODEC_ID_H264_fuzzer Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpegSigned-off-by: Michael Niedermayer <michael@niedermayer.cc>
-
James Almer authored
This reflects a recent change to the spec draft. Signed-off-by: James Almer <jamrial@gmail.com>
-
Joel Cunningham authored
From e24d95c0e06a878d401ee34fd6742fcaddeeb95f Mon Sep 17 00:00:00 2001 From: Joel Cunningham <joel.cunningham@me.com> Date: Mon, 9 Jan 2017 13:37:51 -0600 Subject: [PATCH] tcp: set socket buffer sizes before listen/connect/accept Attempting to set SO_RCVBUF and SO_SNDBUF on TCP sockets after connection establishment is incorrect and some stacks ignore the set call on the socket at this point. This has been observed on MacOS/iOS. Windows 7 has some peculiar behavior where setting SO_RCVBUF after applies only if the buffer is increasing from the default while decreases are ignored. This is possibly how the incorrect usage has gone unnoticed Unix Network Programming Vol. 1: The Sockets Networking API (3rd edition, seciton 7.5): "When setting the size of the TCP socket receive buffer, the ordering of the function calls is important. This is because of TCP's window scale option, which is exchanged with the peer on SYN segments when the connection is established. For a client, this means the SO_RCVBUF socket option must be set before calling connect. For a server, this means the socket option must be set for the listening socket before calling listen. Setting this option for the connected socket will have no effect whatsoever on the possible window scale option because accept does not return with the connected socket until TCP's three-way handshake is complete. This is why the option must be set on the listening socket. (The sizes of the socket buffers are always inherited from the listening socket by the newly created connected socket)" Signed-off-by: Joel Cunningham <joel.cunningham@me.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
-
Paul B Mahol authored
Signed-off-by: Paul B Mahol <onemda@gmail.com>
-
- 25 Jan, 2017 7 commits
-
-
Frank Liberato authored
Return AVERROR_INVALIDDATA if all four bytes aren't present. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
-
Sasi Inguva authored
Signed-off-by: Sasi Inguva <isasi@google.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
-
Paul B Mahol authored
Signed-off-by: Paul B Mahol <onemda@gmail.com>
-
Paul B Mahol authored
Fixes #4767. Signed-off-by: Paul B Mahol <onemda@gmail.com>
-
Paul B Mahol authored
Signed-off-by: Paul B Mahol <onemda@gmail.com>
-
Carl Eugen Hoyos authored
-
compn authored
-
- 24 Jan, 2017 14 commits
-
-
Michael Niedermayer authored
Fixes out of array access Fixes: 452/fuzz-1-ffmpeg_VIDEO_AV_CODEC_ID_INTERPLAY_VIDEO_fuzzer Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpegSigned-off-by: Michael Niedermayer <michael@niedermayer.cc>
-
Carl Eugen Hoyos authored
-
Carl Eugen Hoyos authored
-
Carl Eugen Hoyos authored
When bytes_read overflowed, last_bytes_read did not yet overflow and no bytes-read report was created leading to a timeout. Analyzed-by: Thomas Bernhard Fixes ticket #5836.
-
Marton Balint authored
Current code returned the number of channels as channel layout in that case, and if nret is not set then unknown layouts are typically not supported. Also use the common parsing code. Use a temporary workaround to parse an unknown channel layout such as '13c', after a 1 year grace period only '13C' will work. Signed-off-by: Marton Balint <cus@passwd.hu>
-
Marton Balint authored
Return a channel layout and the number of channels based on the specified name. This function is similar to av_get_channel_layout(), but can also parse unknown channel layout specifications. Unknown channel layout specifications are a decimal number and a capital 'C' suffix, in order to not break compatibility with the lowercase 'c' suffix, which is used for a guessed channel layout with the specified number of channels. Signed-off-by: Marton Balint <cus@passwd.hu>
-
Marton Balint authored
Signed-off-by: Marton Balint <cus@passwd.hu>
-
Carl Eugen Hoyos authored
Tested-by: ami_stuff Fixes a part of ticket #6094.
-
Michael Niedermayer authored
Fixes timeout Fixes: 446/fuzz-3-ffmpeg_VIDEO_AV_CODEC_ID_VP6_fuzzer Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpegSigned-off-by: Michael Niedermayer <michael@niedermayer.cc>
-
Martin Storsjö authored
This work is sponsored by, and copyright, Google. This is similar to the arm version, but due to the larger registers on aarch64, we can do 8 pixels at a time for all filter sizes. Examples of runtimes vs the 32 bit version, on a Cortex A53: ARM AArch64 vp9_loop_filter_h_4_8_10bpp_neon: 213.2 172.6 vp9_loop_filter_h_8_8_10bpp_neon: 281.2 244.2 vp9_loop_filter_h_16_8_10bpp_neon: 657.0 444.5 vp9_loop_filter_h_16_16_10bpp_neon: 1280.4 877.7 vp9_loop_filter_mix2_h_44_16_10bpp_neon: 397.7 358.0 vp9_loop_filter_mix2_h_48_16_10bpp_neon: 465.7 429.0 vp9_loop_filter_mix2_h_84_16_10bpp_neon: 465.7 428.0 vp9_loop_filter_mix2_h_88_16_10bpp_neon: 533.7 499.0 vp9_loop_filter_mix2_v_44_16_10bpp_neon: 271.5 244.0 vp9_loop_filter_mix2_v_48_16_10bpp_neon: 330.0 305.0 vp9_loop_filter_mix2_v_84_16_10bpp_neon: 329.0 306.0 vp9_loop_filter_mix2_v_88_16_10bpp_neon: 386.0 365.0 vp9_loop_filter_v_4_8_10bpp_neon: 150.0 115.2 vp9_loop_filter_v_8_8_10bpp_neon: 209.0 175.5 vp9_loop_filter_v_16_8_10bpp_neon: 492.7 345.2 vp9_loop_filter_v_16_16_10bpp_neon: 951.0 682.7 This is significantly faster than the ARM version in almost all cases except for the mix2 functions. Based on START_TIMER/STOP_TIMER wrapping around a few individual functions, the speedup vs C code is around 2-3x. Signed-off-by: Martin Storsjö <martin@martin.st>
-
Martin Storsjö authored
This work is sponsored by, and copyright, Google. Compared to the arm version, on aarch64 we can keep the full 8x8 transform in registers, and for 16x16 and 32x32, we can process it in slices of 4 pixels instead of 2. Examples of runtimes vs the 32 bit version, on a Cortex A53: ARM AArch64 vp9_inv_adst_adst_4x4_sub4_add_10_neon: 111.0 109.7 vp9_inv_adst_adst_8x8_sub8_add_10_neon: 914.0 733.5 vp9_inv_adst_adst_16x16_sub16_add_10_neon: 5184.0 3745.7 vp9_inv_dct_dct_4x4_sub1_add_10_neon: 65.0 65.7 vp9_inv_dct_dct_4x4_sub4_add_10_neon: 100.0 96.7 vp9_inv_dct_dct_8x8_sub1_add_10_neon: 111.0 119.7 vp9_inv_dct_dct_8x8_sub8_add_10_neon: 618.0 494.7 vp9_inv_dct_dct_16x16_sub1_add_10_neon: 295.1 284.6 vp9_inv_dct_dct_16x16_sub2_add_10_neon: 2303.2 1883.9 vp9_inv_dct_dct_16x16_sub8_add_10_neon: 2984.8 2189.3 vp9_inv_dct_dct_16x16_sub16_add_10_neon: 3890.0 2799.4 vp9_inv_dct_dct_32x32_sub1_add_10_neon: 1044.4 1012.7 vp9_inv_dct_dct_32x32_sub2_add_10_neon: 13333.7 9695.1 vp9_inv_dct_dct_32x32_sub16_add_10_neon: 18531.3 12459.8 vp9_inv_dct_dct_32x32_sub32_add_10_neon: 24470.7 16160.2 vp9_inv_wht_wht_4x4_sub4_add_10_neon: 83.0 79.7 The larger transforms are significantly faster than the corresponding ARM versions. The speedup vs C code is smaller than in 32 bit mode, probably because the 64 bit intermediates in the C code can be expressed more efficiently in aarch64. Signed-off-by: Martin Storsjö <martin@martin.st>
-
Martin Storsjö authored
This work is sponsored by, and copyright, Google. This has mostly got the same differences to the 8 bit version as in the arm version. For the horizontal filters, we do 16 pixels in parallel as well. For the 8 pixel wide vertical filters, we can accumulate 4 rows before storing, just as in the 8 bit version. Examples of runtimes vs the 32 bit version, on a Cortex A53: ARM AArch64 vp9_avg4_10bpp_neon: 35.7 30.7 vp9_avg8_10bpp_neon: 93.5 84.7 vp9_avg16_10bpp_neon: 324.4 296.6 vp9_avg32_10bpp_neon: 1236.5 1148.2 vp9_avg64_10bpp_neon: 4639.6 4571.1 vp9_avg_8tap_smooth_4h_10bpp_neon: 130.0 128.0 vp9_avg_8tap_smooth_4hv_10bpp_neon: 440.0 440.5 vp9_avg_8tap_smooth_4v_10bpp_neon: 114.0 105.5 vp9_avg_8tap_smooth_8h_10bpp_neon: 327.0 314.0 vp9_avg_8tap_smooth_8hv_10bpp_neon: 918.7 865.4 vp9_avg_8tap_smooth_8v_10bpp_neon: 330.0 300.2 vp9_avg_8tap_smooth_16h_10bpp_neon: 1187.5 1155.5 vp9_avg_8tap_smooth_16hv_10bpp_neon: 2663.1 2591.0 vp9_avg_8tap_smooth_16v_10bpp_neon: 1107.4 1078.3 vp9_avg_8tap_smooth_64h_10bpp_neon: 17754.6 17454.7 vp9_avg_8tap_smooth_64hv_10bpp_neon: 33285.2 33001.5 vp9_avg_8tap_smooth_64v_10bpp_neon: 16066.9 16048.6 vp9_put4_10bpp_neon: 25.5 21.7 vp9_put8_10bpp_neon: 56.0 52.0 vp9_put16_10bpp_neon/armv8: 183.0 163.1 vp9_put32_10bpp_neon/armv8: 678.6 563.1 vp9_put64_10bpp_neon/armv8: 2679.9 2195.8 vp9_put_8tap_smooth_4h_10bpp_neon: 120.0 118.0 vp9_put_8tap_smooth_4hv_10bpp_neon: 435.2 435.0 vp9_put_8tap_smooth_4v_10bpp_neon: 107.0 98.2 vp9_put_8tap_smooth_8h_10bpp_neon: 303.0 290.0 vp9_put_8tap_smooth_8hv_10bpp_neon: 893.7 828.7 vp9_put_8tap_smooth_8v_10bpp_neon: 305.5 263.5 vp9_put_8tap_smooth_16h_10bpp_neon: 1089.1 1059.2 vp9_put_8tap_smooth_16hv_10bpp_neon: 2578.8 2452.4 vp9_put_8tap_smooth_16v_10bpp_neon: 1009.5 933.5 vp9_put_8tap_smooth_64h_10bpp_neon: 16223.4 15918.6 vp9_put_8tap_smooth_64hv_10bpp_neon: 32153.0 31016.2 vp9_put_8tap_smooth_64v_10bpp_neon: 14516.5 13748.1 These are generally about as fast as the corresponding ARM routines on the same CPU (at least on the A53), in most cases marginally faster. The speedup vs C code is around 4-9x. Signed-off-by: Martin Storsjö <martin@martin.st>
-
Martin Storsjö authored
This work is sponsored by, and copyright, Google. This is more in line with how it will be extended for more bitdepths. Signed-off-by: Martin Storsjö <martin@martin.st>
-
Martin Storsjö authored
This work is sponsored by, and copyright, Google. This is pretty much similar to the 8 bpp version, but in some senses simpler. All input pixels are 16 bits, and all intermediates also fit in 16 bits, so there's no lengthening/narrowing in the filter at all. For the full 16 pixel wide filter, we can only process 4 pixels at a time (using an implementation very much similar to the one for 8 bpp), but we can do 8 pixels at a time for the 4 and 8 pixel wide filters with a different implementation of the core filter. Examples of relative speedup compared to the C version, from checkasm: Cortex A7 A8 A9 A53 vp9_loop_filter_h_4_8_10bpp_neon: 1.83 2.16 1.40 2.09 vp9_loop_filter_h_8_8_10bpp_neon: 1.39 1.67 1.24 1.70 vp9_loop_filter_h_16_8_10bpp_neon: 1.56 1.47 1.10 1.81 vp9_loop_filter_h_16_16_10bpp_neon: 1.94 1.69 1.33 2.24 vp9_loop_filter_mix2_h_44_16_10bpp_neon: 2.01 2.27 1.67 2.39 vp9_loop_filter_mix2_h_48_16_10bpp_neon: 1.84 2.06 1.45 2.19 vp9_loop_filter_mix2_h_84_16_10bpp_neon: 1.89 2.20 1.47 2.29 vp9_loop_filter_mix2_h_88_16_10bpp_neon: 1.69 2.12 1.47 2.08 vp9_loop_filter_mix2_v_44_16_10bpp_neon: 3.16 3.98 2.50 4.05 vp9_loop_filter_mix2_v_48_16_10bpp_neon: 2.84 3.64 2.25 3.77 vp9_loop_filter_mix2_v_84_16_10bpp_neon: 2.65 3.45 2.16 3.54 vp9_loop_filter_mix2_v_88_16_10bpp_neon: 2.55 3.30 2.16 3.55 vp9_loop_filter_v_4_8_10bpp_neon: 2.85 3.97 2.24 3.68 vp9_loop_filter_v_8_8_10bpp_neon: 2.27 3.19 1.96 3.08 vp9_loop_filter_v_16_8_10bpp_neon: 3.42 2.74 2.26 4.40 vp9_loop_filter_v_16_16_10bpp_neon: 2.86 2.44 1.93 3.88 The speedup vs C code measured in checkasm is around 1.1-4x. These numbers are quite inconclusive though, since the checkasm test runs multiple filterings on top of each other, so later rounds might end up with different codepaths (different decisions on which filter to apply, based on input pixel differences). Based on START_TIMER/STOP_TIMER wrapping around a few individual functions, the speedup vs C code is around 2-4x. Signed-off-by: Martin Storsjö <martin@martin.st>
-