Commits · 8ba694548782c1821ab119c18fe02360a81c6768 · Linshizhi / ffmpeg.wasm-core

25 Sep, 2014 1 commit
- avcodec/idctdsp: change {put,add}_pixels_clamped to ptrdiff_t line_size · c99a8828
  James Almer authored 10 years ago
```
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
```
  c99a8828
23 Sep, 2014 1 commit

Fix compile error on arm4/arm5 platform · 6b733be7

Bernd Kuhls authored 10 years ago

Since these commits
http://git.videolan.org/?p=ffmpeg.git;a=commitdiff;h=adf8227cf4e7b4fccb2ad88e1e09b6dc00dd00ed
http://git.videolan.org/?p=ffmpeg.git;a=commitdiff;h=db7f1c7c5a1d37e7f4da64a79a97bea1c4b6e9f8

compilation on arm4/arm5 fails:

libavcodec/libavcodec.so: undefined reference to
`ff_startcode_find_candidate_armv6'

Because libavcodec/arm/Makefile contains
ARMV6-OBJS-$(CONFIG_STARTCODE)         += arm/startcode_armv6.o
function ff_startcode_find_candidate_armv6 is not included for older ARM
archs. The bug was found during automatic buildroot builds:

http://autobuild.buildroot.net/results/ec7/ec71e4f16ee9106747dff5f15999cbd17903e76f//build-end.log
Quote from configure summary:
ARCH                      arm (armv4t)
big-endian                no
runtime cpu detection     yes
ARMv5TE enabled           no
ARMv6 enabled             no
ARMv6T2 enabled           no

http://autobuild.buildroot.net/results/be7/be72eb182eaccf0064a32c9dfc2ac1c0d6555506/build-end.log
ARCH                      arm (armv5te)
big-endian                no
runtime cpu detection     yes
ARMv5TE enabled           yes
ARMv6 enabled             no
ARMv6T2 enabled           no

This patch provides the necessary #if clauses as discussed with Michael:
https://ffmpeg.org/pipermail/ffmpeg-devel/2014-September/163329.htmlSigned-off-by: Bernd Kuhls <bernd.kuhls@t-online.de>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

6b733be7

02 Sep, 2014 1 commit

idctdsp: Add global function pointers for {add|put}_pixels_clamped functions · 95c0cec0

Diego Biurrun authored 10 years ago

These function pointers already existed in the ARM code. Adding them globally
allows calls to the function pointers to access arch-optimized versions of the
functions transparently.

95c0cec0

15 Aug, 2014 2 commits
- build: Add explanatory comments to (optimization) blocks in the Makefiles · efd26bed
  Diego Biurrun authored 10 years ago
  
  efd26bed
- mpegvideo: cosmetics: Lowercase ugly uppercase MPV_ function name prefixes · 835f798c
  Diego Biurrun authored 10 years ago
  
  835f798c
12 Aug, 2014 1 commit

avcodec/idctdsp: make add/put_pixels_clamped_c internal functions · a8592db9

James Almer authored 10 years ago

This reduces code duplication and differences with the fork.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

a8592db9

06 Aug, 2014 1 commit
- avcodec: Change get_pixels() to ptrdiff_t linesize · 305f72ae
  Michael Niedermayer authored 10 years ago
```
Found-by: ubitux
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
```
  305f72ae
04 Aug, 2014 2 commits

vc-1: Add platform-specific start code search routine to VC1DSPContext. · adf8227c

Ben Avison authored 10 years ago

Initialise VC1DSPContext for parser as well as for decoder.
Note, the VC-1 code doesn't actually use the function pointer yet.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>

adf8227c

h264: Move start code search functions into separate source files. · db7f1c7c

Ben Avison authored 10 years ago

This permits re-use with parsers for codecs which use similar start codes.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>

db7f1c7c

27 Jul, 2014 1 commit
- avcodec/arm/idctdsp_init_arm*: Only select non bitexact IDCTs by default when bitexact is not set · b051a1bb
  Michael Niedermayer authored 10 years ago
```
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
```
  b051a1bb
25 Jul, 2014 1 commit
- qpeldsp: Mark source pointer in qpel_mc_func function pointer const · 7fb993d3
  Diego Biurrun authored 10 years ago
  
  7fb993d3
21 Jul, 2014 2 commits
- arm: Macroize the test for 'setend' CPU instruction support · 6869612f
  Ben Avison authored 10 years ago
```
Signed-off-by: Diego Biurrun <diego@biurrun.de>
```
  6869612f
- dct-test: Move arch-specific bits into arch-specific subdirectories · 81b9bf31
  Diego Biurrun authored 10 years ago
  
  81b9bf31
20 Jul, 2014 1 commit
- idct: Move arm-specific declarations to a header in the arm directory · 4de8b606
  Diego Biurrun authored 10 years ago
  
  4de8b606
18 Jul, 2014 4 commits
- idctdsp: prettyprinting cosmetics · 8b0dd494
  Diego Biurrun authored 10 years ago
  
  8b0dd494
- idct: Convert IDCT permutation #defines to an enum · b4987f72
  Diego Biurrun authored 10 years ago
```
Also rename the enum values to be consistent with other DCT permutations.
```
  b4987f72
- arm: cosmetics: Consistently use lowercase for shift operators · 7e18a727
  Martin Storsjö authored 10 years ago
```
Signed-off-by: Martin Storsjö <martin@martin.st>
```
  7e18a727
- arm: cosmetics: Fix a misaligned asm operand · fe67f3fb
  Martin Storsjö authored 10 years ago
```
Signed-off-by: Martin Storsjö <martin@martin.st>
```
  fe67f3fb
17 Jul, 2014 3 commits

armv6: Accelerate ff_fft_calc for general case (nbits != 4) · 87552d54

Ben Avison authored 10 years ago

The previous implementation targeted DTS Coherent Acoustics, which only
requires nbits == 4 (fft16()). This case was (and still is) linked directly
rather than being indirected through ff_fft_calc_vfp(), but now the full
range from radix-4 up to radix-65536 is available. This benefits other codecs
such as AAC and AC3.

The implementaion is based upon the C version, with each routine larger than
radix-16 calling a hierarchy of smaller FFT functions, then performing a
post-processing pass. This pass benefits a lot from loop unrolling to
counter the long pipelines in the VFP. A relaxed calling standard also
reduces the overhead of the call hierarchy, and avoiding the excessive
inlining performed by GCC probably helps with I-cache utilisation too.

I benchmarked the result by measuring the number of gperftools samples that
hit anywhere in the AAC decoder (starting from aac_decode_frame()) or
specifically in the FFT routines (fft4() to fft512() and pass()) for the
same sample AAC stream:

Before After
Mean StdDev Mean StdDev Confidence Change
Audio decode 2245.5 53.1 1599.6 43.8 100.0% +40.4%
FFT routines 940.6 22.0 348.1 20.8 100.0% +170.2%
Signed-off-by: Martin Storsjö <martin@martin.st>

87552d54

armv6: Accelerate ff_imdct_half for general case (mdct_bits != 6) · 5c22e8e4

Ben Avison authored 10 years ago

The previous implementation targeted DTS Coherent Acoustics, which only
requires mdct_bits == 6. This relatively small size lent itself to
unrolling the loops a small number of times, and encoding offsets
calculated at assembly time within the load/store instructions of each
iteration.

In the more general case (codecs such as AAC and AC3) much larger arrays
are used - mdct_bits == [8, 9, 11]. The old method does not scale for
these cases, so more integer registers are used with non-unrolled versions
of the loops (and with some stack spillage). The postrotation filter loop
is still unrolled by a factor of 2 to permit the double-buffering of some
VFP registers to facilitate overlap of neighbouring iterations.

I benchmarked the result by measuring the number of gperftools samples
that hit anywhere in the AAC decoder (starting from aac_decode_frame())
or specifically in ff_imdct_half_c / ff_imdct_half_vfp, for the same
example AAC stream:

Before After
Mean StdDev Mean StdDev Confidence Change
aac_decode_frame 2368.1 35.8 2117.2 35.3 100.0% +11.8%
ff_imdct_half_* 457.5 22.4 251.2 16.2 100.0% +82.1%
Signed-off-by: Martin Storsjö <martin@martin.st>

5c22e8e4

dsputil: Split motion estimation compare bits off into their own context · 2d604443
Diego Biurrun authored 10 years ago

2d604443

16 Jul, 2014 1 commit
- arm: dsputil: Coalesce all init files · adff0a81
  Diego Biurrun authored 10 years ago
  
  adff0a81
13 Jul, 2014 1 commit

armv6: Accelerate ff_imdct_half for general case (mdct_bits != 6) · 42c1cc35

Ben Avison authored 10 years ago

42c1cc35

11 Jul, 2014 1 commit
- dsputil: Drop unused bit_depth parameter from all init functions · 11733202
  Diego Biurrun authored 10 years ago
  
  11733202
09 Jul, 2014 1 commit
- dsputil: Split off pixel block routines into their own context · f46bb608
  Diego Biurrun authored 11 years ago
  
  f46bb608
08 Jul, 2014 1 commit

arm: Avoid using the 'setend' instruction on ARMv7 and newer · 79fce1ec

Martin Storsjö authored 10 years ago

This instruction is deprecated on ARMv8, and it is serializing on
some ARMv7 cores as well [1].

[1] http://article.gmane.org/gmane.linux.ports.arm.kernel/339293

CC: libav-stable@libav.org
Signed-off-by: Martin Storsjö <martin@martin.st>

79fce1ec

06 Jul, 2014 1 commit
- dsputil: Move pix_sum, pix_norm1, shrink function pointers to mpegvideoenc · c1661484
  Diego Biurrun authored 11 years ago
  
  c1661484
30 Jun, 2014 1 commit
- dsputil: Split off IDCT bits into their own context · e3fcb143
  Diego Biurrun authored 11 years ago
  
  e3fcb143
23 Jun, 2014 1 commit
- h264: avoid using uninitialized memory in NEON chroma mc · f23d26a6
  Janne Grunau authored 10 years ago
```
Adapt commit 982b596e for the arm and
aarch64 NEON asm. 5-10% faster on Cortex-A9.
```
  f23d26a6
22 Jun, 2014 1 commit
- dsputil: Split audio operations off into a separate context · 9a9e2f1c
  Diego Biurrun authored 11 years ago
  
  9a9e2f1c
19 Jun, 2014 1 commit

avcodec: add simpleauto idct · 08c5859f

Michael Niedermayer authored 10 years ago

This will pick the "best" simple idct compatible idct
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

08c5859f

18 Jun, 2014 1 commit
- dsputil: Split clear_block*/fill_block* off into a separate context · e74433a8
  Diego Biurrun authored 11 years ago
  
  e74433a8
05 Jun, 2014 1 commit

apedsp: move to llauddsp · ccff45a0

Christophe Gisquet authored 10 years ago

APE is not the sole codec using scalarproduct_and_madd_int16.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

ccff45a0

03 Jun, 2014 1 commit

arm: check if AS supports .dn · 896a5bff

Janne Grunau authored 10 years ago

Move the GNU as check before the arch specific asm checks since the .dn
check requires gas compatible assembler.

Disable the VC-1 motion compensation NEON asm which is the only part
using that directive. The integrated assembler in the upcoming clang 3.5
does not support .dn/.qn without plans to change that. Too much effort
to implement it while it is rarely used.

http://llvm.org/bugs/show_bug.cgi?id=18199.

896a5bff

29 May, 2014 1 commit
- dsputil: Move APE-specific bits into apedsp · 054013a0
  Diego Biurrun authored 11 years ago
  
  054013a0
29 Apr, 2014 1 commit
- mpegvideo: move the MpegEncContext fields used from arm asm to the beginning · 6a13505c
  Anton Khirnov authored 11 years ago
```
This should reduce the frequency with which the offsets need to be
updated.
```
  6a13505c
25 Apr, 2014 2 commits

vc-1: Add platform-specific start code search routine to VC1DSPContext. · 9d8ecdd8

Ben Avison authored 10 years ago

Initialise VC1DSPContext for parser as well as for decoder.
Note, the VC-1 code doesn't actually use the function pointer yet.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

9d8ecdd8

h264: Move search code search functions into separate source files. · 270cede3

Ben Avison authored 10 years ago

This permits re-use with parsers for codecs which use similar start codes.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

270cede3

24 Apr, 2014 1 commit
- lavu: add CHK_OFFS as AV_CHECK_OFFSET to check struct member offsets · a88e1d1c
  Janne Grunau authored 10 years ago
  
  a88e1d1c
20 Apr, 2014 1 commit
- avcodec/arm/vc1dsp_init_neon: fix code so it compiles and passes fate-vc1 · af89a685
  Michael Niedermayer authored 10 years ago
```
The original patch  seems to be missing a 16x16 function though
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
```
  af89a685