- 07 Aug, 2012 1 commit
-
-
Anton Khirnov authored
-
- 03 Aug, 2012 2 commits
-
-
Mans Rullgard authored
In the GNU assembler, a relational expression, bizarrely, has the value -1 if true, whereas in Apple's it is +1. This patch makes sure the correct expression is used in both cases. Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
Mans Rullgard authored
The clang integrated assembler does not support pre-UAL syntax, while gcc requires pre-UAL syntax for ARM code. A patch[1] for clang to support the old syntax as well has been ignored since January. This patch chooses the syntax appropriate for each compiler, allowing both to build the code. Notably, this change allows building for iphone with the latest Apple Xcode update. [1] http://llvm.org/bugs/show_bug.cgi?id=11855Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
- 01 Aug, 2012 2 commits
-
-
Mans Rullgard authored
Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
Mans Rullgard authored
The standard syntax requires two destination registers for LDRD/STRD instructions. Some versions of the GNU assembler allow using only one with the second implicit, others are more strict. Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
- 18 Jul, 2012 2 commits
-
-
Mans Rullgard authored
This moves all VP3-specific function pointers from dsputil to a new vp3dsp context. There is no reason to ever use the VP3 IDCT where an MPEG2 IDCT is expected or vice versa. Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
Mans Rullgard authored
Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
- 01 Jul, 2012 1 commit
-
-
Mans Rullgard authored
This creates proper position independent code when accessing data symbols if CONFIG_PIC is set. References to external symbols should now use the movrelx macro. Some additional code changes are required since this macro may need a register to hold the GOT pointer. Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
- 18 Jun, 2012 1 commit
-
-
Justin Ruggles authored
-
- 08 Jun, 2012 2 commits
-
-
Justin Ruggles authored
Move vector_fmul() from DSPContext to AVFloatDSPContext.
-
Justin Ruggles authored
This will allow for easier implementation of ARM-optimized functions in libraries other than libavcodec.
-
- 10 May, 2012 3 commits
-
-
Mans Rullgard authored
Change the size specifiers to match the actual element sizes of the data. This makes no practical difference with strict alignment checking disabled (the default) other than somewhat documenting the code. With strict alignment checking on, it avoids trapping the unaligned loads. Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
Mans Rullgard authored
Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
Mans Rullgard authored
The vertically interpolating variants of these functions read ahead one line to optimise the loop. On the last line processed, this might be outside the buffer. Fix these invalid reads by processing the last line outside the loop. Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
- 05 May, 2012 1 commit
-
-
Mans Rullgard authored
Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
- 25 Apr, 2012 4 commits
-
-
Mans Rullgard authored
The assembler may fail to place literal pools close enough to instructions referencing them. An explicit .ltorg directive fixes this. Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
Mans Rullgard authored
Based on patch by Ronald S. Bultje <rsbultje@gmail.com>, partially ported from libvpx. Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
Mans Rullgard authored
This is a preparation for complete ARMv6 optimisations. Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
Mans Rullgard authored
This adds some macros simplifying Thumb and pre-v6T2 compatibility. Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
- 22 Apr, 2012 1 commit
-
-
Mans Rullgard authored
This allows masking CPU features with the -cpuflags avconv option which is useful for testing different optimisations without rebuilding. Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
- 21 Apr, 2012 1 commit
-
-
Mans Rullgard authored
This feature is complex, of questionable utility, and slows down normal decoding. Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
- 12 Apr, 2012 1 commit
-
-
Diego Biurrun authored
-
- 10 Apr, 2012 1 commit
-
-
Christophe GISQUET authored
Quite often, the original weights are multiple of 512. By prescaling them by 1/512 when they are computed (once per frame), no intermediate shifting is needed, and no prescaling on each call either. The x86 code already used that trick. Signed-off-by:
Ronald S. Bultje <rsbultje@gmail.com>
-
- 04 Apr, 2012 1 commit
-
-
Diego Biurrun authored
-
- 12 Mar, 2012 1 commit
-
-
Janne Grunau authored
The were broken since August of 2010 without anyone noticing until three weeks ago. Nobody cares about it anymore and hopefully Marvell will support NEON like in the PXA978 from now on.
-
- 07 Mar, 2012 1 commit
-
-
Christophe GISQUET authored
There is only one caller, which does not need the shifting. Other use cases are situations where different roundings would be needed. The x86 and neon versions are modified accordingly. Signed-off-by:
Ronald S. Bultje <rsbultje@gmail.com>
-
- 02 Mar, 2012 1 commit
-
-
Ronald S. Bultje authored
On 64bit platforms with 32bit int, this means we won't have to sign- extend the integer anymore.
-
- 23 Feb, 2012 1 commit
-
-
Christophe GISQUET authored
Signed-off-by:
Ronald S. Bultje <rsbultje@gmail.com>
-
- 20 Feb, 2012 1 commit
-
-
Ronald S. Bultje authored
This prevents having to sign-extend on 64-bit systems with 32-bit ints, such as x86-64. Also fixes crashes on systems where we don't do it and arguments are not in registers, such as Win64 for all weight functions.
-
- 15 Feb, 2012 2 commits
-
-
Martin Storsjö authored
Signed-off-by:
Martin Storsjö <martin@martin.st>
-
Martin Storsjö authored
Signed-off-by:
Martin Storsjö <martin@martin.st>
-
- 09 Feb, 2012 1 commit
-
-
Diego Biurrun authored
-
- 06 Feb, 2012 1 commit
-
-
Diego Biurrun authored
-
- 02 Feb, 2012 1 commit
-
-
Mans Rullgard authored
This function was broken when the start bin was not at the start of a band. Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
- 28 Jan, 2012 1 commit
-
-
Mans Rullgard authored
Overall speedup of HE-AAC decoding 2.3x on Cortex-A8, 1.2x on A9. Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
- 20 Jan, 2012 1 commit
-
-
Felipe Contreras authored
Signed-off-by:
Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
- 16 Jan, 2012 2 commits
-
-
Janne Grunau authored
Overall almost 4% faster, idct_add down from 350 to 85 cycles, idct_dc_add down from 83 to 30 cycles. squash: rv34 idct rearrange partial register loads
-
Christophe GISQUET authored
Implement 1-pass inverse transform and reconstruction for inter blocks.
-
- 13 Jan, 2012 2 commits
-
-
Mans Rullgard authored
The alignment directive must obviously precede the label. This was never noticed in ARM mode since the location is already aligned there. Signed-off-by:
Mans Rullgard <mans@mansr.com>
-
Mans Rullgard authored
Due to apprent bugs in the GNU assembler and/or linker, relocations can be incorrectly processed if the alignment of a Thumb instruction is changed in the output file compared to the input object. This fixes crashes in h264 decoding with Thumb enabled. No effect in ARM mode since everything is 4-byte aligned there. Signed-off-by:
Mans Rullgard <mans@mansr.com>
-