- 28 Mar, 2012 7 commits
-
-
Ronald S. Bultje authored
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
-
- 26 Mar, 2012 1 commit
-
-
Diego Biurrun authored
-
- 25 Mar, 2012 4 commits
-
-
Diego Biurrun authored
-
Diego Biurrun authored
-
Diego Biurrun authored
-
Diego Biurrun authored
This makes them safe to use in non-fully braced if-blocks and similar.
-
- 24 Mar, 2012 1 commit
-
-
Carl Eugen Hoyos authored
-
- 23 Mar, 2012 2 commits
-
-
Ronald S. Bultje authored
Prevents a signflip in the counter, and a subsequent crash because of overreads/overwrites. Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind CC: libav-stable@libav.org
-
Reimar Döffinger authored
They were moved into code under HAVE_YASM and most of them even into completely disabled code with no reason given for that in the commit message. Signed-off-by:
Reimar Döffinger <Reimar.Doeffinger@gmx.de>
-
- 22 Mar, 2012 1 commit
-
-
ami_stuff authored
Fixes an AAC decoding issue with the sample from ticket #213 on machines with SSE but without SSE2. Based on 89411a by Reimar.
-
- 21 Mar, 2012 1 commit
-
-
Reimar Döffinger authored
This is even potentially faster in this use-case. Should fix AAC SBR decoding on machines with SSE but not SSE2, fixing track issue #1041. Signed-off-by:
Reimar Döffinger <Reimar.Doeffinger@gmx.de>
-
- 17 Mar, 2012 1 commit
-
-
Michael Niedermayer authored
Fixes Ticket1068 Signed-off-by:
Michael Niedermayer <michaelni@gmx.at>
-
- 15 Mar, 2012 2 commits
-
-
Michael Niedermayer authored
Signed-off-by:
Michael Niedermayer <michaelni@gmx.at>
-
Michael Niedermayer authored
Signed-off-by:
Michael Niedermayer <michaelni@gmx.at>
-
- 12 Mar, 2012 1 commit
-
-
Nico Weber authored
Yasm creates an implicit unaligned text section if "struc" is used outside of any section: http://tortall.lighthouseapp.com/projects/78676-yasm/tickets/247 Since yasm only honors the "align" annotation on the first declaration of a section, this implicit text section causes all text section alignments to be ignored. Also fixes a yasm warning about it agnoring alignment. Signed-off-by:
Michael Niedermayer <michaelni@gmx.at>
-
- 10 Mar, 2012 2 commits
-
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
-
- 07 Mar, 2012 3 commits
-
-
Reimar Döffinger authored
Since the values are floats, using the float operations makes sense, improves performance on some CPUs and makes the code SSE compatible instead of needing SSE2. Based on suggestion by Jason. Signed-off-by:
Reimar Döffinger <Reimar.Doeffinger@gmx.de> Signed-off-by:
Ronald S. Bultje <rsbultje@gmail.com>
-
Christophe GISQUET authored
There is only one caller, which does not need the shifting. Other use cases are situations where different roundings would be needed. The x86 and neon versions are modified accordingly. Signed-off-by:
Ronald S. Bultje <rsbultje@gmail.com>
-
Diego Biurrun authored
-
- 06 Mar, 2012 1 commit
-
-
Reimar Döffinger authored
movq from SSE register _to_ memory is an SSE2 instruction. Use the SSE movlps function instead that does the same thing. Signed-off-by:
Reimar Döffinger <Reimar.Doeffinger@gmx.de> Signed-off-by:
Ronald S. Bultje <rsbultje@gmail.com>
-
- 05 Mar, 2012 1 commit
-
-
Mans Rullgard authored
This splits ff_dsputil_init_mmx() into multiple functions, one for each MMX/SSE level, somewhat simplifying the nested conditions. Signed-off-by:
Mans Rullgard <mans@mansr.com> Signed-off-by:
Diego Biurrun <diego@biurrun.de>
-
- 04 Mar, 2012 5 commits
-
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
-
- 02 Mar, 2012 3 commits
-
-
Ronald S. Bultje authored
This prevents us from reading before the start of the buffer, and thus prevents crashes resulting from this behaviour. Fixes bug 237.
-
Ronald S. Bultje authored
x86-64 is guaranteed to have at least SSE2, therefore the MMX/MMX2 functions will never be used in practice.
-
Ronald S. Bultje authored
On 64bit platforms with 32bit int, this means we won't have to sign- extend the integer anymore.
-
- 27 Feb, 2012 1 commit
-
-
Ronald S. Bultje authored
-
- 23 Feb, 2012 2 commits
-
-
Christophe GISQUET authored
Unrolling the main loop to process, instead of 4 elements: - 8: minor gain of 2 cycles (not worth the extra object size) - 2: loss of 8 cycles. Assigning STEP to a register is a loss. Output address (Y) is almost always unaligned. Timings: - C (32/64 bits): 117/109 cycles - SSE: 57 cycles Signed-off-by:
Ronald S. Bultje <rsbultje@gmail.com>
-
Christophe GISQUET authored
The 32bits targets have been compiled with -mfpmath=sse for proper reference. sbr_sum_square C /32bits: 82c (unrolled)/102c C /64bits: 69c (unrolled)/82c SSE/32bits: 42c SSE/64bits: 31c Use of SSE4.1 dpps to perform the final sum is slower. Not unrolling to perform 8 operations in a loop yields 10 more cycles. Signed-off-by:
Ronald S. Bultje <rsbultje@gmail.com>
-
- 20 Feb, 2012 1 commit
-
-
Ronald S. Bultje authored
This prevents having to sign-extend on 64-bit systems with 32-bit ints, such as x86-64. Also fixes crashes on systems where we don't do it and arguments are not in registers, such as Win64 for all weight functions.
-