- 13 May, 2017 1 commit
-
-
James Almer authored
-
- 12 Apr, 2017 1 commit
-
-
James Almer authored
~20% faster than AVX. Signed-off-by: James Almer <jamrial@gmail.com>
-
- 10 Apr, 2017 1 commit
-
-
James Almer authored
-
- 08 Jan, 2016 3 commits
-
-
James Almer authored
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
-
James Almer authored
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
-
James Almer authored
The function documentation explicitly mentions it needs to be a multiple of 4. Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
-
- 26 Jul, 2015 1 commit
-
-
James Almer authored
Silences warnings with Nasm Signed-off-by: James Almer <jamrial@gmail.com>
-
- 08 Jun, 2014 2 commits
-
-
James Almer authored
It was lost during the port. Should fix fate on 3dnowext machines. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
James Almer authored
Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
- 19 Apr, 2014 1 commit
-
-
James Almer authored
Use the xm# and ym# aliases as they remain in sync with m# after a SWAP. No actual changes to the assembly. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
- 16 Apr, 2014 2 commits
-
-
James Almer authored
~6% faster SSE2 performance. AVX/FMA3 are unaffected. Signed-off-by: James Almer <jamrial@gmail.com> Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
James Almer authored
The mova is unnecessary Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
- 13 Mar, 2014 1 commit
-
-
James Almer authored
~7% faster than AVX Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
- 20 Feb, 2014 1 commit
-
-
Christophe Gisquet authored
vector_fmul and vector_fmac_scalar are guaranteed that they can process in batch of 16 elements, but their SSE versions only does 8 at a time. Therefore, unroll them a bit. 299 to 261c for 256 elements in vector_fmac_scalar on Arrandale/Win64. Signed-off-by: Janne Grunau <janne-libav@jannau.net>
-
- 15 Feb, 2014 1 commit
-
-
Christophe Gisquet authored
vector_fmul and vector_fmac_scalar are guaranteed that they can process in batch of 16 elements, but their SSE versions only does 8 at a time. Therefore, unroll them a bit. 299 to 261c for 256 elements in vector_fmac_scalar on Arrandale/Win64. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
- 03 May, 2013 1 commit
-
-
Christophe Gisquet authored
97c -> 49c Some codecs could benefit from more unrolling, but AAC doesn't.
-
- 16 Apr, 2013 2 commits
-
-
Michael Niedermayer authored
adds are simpler instructions and should be faster or equally fast on all cpus Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
Christophe Gisquet authored
97c -> 49c Some codecs could benefit from more unrolling, but AAC doesn't. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
- 22 Jan, 2013 3 commits
-
-
Ronald S. Bultje authored
Now, nellymoserenc and aacenc no longer depends on dsputil. Independent of this patch, wmaprodec also does not depend on dsputil, so I removed it from there also.
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
This makes the aac decoder and all voice codecs independent of dsputil.
-
- 08 Dec, 2012 1 commit
-
-
Justin Ruggles authored
-
- 06 Dec, 2012 1 commit
-
-
Justin Ruggles authored
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
-
- 05 Dec, 2012 1 commit
-
-
Justin Ruggles authored
Include x86-optimized versions for SSE2 and AVX.
-
- 26 Nov, 2012 1 commit
-
-
Justin Ruggles authored
-
- 11 Nov, 2012 1 commit
-
-
Diego Biurrun authored
An assembler able to cope with AVX instructions is now required.
-
- 30 Oct, 2012 1 commit
-
-
Diego Biurrun authored
This is necessary to allow refactoring some x86util macros with cpuflags.
-
- 07 Sep, 2012 1 commit
-
-
Justin Ruggles authored
The SWAP macro does not work for explicit xmm/ymm usage, so instead just move the scalar value from xmm2 to xmm0.
-
- 30 Aug, 2012 1 commit
-
-
Diego Biurrun authored
-
- 07 Aug, 2012 1 commit
-
-
Mans Rullgard authored
nasm prints a warning if the colon is missing. Signed-off-by: Mans Rullgard <mans@mansr.com>
-
- 26 Jul, 2012 1 commit
-
-
Ronald S. Bultje authored
-
- 18 Jun, 2012 1 commit
-
-
Justin Ruggles authored
-
- 09 Jun, 2012 1 commit
-
-
Michael Niedermayer authored
The attribution was removed by libav while moving the code to libavutil The original code is from commit eb4825b5 Author: Loren Merritt <lorenm@u.washington.edu> Date: Thu Aug 10 19:06:25 2006 +0000 sse and 3dnow implementations of float->int conversion and mdct windowing. 15% faster vorbis. and commit 06972056 Author: Loren Merritt <lorenm@u.washington.edu> Date: Fri Aug 11 18:19:37 2006 +0000 vorbis simd tweaks Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
- 08 Jun, 2012 1 commit
-
-
Justin Ruggles authored
Move vector_fmul() from DSPContext to AVFloatDSPContext.
-
- 29 May, 2012 1 commit
-
-
Justin Ruggles authored
-