- 24 Feb, 2014 1 commit
-
-
James Almer authored
We need the emulation to support the cases where the first argument is the same as the fourth. To achieve this a fifth argument working as a temporary may be needed. Emulation that doesn't obey the original instruction semantics can't be in x86inc. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
- 23 Feb, 2014 3 commits
-
-
James Almer authored
Based on x264 code Signed-off-by: James Almer <jamrial@gmail.com>
-
James Almer authored
Based on x264 code Signed-off-by: James Almer <jamrial@gmail.com>
-
James Almer authored
Signed-off-by: James Almer <jamrial@gmail.com>
-
- 22 Feb, 2014 2 commits
-
-
James Almer authored
Based on x264 code Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
James Almer authored
Based on x264 code Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
- 20 Feb, 2014 1 commit
-
-
Christophe Gisquet authored
vector_fmul and vector_fmac_scalar are guaranteed that they can process in batch of 16 elements, but their SSE versions only does 8 at a time. Therefore, unroll them a bit. 299 to 261c for 256 elements in vector_fmac_scalar on Arrandale/Win64. Signed-off-by: Janne Grunau <janne-libav@jannau.net>
-
- 15 Feb, 2014 1 commit
-
-
Christophe Gisquet authored
vector_fmul and vector_fmac_scalar are guaranteed that they can process in batch of 16 elements, but their SSE versions only does 8 at a time. Therefore, unroll them a bit. 299 to 261c for 256 elements in vector_fmac_scalar on Arrandale/Win64. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
- 13 Feb, 2014 1 commit
-
-
James Almer authored
Support the cases where the first and last operand of the XOP instruction are the same. Also add vpmacsdql emulation. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
- 11 Feb, 2014 1 commit
-
-
James Almer authored
Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
- 26 Jan, 2014 1 commit
-
-
Loren Merritt authored
Work around Yasm's inefficiency with handling large numbers of variables in the global scope. Signed-off-by: Diego Biurrun <diego@biurrun.de>
-
- 18 Jan, 2014 1 commit
-
-
Loren Merritt authored
Work around yasm's inefficiency with handling large numbers of variables in the global scope.
-
- 17 Nov, 2013 2 commits
-
-
Michael Niedermayer authored
also remove failed attempt at a compatibility layer, the code simply cannot work Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
Michael Niedermayer authored
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
- 26 Oct, 2013 1 commit
-
-
Kieran Kunhya authored
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
- 25 Oct, 2013 1 commit
-
-
Kieran Kunhya authored
Patch based on x264's AVX2 detection Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
- 14 Oct, 2013 4 commits
-
-
Jason Garrett-Glaser authored
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Jason Garrett-Glaser authored
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Derek Buitenhuis authored
This is so we can sync to x264's version of FMA4 support. This partialy reverts commit 79687079. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Henrik Gramner authored
Automatically use VEX-encoding in AVX/AVX2/XOP/FMA3/FMA4 functions for all instructions that exists in a VEX-encoded version. This change makes it easier to extend existing code to use AVX2. Also add support for AVX emulation of a few instructions that were missing before. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
- 09 Oct, 2013 1 commit
-
-
Henrik Gramner authored
The Mach-O bug was fixed in yasm 0.8.0 and we don't support versions that old anymore. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
- 07 Oct, 2013 9 commits
-
-
Henrik Gramner authored
Prevents a crash if the misaligned exception mask bit is cleared for some reason. Misaligned SSE functions are only used on AMD Phenom CPUs and the benefit is miniscule. They also require modifying the MXCSR control register and by removing those functions we can get rid of that complexity altogether. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Jason Garrett-Glaser authored
Small backports that sneaked into other asm commits in x264. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Derek Buitenhuis authored
This is also a valid value for WIN64. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Henrik Gramner authored
Store XMM6 and XMM7 in the shadow space in functions that clobbers them. This way we don't have to adjust the stack pointer as often, reducing the number of instructions as well as code size. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Loren Merritt authored
For when we want to mix simd sizes within one function. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Loren Merritt authored
SWAP with >=3 named (rather than numbered) args PERMUTE followed by SWAP with 2 named args used to produce the wrong permutation Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Henrik Gramner authored
Reduces code size because movaps/movups is one byte shorter than movdqa/movdqu. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Henrik Gramner authored
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Loren Merritt authored
Now RET checks whether it immediately follows a branch, so the programmer dosen't have to keep track of that condition. REP_RET is still needed manually when it's a branch target, but that's much rarer. The implementation involves lots of spurious labels, but that's OK because we strip them. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
- 03 Oct, 2013 1 commit
-
-
Ronald S. Bultje authored
Decoding time of ped1080p.webm goes from 20.7sec to 11.3sec.
-
- 21 Sep, 2013 1 commit
-
-
Alex Smith authored
Because of -Werror=implicit-function-declaration the build will fail. Signed-off-by: Martin Storsjö <martin@martin.st>
-
- 30 Aug, 2013 1 commit
-
-
Thilo Borgmann authored
-
- 29 Aug, 2013 1 commit
-
-
Diego Biurrun authored
-
- 28 Aug, 2013 2 commits
-
-
Diego Biurrun authored
-
Diego Biurrun authored
-
- 17 Jul, 2013 1 commit
-
-
Diego Biurrun authored
-
- 02 Jul, 2013 2 commits
-
-
Michael Niedermayer authored
The bug has been fixed in c8b920a9 by Loren Merritt Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
Loren Merritt authored
Fixes build with yasm-1.1 Signed-off-by: Anton Khirnov <anton@khirnov.net>
-
- 01 Jul, 2013 1 commit
-
-
Michael Niedermayer authored
This reverts commit 24742524.
-