- 24 Feb, 2014 1 commit
-
-
James Almer authored
We need the emulation to support the cases where the first argument is the same as the fourth. To achieve this a fifth argument working as a temporary may be needed. Emulation that doesn't obey the original instruction semantics can't be in x86inc. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
- 13 Feb, 2014 1 commit
-
-
James Almer authored
Support the cases where the first and last operand of the XOP instruction are the same. Also add vpmacsdql emulation. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
- 26 Jan, 2014 1 commit
-
-
Loren Merritt authored
Work around Yasm's inefficiency with handling large numbers of variables in the global scope. Signed-off-by: Diego Biurrun <diego@biurrun.de>
-
- 18 Jan, 2014 1 commit
-
-
Loren Merritt authored
Work around yasm's inefficiency with handling large numbers of variables in the global scope.
-
- 14 Oct, 2013 4 commits
-
-
Jason Garrett-Glaser authored
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Jason Garrett-Glaser authored
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Derek Buitenhuis authored
This is so we can sync to x264's version of FMA4 support. This partialy reverts commit 79687079. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Henrik Gramner authored
Automatically use VEX-encoding in AVX/AVX2/XOP/FMA3/FMA4 functions for all instructions that exists in a VEX-encoded version. This change makes it easier to extend existing code to use AVX2. Also add support for AVX emulation of a few instructions that were missing before. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
- 09 Oct, 2013 1 commit
-
-
Henrik Gramner authored
The Mach-O bug was fixed in yasm 0.8.0 and we don't support versions that old anymore. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
- 07 Oct, 2013 9 commits
-
-
Henrik Gramner authored
Prevents a crash if the misaligned exception mask bit is cleared for some reason. Misaligned SSE functions are only used on AMD Phenom CPUs and the benefit is miniscule. They also require modifying the MXCSR control register and by removing those functions we can get rid of that complexity altogether. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Jason Garrett-Glaser authored
Small backports that sneaked into other asm commits in x264. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Derek Buitenhuis authored
This is also a valid value for WIN64. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Henrik Gramner authored
Store XMM6 and XMM7 in the shadow space in functions that clobbers them. This way we don't have to adjust the stack pointer as often, reducing the number of instructions as well as code size. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Loren Merritt authored
For when we want to mix simd sizes within one function. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Loren Merritt authored
SWAP with >=3 named (rather than numbered) args PERMUTE followed by SWAP with 2 named args used to produce the wrong permutation Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Henrik Gramner authored
Reduces code size because movaps/movups is one byte shorter than movdqa/movdqu. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Henrik Gramner authored
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
Loren Merritt authored
Now RET checks whether it immediately follows a branch, so the programmer dosen't have to keep track of that condition. REP_RET is still needed manually when it's a branch target, but that's much rarer. The implementation involves lots of spurious labels, but that's OK because we strip them. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
-
- 03 Oct, 2013 1 commit
-
-
Ronald S. Bultje authored
Decoding time of ped1080p.webm goes from 20.7sec to 11.3sec.
-
- 09 Apr, 2013 1 commit
-
-
Christophe Gisquet authored
cmp{p,s}{s,d} instructions do take an imm8 operand. Signed-off-by: Diego Biurrun <diego@biurrun.de>
-
- 05 Apr, 2013 1 commit
-
-
Christophe Gisquet authored
cmp{p,s}{s,d} instructions do take an imm8 operand. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
- 19 Feb, 2013 1 commit
-
-
Ronald S. Bultje authored
The "CentaurHauls family 6 model 9 stepping 8" family of CPUs (flags: fpu vme de pse tsc msr cx8 sep mtrr pge mov pat mmx fxsr sse up rng rng_en ace ace_en) SIGILLs on long nop codes. Signed-off-by: Martin Storsjö <martin@martin.st>
-
- 11 Feb, 2013 1 commit
-
-
Ronald S. Bultje authored
The "CPU: CentaurHauls family 6 model 9 stepping 8" family of CPUs (flags: fpu vme de pse tsc msr cx8 sep mtrr pge mov pat mmx fxsr sse up rng rng_en ace ace_en) SIGILLs on long nop codes. Change-Id: I7e7c52a2191006df30a9aadbc40d481a1db89106
-
- 18 Jan, 2013 2 commits
-
-
Diego Biurrun authored
This allows defining externally visible library symbols. Signed-off-by: Diego Biurrun <diego@biurrun.de>
-
Diego Biurrun authored
The new name is more descriptive and will allow defining a separate public prefix for externally visible library symbols. Signed-off-by: Diego Biurrun <diego@biurrun.de>
-
- 19 Dec, 2012 1 commit
-
-
Ronald S. Bultje authored
Unbreak NASM support. Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
-
- 13 Dec, 2012 1 commit
-
-
Janne Grunau authored
Fixes build errors with nasm introduced in 6f40e9f0 for stack memory alignment. Noticed by BugMaster.
-
- 12 Dec, 2012 3 commits
-
-
Ronald S. Bultje authored
Signed-off-by: Martin Storsjö <martin@martin.st>
-
Ronald S. Bultje authored
Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
Ronald S. Bultje authored
Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
-
- 05 Dec, 2012 1 commit
-
-
Justin Ruggles authored
-
- 11 Nov, 2012 1 commit
-
-
Diego Biurrun authored
This reduces the local difference to the x264 upstream version.
-
- 02 Nov, 2012 1 commit
-
-
Diego Biurrun authored
This allows overriding the value from outside of the file.
-
- 29 Oct, 2012 1 commit
-
-
Ronald S. Bultje authored
-
- 26 Aug, 2012 1 commit
-
-
Loren Merritt authored
13% faster on penryn, 16% on sandybridge, 15% on bulldozer Not simd; a compiler should have generated this, but gcc didn't.
-
- 07 Aug, 2012 4 commits
-
-
Michael Niedermayer authored
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
Mans Rullgard authored
It appears that something goes wrong in old nasm versions when the %+ operator is used in the last argument of a macro invocation and this argument is tested with %ifdef within the macro. This patch rearranges the macro arguments such that the %+ operator is never used in the last argument.
-
Mans Rullgard authored
nasm does not support 'CPU foonop' directives. This adds a configure test for the directive and uses it only if supported. Signed-off-by: Mans Rullgard <mans@mansr.com>
-
Mans Rullgard authored
For some reason, nasm requires this. No harm done to yasm. Signed-off-by: Mans Rullgard <mans@mansr.com>
-
- 03 Aug, 2012 1 commit
-
-
Diego Biurrun authored
Currently there is a wild mix of 3dn2/3dnow2/3dnowext. Switching to "3dnowext", which is a more common name of the CPU flag, as reported e.g. by the Linux kernel, unifies this.
-