Commits · 94f084324e648876508bed546d950762f10b875e · Linshizhi / ffmpeg.wasm-core

01 Jul, 2014 1 commit
- Update Fiona's name in copyright statements. · 79793f83
  Diego Biurrun authored 10 years ago
  
  79793f83
23 Feb, 2014 3 commits
- x86: add detection for Bit Manipulation Instruction sets · d59fcdaf
  James Almer authored 10 years ago
```
Based on x264 code
Signed-off-by: James Almer <jamrial@gmail.com>
```
  d59fcdaf
- x86: add detection for FMA3 instruction set · 1b932eb1
  James Almer authored 10 years ago
```
Based on x264 code
Signed-off-by: James Almer <jamrial@gmail.com>
```
  1b932eb1
- x86: add missing XOP checks and macros · 10b0161d
  James Almer authored 10 years ago
```
Signed-off-by: James Almer <jamrial@gmail.com>
```
  10b0161d
20 Feb, 2014 1 commit

x86: float dsp: unroll SSE versions · 996697e2

Christophe Gisquet authored 11 years ago

vector_fmul and vector_fmac_scalar are guaranteed that they can process in
batch of 16 elements, but their SSE versions only does 8 at a time.

Therefore, unroll them a bit.
299 to 261c for 256 elements in vector_fmac_scalar on Arrandale/Win64.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>

996697e2

26 Jan, 2014 1 commit

x86inc: Speed up assembling with Yasm · b7d0d10a

Loren Merritt authored 11 years ago

Work around Yasm's inefficiency with handling large numbers of variables
in the global scope.
Signed-off-by: Diego Biurrun <diego@biurrun.de>

b7d0d10a

25 Oct, 2013 1 commit

libavutil: x86: Add AVX2 capable CPU detection. · 4d6ee072

Kieran Kunhya authored 11 years ago

Patch based on x264's AVX2 detection
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>

4d6ee072

14 Oct, 2013 4 commits

x86: more AVX2 framework · a3fabc6c
Jason Garrett-Glaser authored 11 years ago
```
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
```
a3fabc6c
x86inc: FMA3/4 Support · c6908d6b
Jason Garrett-Glaser authored 12 years ago
```
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
```
c6908d6b

x86inc: Remove our FMA4 support · 20689570

Derek Buitenhuis authored 11 years ago

This is so we can sync to x264's version of FMA4 support.

This partialy reverts commit 79687079.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>

20689570

x86inc: Use VEX-encoded instructions in AVX functions · c108ba01

Henrik Gramner authored 12 years ago

Automatically use VEX-encoding in AVX/AVX2/XOP/FMA3/FMA4
functions for all instructions that exists in a VEX-encoded
version.

This change makes it easier to extend existing code to use AVX2.

Also add support for AVX emulation of a few instructions that
were missing before.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>

c108ba01

09 Oct, 2013 1 commit

x86inc: Remove .rodata kludges · ad7d7d4f

Henrik Gramner authored 11 years ago

The Mach-O bug was fixed in yasm 0.8.0 and we don't
support versions that old anymore.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>

ad7d7d4f

07 Oct, 2013 9 commits

x86inc: remove misaligned cpu flag · 3e2fa991

Henrik Gramner authored 11 years ago

Prevents a crash if the misaligned exception mask bit is
cleared for some reason.

Misaligned SSE functions are only used on AMD Phenom CPUs
and the benefit is miniscule. They also require modifying
the MXCSR control register and by removing those functions
we can get rid of that complexity altogether.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>

3e2fa991

x86inc: various minor backports from x264 · 71155665

Jason Garrett-Glaser authored 11 years ago

Small backports that sneaked into other asm commits in x264.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>

71155665

x86inc: Check for __OUTPUT_FORMAT__ having a value of "x64" · 47f9d7ce
Derek Buitenhuis authored 11 years ago
```
This is also a valid value for WIN64.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
```
47f9d7ce

x86inc: Utilize the shadow space on 64-bit Windows · bbe4a6db

Henrik Gramner authored 11 years ago

Store XMM6 and XMM7 in the shadow space in functions that
clobbers them. This way we don't have to adjust the stack
pointer as often, reducing the number of instructions as
well as code size.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>

bbe4a6db

x86inc: create xm# and ym#, analagous to m# · 3fb78e99

Loren Merritt authored 11 years ago

For when we want to mix simd sizes within one function.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>

3fb78e99

x86inc: fix some corner cases of SWAP · 49ebe3f9

Loren Merritt authored 11 years ago

SWAP with >=3 named (rather than numbered) args
PERMUTE followed by SWAP with 2 named args
used to produce the wrong permutation
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>

49ebe3f9

x86inc: Use SSE instead of SSE2 for copying data · 63f0d623

Henrik Gramner authored 11 years ago

Reduces code size because movaps/movups is one byte
shorter than movdqa/movdqu.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>

63f0d623

x86inc: Set ELF hidden visibility for global constants · ad76e6e7
Henrik Gramner authored 11 years ago
```
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
```
ad76e6e7

x86inc: activate REP_RET automatically · 25cb0c1a

Loren Merritt authored 11 years ago

Now RET checks whether it immediately follows a branch, so the
programmer dosen't have to keep track of that condition. REP_RET
is still needed manually when it's a branch target, but that's
much rarer.

The implementation involves lots of spurious labels, but that's OK
because we strip them.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>

25cb0c1a

21 Sep, 2013 1 commit

avutil: Fix compilation with inline asm disabled on mingw · 08fa828b

Alex Smith authored 11 years ago

Because of -Werror=implicit-function-declaration the build will fail.
Signed-off-by: Martin Storsjö <martin@martin.st>

08fa828b

29 Aug, 2013 1 commit
- x86: Add and use more convenience macros to check CPU extension availability · 79aec43c
  Diego Biurrun authored 11 years ago
  
  79aec43c
28 Aug, 2013 2 commits
- avutil: Refactor CPU extension availability macros · 8410d6e9
  Diego Biurrun authored 11 years ago
  
  8410d6e9
- avutil: Move internal CPU detection function declarations to private header · b78b10c4
  Diego Biurrun authored 11 years ago
  
  b78b10c4
17 Jul, 2013 1 commit
- Consistently use "cpu_flags" as variable/parameter name for CPU flags · 3ac7fa81
  Diego Biurrun authored 11 years ago
  
  3ac7fa81
02 Jul, 2013 1 commit
- lls/x86: use 3-operator vaddpd in ADDPD_MEM · c8b920a9
  Loren Merritt authored 11 years ago
```
Fixes build with yasm-1.1
Signed-off-by: Anton Khirnov <anton@khirnov.net>
```
  c8b920a9
30 Jun, 2013 1 commit
- x86: lpc: fix a segfault in av_evaluate_lls_sse2() · 1221bb62
  Loren Merritt authored 11 years ago
  
  1221bb62
29 Jun, 2013 2 commits
- x86: lpc: simd av_evaluate_lls · b545179f
  Loren Merritt authored 11 years ago
```
1.5x-1.8x faster on sandybridge
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
```
  b545179f
- x86: lpc: simd av_update_lls · 502ab21a
  Loren Merritt authored 11 years ago
```
4x-6x faster on sandybridge
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
```
  502ab21a
04 May, 2013 1 commit
- avutil: Add av_cold attributes to init functions missing them · 1fda184a
  Diego Biurrun authored 11 years ago
  
  1fda184a
03 May, 2013 1 commit
- x86: float dsp: butterflies_float SSE · 566b7a20
  Christophe Gisquet authored 11 years ago
```
97c -> 49c
Some codecs could benefit from more unrolling, but AAC doesn't.
```
  566b7a20
10 Apr, 2013 1 commit
- dsputil: Make dsputil selectable · b93b27ed
  Ronald S. Bultje authored 11 years ago
```
Signed-off-by: Martin Storsjö <martin@martin.st>
```
  b93b27ed
09 Apr, 2013 1 commit
- x86inc: Fix number of operands for cmp* instructions · 2e81acc6
  Christophe Gisquet authored 11 years ago
```
cmp{p,s}{s,d} instructions do take an imm8 operand.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
```
  2e81acc6
27 Mar, 2013 1 commit
- cosmetics: Remove unnecessary extern keywords from function declarations · b6649ab5
  Diego Biurrun authored 11 years ago
  
  b6649ab5
19 Feb, 2013 1 commit

x86: Use simple nop codes for <= sse (rather than <= mmx) · 0c0828ec

Ronald S. Bultje authored 12 years ago

The "CentaurHauls family 6 model 9 stepping 8" family of CPUs
(flags: fpu vme de pse tsc msr cx8 sep mtrr pge mov pat mmx fxsr sse
up rng rng_en ace ace_en) SIGILLs on long nop codes.
Signed-off-by: Martin Storsjö <martin@martin.st>

0c0828ec

14 Feb, 2013 2 commits
- avutil: Ensure that emms_c is always defined, even on non-x86 · 4db96649
  Diego Biurrun authored 12 years ago
  
  4db96649
- avutil: Move emms code to x86-specific header · ab441e20
  Diego Biurrun authored 12 years ago
  
  ab441e20
22 Jan, 2013 2 commits
- floatdsp: move scalarproduct_float from dsputil to avfloatdsp. · d56668bd
  Ronald S. Bultje authored 12 years ago
```
This makes the aac decoder and all voice codecs independent of dsputil.
```
  d56668bd
- floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp. · 42d32469
  Ronald S. Bultje authored 12 years ago
```
Now, nellymoserenc and aacenc no longer depends on dsputil. Independent
of this patch, wmaprodec also does not depend on dsputil, so I removed
it from there also.
```
  42d32469