Commits · 7e98da9c4f423a26929188aa0aba5b9b3c5989b0 · Linshizhi / ffmpeg.wasm-core

13 May, 2017 1 commit
- x86/float_dsp: remove usage of integer instructions · 0fbc7a21
  James Almer authored 7 years ago
  
  0fbc7a21
12 Apr, 2017 1 commit
- x86/float_dsp: add ff_vector_fmul_reverse_avx2 · f1d80bc6
  James Almer authored 7 years ago
```
~20% faster than AVX.
Signed-off-by: James Almer <jamrial@gmail.com>
```
  f1d80bc6
10 Apr, 2017 1 commit
- x86/float_dsp: add ff_vector_dmac_scalar_{sse2,avx,fma3} · ed9b25a1
  James Almer authored 7 years ago
  
  ed9b25a1
08 Jan, 2016 3 commits

x86/float_dsp: zero extend offset from ff_scalarproduct_float_sse · dc79824d

James Almer authored 9 years ago

Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>

dc79824d

x86/float_dsp: zero extend len from ff_butterflies_float_sse implicitly · 4ee38ed7
James Almer authored 9 years ago
```
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
```
4ee38ed7

x86/float_dsp: remove len check from ff_butterflies_float_sse · 7f520524

James Almer authored 9 years ago

The function documentation explicitly mentions it needs to be a multiple of 4.
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>

7f520524

26 Jul, 2015 1 commit
- x86/float_dsp: add missing colon to labels · 4d2c014a
  James Almer authored 9 years ago
```
Silences warnings with Nasm
Signed-off-by: James Almer <jamrial@gmail.com>
```
  4d2c014a
08 Jun, 2014 2 commits

x86/float_dsp: add missing femms · 85065d2a

James Almer authored 10 years ago

It was lost during the port.
Should fix fate on 3dnowext machines.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

85065d2a

x86/float_dsp: port vector_fmul_window to yasm · dcaf9660

James Almer authored 10 years ago

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

dcaf9660

19 Apr, 2014 1 commit

x86/float_dsp: remove duplicated code from vector_dmul_scalar · 3b06208a

James Almer authored 10 years ago

Use the xm# and ym# aliases as they remain in sync with m# after a SWAP.
No actual changes to the assembly.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

3b06208a

16 Apr, 2014 2 commits

x86/float_dsp: unroll loop in vector_fmac_scalar · 11b36b1e

James Almer authored 10 years ago

~6% faster SSE2 performance. AVX/FMA3 are unaffected.
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

11b36b1e

x86/float_dsp: use SWAP in vector_fmac_scalar Win64 · 3b808900

James Almer authored 10 years ago

The mova is unnecessary
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

3b808900

13 Mar, 2014 1 commit

x86/float_dsp: add ff_vector_{fmul_add, fmac_scalar}_fma3 · 7d7487e8

James Almer authored 10 years ago

~7% faster than AVX
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

7d7487e8

20 Feb, 2014 1 commit

x86: float dsp: unroll SSE versions · 996697e2

Christophe Gisquet authored 10 years ago

vector_fmul and vector_fmac_scalar are guaranteed that they can process in
batch of 16 elements, but their SSE versions only does 8 at a time.

Therefore, unroll them a bit.
299 to 261c for 256 elements in vector_fmac_scalar on Arrandale/Win64.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>

996697e2

15 Feb, 2014 1 commit

x86: float dsp: unroll SSE versions · 133b3420

Christophe Gisquet authored 10 years ago

vector_fmul and vector_fmac_scalar are guaranteed that they can process in
batch of 16 elements, but their SSE versions only does 8 at a time.

Therefore, unroll them a bit.
299 to 261c for 256 elements in vector_fmac_scalar on Arrandale/Win64.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

133b3420

03 May, 2013 1 commit
- x86: float dsp: butterflies_float SSE · 566b7a20
  Christophe Gisquet authored 11 years ago
```
97c -> 49c
Some codecs could benefit from more unrolling, but AAC doesn't.
```
  566b7a20
16 Apr, 2013 2 commits

butterflies_float: replace 2 lea by 2 add · 92218aad

Michael Niedermayer authored 11 years ago

adds are simpler instructions and should be faster or equally fast
on all cpus
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

92218aad

x86: float dsp: butterflies_float SSE · 1a400796

Christophe Gisquet authored 11 years ago

97c -> 49c
Some codecs could benefit from more unrolling, but AAC doesn't.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

1a400796

22 Jan, 2013 3 commits
- floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp. · 42d32469
  Ronald S. Bultje authored 12 years ago
```
Now, nellymoserenc and aacenc no longer depends on dsputil. Independent
of this patch, wmaprodec also does not depend on dsputil, so I removed
it from there also.
```
  42d32469
- floatdsp: move vector_fmul_add from dsputil to avfloatdsp. · 55aa03b9
  Ronald S. Bultje authored 12 years ago
  
  55aa03b9
- floatdsp: move scalarproduct_float from dsputil to avfloatdsp. · d56668bd
  Ronald S. Bultje authored 12 years ago
```
This makes the aac decoder and all voice codecs independent of dsputil.
```
  d56668bd
08 Dec, 2012 1 commit
- x86: float_dsp: fix loading of the len parameter on x86-32 · 1c012e6b
  Justin Ruggles authored 12 years ago
  
  1c012e6b
06 Dec, 2012 1 commit
- x86: float_dsp: fix compilation of ff_vector_dmul_scalar_avx() on x86-32 · ecc8b021
  Justin Ruggles authored 12 years ago
```
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
```
  ecc8b021
05 Dec, 2012 1 commit
- float_dsp: add vector_dmul_scalar() to multiply a vector of doubles · ac7eb4cb
  Justin Ruggles authored 12 years ago
```
Include x86-optimized versions for SSE2 and AVX.
```
  ac7eb4cb
26 Nov, 2012 1 commit
- x86: float_dsp: add SSE version of vector_fmul_scalar() · 947f9336
  Justin Ruggles authored 12 years ago
  
  947f9336
11 Nov, 2012 1 commit
- build: Drop AVX assembly ifdefs · 2b479bca
  Diego Biurrun authored 12 years ago
```
An assembler able to cope with AVX instructions is now required.
```
  2b479bca
30 Oct, 2012 1 commit
- x86: include x86inc.asm in x86util.asm · 6860b408
  Diego Biurrun authored 12 years ago
```
This is necessary to allow refactoring some x86util macros with cpuflags.
```
  6860b408
07 Sep, 2012 1 commit

x86: float_dsp: fix ff_vector_fmac_scalar_avx() on Win64 · 73275259

Justin Ruggles authored 12 years ago

The SWAP macro does not work for explicit xmm/ymm usage, so instead just move
the scalar value from xmm2 to xmm0.

73275259

30 Aug, 2012 1 commit
- x86: Split inline and external assembly #ifdefs · 17337f54
  Diego Biurrun authored 12 years ago
  
  17337f54
07 Aug, 2012 1 commit

x86: add colons after labels · a3df4781

Mans Rullgard authored 12 years ago

nasm prints a warning if the colon is missing.
Signed-off-by: Mans Rullgard <mans@mansr.com>

a3df4781

26 Jul, 2012 1 commit
- x86inc: automatically insert vzeroupper for YMM functions. · 30b45d9c
  Ronald S. Bultje authored 12 years ago
  
  30b45d9c
18 Jun, 2012 1 commit
- float_dsp: add x86-optimized functions for vector_fmac_scalar() · 82b2df97
  Justin Ruggles authored 12 years ago
  
  82b2df97
09 Jun, 2012 1 commit

x86/float_dsp.asm: restore author attribution · f0313e90

Michael Niedermayer authored 12 years ago

The attribution was removed by libav while moving the code to libavutil

The original code is from
commit eb4825b5
Author: Loren Merritt <lorenm@u.washington.edu>
Date:   Thu Aug 10 19:06:25 2006 +0000

    sse and 3dnow implementations of float->int conversion and mdct windowing.
    15% faster vorbis.

and

commit 06972056
Author: Loren Merritt <lorenm@u.washington.edu>
Date:   Fri Aug 11 18:19:37 2006 +0000

    vorbis simd tweaks
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

f0313e90

08 Jun, 2012 1 commit
- Add a float DSP framework to libavutil · d5a7229b
  Justin Ruggles authored 12 years ago
```
Move vector_fmul() from DSPContext to AVFloatDSPContext.
```
  d5a7229b
29 May, 2012 1 commit
- lavr: add x86-optimized functions for mixing 2 to 1 s16p with float coeffs · c140fb2c
  Justin Ruggles authored 12 years ago
  
  c140fb2c