- 31 Mar, 2017 1 commit
-
-
- 24 Mar, 2017 2 commits
-
-
James Almer authored
Unrolling the loops triplicates the size of the assembled output while not generating any gain in performance.
-
Clément Bœsch authored
This will simplify incoming merge.
-
- 07 Feb, 2015 1 commit
-
-
Christophe Gisquet authored
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-
- 05 Sep, 2014 1 commit
-
-
James Almer authored
Should fix compilation with old Yasm/Nasm versions. Signed-off-by: James Almer <jamrial@gmail.com>
-
- 04 Sep, 2014 1 commit
-
-
James Almer authored
~20% faster than AVX. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>
-
- 21 Aug, 2014 1 commit
-
-
James Almer authored
* Reduced xmm register count to 7 (As such they are now enabled for x86_32). * Removed four movdqa (affects the sse2 version only). * pxor is now used to clear m0 only once. ~5% faster. Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
-
- 20 Aug, 2014 2 commits
-
-
James Almer authored
~15% faster than sse2 Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
-
James Almer authored
Signed-off-by: James Almer <jamrial@gmail.com>
-
- 19 Aug, 2014 1 commit
-
-
Pierre Edouard Lepere authored
Reviewed-by: James Almer <jamrial@gmail.com> Approved-by: Ronald S. Bultje Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
-