- 26 Jul, 2010 5 commits
-
-
Jason Garrett-Glaser authored
5-10% faster or more on Phenom, Athlon 64, and some others. Helps some on pre-SSSE3 Intel chips as well, but not as much. Originally committed as revision 24513 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Ronald S. Bultje authored
Originally committed as revision 24511 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Ronald S. Bultje authored
mbedge loopfilter functions, by re-using space that holds a variable that we no longer need. Originally committed as revision 24510 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Ronald S. Bultje authored
construct was always enabled, even for <ssse3 versions). Originally committed as revision 24509 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Ronald S. Bultje authored
future new optimizations (imagine a sse5) much easier. Also fix a bug where we used the direction (%2) rather than optimization (%1) to enable this, which means it wasn't ever actually used... Originally committed as revision 24507 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
- 25 Jul, 2010 1 commit
-
-
Ronald S. Bultje authored
Originally committed as revision 24489 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
- 24 Jul, 2010 1 commit
-
-
Ronald S. Bultje authored
splits it into small optimization-specific macros which are selected for each DSP function. The advantage of this approach is that the sse4 functions now use the ssse3 codepath also without needing an explicit sse4 codepath. Originally committed as revision 24487 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
- 23 Jul, 2010 4 commits
-
-
Jason Garrett-Glaser authored
Add MMX idct_dc_add4uv function for this case. ~40% faster chroma idct. Originally committed as revision 24455 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Jason Garrett-Glaser authored
Originally committed as revision 24453 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Jason Garrett-Glaser authored
Take shortcuts based on statistically common situations. Add 4-at-a-time idct_dc function (mmx and sse2) since rows of 4 DC-only DCT blocks are common. TODO: tie this more directly into the MB mode, since the DC-level transform is only used for non-splitmv blocks? Originally committed as revision 24452 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Jason Garrett-Glaser authored
~0.3% faster overall. Originally committed as revision 24448 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
- 22 Jul, 2010 2 commits
-
-
Ronald S. Bultje authored
CPUs supporting it. Originally committed as revision 24437 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Ronald S. Bultje authored
Originally committed as revision 24409 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
- 21 Jul, 2010 3 commits
-
-
Jason Garrett-Glaser authored
Originally committed as revision 24405 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Jason Garrett-Glaser authored
SSSE3 versions, improve SSE2 versions a bit. SSE2/SSSE3 mbedge h functions are currently broken, so explicitly disable them. Originally committed as revision 24403 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Jason Garrett-Glaser authored
Avoid pextrw, since it's slow on many older CPUs. Now it doesn't require mmxext either. Originally committed as revision 24397 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
- 20 Jul, 2010 2 commits
-
-
Ronald S. Bultje authored
and chroma (width=8). Originally committed as revision 24378 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Ronald S. Bultje authored
Originally committed as revision 24377 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
- 19 Jul, 2010 4 commits
-
-
Ronald S. Bultje authored
wrong with it tomorrow or so, then re-submit. Originally committed as revision 24341 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Ronald S. Bultje authored
Originally committed as revision 24339 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Ronald S. Bultje authored
for x86-32, or 2 MM registers on x86-64. Originally committed as revision 24338 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Ronald S. Bultje authored
so that it does both U and V planes at the same time. This will have speed advantages when using SSE2 (or higher) optimizations, since we can do both the U and V rows together in a single xmm register. This also renames filter16 to filter16y and filter8 to filter8uv so that it's more obvious what each function is used for. Originally committed as revision 24337 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
- 16 Jul, 2010 6 commits
-
-
Ronald S. Bultje authored
Originally committed as revision 24275 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Ronald S. Bultje authored
Originally committed as revision 24272 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Ronald S. Bultje authored
Originally committed as revision 24271 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Ronald S. Bultje authored
Originally committed as revision 24270 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Ronald S. Bultje authored
inner loopfilter, and it also allows us to save one register on x86-64/sse2. Originally committed as revision 24269 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Ronald S. Bultje authored
sse2) doesn't actually loop, so REP_RET isn't necessary. Originally committed as revision 24268 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
- 15 Jul, 2010 1 commit
-
-
Ronald S. Bultje authored
Originally committed as revision 24250 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
- 03 Jul, 2010 2 commits
-
-
Ronald S. Bultje authored
Originally committed as revision 24029 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Jason Garrett-Glaser authored
Originally committed as revision 24013 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
- 02 Jul, 2010 1 commit
-
-
Jason Garrett-Glaser authored
Also make some small changes to saturation order of 4-tap SSSE3 MC to fix a non-bitexactness bug. Patch mostly by Eli Friedman <eli.friedman AT gmail DOT com>. Originally committed as revision 23965 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
- 29 Jun, 2010 4 commits
-
-
Jason Garrett-Glaser authored
Originally committed as revision 23891 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Ronald S. Bultje authored
Originally committed as revision 23890 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Ronald S. Bultje authored
Originally committed as revision 23886 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Jason Garrett-Glaser authored
Originally committed as revision 23878 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
- 28 Jun, 2010 3 commits
-
-
Jason Garrett-Glaser authored
Originally committed as revision 23872 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Jason Garrett-Glaser authored
Originally committed as revision 23858 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
Jason Garrett-Glaser authored
Originally committed as revision 23857 to svn://svn.ffmpeg.org/ffmpeg/trunk
-
- 27 Jun, 2010 1 commit
-
-
Jason Garrett-Glaser authored
- MMXEXT, SSE2 and SSSE3 MC functions - MMX and SSE4 IDCT dc_add functions Patch by Jason Garrett-Glaser <darkshikari gmail com> and myself. Originally committed as revision 23815 to svn://svn.ffmpeg.org/ffmpeg/trunk
-