Commit b4fe9769 authored by Michael Niedermayer's avatar Michael Niedermayer

reorder paddws to reduce dependancy chain

put_h264_chroma_mc2_mmx2() 927 -> 902 dezicyles on duron

Originally committed as revision 8097 to svn://svn.ffmpeg.org/ffmpeg/trunk
parent 9ff77d17
......@@ -284,9 +284,9 @@ static void H264_CHROMA_MC2_TMPL(uint8_t *dst/*align 2*/, uint8_t *src/*align 1*
/* mm1 += C * src[0,1] + D * src[1,2] */
"movq %%mm0, %%mm2\n\t"
"pmaddwd %%mm6, %%mm0\n\t"
"paddw %3, %%mm1\n\t"
"paddw %%mm0, %%mm1\n\t"
/* dst[0,1] = pack((mm1 + 32) >> 6) */
"paddw %3, %%mm1\n\t"
"psrlw $6, %%mm1\n\t"
"packssdw %%mm7, %%mm1\n\t"
"packuswb %%mm7, %%mm1\n\t"
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment