-
Reimar Döffinger authored
About 30% faster on 32 bit Atom, 120% faster on 64 bit Phenom2. This is interesting because supporting P16 is easier in e.g. OpenGL (can misuse support for any 2-component 8 bit format), whereas supporting p9/p10 without conversion needs a texture format with at least 14 bits actual precision. The shiftonly == 0 case is not optimized since the code is more complex and the speed gain less obvious. Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
118bd609