-
Mans Rullgard authored
This gets rid of the variable-length scratch buffer by filtering 16 pixels at a time and writing directly to the destination. The extra loads this requires to load the source values are compensated by not doing a round-trip to memory before shifting. Signed-off-by: Mans Rullgard <mans@mansr.com>
07eb7e20