swscale.txt 4.59 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12
    The official guide to swscale for confused developers.
   ========================================================

Current (simplified) Architecture:
---------------------------------
                        Input
                          v
                   _______OR_________
                 /                   \
               /                       \
       special converter     [Input to YUV converter]
              |                         |
13
              |         (8-bit YUV 4:4:4 / 4:2:2 / 4:2:0 / 4:0:0 )
14 15 16 17
              |                         |
              |                         v
              |                  Horizontal scaler
              |                         |
18
              |     (15-bit YUV 4:4:4 / 4:2:2 / 4:2:0 / 4:1:1 / 4:0:0 )
19 20 21 22 23 24 25 26
              |                         |
              |                         v
              |          Vertical scaler and output converter
              |                         |
              v                         v
                         output


Ramiro Polla's avatar
Ramiro Polla committed
27 28
Swscale has 2 scaler paths. Each side must be capable of handling
slices, that is, consecutive non-overlapping rectangles of dimension
Diego Biurrun's avatar
Diego Biurrun committed
29
(0,slice_top) - (picture_width, slice_bottom).
30 31

special converter
Ramiro Polla's avatar
Ramiro Polla committed
32
    These generally are unscaled converters of common
33
    formats, like YUV 4:2:0/4:2:2 -> RGB12/15/16/24/32. Though it could also
34 35 36
    in principle contain scalers optimized for specific common cases.

Main path
Ramiro Polla's avatar
Ramiro Polla committed
37 38
    The main path is used when no special converter can be used. The code
    is designed as a destination line pull architecture. That is, for each
39
    output line the vertical scaler pulls lines from a ring buffer. When
Diego Biurrun's avatar
Diego Biurrun committed
40 41 42
    the ring buffer does not contain the wanted line, then it is pulled from
    the input slice through the input converter and horizontal scaler.
    The result is also stored in the ring buffer to serve future vertical
43 44 45 46
    scaler requests.
    When no more output can be generated because lines from a future slice
    would be needed, then all remaining lines in the current slice are
    converted, horizontally scaled and put in the ring buffer.
Diego Biurrun's avatar
Diego Biurrun committed
47 48
    [This is done for luma and chroma, each with possibly different numbers
     of lines per picture.]
49 50

Input to YUV Converter
Diego Biurrun's avatar
Diego Biurrun committed
51
    When the input to the main path is not planar 8 bits per component YUV or
Diego Biurrun's avatar
Diego Biurrun committed
52
    8-bit gray, it is converted to planar 8-bit YUV. Two sets of converters
Diego Biurrun's avatar
Diego Biurrun committed
53
    exist for this currently: One performs horizontal downscaling by 2
Diego Biurrun's avatar
Diego Biurrun committed
54
    before the conversion, the other leaves the full chroma resolution,
Diego Biurrun's avatar
Diego Biurrun committed
55
    but is slightly slower. The scaler will try to preserve full chroma
Diego Biurrun's avatar
Diego Biurrun committed
56
    when the output uses it. It is possible to force full chroma with
Diego Biurrun's avatar
Diego Biurrun committed
57
    SWS_FULL_CHR_H_INP even for cases where the scaler thinks it is useless.
58 59

Horizontal scaler
Ramiro Polla's avatar
Ramiro Polla committed
60
    There are several horizontal scalers. A special case worth mentioning is
61
    the fast bilinear scaler that is made of runtime-generated MMXEXT code
62
    using specially tuned pshufw instructions.
Diego Biurrun's avatar
Diego Biurrun committed
63 64
    The remaining scalers are specially-tuned for various filter lengths.
    They scale 8-bit unsigned planar data to 16-bit signed planar data.
Diego Biurrun's avatar
Diego Biurrun committed
65 66
    Future >8 bits per component inputs will need to add a new horizontal
    scaler that preserves the input precision.
67 68

Vertical scaler and output converter
Diego Biurrun's avatar
Diego Biurrun committed
69
    There is a large number of combined vertical scalers + output converters.
70 71 72 73 74 75
    Some are:
    * unscaled output converters
    * unscaled output converters that average 2 chroma lines
    * bilinear converters                (C, MMX and accurate MMX)
    * arbitrary filter length converters (C, MMX and accurate MMX)
    And
Diego Biurrun's avatar
Diego Biurrun committed
76 77 78 79
    * Plain C  8-bit 4:2:2 YUV -> RGB converters using LUTs
    * Plain C 17-bit 4:4:4 YUV -> RGB converters using multiplies
    * MMX     11-bit 4:2:2 YUV -> RGB converters
    * Plain C 16-bit Y -> 16-bit gray
80 81
      ...

Diego Biurrun's avatar
Diego Biurrun committed
82 83
    RGB with less than 8 bits per component uses dither to improve the
    subjective quality and low-frequency accuracy.
84 85 86 87


Filter coefficients:
--------------------
Diego Biurrun's avatar
Diego Biurrun committed
88 89 90 91 92
There are several different scalers (bilinear, bicubic, lanczos, area,
sinc, ...). Their coefficients are calculated in initFilter().
Horizontal filter coefficients have a 1.0 point at 1 << 14, vertical ones at
1 << 12. The 1.0 points have been chosen to maximize precision while leaving
a little headroom for convolutional filters like sharpening filters and
93 94 95
minimizing SIMD instructions needed to apply them.
It would be trivial to use a different 1.0 point if some specific scaler
would benefit from it.
Diego Biurrun's avatar
Diego Biurrun committed
96
Also, as already hinted at, initFilter() accepts an optional convolutional
97 98
filter as input that can be used for contrast, saturation, blur, sharpening
shift, chroma vs. luma shift, ...