1. 14 Feb, 2017 15 commits
    • Michael Niedermayer's avatar
    • Joel Cunningham's avatar
      HTTP: improve performance by reducing forward seeks · 8c8e5d52
      Joel Cunningham authored
      This commit optimizes HTTP performance by reducing forward seeks, instead
      favoring a read-ahead and discard on the current connection (referred to
      as a short seek) for seeks that are within a TCP window's worth of data.
      This improves performance because with TCP flow control, a window's worth
      of data will be in the local socket buffer already or in-flight from the
      sender once congestion control on the sender is fully utilizing the window.
      
      Note: this approach doesn't attempt to differentiate from a newly opened
      connection which may not be fully utilizing the window due to congestion
      control vs one that is. The receiver can't get at this information, so we
      assume worst case; that full window is in use (we did advertise it after all)
      and that data could be in-flight
      
      The previous behavior of closing the connection, then opening a new
      with a new HTTP range value results in a massive amounts of discarded
      and re-sent data when large TCP windows are used.  This has been observed
      on MacOS/iOS which starts with an initial window of 256KB and grows up to
      1MB depending on the bandwidth-product delay.
      
      When seeking within a window's worth of data and we close the connection,
      then open a new one within the same window's worth of data, we discard
      from the current offset till the end of the window.  Then on the new
      connection the server ends up re-sending the previous data from new
      offset till the end of old window.
      
      Example (assumes full window utilization):
      
      TCP window size: 64KB
      Position: 32KB
      Forward seek position: 40KB
      
            *                      (Next window)
      32KB |--------------| 96KB |---------------| 160KB
              *
        40KB |---------------| 104KB
      
      Re-sent amount: 96KB - 40KB = 56KB
      
      For a real world test example, I have MP4 file of ~25MB, which ffplay
      only reads ~16MB and performs 177 seeks. With current ffmpeg, this results
      in 177 HTTP GETs and ~73MB worth of TCP data communication.  With this
      patch, ffmpeg issues 4 HTTP GETs and 3 seeks for a total of ~22MB of TCP data
      communication.
      
      To support this feature, the short seek logic in avio_seek() has been
      extended to call a function to get the short seek threshold value.  This
      callback has been plumbed to the URLProtocol structure, which now has
      infrastructure in HTTP and TCP to get the underlying receiver window size
      via SO_RCVBUF.  If the underlying URL and protocol don't support returning
      a short seek threshold, the default s->short_seek_threshold is used
      
      This feature has been tested on Windows 7 and MacOS/iOS.  Windows support
      is slightly complicated by the fact that when TCP window auto-tuning is
      enabled, SO_RCVBUF doesn't report the real window size, but it does if
      SO_RCVBUF was manually set (disabling auto-tuning). So we can only use
      this optimization on Windows in the later case
      Signed-off-by: 's avatarJoel Cunningham <joel.cunningham@me.com>
      Signed-off-by: 's avatarMichael Niedermayer <michael@niedermayer.cc>
      8c8e5d52
    • Timo Rothenpieler's avatar
      b6f4f0b1
    • Timo Rothenpieler's avatar
      avcodec/cuvid: set width and height before calling get_format · ce79410b
      Timo Rothenpieler authored
      The external hw_frames_ctx is initialized in that callback, and needs
      that information to be accurate.
      ce79410b
    • Timo Rothenpieler's avatar
    • Timo Rothenpieler's avatar
      avcodec/nvenc: push cuda context before encoding a frame · be74ba64
      Timo Rothenpieler authored
      Thanks to Miroslav Slugeň for figuring out what was going on here.
      be74ba64
    • Rostislav Pehlivanov's avatar
    • Rostislav Pehlivanov's avatar
    • Rostislav Pehlivanov's avatar
      opus: add a native Opus encoder · 5f47c85e
      Rostislav Pehlivanov authored
      This marks the first time anyone has written an Opus encoder without
      using any libopus code. The aim of the encoder is to prove how far
      the format can go by writing the craziest encoder for it.
      
      Right now the encoder's basic, it only supports CBR encoding, however
      internally every single feature the CELT layer has is implemented
      (except the pitch pre-filter which needs to work well with the rest of
      whatever gets implemented). Psychoacoustic and rate control systems are
      under development.
      
      The encoder takes in frames of 120 samples and depending on the value of
      opus_delay the plan is to use the extra buffered frames as lookahead.
      Right now the encoder will pick the nearest largest legal frame size and
      won't use the lookahead, but that'll change once there's a
      psychoacoustic system.
      
      Even though its a pretty basic encoder its already outperforming
      any other native encoder FFmpeg has by a huge amount.
      
      The PVQ search algorithm is faster and more accurate than libopus's
      algorithm so the encoder's performance is close to that of libopus
      at zero complexity (libopus has more SIMD).
      The algorithm might be ported to libopus or other codecs using PVQ in
      the future.
      
      The encoder still has a few minor bugs, like desyncs at ultra low
      bitrates (below 9kbps with 20ms frames).
      Signed-off-by: 's avatarRostislav Pehlivanov <atomnuker@gmail.com>
      5f47c85e
    • Rostislav Pehlivanov's avatar
      opus_celt: rename structures to better names and reorganize them · 07b78340
      Rostislav Pehlivanov authored
      This is meant to be applied on top of my previous patch which
      split PVQ into celt_pvq.c and made opus_celt.h
      
      Essentially nothing has been changed other than renaming CeltFrame
      to CeltBlock (CeltFrame had absolutely nothing at all to do with
      a frame) and CeltContext to CeltFrame.
      3 variables have been put in CeltFrame as they make more sense
      there rather than being passed around as arguments.
      The coefficients have been moved to the CeltBlock structure
      (why the hell were they in CeltContext and not in CeltFrame??).
      
      Now the encoder would be able to use the exact context the decoder
      uses (plus a couple of extra fields in there).
      
      FATE passes, no slowdowns, etc.
      Signed-off-by: 's avatarRostislav Pehlivanov <atomnuker@gmail.com>
      07b78340
    • Rostislav Pehlivanov's avatar
      opus_celt: move quantization and band decoding to opus_pvq.c · e538108c
      Rostislav Pehlivanov authored
      A huge amount can be reused by the encoder, as the only thing
      which needs to be done would be to add a 10 line celt_icwrsi,
      a wrapper around it (celt_alg_quant) and templating the
      ff_celt_decode_band to replace entropy decoding functions
      with entropy encoding.
      
      There is no performance loss but in fact a performance gain of
      around 6% which is caused by the compiler being able to optimize
      the decoding more efficiently.
      Signed-off-by: 's avatarRostislav Pehlivanov <atomnuker@gmail.com>
      e538108c
    • Rostislav Pehlivanov's avatar
      imdct15: rename to mdct15 and add a forward transform · d2119f62
      Rostislav Pehlivanov authored
      Handles strides (needed for Opus transients), does pre-reindexing and folding
      without needing a copy.
      Signed-off-by: 's avatarRostislav Pehlivanov <atomnuker@gmail.com>
      d2119f62
    • Rostislav Pehlivanov's avatar
      opus_rc: add entropy encoding functions · 373ee2c6
      Rostislav Pehlivanov authored
      Mostly used the RFC document, the decoding functions and
      the reference encoder's implmenentation as a reference.
      Signed-off-by: 's avatarRostislav Pehlivanov <atomnuker@gmail.com>
      373ee2c6
    • Michael Niedermayer's avatar
    • Lou Logan's avatar
      doc/ffmpeg: document trailing "?" in map option · 1c049d5f
      Lou Logan authored
      This feature was added in 2375a85c.
      Signed-off-by: 's avatarLou Logan <lou@lrcd.com>
      1c049d5f
  2. 13 Feb, 2017 11 commits
  3. 12 Feb, 2017 8 commits
  4. 11 Feb, 2017 6 commits