1. 27 Aug, 2018 4 commits
    • Avi Halachmi (:avih)'s avatar
      configure: speed up check_deps() · 45499e55
      Avi Halachmi (:avih) authored
      x4 - x25 faster.
      
      check_deps() recursively enables/disables components, and its loop is
      iterated nearly 6000 times. It's particularly slow in bash - currently
      consuming more than 50% of configure runtime, and about 20% with other
      shells.
      
      This commit applies few local optimizations, most effective first:
      - Use $1 $2 ... instead of pushvar/popvar, and same at enable_deep*
      - Abort early in one notable case - empty deps, to avoid costly no-op.
      - Smaller changes which do add up:
        - Handle ${cfg}_checking locally instead of via enable[d]/disable
        - ${cfg}_checking: test done before inprogress - x2 faster in 50%+
        - one eval instead of several at the empty-deps early abort path.
      
      - The "actual work" part is unmodified - just its surroundings.
      
      Biggest speedups (relative and absolute) are observed with bash.
      Tested-by: 's avatarMichael Niedermayer <michael@niedermayer.cc>
      Tested-by: 's avatarHelmut K. C. Tessarek <tessarek@evermeet.cx>
      Tested-by: 's avatarDave Yeo <daveryeo@telus.net>
      Tested-by: 's avatarReino Wijnsma <rwijnsma@xs4all.nl>
      Signed-off-by: 's avatarJames Almer <jamrial@gmail.com>
      45499e55
    • Avi Halachmi (:avih)'s avatar
      configure: speed up print_enabled_components() · 923586a5
      Avi Halachmi (:avih) authored
      x4 - x10 faster.
      
      Inside print_enabled components, the filter_list case invokes sed
      about 350 times to parse the same source file and extract different
      info for each arg. This is never instant, and on systems where fork is
      slow (notably MSYS2/Cygwin on windows) it takes many seconds.
      
      Change it to use sed once on the source file and set env vars with the
      parse results, then use these results inside the loop.
      
      Additionally, the cases of indev_list and outdev_list are very
      infrequent, but nevertheless they're faster, and arguably cleaner, with
      shell parameter substitutions than with command substitutions.
      Tested-by: 's avatarMichael Niedermayer <michael@niedermayer.cc>
      Tested-by: 's avatarHelmut K. C. Tessarek <tessarek@evermeet.cx>
      Tested-by: 's avatarDave Yeo <daveryeo@telus.net>
      Tested-by: 's avatarReino Wijnsma <rwijnsma@xs4all.nl>
      Signed-off-by: 's avatarJames Almer <jamrial@gmail.com>
      923586a5
    • Avi Halachmi (:avih)'s avatar
      configure: speed up flatten_extralibs_wrapper() · 58b81ac6
      Avi Halachmi (:avih) authored
      x50 - x200 faster.
      
      Currently configure spends 50-70% of its runtime inside a single
      function: flatten_extralibs[_wrapper] - which does string processing.
      
      During its run, nearly 20K command substitutions (subshells) are used,
      including its callees unique() and resolve(), which is the reason
      for its lengthy run.
      
      This commit avoids all subshells during its execution, speeding it up
      by about two orders of magnitude, and reducing the overall configure
      runtime by 50-70% .
      
      resolve() is rewritten to avoid subshells, and in unique() and
      flatten_extralibs() we "inline" the filter[_out] functionality.
      
      Note that logically, "unique" functionality has more than one possible
      output (depending on which of the recurring items is kept). As it
      turns out, other parts expect the last recurring item to be kept
      (which was the original behavior of uniqie()). This patch preservs
      its output order.
      Tested-by: 's avatarMichael Niedermayer <michael@niedermayer.cc>
      Tested-by: 's avatarHelmut K. C. Tessarek <tessarek@evermeet.cx>
      Tested-by: 's avatarDave Yeo <daveryeo@telus.net>
      Tested-by: 's avatarReino Wijnsma <rwijnsma@xs4all.nl>
      Signed-off-by: 's avatarJames Almer <jamrial@gmail.com>
      58b81ac6
    • Zhong Li's avatar
      lavc/encode: fix frame_number double-counted · d91370e0
      Zhong Li authored
      Encoder frame_number may be double-counted if some frames are cached and then flushed.
      Take qsv encoder (some frames are cached firsty for asynchronism) as example,
      ./ffmpeg -loglevel verbose -hwaccel qsv -c:v h264_qsv -i in.mp4 -vframes 100 -c:v h264_qsv out.mp4
      frame_number passed to encoder is double-counted and larger than the accurate value.
      Libx264 encoding with B frames can also reproduce it.
      Signed-off-by: 's avatarZhong Li <zhong.li@intel.com>
      d91370e0
  2. 26 Aug, 2018 3 commits
  3. 25 Aug, 2018 11 commits
  4. 24 Aug, 2018 8 commits
  5. 23 Aug, 2018 5 commits
  6. 22 Aug, 2018 9 commits