1. 12 Oct, 2021 1 commit
    • Jakob Kummerow's avatar
      [turbofan] Make GetCommonDominator faster by caching · a2ebdb15
      Jakob Kummerow authored
      Walking the dominator tree can be slow when that tree is very deep,
      and since it's typically done at least once for every BasicBlock,
      overall cost is approximately quadratic.
      With some (sparse) caching, we can get significant speedups for
      very little extra memory consumption.
      In the specific function I looked at, tree depth was around 11,500,
      and this patch speeds up the Scheduling phase from 42 seconds to 0.2
      seconds, while increasing its memory consumption from 113.1 to 113.4
      megabytes.
      
      Change-Id: Iaa32d249a30f62269858d090fbd8924d16d3a9f4
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3218157
      Commit-Queue: Jakob Kummerow <jkummerow@chromium.org>
      Reviewed-by: 's avatarMaya Lekova <mslekova@chromium.org>
      Cr-Commit-Position: refs/heads/main@{#77356}
      a2ebdb15
  2. 05 May, 2021 1 commit
    • Ross McIlroy's avatar
      [compiler] Simplify and optimize Scheduler::PrepareUses. · 3f28ca94
      Ross McIlroy authored
      Simplifies the traversal of nodes in Scheduler::PrepareUses to
      avoid having to carefully order stack traversal for pre/post
      ordering visits. Instead simply pre visit when pushing a node
      onto the stack, then post visit the node when popping it from
      the stack and then visiting it's inputs. This keeps the same
      invariants required, but reduces visit overhead.
      
      In addition, move checking for CoupledControlEdges out of
      Increment/DecrementUnscheduledUseCounts such that the
      coupled control edge calculation only needs to be done once
      per node, rather than once for every input of the node. Also
      remove unecessary recursion from these functions.
      
      All told, these optimizations reduce the PrepareUses overhead
      by 40-50%.
      
      BUG=v8:9684
      
      Change-Id: I934523a732892a1f66d7e77f8d04e200169080f1
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2863602
      Commit-Queue: Ross McIlroy <rmcilroy@chromium.org>
      Reviewed-by: 's avatarNico Hartmann <nicohartmann@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#74373}
      3f28ca94
  3. 22 Jul, 2020 1 commit
    • Seth Brenith's avatar
      Profile-guided optimization of builtins · 922983df
      Seth Brenith authored
      Design doc:
      https://docs.google.com/document/d/1szInbXZfaErWW70d30hJsOLL0Es-l5_g8d2rXm1ZBqI/edit?usp=sharing
      
      V8 can already collect data about how many times each basic block in the
      builtins is run. This change enables using that data for profile-guided
      optimization. New comments in BUILD.gn describe how to use this feature.
      
      A few implementation details worth mentioning, which aren't covered in
      the design doc:
      
      - BasicBlockProfilerData currently contains an array of RPO numbers.
        However, this array is always just [0, 1, 2, 3, ...], so this change
        removes that array. A new DCHECK in BasicBlockInstrumentor::Instrument
        ensures that the removal is valid.
      
      - RPO numbers, while useful for printing data that matches with the
        stringified schedule, are not useful for matching profiling data with
        blocks that haven't been scheduled yet. This change adds a new array
        of block IDs in BasicBlockProfilerData, so that block counters can be
        used for PGO.
      
      - Basic block counters need to be written to a file so that they can be
        provided to a subsequent run of mksnapshot, but the design doc doesn't
        specify the transfer format or what file is used. In this change, I
        propose using the existing v8.log file for that purpose. Block count
        records look like this:
      
        block,TestLessThanHandler,37,29405
      
        This line indicates that block ID 37 in TestLessThanHandler was run
        29405 times. If multiple lines refer to the same block, the reader
        adds them all together. I like this format because it's easy to use:
        - V8 already has robust logic for creating the log file, naming it to
          avoid conflicts in multi-process situations, etc.
        - Line order doesn't matter, and interleaved writes from various
          logging sources are fine, given that V8 writes each line atomically.
        - Combining multiple sources of profiling data is as simple as
          concatenating their v8.log files together.
      
      - It is a good idea to avoid making any changes based on profiling data
        if the function being compiled doesn't match the one that was
        profiled, since it is common to use profiling data downloaded from a
        central lab which is updated only periodically. To check whether a
        function matches, I propose using a hash of the Graph state right
        before scheduling. This might be stricter than necessary, as some
        changes to the function might be small enough that the profile data is
        still relevant, but I'd rather err on the side of not making incorrect
        changes. This hash is also written to the v8.log file, in a line that
        looks like this:
      
        builtin_hash,LdaZeroHandler,3387822046
      
      Bug: v8:10470
      Change-Id: I429e5ce5efa94e01e7489deb3996012cf860cf13
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2220765
      Commit-Queue: Seth Brenith <seth.brenith@microsoft.com>
      Reviewed-by: 's avatarRoss McIlroy <rmcilroy@chromium.org>
      Reviewed-by: 's avatarTobias Tebbi <tebbi@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#69008}
      922983df
  4. 28 Oct, 2019 1 commit
  5. 17 Jul, 2019 1 commit
  6. 24 May, 2019 1 commit
  7. 29 Mar, 2019 1 commit
  8. 26 Nov, 2018 1 commit
  9. 20 Nov, 2018 1 commit
  10. 04 Sep, 2017 1 commit
  11. 27 Mar, 2017 2 commits
    • Ross McIlroy's avatar
      [TurboFan] Reserve space in scheduler node data for split nodes. · bdb4a8d3
      Ross McIlroy authored
      When node splitting is enabled new nodes could be created during scheduling.
      The Scheduler::node_data_ and Schedule::nodeid_to_block_ zone vectors
      reserve enough space for the node count before splitting, however will
      have to reallocate space when node splitting occurs. The vectors double
      in space by default, meaning the peak zone usage is 3x the required amount
      for these vectors as soon as a single node is split. Avoid this in the
      common case by reserving 10% extra space for split nodes. The value
      10% was choosen since it covers 98.7% of the optimized functions in Octane.
      
      BUG=chromium:700364
      
      Change-Id: Ibabd8d04cffd1eb08cc3b8a12b76892208ef3288
      Reviewed-on: https://chromium-review.googlesource.com/458425
      Commit-Queue: Ross McIlroy <rmcilroy@chromium.org>
      Reviewed-by: 's avatarMichael Starzinger <mstarzinger@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#44153}
      bdb4a8d3
    • Ross McIlroy's avatar
      [TurboFan] Lazily allocate scheduled_nodes vectors since most remain empty. · a059e87e
      Ross McIlroy authored
      The scheduled_nodes_ vector is used to maintain a per-block list of
      non-fixed nodes. For most blocks this list remains empty, so lazily
      initialize it instead of pre-allocating to save memory.
      
      Also pre-reserve an extra 10% of blocks to avoid reallocting space in the
      vector when fusing floating control creates new basic blocks.
      
      BUG=chromium:700364
      
      Change-Id: I9876e6a42bc90c9bff5838620365c18609ed1ee9
      Reviewed-on: https://chromium-review.googlesource.com/458919Reviewed-by: 's avatarMichael Starzinger <mstarzinger@chromium.org>
      Commit-Queue: Ross McIlroy <rmcilroy@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#44152}
      a059e87e
  12. 21 Mar, 2017 1 commit
  13. 17 Oct, 2016 1 commit
  14. 10 Oct, 2016 1 commit
  15. 07 Oct, 2016 1 commit
  16. 06 Oct, 2016 1 commit
  17. 23 Sep, 2016 1 commit
  18. 22 Sep, 2016 1 commit
  19. 20 Sep, 2016 1 commit
  20. 09 Feb, 2015 1 commit
  21. 03 Feb, 2015 1 commit
  22. 23 Jan, 2015 1 commit
  23. 22 Jan, 2015 1 commit
  24. 02 Dec, 2014 2 commits
  25. 27 Nov, 2014 1 commit
  26. 26 Nov, 2014 1 commit
  27. 11 Nov, 2014 1 commit
  28. 05 Nov, 2014 1 commit
  29. 29 Oct, 2014 2 commits
  30. 23 Oct, 2014 2 commits
  31. 21 Oct, 2014 3 commits
  32. 13 Oct, 2014 3 commits