1. 11 Aug, 2020 1 commit
  2. 05 Aug, 2020 1 commit
    • Mythri A's avatar
      [turboprop] Change heuristics for OSRing in TurboProp · bd9609a0
      Mythri A authored
      Change the heuristics for OSRing in TurboProp. Currently we OSR if
      a funciton is already optimized / marked for optimization but is still
      running optimized code. Since TurboProp optimizes much earlier than
      TurboFan using the same heuristics would cause us to OSR more often
      than required. This cl adds an additional check on the number of ticks
      to make sure the function is hot enough for OSRing.
      
      Bug: v8:9684
      Change-Id: I7a1c8229182a928fd85efb23e2d385413c5209ef
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2339098
      Commit-Queue: Mythri Alle <mythria@chromium.org>
      Reviewed-by: 's avatarRoss McIlroy <rmcilroy@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#69252}
      bd9609a0
  3. 29 Jul, 2020 1 commit
    • Jakob Gruber's avatar
      [nci] Update interrupt budget from NCI code · 980e224a
      Jakob Gruber authored
      This is the first step towards implementing a tier-up mechanism from
      NCI code to TF. We will follow the existing Ignition-to-Turbofan
      mechanics, which are, roughly:
      
      1. Track a bytecode interrupt budget.
      2. When exhausted, call the runtime profiler, which increments
         profiler ticks for the top frame's function.
      3. When a function should tier up, it is marked as such using the
         FeedbackVector::optimized_code_weak_or_smi slot / the
         OptimizationMarker mechanism.
      4. The InterpreterEntryTrampoline checks this slot and calls into
         runtime to compile if needed.
      5. The finished code is also placed into this slot, as well as
         installed on the JSFunction.
      6. Again, the IET checks the slot and tail-calls the code object if it
         exists.
      
      This CL implements step 1 for NCI code by inserting the new simplified
      UpdateInterruptBudget operator at the same spots (and using the same
      offsets) as Ignition. When the budget is exhausted, we call a runtime
      function that currently does nothing and will be implemented in the
      next CL.
      
      Bug: v8:8888
      Change-Id: I98c0f8d96f32d515218dc2a76f961d44fe281c86
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2312778
      Commit-Queue: Jakob Gruber <jgruber@chromium.org>
      Reviewed-by: 's avatarGeorg Neis <neis@chromium.org>
      Reviewed-by: 's avatarMythri Alle <mythria@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#69124}
      980e224a
  4. 16 Mar, 2020 1 commit
  5. 10 Jan, 2020 1 commit
  6. 02 Dec, 2019 1 commit
  7. 04 Sep, 2019 1 commit
    • Tobias Tebbi's avatar
      Revert "[compiler] improve inlining heuristics: call frequency per executed bytecodes" · eb443e1f
      Tobias Tebbi authored
      This reverts commit 352a154e.
      
      Reason for revert: https://crbug.com/999972
      
      Original change's description:
      > [compiler] improve inlining heuristics: call frequency per executed bytecodes
      > 
      > TLDR: Inline less, but more where it matters. ~10% decrease in Turbofan
      > compile time including off-thread, while improving Octane scores by ~2%.
      > 
      > How things used to work:
      > 
      > There is a flag FLAG_min_inlining_frequency that limits inlining by
      > the callsite being sufficiently frequently executed. This call frequency
      > was measured relative to invocations of the parent (= the function we
      > originally optimize). At the same time, the limit was very low (0.15),
      > meaning we mostly relied on the total amount of inlined code
      > (FLAG_max_inlined_bytecode_size_cumulative) to limit inlining.
      > 
      > How things work now:
      > 
      > Instead of measuring call frequency relative to parent invocations, we
      > should have a measure that predicts how often the callsite in question
      > will be executed in the future. An obvious attempt at that would be to
      > measure how often the callsite was executed in absolute numbers in the
      > past. But depending on how fast feedback stabilizes, it can take more
      > or less time until we optimize a function. If we just take the absolute
      > call frequency up to the point in time when we optimize, we would
      > inline more for functions that stabilize slowly, which doesn't make
      > sense. So instead, we measure absolute call count per KB of executed
      > bytecodes of the parent function.
      > Since inlining big functions is more expensive, this threshold is
      > additionally scaled linearly with the bytecode-size of the inlinee.
      > The resulting formula is:
      > call_frequency >
      > FLAG_min_inlining_frequency *
      >   (bytecode.length() - FLAG_max_inlined_bytecode_size_small) /
      >   (FLAG_max_inlined_bytecode_size - FLAG_max_inlined_bytecode_size_small)
      > 
      > The new threshold is chosen in a way that it effectively limits
      > inlining, which allows us to increase
      > FLAG_max_inlined_bytecode_size_cumulative without increasing inlining
      > in general.
      > 
      > The reduction in compile time (x64 build) of ~10% was observed in Octane,
      > ARES-6, web-tooling-benchmark, and the standalone TypeScript benchmark.
      > The hope is that this will reduce CPU-time in real-world situations
      > too.
      > The Octane improvements come from inlining more in places where it
      > matters.
      > 
      > Bug: v8:6682
      > 
      > Change-Id: I99baa17dec85b71616a3ab3414d7e055beca39a0
      > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1768366
      > Commit-Queue: Tobias Tebbi <tebbi@chromium.org>
      > Reviewed-by: Jakob Gruber <jgruber@chromium.org>
      > Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
      > Reviewed-by: Georg Neis <neis@chromium.org>
      > Reviewed-by: Maya Lekova <mslekova@chromium.org>
      > Cr-Commit-Position: refs/heads/master@{#63449}
      
      TBR=rmcilroy@chromium.org,neis@chromium.org,jgruber@chromium.org,tebbi@chromium.org,mslekova@chromium.org
      
      # Not skipping CQ checks because original CL landed > 1 day ago.
      
      Bug: v8:6682 chromium:999972
      Change-Id: Iffca63d4bef81afa0f66e34d35fb72f3b5baf517
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1784281Reviewed-by: 's avatarTobias Tebbi <tebbi@chromium.org>
      Commit-Queue: Tobias Tebbi <tebbi@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#63554}
      eb443e1f
  8. 29 Aug, 2019 1 commit
    • Tobias Tebbi's avatar
      [compiler] improve inlining heuristics: call frequency per executed bytecodes · 352a154e
      Tobias Tebbi authored
      TLDR: Inline less, but more where it matters. ~10% decrease in Turbofan
      compile time including off-thread, while improving Octane scores by ~2%.
      
      How things used to work:
      
      There is a flag FLAG_min_inlining_frequency that limits inlining by
      the callsite being sufficiently frequently executed. This call frequency
      was measured relative to invocations of the parent (= the function we
      originally optimize). At the same time, the limit was very low (0.15),
      meaning we mostly relied on the total amount of inlined code
      (FLAG_max_inlined_bytecode_size_cumulative) to limit inlining.
      
      How things work now:
      
      Instead of measuring call frequency relative to parent invocations, we
      should have a measure that predicts how often the callsite in question
      will be executed in the future. An obvious attempt at that would be to
      measure how often the callsite was executed in absolute numbers in the
      past. But depending on how fast feedback stabilizes, it can take more
      or less time until we optimize a function. If we just take the absolute
      call frequency up to the point in time when we optimize, we would
      inline more for functions that stabilize slowly, which doesn't make
      sense. So instead, we measure absolute call count per KB of executed
      bytecodes of the parent function.
      Since inlining big functions is more expensive, this threshold is
      additionally scaled linearly with the bytecode-size of the inlinee.
      The resulting formula is:
      call_frequency >
      FLAG_min_inlining_frequency *
        (bytecode.length() - FLAG_max_inlined_bytecode_size_small) /
        (FLAG_max_inlined_bytecode_size - FLAG_max_inlined_bytecode_size_small)
      
      The new threshold is chosen in a way that it effectively limits
      inlining, which allows us to increase
      FLAG_max_inlined_bytecode_size_cumulative without increasing inlining
      in general.
      
      The reduction in compile time (x64 build) of ~10% was observed in Octane,
      ARES-6, web-tooling-benchmark, and the standalone TypeScript benchmark.
      The hope is that this will reduce CPU-time in real-world situations
      too.
      The Octane improvements come from inlining more in places where it
      matters.
      
      Bug: v8:6682
      
      Change-Id: I99baa17dec85b71616a3ab3414d7e055beca39a0
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1768366
      Commit-Queue: Tobias Tebbi <tebbi@chromium.org>
      Reviewed-by: 's avatarJakob Gruber <jgruber@chromium.org>
      Reviewed-by: 's avatarRoss McIlroy <rmcilroy@chromium.org>
      Reviewed-by: 's avatarGeorg Neis <neis@chromium.org>
      Reviewed-by: 's avatarMaya Lekova <mslekova@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#63449}
      352a154e
  9. 22 Aug, 2019 1 commit
    • Sigurd Schneider's avatar
      [testing] Prevent heuristics from triggering optimization in tests · 6d9b7988
      Sigurd Schneider authored
      This CL adds a mechanism that prevents the RuntimeProfiler from
      triggering optimization of a function after
      %PrepareFunctionForOptimization has been called. This is useful to
      prevent flakiness in tests, as sometimes a function that already
      got deoptimized would receive a new code object from a concurrent
      compile that was triggered by a heuristic just in the right moment
      for the assertUnoptimized test to fail. For example, the following
      was happening:
      
      PrepareFunctionForOptimization
      [marking `testAdd` for optimized recompilation, reason: small function]
      [concurrently compiling method `testAdd` using TurboFan]
      [manually marking `testAdd` for non-concurrent optimization]
      [synchonously compiling method `testAdd` using TurboFan]
      [synchonously optimizing `testAdd` produced code object 0xAAAA - took 1.638 ms]
      Runtime_GetOptimizationStatus OPTIMIZED `testAdd` (code object 0xAAAA)
      DeoptimizeFunction `testAdd` with Code Object 0xAAAA
      [concurrently optimizing `testAdd` produced code object 0xBBBB - took 3.377 ms]
      Runtime_GetOptimizationStatus OPTIMIZED `testAdd` (code object 0xBBBB)
      
      Bug: v8:9563
      Change-Id: Ia4c846aba95281589317d43b82383e70fe0a35f5
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1763546Reviewed-by: 's avatarRoss McIlroy <rmcilroy@chromium.org>
      Reviewed-by: 's avatarYang Guo <yangguo@chromium.org>
      Commit-Queue: Sigurd Schneider <sigurds@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#63343}
      6d9b7988
  10. 23 May, 2019 2 commits
  11. 22 May, 2019 1 commit
  12. 21 May, 2019 1 commit
  13. 16 May, 2019 1 commit
  14. 25 Mar, 2019 3 commits
    • Benedikt Meurer's avatar
      [cleanup] Remove obsolete --type_info_threshold flag. · 19dcbec8
      Benedikt Meurer authored
      The --type_info_threshold is no longer supported for a long time and
      doesn't do anything useful nowadays, so no point in having that around.
      
      Drive-by-fix: Remove the FeedbackVector::ComputeCounts() logic, since
      it's dead code anyways by now.
      
      Bug: v8:8834
      Change-Id: I05f7517b3b82e34c0a83357337a456ab9c9f1f42
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1538128
      Commit-Queue: Benedikt Meurer <bmeurer@chromium.org>
      Reviewed-by: 's avatarJaroslav Sevcik <jarin@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#60442}
      19dcbec8
    • Benedikt Meurer's avatar
      [turbofan] Remove duplicated optimization limit. · 077e49a1
      Benedikt Meurer authored
      Before this change we had essentially two optimization limits, one hard
      limit in the TurboFan pipeline (128KiB), and a soft limit in the runtime
      profiler (60KiB). The hard limit was only relevant to --always-opt and
      other internal test infrastructure, and the soft limit was always
      enforced on regular JavaScript, but didn't properly disable further
      optimization for the function (so for example --trace-opt would
      continuesly report attempts to optimize the function).
      
      Now with this change we only have the hard limit, set to 60KiB, in the
      TurboFan pipeline and use that consistently.
      
      Bug: v8:8598
      Change-Id: I9e2ae7cb67de4a2256d3a7b9c3aee3dab60c2ec1
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1538127
      Commit-Queue: Benedikt Meurer <bmeurer@chromium.org>
      Commit-Queue: Jaroslav Sevcik <jarin@chromium.org>
      Auto-Submit: Benedikt Meurer <bmeurer@chromium.org>
      Reviewed-by: 's avatarJaroslav Sevcik <jarin@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#60436}
      077e49a1
    • Benedikt Meurer's avatar
      [tracing] Properly trace stack guards and interrupts. · b8490293
      Benedikt Meurer authored
      Add tracing support for the %StackGuard() and %Interrupt() runtime calls
      and the individual actions performed in StackGuard::HandleInterrupts().
      This includes:
      
       - "V8.GCHandleGCRequest" (in "disabled-by-default-v8.gc") when the
         GC_REQUEST bit is set.
       - "V8.WasmGrowSharedMemory" (in "disabled-by-default-v8.wasm") when
         the GROW_SHARED_MEMORY bit is set.
       - "V8.TerminateExecution" (in "v8.execute") when the
         TERMINATE_EXECUTION bit is set.
       - "V8.GCDeoptMarkedAllocationSites" (in "disabled-by-default-v8.gc")
         when the DEOPT_MARKED_ALLOCATION_SITES bit is set.
       - "V8.InstallOptimizedFunctions" (in "disabled-by-default-v8.compile")
         when the INSTALL_CODE bit is set.
       - "V8.InvokeApiInterruptCallbacks" (in "v8.execute") when the
         API_INTERRUPT bit is set.
      
      Now we also emit a trace event "V8.MarkCandidatesForOptimization" (in
      "disabled-by-default-v8.compile") in addition to the above from the
      RuntimeProfiler when we mark candidates for optimization at the end
      of each stack check.
      
      An example of the "V8.InstallOptimizedFunctions" in action (in the
      trace viewer) can be seen here:
      
        https://i.paste.pics/094a04af035eedc0690cd4079afa28f1.png
      
      This supersedes the previously introduced --trace-interrupts CLI flag,
      which is thus removed as part of this change.
      
      Bug: v8:8598
      Change-Id: I3c3375d00b07cbe700b6912097d7264031ace802
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1538116
      Commit-Queue: Benedikt Meurer <bmeurer@chromium.org>
      Reviewed-by: 's avatarPeter Marshall <petermarshall@chromium.org>
      Reviewed-by: 's avatarLeszek Swirski <leszeks@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#60428}
      b8490293
  15. 08 Dec, 2018 1 commit
  16. 07 Dec, 2018 3 commits
  17. 30 Nov, 2018 1 commit
  18. 28 Nov, 2018 1 commit
  19. 24 Nov, 2018 1 commit
  20. 20 Nov, 2018 1 commit
  21. 02 Nov, 2018 2 commits
    • Ross McIlroy's avatar
      Reland "Get BytecodeArray via current frame where possible." · 3530998c
      Ross McIlroy authored
      This is a reland of 7350e7b2
      
      Disabled LayoutTest that was causing issues and will rebaseline once this has rolled.
      
      Original change's description:
      > Get BytecodeArray via current frame where possible.
      >
      > With BytecodeArray flushing the SFI->BytecodeArray pointer will become pseudo weak.
      > Instead of getting the bytecode array from the SFI, get it from the frame instead
      > (which is a strong pointer). Note: This won't actually change behaviour since the
      > fact that the bytecode array was on the frame will retain it strongly, however it
      > makes the contract that the BytecodeArray must exist at these points more explicit.
      >
      > Updates code in runtime-profiler.cc, frames.cc and runtime-test.cc to do this.
      >
      > BUG=v8:8395
      >
      > Cq-Include-Trybots: luci.chromium.try:linux_chromium_headless_rel;master.tryserver.blink:linux_trusty_blink_rel
      > Change-Id: Id7a3e6857abd0e89bf238e9b0b01de4461df54e1
      > Reviewed-on: https://chromium-review.googlesource.com/c/1310193
      > Commit-Queue: Ross McIlroy <rmcilroy@chromium.org>
      > Reviewed-by: Mythri Alle <mythria@chromium.org>
      > Cr-Commit-Position: refs/heads/master@{#57198}
      
      TBR=mythria@chromium.org
      
      Bug: v8:8395
      Change-Id: I63044138f876a1cdfb8bb71499732a257f30d29a
      Cq-Include-Trybots: luci.chromium.try:linux_chromium_headless_rel;master.tryserver.blink:linux_trusty_blink_rel
      Reviewed-on: https://chromium-review.googlesource.com/c/1314336Reviewed-by: 's avatarRoss McIlroy <rmcilroy@chromium.org>
      Commit-Queue: Ross McIlroy <rmcilroy@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#57219}
      3530998c
    • Maya Lekova's avatar
      Revert "Get BytecodeArray via current frame where possible." · ea27a244
      Maya Lekova authored
      This reverts commit 7350e7b2.
      
      Reason for revert: Braking layout test, blocking the roll, see
      https://bugs.chromium.org/p/v8/issues/detail?id=8405
      
      Original change's description:
      > Get BytecodeArray via current frame where possible.
      > 
      > With BytecodeArray flushing the SFI->BytecodeArray pointer will become pseudo weak.
      > Instead of getting the bytecode array from the SFI, get it from the frame instead
      > (which is a strong pointer). Note: This won't actually change behaviour since the
      > fact that the bytecode array was on the frame will retain it strongly, however it
      > makes the contract that the BytecodeArray must exist at these points more explicit.
      > 
      > Updates code in runtime-profiler.cc, frames.cc and runtime-test.cc to do this.
      > 
      > BUG=v8:8395
      > 
      > Cq-Include-Trybots: luci.chromium.try:linux_chromium_headless_rel;master.tryserver.blink:linux_trusty_blink_rel
      > Change-Id: Id7a3e6857abd0e89bf238e9b0b01de4461df54e1
      > Reviewed-on: https://chromium-review.googlesource.com/c/1310193
      > Commit-Queue: Ross McIlroy <rmcilroy@chromium.org>
      > Reviewed-by: Mythri Alle <mythria@chromium.org>
      > Cr-Commit-Position: refs/heads/master@{#57198}
      
      TBR=rmcilroy@chromium.org,mythria@chromium.org
      
      Change-Id: Ie5db0ec1d68ca01d62e9880a4476704ad4d013b5
      No-Presubmit: true
      No-Tree-Checks: true
      No-Try: true
      Bug: v8:8395
      Cq-Include-Trybots: luci.chromium.try:linux_chromium_headless_rel;master.tryserver.blink:linux_trusty_blink_rel
      Reviewed-on: https://chromium-review.googlesource.com/c/1314330Reviewed-by: 's avatarMaya Lekova <mslekova@chromium.org>
      Commit-Queue: Maya Lekova <mslekova@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#57205}
      ea27a244
  22. 01 Nov, 2018 1 commit
    • Ross McIlroy's avatar
      Get BytecodeArray via current frame where possible. · 7350e7b2
      Ross McIlroy authored
      With BytecodeArray flushing the SFI->BytecodeArray pointer will become pseudo weak.
      Instead of getting the bytecode array from the SFI, get it from the frame instead
      (which is a strong pointer). Note: This won't actually change behaviour since the
      fact that the bytecode array was on the frame will retain it strongly, however it
      makes the contract that the BytecodeArray must exist at these points more explicit.
      
      Updates code in runtime-profiler.cc, frames.cc and runtime-test.cc to do this.
      
      BUG=v8:8395
      
      Cq-Include-Trybots: luci.chromium.try:linux_chromium_headless_rel;master.tryserver.blink:linux_trusty_blink_rel
      Change-Id: Id7a3e6857abd0e89bf238e9b0b01de4461df54e1
      Reviewed-on: https://chromium-review.googlesource.com/c/1310193
      Commit-Queue: Ross McIlroy <rmcilroy@chromium.org>
      Reviewed-by: 's avatarMythri Alle <mythria@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#57198}
      7350e7b2
  23. 10 Apr, 2018 1 commit
    • Matheus Marchini's avatar
      interpreter: make interpreted frames distinguishable in the native stack · ada64b58
      Matheus Marchini authored
      Before Turbofan/Ignition it was possible to use external profilers to
      sample running V8/Node.js processes and generate reports/FlameGraphs
      from that. It's still possible to do so, but non-optimized JavaScript
      functions appear in the stack as InterpreterEntryTrampoline. This commit
      adds a runtime flag which makes interpreted frames visible on the
      process' native stack as distinguishable functions, making the sampled
      data gathered by external profilers such as Linux perf and DTrace more
      useful.
      
      R=bmeurer@google.com, franzih@google.com, jarin@google.com, yangguo@google.com
      
      Bug: v8:7155
      Change-Id: I3dc8876aa3cd9f1b9766624842a7cc354ccca415
      Reviewed-on: https://chromium-review.googlesource.com/959081
      Commit-Queue: Yang Guo <yangguo@chromium.org>
      Reviewed-by: 's avatarLeszek Swirski <leszeks@chromium.org>
      Reviewed-by: 's avatarYang Guo <yangguo@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#52533}
      ada64b58
  24. 17 Oct, 2017 1 commit
  25. 19 Sep, 2017 1 commit
    • Mythri's avatar
      Change runtime_profiler to use bytecode array length · 807d0abe
      Mythri authored
      Runtime profiler uses bytecode array size for the tiering up decisions.
      Bytecode array size includes the header size as well. Inlining
      heuristics use bytecode array length instead. Bytecode array length
      is just the size of bytecode not inlcuding any headers. This change
      is to keep both of them in sync to avoid confusion. Also, the header
      contains several pointers and hence the size changes depending on the
      size of kPointerSize.
      
      Bug: 
      Change-Id: I22a9cf5e0bb9d6853c6a8be8d69c9ff459418a0d
      Reviewed-on: https://chromium-review.googlesource.com/670724Reviewed-by: 's avatarJaroslav Sevcik <jarin@chromium.org>
      Commit-Queue: Mythri Alle <mythria@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#48081}
      807d0abe
  26. 13 Sep, 2017 1 commit
  27. 11 Sep, 2017 1 commit
  28. 05 Sep, 2017 2 commits
  29. 31 Aug, 2017 1 commit
  30. 10 Aug, 2017 1 commit
  31. 03 Aug, 2017 1 commit
  32. 28 Jul, 2017 1 commit
  33. 25 Jul, 2017 1 commit