• Tobias Tebbi's avatar
    Revert "[compiler] improve inlining heuristics: call frequency per executed bytecodes" · eb443e1f
    Tobias Tebbi authored
    This reverts commit 352a154e.
    
    Reason for revert: https://crbug.com/999972
    
    Original change's description:
    > [compiler] improve inlining heuristics: call frequency per executed bytecodes
    > 
    > TLDR: Inline less, but more where it matters. ~10% decrease in Turbofan
    > compile time including off-thread, while improving Octane scores by ~2%.
    > 
    > How things used to work:
    > 
    > There is a flag FLAG_min_inlining_frequency that limits inlining by
    > the callsite being sufficiently frequently executed. This call frequency
    > was measured relative to invocations of the parent (= the function we
    > originally optimize). At the same time, the limit was very low (0.15),
    > meaning we mostly relied on the total amount of inlined code
    > (FLAG_max_inlined_bytecode_size_cumulative) to limit inlining.
    > 
    > How things work now:
    > 
    > Instead of measuring call frequency relative to parent invocations, we
    > should have a measure that predicts how often the callsite in question
    > will be executed in the future. An obvious attempt at that would be to
    > measure how often the callsite was executed in absolute numbers in the
    > past. But depending on how fast feedback stabilizes, it can take more
    > or less time until we optimize a function. If we just take the absolute
    > call frequency up to the point in time when we optimize, we would
    > inline more for functions that stabilize slowly, which doesn't make
    > sense. So instead, we measure absolute call count per KB of executed
    > bytecodes of the parent function.
    > Since inlining big functions is more expensive, this threshold is
    > additionally scaled linearly with the bytecode-size of the inlinee.
    > The resulting formula is:
    > call_frequency >
    > FLAG_min_inlining_frequency *
    >   (bytecode.length() - FLAG_max_inlined_bytecode_size_small) /
    >   (FLAG_max_inlined_bytecode_size - FLAG_max_inlined_bytecode_size_small)
    > 
    > The new threshold is chosen in a way that it effectively limits
    > inlining, which allows us to increase
    > FLAG_max_inlined_bytecode_size_cumulative without increasing inlining
    > in general.
    > 
    > The reduction in compile time (x64 build) of ~10% was observed in Octane,
    > ARES-6, web-tooling-benchmark, and the standalone TypeScript benchmark.
    > The hope is that this will reduce CPU-time in real-world situations
    > too.
    > The Octane improvements come from inlining more in places where it
    > matters.
    > 
    > Bug: v8:6682
    > 
    > Change-Id: I99baa17dec85b71616a3ab3414d7e055beca39a0
    > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1768366
    > Commit-Queue: Tobias Tebbi <tebbi@chromium.org>
    > Reviewed-by: Jakob Gruber <jgruber@chromium.org>
    > Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
    > Reviewed-by: Georg Neis <neis@chromium.org>
    > Reviewed-by: Maya Lekova <mslekova@chromium.org>
    > Cr-Commit-Position: refs/heads/master@{#63449}
    
    TBR=rmcilroy@chromium.org,neis@chromium.org,jgruber@chromium.org,tebbi@chromium.org,mslekova@chromium.org
    
    # Not skipping CQ checks because original CL landed > 1 day ago.
    
    Bug: v8:6682 chromium:999972
    Change-Id: Iffca63d4bef81afa0f66e34d35fb72f3b5baf517
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1784281Reviewed-by: 's avatarTobias Tebbi <tebbi@chromium.org>
    Commit-Queue: Tobias Tebbi <tebbi@chromium.org>
    Cr-Commit-Position: refs/heads/master@{#63554}
    eb443e1f
code-stub-assembler.cc 532 KB