[compiler] improve inlining heuristics: call frequency per executed bytecodes
TLDR: Inline less, but more where it matters. ~10% decrease in Turbofan compile time including off-thread, while improving Octane scores by ~2%. How things used to work: There is a flag FLAG_min_inlining_frequency that limits inlining by the callsite being sufficiently frequently executed. This call frequency was measured relative to invocations of the parent (= the function we originally optimize). At the same time, the limit was very low (0.15), meaning we mostly relied on the total amount of inlined code (FLAG_max_inlined_bytecode_size_cumulative) to limit inlining. How things work now: Instead of measuring call frequency relative to parent invocations, we should have a measure that predicts how often the callsite in question will be executed in the future. An obvious attempt at that would be to measure how often the callsite was executed in absolute numbers in the past. But depending on how fast feedback stabilizes, it can take more or less time until we optimize a function. If we just take the absolute call frequency up to the point in time when we optimize, we would inline more for functions that stabilize slowly, which doesn't make sense. So instead, we measure absolute call count per KB of executed bytecodes of the parent function. Since inlining big functions is more expensive, this threshold is additionally scaled linearly with the bytecode-size of the inlinee. The resulting formula is: call_frequency > FLAG_min_inlining_frequency * (bytecode.length() - FLAG_max_inlined_bytecode_size_small) / (FLAG_max_inlined_bytecode_size - FLAG_max_inlined_bytecode_size_small) The new threshold is chosen in a way that it effectively limits inlining, which allows us to increase FLAG_max_inlined_bytecode_size_cumulative without increasing inlining in general. The reduction in compile time (x64 build) of ~10% was observed in Octane, ARES-6, web-tooling-benchmark, and the standalone TypeScript benchmark. The hope is that this will reduce CPU-time in real-world situations too. The Octane improvements come from inlining more in places where it matters. Bug: v8:6682 Change-Id: I99baa17dec85b71616a3ab3414d7e055beca39a0 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1768366 Commit-Queue: Tobias Tebbi <tebbi@chromium.org> Reviewed-by: Jakob Gruber <jgruber@chromium.org> Reviewed-by: Ross McIlroy <rmcilroy@chromium.org> Reviewed-by: Georg Neis <neis@chromium.org> Reviewed-by: Maya Lekova <mslekova@chromium.org> Cr-Commit-Position: refs/heads/master@{#63449}
Showing
Please
register
or
sign in
to comment