Commits · e43118466fb6a485c9ae20d9f47496ce1ed3c10e · Linshizhi / V8

20 Jan, 2022 1 commit

Remove the turboprop implementation · 0a6c1a77

Jakob Gruber authored 3 years ago

Bug: v8:12552
Change-Id: I99e4d8e8aeba5460f11e54cc1b2bcaea98a5276d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3400964Reviewed-by: Toon Verwaest <verwaest@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/main@{#78698}

0a6c1a77

09 Nov, 2021 1 commit

Refactor and remove dead code in runtime-profiler · 7d591d2b

Jakob Gruber authored 3 years ago

Change-Id: Id51910177ce1124b025af2ec36ab6d7c6b06937d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3268741
Auto-Submit: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Leszek Swirski <leszeks@chromium.org>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Cr-Commit-Position: refs/heads/main@{#77793}

7d591d2b

07 Sep, 2021 1 commit

[profiler] Turn some runtime profiler static ints into flags · 4b1d972c

Toon Verwaest authored 3 years ago

That makes it easier to try various values.

Change-Id: I3f4784d148cd5c7524773972e72e1a37ce861210
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2972731
Commit-Queue: Toon Verwaest <verwaest@chromium.org>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76701}

4b1d972c

24 Aug, 2021 1 commit

Fix a DCHECK failure with broken asm.js functions · a6f3fce3

Georg Neis authored 3 years ago

Fixed: chromium:1236286
Change-Id: I90106fce4d6e747f35c638ab00bf9a1696c8eb77
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3109668
Commit-Queue: Georg Neis <neis@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76462}

a6f3fce3

08 Jul, 2021 1 commit

[TurboProp] Don't scale OSR ticks. · 53574525

Ross McIlroy authored 3 years ago

Now that TurboProp doesn't have an earlier interupt budget, we
should no longer be scaling the number of ticks required to
OSR to TurboProp.

BUG=v8:9684

Change-Id: Ie4d41e75df697e36e7fbc3f7bc8a8d0f24f6743a
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3014462
Commit-Queue: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Mythri Alle <mythria@chromium.org>
Cr-Commit-Position: refs/heads/master@{#75647}

53574525

15 Jun, 2021 1 commit

[TurboProp] Make TurboProp optimize later. · 7d468b70

Ross McIlroy authored 3 years ago

Moves TurboProp to optimize around the time of TurboFan right now, and
removes some of the special-case logic we had to avoid aggressive
early optimization of TurboProp.

BUG=v8:9684

Change-Id: I0299408891ff6fd57e6523ff309b5f16624466a9
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2964814
Commit-Queue: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Mythri Alle <mythria@chromium.org>
Cr-Commit-Position: refs/heads/master@{#75163}

7d468b70

15 Apr, 2021 1 commit

[nci] Remove more NCI-specific logic · 5ecb5bd9

Jakob Gruber authored 3 years ago

Some logic still remains, notably in compiler/.

Bug: v8:8888
Change-Id: I7e7f10a487e1bc8b90bbbfedbc46bf09bae0717e
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2825589
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Dominik Inführ <dinfuehr@chromium.org>
Cr-Commit-Position: refs/heads/master@{#73969}

5ecb5bd9

22 Feb, 2021 1 commit

[turboprop] Reduce BytecodeBudgetInterrupt overhead from Turboprop · 5b783479

Mythri A authored 3 years ago

Earlier we used the same interrupt budget always and waited for higher
number of ticks when tiering up from Turboprop to TurboFan. On some of
the real world pages this adds a reasonable overhead for processing
these interrupts. This cl sets the interrupt budget to a higher value so
there are fewer interrupts. This cl:
1. Sets the interrupt budget on feedback cell to
FLAG_interrupt_budget * scale factor when we install optimized code.
2. Resets the budget to FLAG_interrupt_budget when there is a
deoptimization.
3. Updates the runtime profiler to remove the scaling of number of ticks
needed for optimization when tiering up from TP to TF.

On sheets benchmark, we spend 40-50ms when servicing interrupts from
Turboprop code. This change brings it down to ~7ms. We also see
improvements on other pages.


Bug: v8:9684
Change-Id: Ia3e5e998d1fff44f2e08a240a8769b7ebe794da2
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2696661
Commit-Queue: Mythri Alle <mythria@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Cr-Commit-Position: refs/heads/master@{#72906}

5b783479

18 Feb, 2021 1 commit

Tweak OSR heuristic to fix gaussian-blur regression · 15891111

Seth Brenith authored 3 years ago

My recent change https://crrev.com/c/v8/v8/+/2698057 changed the size of
bytecode for most functions, and attempted to update other heuristic
values to match. However, it caused V8 to be slightly too eager to
perform on-stack replacement in JetStream 2's gaussian-blur test case,
so that the function got compiled separately for each of two nested
loops rather than just once for the outer loop. This is the smallest
change that restores the previous behavior in that benchmark.

Bug: chromium:1179571
Change-Id: I03e98d6bff7355b775c1fdaf495e7444e7c6f095
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2704882Reviewed-by: Mythri Alle <mythria@chromium.org>
Commit-Queue: Seth Brenith <seth.brenith@microsoft.com>
Cr-Commit-Position: refs/heads/master@{#72849}

15891111

17 Feb, 2021 1 commit

Reland "[interpreter] Short Star bytecode" · 7be64db4

Seth Brenith authored 3 years ago

This is a reland of cf93071c

Original change's description:
> [interpreter] Short Star bytecode
>
> Design doc:
> https://docs.google.com/document/d/1g_NExMT78II_KnIYNa9MvyPYIj23qAiFUEsyemY5KRk/edit
>
> This change adds 16 new interpreter opcodes, kStar0 through kStar15, so
> that we can use a single byte to represent the common operation of
> storing to a low-numbered register. This generally reduces the quantity
> of bytecode generated on web sites by 8-9%.
>
> In order to not degrade speed, a couple of other changes are required:
>
> The existing lookahead logic to check for Star after certain other
> bytecode handlers is updated to check for these new short Star codes
> instead. Furthermore, that lookahead logic is updated to contain its own
> copy of the dispatch jump rather than merging control flow with the
> lookahead-failed case, to improve branch prediction.
>
> A bunch of constants use bytecode size in bytes as a proxy for the size
> or complexity of a function, and are adjusted downward proportionally to
> the decrease in generated bytecode size.
>
> Other small drive-by fix: update generate-bytecode-expectations to emit
> \n instead of \r\n on Windows.
>
> Change-Id: I6307c2b0f5794a3a1088bb0fb94f6e1615441ed5
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2641180
> Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
> Commit-Queue: Seth Brenith <seth.brenith@microsoft.com>
> Cr-Commit-Position: refs/heads/master@{#72773}

Change-Id: I1afb670c25694498b3989de615858f984a8c7f6f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2698057
Commit-Queue: Seth Brenith <seth.brenith@microsoft.com>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Mythri Alle <mythria@chromium.org>
Cr-Commit-Position: refs/heads/master@{#72821}

7be64db4

16 Feb, 2021 2 commits

Revert "[interpreter] Short Star bytecode" · 08a49bbe

Leszek Swirski authored 3 years ago

This reverts commit cf93071c.

Reason for revert: Speculative revert because of Mac4 GC stress failure: https://ci.chromium.org/ui/p/v8/builders/ci/V8%20Mac64%20GC%20Stress/16697/overview

Original change's description:
> [interpreter] Short Star bytecode
>
> Design doc:
> https://docs.google.com/document/d/1g_NExMT78II_KnIYNa9MvyPYIj23qAiFUEsyemY5KRk/edit
>
> This change adds 16 new interpreter opcodes, kStar0 through kStar15, so
> that we can use a single byte to represent the common operation of
> storing to a low-numbered register. This generally reduces the quantity
> of bytecode generated on web sites by 8-9%.
>
> In order to not degrade speed, a couple of other changes are required:
>
> The existing lookahead logic to check for Star after certain other
> bytecode handlers is updated to check for these new short Star codes
> instead. Furthermore, that lookahead logic is updated to contain its own
> copy of the dispatch jump rather than merging control flow with the
> lookahead-failed case, to improve branch prediction.
>
> A bunch of constants use bytecode size in bytes as a proxy for the size
> or complexity of a function, and are adjusted downward proportionally to
> the decrease in generated bytecode size.
>
> Other small drive-by fix: update generate-bytecode-expectations to emit
> \n instead of \r\n on Windows.
>
> Change-Id: I6307c2b0f5794a3a1088bb0fb94f6e1615441ed5
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2641180
> Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
> Commit-Queue: Seth Brenith <seth.brenith@microsoft.com>
> Cr-Commit-Position: refs/heads/master@{#72773}

TBR=rmcilroy@chromium.org,mythria@chromium.org,seth.brenith@microsoft.com

Change-Id: I0162b9400861b90bacef27cca9aebc8ab9d74c10
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2697350Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Commit-Queue: Leszek Swirski <leszeks@chromium.org>
Cr-Commit-Position: refs/heads/master@{#72777}

08a49bbe

[interpreter] Short Star bytecode · cf93071c

Seth Brenith authored 3 years ago

Design doc:
https://docs.google.com/document/d/1g_NExMT78II_KnIYNa9MvyPYIj23qAiFUEsyemY5KRk/edit

This change adds 16 new interpreter opcodes, kStar0 through kStar15, so
that we can use a single byte to represent the common operation of
storing to a low-numbered register. This generally reduces the quantity
of bytecode generated on web sites by 8-9%.

In order to not degrade speed, a couple of other changes are required:

The existing lookahead logic to check for Star after certain other
bytecode handlers is updated to check for these new short Star codes
instead. Furthermore, that lookahead logic is updated to contain its own
copy of the dispatch jump rather than merging control flow with the
lookahead-failed case, to improve branch prediction.

A bunch of constants use bytecode size in bytes as a proxy for the size
or complexity of a function, and are adjusted downward proportionally to
the decrease in generated bytecode size.

Other small drive-by fix: update generate-bytecode-expectations to emit
\n instead of \r\n on Windows.

Change-Id: I6307c2b0f5794a3a1088bb0fb94f6e1615441ed5
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2641180Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Commit-Queue: Seth Brenith <seth.brenith@microsoft.com>
Cr-Commit-Position: refs/heads/master@{#72773}

cf93071c

15 Feb, 2021 1 commit

[frames] Add UnoptimizedFrame · 053d1e0d

Leszek Swirski authored 3 years ago

Add a new StackFrame class for unoptimized frames (which are either
interpreted or baseline). BaselineFrame becomes a subclass of this
rather than InterpretedFrame, and the various frame constants helpers
are similarly amended.

Bug: v8:11420, v8:11429
Change-Id: I87e9368aef48ef06a39476bf826f379ce1441528
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2692208
Commit-Queue: Leszek Swirski <leszeks@chromium.org>
Auto-Submit: Leszek Swirski <leszeks@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
Cr-Commit-Position: refs/heads/master@{#72743}

053d1e0d

12 Feb, 2021 1 commit

[sparkplug] Upstream Sparkplug · c053419e

Leszek Swirski authored 3 years ago

Sparkplug is a new baseline, non-optimising second-tier compiler,
designed to fit in the compiler trade-off space between Ignition and
TurboProp/TurboFan.

Design doc:
https://docs.google.com/document/d/13c-xXmFOMcpUQNqo66XWQt3u46TsBjXrHrh4c045l-A/edit?usp=sharing

Bug: v8:11420
Change-Id: Ideb7270db3d6548eedd8337a3f596eb6f8fea6b1
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2667514
Commit-Queue: Leszek Swirski <leszeks@chromium.org>
Reviewed-by: Michael Stanton <mvstanton@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Michael Achenbach <machenbach@chromium.org>
Reviewed-by: Hannes Payer <hpayer@chromium.org>
Cr-Commit-Position: refs/heads/master@{#72686}

c053419e

29 Jan, 2021 1 commit

[turboprop] Don't tier up small functions early from Turboprop · 7cadd21e

Mythri A authored 4 years ago

We use a heuristic that tiers up small functions at the first tick to
optimize the small functions early. When tiering up from Turboprop it
isn't important to tier up these functions quite early since they are
already executing optimized code.

Bug: v8:9684
Change-Id: Iaa647e0e03f0b4bf9cd0da7feb1e2d0e36004bc1
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2656258Reviewed-by: Sathya Gunasekaran  <gsathya@chromium.org>
Commit-Queue: Mythri Alle <mythria@chromium.org>
Cr-Commit-Position: refs/heads/master@{#72425}

7cadd21e

25 Jan, 2021 1 commit

[turboprop] Delay optimizing functions that get hot slower · 502419a8

Mythri A authored 4 years ago

Functions that get hot quickly are more likely to stay hot and stable,
so optimize these functions earlier than the function that become
hot slower. To measure how "soon" the function gets hot this cl
introduces a global tick that is incremented whenever a function
registers a tick. We use the difference in the global tick between the
current tick and the last tick on that function to measure how soon
the function is becoming hot. We use the last tick to account for
functions that aren't used so much at the start but become hot
in a later phase. Currently we use this heuristic only for Turboprop
tierups. It is possible to extend this to extend this to Turbofan in
future.

Bug: v8:9684
Change-Id: I8ef265c03520274c68d56a9d35429531a3ba3d1d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2627850
Commit-Queue: Mythri Alle <mythria@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Cr-Commit-Position: refs/heads/master@{#72281}

502419a8

18 Jan, 2021 1 commit

[turboprop][runtime-profiler] Fix a bug that disabled OSR · d7f767e1

Mythri A authored 4 years ago

This cl: https://chromium-review.googlesource.com/c/v8/v8/+/2632588
introduced a bug by bailing out early if we have top tier code early.
However, we still need to check if the frame is still interpreted
so that we could OSR. The early bailout isn't correct and also the
DCHECK isn't correct. This cl removes both.

Bug: chromium:1167638, v8:9684
Change-Id: I5a4aa406b05b6cbb5f98b63e015298c5b45160eb
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2632696Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Mythri Alle <mythria@chromium.org>
Cr-Commit-Position: refs/heads/master@{#72128}

d7f767e1

15 Jan, 2021 1 commit

Reland "[turboprop] Enable tierup to TurboFan with FLAG_turboprop" · 3a6920d2

Mythri A authored 4 years ago

This is a reland of e38cb757. This
was reverted as a potential culprit for a wasm failure. The
actual revert that fixed the bots is here:
https://chromium-review.googlesource.com/c/v8/v8/+/2630736.
This should be safe to reland. I verified locally that the test is
failing with or without this change.

Original change's description:
> [turboprop] Enable tierup to TurboFan with FLAG_turboprop
>
> FLAG_turboprop was used to test the turboprop compiler without any
> further tierup to TurboFan. This cl changes:
> - FLAG_turboprop to also tier up to TurboFan.
> - Introduces FLAG_turboprop_as_toptier to continue running the
>   configuration without tierup.
> - Removes FLAG_turboprop_as_midtier which is same as FLAG_turboprop.
>
> Bug: v8:9684
> Change-Id: I487bda13d226434837770ecc43b3ced7c31ccf19
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2622214
> Commit-Queue: Mythri Alle <mythria@chromium.org>
> Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
> Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#72101}

Bug: v8:9684
Change-Id: I8b61fd8e562190c3c7bf5a003273f2a058542dad
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2632588
Commit-Queue: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Cr-Commit-Position: refs/heads/master@{#72110}

3a6920d2

14 Jan, 2021 2 commits

Revert "[turboprop] Enable tierup to TurboFan with FLAG_turboprop" · 053b3c44

Francis McCabe authored 4 years ago

This reverts commit e38cb757.

Reason for revert: Test failing: https://logs.chromium.org/logs/v8/buildbucket/cr-buildbucket.appspot.com/8858103866497469056/+/steps/Check/0/logs/tier-down-to-liftoff/0

Original change's description:
> [turboprop] Enable tierup to TurboFan with FLAG_turboprop
>
> FLAG_turboprop was used to test the turboprop compiler without any
> further tierup to TurboFan. This cl changes:
> - FLAG_turboprop to also tier up to TurboFan.
> - Introduces FLAG_turboprop_as_toptier to continue running the
>   configuration without tierup.
> - Removes FLAG_turboprop_as_midtier which is same as FLAG_turboprop.
>
> Bug: v8:9684
> Change-Id: I487bda13d226434837770ecc43b3ced7c31ccf19
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2622214
> Commit-Queue: Mythri Alle <mythria@chromium.org>
> Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
> Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#72101}

TBR=rmcilroy@chromium.org,mythria@chromium.org,jgruber@chromium.org

Change-Id: Ic3e87c311fba001460e4f1561a2e5f74391a06a7
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: v8:9684
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2630526Reviewed-by: Francis McCabe <fgm@chromium.org>
Commit-Queue: Francis McCabe <fgm@chromium.org>
Cr-Commit-Position: refs/heads/master@{#72102}

053b3c44

[turboprop] Enable tierup to TurboFan with FLAG_turboprop · e38cb757

Mythri A authored 4 years ago

FLAG_turboprop was used to test the turboprop compiler without any
further tierup to TurboFan. This cl changes:
- FLAG_turboprop to also tier up to TurboFan.
- Introduces FLAG_turboprop_as_toptier to continue running the
  configuration without tierup.
- Removes FLAG_turboprop_as_midtier which is same as FLAG_turboprop.

Bug: v8:9684
Change-Id: I487bda13d226434837770ecc43b3ced7c31ccf19
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2622214
Commit-Queue: Mythri Alle <mythria@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#72101}

e38cb757

17 Dec, 2020 1 commit

[TurboFan] Templatize GetBytecodeArray · d1226086

Nico Hartmann authored 4 years ago

This CL changes SharedFunctionInfo::GetBytecodeArray to a function
template, which is specialized for Isolate and LocalIsolate arguments.
This allows main thread only uses to avoid taking a lock.

Bug: v8:7790, chromium:1154603
Change-Id: I3462c4e36b66073e09393c01c765dd8a018a98f0
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2595307
Commit-Queue: Nico Hartmann <nicohartmann@chromium.org>
Reviewed-by: Michael Stanton <mvstanton@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Cr-Commit-Position: refs/heads/master@{#71833}

d1226086

02 Dec, 2020 1 commit

[runtime-profiler] Cleanup MarkCandidatesForOptimization* functions · 58477dc3

Mythri A authored 4 years ago

MarkCandidatesForOptimizationFromBytecode/
MarkCandidatesForOptimizationFromCode are called when bytecode budget
interrupt occurs from interpreted / optimized code. The logic in these
two functions is very similar. This cl merges this logic into one
function.

This cl also removes FLAG_frame_count which specifies the
number of frames we need to look at for tiering up on a bytecode
budget interrupt. The default value is set to 1 and in its current
form it isn't very useful.

Bug: v8:9684
Change-Id: I9f56034f2857672921673b9b68b3615765c0ccfe
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2565514
Commit-Queue: Mythri Alle <mythria@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Cr-Commit-Position: refs/heads/master@{#71561}

58477dc3

27 Nov, 2020 1 commit

[runtime-profiler] Update profiler ticks before tiering up decisions · ec71d2b9

Mythri A authored 4 years ago

We used to update profiler ticks after tiering up decisions when tiering
up from Ignition and update the ticks before when tiering up from
mid-tier optimized code. This meant we added special cases to account for
this difference. This cl makes updating the ticks uniform by always
updating the ticks before tiering up decisions. Also adjusts the
heuristics to take this into account.

Bug: v8:9684
Change-Id: I2c63ba3499c542bb4a69e55d6cc4bebe4612793f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2563659Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Commit-Queue: Mythri Alle <mythria@chromium.org>
Cr-Commit-Position: refs/heads/master@{#71457}

ec71d2b9

26 Nov, 2020 1 commit

[turboprop] Fix Turboprop to Turbofan tiering heuristics · 3de12329

Mythri A authored 4 years ago

1. Don't optimize small functions early when tiering up from ignition
to Turboprop.
2. When tiering up from Turboprop to Turbofan scale the ticks so we
optimize small functions at roughly same time as default.
3. Adjust for the fact that profiler ticks are updated before performing
the ShouldOptimize check when tiering up from TP -> TF.

Bug: v8:9684
Change-Id: I6b68eed70abb9a86f9b99eac9c0b9a1fe6346027
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2560725
Commit-Queue: Mythri Alle <mythria@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Cr-Commit-Position: refs/heads/master@{#71436}

3de12329

24 Nov, 2020 1 commit

[cleanup] Replace all remaining Min/Max uses with std::min/max · 3836aeb0

Georg Neis authored 4 years ago

Apart from removing Min and Max (utils.h), this is mostly a renaming.

In a few cases I had to add a cast. In a bunch of cases I had to use
initializer lists to force call-by-value for static member constants
because call-by-reference wouldn't compile (like in the previous CL).
In a few places I used initializer lists in place of nested min/max
operations.

Bug: v8:11074
Change-Id: I53a5411be6334ff41e7a8517e6b87fb46f14d086
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2545523
Commit-Queue: Georg Neis <neis@chromium.org>
Reviewed-by: Hannes Payer <hpayer@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
Cr-Commit-Position: refs/heads/master@{#71380}

3836aeb0

11 Nov, 2020 1 commit

[turboprop] Adjust OSR heuristics for Turboprop · 301b354e

Mythri A authored 4 years ago

Turboprop should tierup to OSR roughly at the same time as TurboFan,
so we wait for kProfilerTicksForTurboPropOSR ticks before OSRing. This
value was incorrect because we OSR after 4 ticks (we increment the ticks
after the tiering up decision). Also, we wait for additional ticks based
on function size. That should also adjust for the lower interrupt budget
on Turboprop.

Bug: v8:9684
Change-Id: I84c0afadd0562e598bbbe1c0cf904d7488c70261
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2532295
Commit-Queue: Mythri Alle <mythria@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Cr-Commit-Position: refs/heads/master@{#71135}

301b354e

05 Nov, 2020 1 commit

[turboprop] Tierup from turboprop with --turboprop-as-midtier · b022c448

Mythri A authored 4 years ago

This cl implements tiering up support from Turboprop to TurboFan behind
turboprop_as_midtier flag. More specifically:
1. Scales down the bytecode size when updating the interrupt budget in
optimized code (TP / NCI).
2. Runtime profiler tiers up from TP->TF with --turboprop-as-midtier
3. Looks for the correct code kind when looking for optimized code in
the feedback vector.
4. After servicing the optimization marker continues with mid-tier
optimized code if it exists

Bug: v8:9684
Change-Id: Iaf5783e75555c50c97901504fd122f62ff30be5c
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2480363
Commit-Queue: Mythri Alle <mythria@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70993}

b022c448

02 Nov, 2020 1 commit

[turboprop] Add tiering up support for TurboProp · 804a612c

Mythri A authored 4 years ago

This cl adds support for tiering up in TurboProp. This cl makes
necessary changes to support tier up but doesn't tier up yet. More
specifically this cl:
1. Introduces a new flag for interrupt_budget_for_midtier and
updates code to use the correct interrupt_budget.
2. Introduces a flag turboprop_as_midtier and necessary support
to tier up. When this flag is enabled, we introduce checks for tierup
and updating interrupt budget.


Bug: v8:9684
Change-Id: I58785ce4b9de46488a22d3b4d0cebedac460a773
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2460822
Commit-Queue: Mythri Alle <mythria@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70915}

804a612c

30 Sep, 2020 1 commit

[turboprop] Add TURBOPROP code kind · 75b8c238

Jakob Gruber authored 4 years ago

Turboprop-generated Code objects will now have the dedicated
TURBOPROP code kind instead of OPTIMIZED_FUNCTION. When possible,
the code kind is used as the source of truth instead of
FLAG_turboprop. This is the initial step towards implementing
tier-up from Turboprop to Turbofan.

Future work: Rename OPTIMIZED_FUNCTION to TURBOFAN, rename STUB to
DEOPT_ENTRIES_OR_FOR_TESTING, implement TP tier-up.

No-Try: true
Bug: v8:9684
Cq-Include-Trybots: luci.v8.try:v8_linux64_fyi_rel_ng
Change-Id: I3c9308718d7e9a2b7e6796e7ea94f17e5ff84c0a
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2424140
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Mythri Alle <mythria@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70213}

75b8c238

02 Sep, 2020 1 commit

Fix various typos (and add one DCHECK) · d4cf7d1f

Jakob Gruber authored 4 years ago

A random grab-bag of trivial fixes I came across while working on
another CL.

Bug: v8:8888
Change-Id: I6e46e1fe5a547854d8afbac19f7e049f1661c406
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2388113
Auto-Submit: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Cr-Commit-Position: refs/heads/master@{#69675}

d4cf7d1f

20 Aug, 2020 1 commit

[nci] Change testing mode to --turbo-nci-as-midtier · faed2986

Jakob Gruber authored 4 years ago

To properly test tier-up in the V8 test suite, change the test variant
previously called --turbo-nci-as-highest-tier to
--turbo-nci-as-midtier.  As a midtier (between ignition and turbofan),
all major parts of the NCI pipeline (codegen, caching inside the same
native context, tier-up) are exercised by test suite.

Bug: v8:8888
Change-Id: Ic8ee2f3e3d72768c3869f5e0b25800dd0a5f25b7
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2361462
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Cr-Commit-Position: refs/heads/master@{#69501}

faed2986

19 Aug, 2020 1 commit

[nci] Implement tier-up, part 2 (marking) · 1096e031

Jakob Gruber authored 4 years ago

This is part two of the implementation (part 1: heuristics in NCI code
to call the runtime profiler, part 2: heuristics in the runtime
profiler to mark the function for optimization, part 3: the final
part, recognizing and acting upon the marked function).

The runtime profiler heuristics added here remain very similar to what
we have for ignition, except that we now inspect optimized frames with
NCI code, and that we (currently) do not OSR from NCI to TF.

Bug: v8:8888
Change-Id: Ie88b0a0dcee16334cea585c771a4b505035f2291
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2358748
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Mythri Alle <mythria@chromium.org>
Cr-Commit-Position: refs/heads/master@{#69484}

1096e031

11 Aug, 2020 1 commit

[js-function] Remove deprecated predicates · b3a6b586

Jakob Gruber authored 4 years ago

Updated:

IsOptimized -> HasAttachedOptimizedCode
HasOptimizedCode -> HasAvailableOptimizedCode
IsInterpreted -> ActiveTierIsIgnition

Bug: v8:8888
Change-Id: I96363622b67b53371a974f1c17cef387093f053c
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2346404
Auto-Submit: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Mythri Alle <mythria@chromium.org>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#69326}

b3a6b586

05 Aug, 2020 1 commit

[turboprop] Change heuristics for OSRing in TurboProp · bd9609a0

Mythri A authored 4 years ago

Change the heuristics for OSRing in TurboProp. Currently we OSR if
a funciton is already optimized / marked for optimization but is still
running optimized code. Since TurboProp optimizes much earlier than
TurboFan using the same heuristics would cause us to OSR more often
than required. This cl adds an additional check on the number of ticks
to make sure the function is hot enough for OSRing.

Bug: v8:9684
Change-Id: I7a1c8229182a928fd85efb23e2d385413c5209ef
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2339098
Commit-Queue: Mythri Alle <mythria@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Cr-Commit-Position: refs/heads/master@{#69252}

bd9609a0

29 Jul, 2020 1 commit

[nci] Update interrupt budget from NCI code · 980e224a

Jakob Gruber authored 4 years ago

This is the first step towards implementing a tier-up mechanism from
NCI code to TF. We will follow the existing Ignition-to-Turbofan
mechanics, which are, roughly:

1. Track a bytecode interrupt budget.
2. When exhausted, call the runtime profiler, which increments
   profiler ticks for the top frame's function.
3. When a function should tier up, it is marked as such using the
   FeedbackVector::optimized_code_weak_or_smi slot / the
   OptimizationMarker mechanism.
4. The InterpreterEntryTrampoline checks this slot and calls into
   runtime to compile if needed.
5. The finished code is also placed into this slot, as well as
   installed on the JSFunction.
6. Again, the IET checks the slot and tail-calls the code object if it
   exists.

This CL implements step 1 for NCI code by inserting the new simplified
UpdateInterruptBudget operator at the same spots (and using the same
offsets) as Ignition. When the budget is exhausted, we call a runtime
function that currently does nothing and will be implemented in the
next CL.

Bug: v8:8888
Change-Id: I98c0f8d96f32d515218dc2a76f961d44fe281c86
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2312778
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Georg Neis <neis@chromium.org>
Reviewed-by: Mythri Alle <mythria@chromium.org>
Cr-Commit-Position: refs/heads/master@{#69124}

980e224a

16 Mar, 2020 1 commit

[TurboFan] Redirect --trace-opt, --trace-deopt, --trace-osr to a file · b0bae6c7

Mythri A authored 4 years ago

With the current flow, it is difficult to easily get the output
of --trace-opt, --trace-deopt and --trace-osr from Android devices.
These flags log to stdout and on Android it is difficult to get this
output that preserves the formatting. This cl redirects them to a file
when --redirect-code-traces is specified.

Change-Id: I8ea1f083d0ee4577f9d70cfd2d7cb2823fd1a6c4
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2089931
Commit-Queue: Mythri Alle <mythria@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Georg Neis <neis@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66722}

b0bae6c7

10 Jan, 2020 1 commit

Don't mark a function for optimization if feedback vector has optimized code · 4453f89c

Mythri A authored 5 years ago

If feedback vector contains optimized code then we don't have to mark
the closure for optimization. The optimized code would be installed on
the next execution.

Bug: chromium:1030415
Change-Id: Ifc6bbdf6f99ac835ace828fc812e89d1100622f9
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1993293Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Commit-Queue: Mythri Alle <mythria@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65689}

4453f89c

02 Dec, 2019 1 commit

Don't try to optimize an already-optimized function · cab15c81

Georg Neis authored 5 years ago

Bug: chromium:1028208
Change-Id: I439cb5acf4487ab0e4af0dcd065f1ccb78b2e7a1
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1946351Reviewed-by: Mythri Alle <mythria@chromium.org>
Commit-Queue: Georg Neis <neis@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65282}

cab15c81

04 Sep, 2019 1 commit

Revert "[compiler] improve inlining heuristics: call frequency per executed bytecodes" · eb443e1f

Tobias Tebbi authored 5 years ago

This reverts commit 352a154e.

Reason for revert: https://crbug.com/999972

Original change's description:
> [compiler] improve inlining heuristics: call frequency per executed bytecodes
> 
> TLDR: Inline less, but more where it matters. ~10% decrease in Turbofan
> compile time including off-thread, while improving Octane scores by ~2%.
> 
> How things used to work:
> 
> There is a flag FLAG_min_inlining_frequency that limits inlining by
> the callsite being sufficiently frequently executed. This call frequency
> was measured relative to invocations of the parent (= the function we
> originally optimize). At the same time, the limit was very low (0.15),
> meaning we mostly relied on the total amount of inlined code
> (FLAG_max_inlined_bytecode_size_cumulative) to limit inlining.
> 
> How things work now:
> 
> Instead of measuring call frequency relative to parent invocations, we
> should have a measure that predicts how often the callsite in question
> will be executed in the future. An obvious attempt at that would be to
> measure how often the callsite was executed in absolute numbers in the
> past. But depending on how fast feedback stabilizes, it can take more
> or less time until we optimize a function. If we just take the absolute
> call frequency up to the point in time when we optimize, we would
> inline more for functions that stabilize slowly, which doesn't make
> sense. So instead, we measure absolute call count per KB of executed
> bytecodes of the parent function.
> Since inlining big functions is more expensive, this threshold is
> additionally scaled linearly with the bytecode-size of the inlinee.
> The resulting formula is:
> call_frequency >
> FLAG_min_inlining_frequency *
>   (bytecode.length() - FLAG_max_inlined_bytecode_size_small) /
>   (FLAG_max_inlined_bytecode_size - FLAG_max_inlined_bytecode_size_small)
> 
> The new threshold is chosen in a way that it effectively limits
> inlining, which allows us to increase
> FLAG_max_inlined_bytecode_size_cumulative without increasing inlining
> in general.
> 
> The reduction in compile time (x64 build) of ~10% was observed in Octane,
> ARES-6, web-tooling-benchmark, and the standalone TypeScript benchmark.
> The hope is that this will reduce CPU-time in real-world situations
> too.
> The Octane improvements come from inlining more in places where it
> matters.
> 
> Bug: v8:6682
> 
> Change-Id: I99baa17dec85b71616a3ab3414d7e055beca39a0
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1768366
> Commit-Queue: Tobias Tebbi <tebbi@chromium.org>
> Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
> Reviewed-by: Georg Neis <neis@chromium.org>
> Reviewed-by: Maya Lekova <mslekova@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#63449}

TBR=rmcilroy@chromium.org,neis@chromium.org,jgruber@chromium.org,tebbi@chromium.org,mslekova@chromium.org

# Not skipping CQ checks because original CL landed > 1 day ago.

Bug: v8:6682 chromium:999972
Change-Id: Iffca63d4bef81afa0f66e34d35fb72f3b5baf517
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1784281Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Commit-Queue: Tobias Tebbi <tebbi@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63554}

eb443e1f

29 Aug, 2019 1 commit

[compiler] improve inlining heuristics: call frequency per executed bytecodes · 352a154e

Tobias Tebbi authored 5 years ago

TLDR: Inline less, but more where it matters. ~10% decrease in Turbofan
compile time including off-thread, while improving Octane scores by ~2%.

How things used to work:

There is a flag FLAG_min_inlining_frequency that limits inlining by
the callsite being sufficiently frequently executed. This call frequency
was measured relative to invocations of the parent (= the function we
originally optimize). At the same time, the limit was very low (0.15),
meaning we mostly relied on the total amount of inlined code
(FLAG_max_inlined_bytecode_size_cumulative) to limit inlining.

How things work now:

Instead of measuring call frequency relative to parent invocations, we
should have a measure that predicts how often the callsite in question
will be executed in the future. An obvious attempt at that would be to
measure how often the callsite was executed in absolute numbers in the
past. But depending on how fast feedback stabilizes, it can take more
or less time until we optimize a function. If we just take the absolute
call frequency up to the point in time when we optimize, we would
inline more for functions that stabilize slowly, which doesn't make
sense. So instead, we measure absolute call count per KB of executed
bytecodes of the parent function.
Since inlining big functions is more expensive, this threshold is
additionally scaled linearly with the bytecode-size of the inlinee.
The resulting formula is:
call_frequency >
FLAG_min_inlining_frequency *
  (bytecode.length() - FLAG_max_inlined_bytecode_size_small) /
  (FLAG_max_inlined_bytecode_size - FLAG_max_inlined_bytecode_size_small)

The new threshold is chosen in a way that it effectively limits
inlining, which allows us to increase
FLAG_max_inlined_bytecode_size_cumulative without increasing inlining
in general.

The reduction in compile time (x64 build) of ~10% was observed in Octane,
ARES-6, web-tooling-benchmark, and the standalone TypeScript benchmark.
The hope is that this will reduce CPU-time in real-world situations
too.
The Octane improvements come from inlining more in places where it
matters.

Bug: v8:6682

Change-Id: I99baa17dec85b71616a3ab3414d7e055beca39a0
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1768366
Commit-Queue: Tobias Tebbi <tebbi@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Georg Neis <neis@chromium.org>
Reviewed-by: Maya Lekova <mslekova@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63449}

352a154e