Commits · e567dd3ab4b5ec1c72ff91716800b0fcdcd2c1b2 · Linshizhi / V8

05 Sep, 2016 1 commit

[turbofan] Readjust has_slot_use after splintering · f1a6e5e7

mtrofin authored 8 years ago

has_slot_use is computed early, and we need it to determine if we need
to generate SpillRanges. After splintering, however, the information may
be incorrect - e.g. just the splinter may have slot uses, and not the
original.

BUG=

Review-Url: https://codereview.chromium.org/2312523002
Cr-Commit-Position: refs/heads/master@{#39147}

f1a6e5e7

22 Jan, 2016 1 commit

Revert "Revert of [turbofan] optimize spills in defered blocks (patchset #3... · 17ea41ba

mtrofin authored 8 years ago

Revert "Revert of [turbofan] optimize spills in defered blocks (patchset #3 id:240001 of https://codereview.chromium.org/1551013002/ )"

This reverts commit 7f62e122.

The regressions reported in the bug below appear to be bogus, and
caused by chromium:579503

BUG=chromium:579900
LOG=N

Review URL: https://codereview.chromium.org/1612923003

Cr-Commit-Position: refs/heads/master@{#33454}

17ea41ba

21 Jan, 2016 1 commit

Revert of [turbofan] optimize spills in defered blocks (patchset #3 id:240001... · 7f62e122

jarin authored 8 years ago

Revert of [turbofan] optimize spills in defered blocks (patchset #3 id:240001 of https://codereview.chromium.org/1551013002/ )

Reason for revert:
Regresses lots of benchmarks: https://crbug.com/579900

Original issue's description:
> [turbofan] optimize spills in defered blocks
>
> Up to now, for ranges spilled in deferred blocks, we would spill every
> time a range would switch from using a register to spill slots. That can
> be redundant, leading to avoidable code size  cost.
>
> This change addresses this issue, by performing the spills as early as
> possible.
>
> BUG=
>
> Committed: https://crrev.com/7c54dc33855b8ac31f26b309671f9b5481a74376
> Cr-Commit-Position: refs/heads/master@{#33413}

TBR=bmeurer@chromium.org,mtrofin@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=

Review URL: https://codereview.chromium.org/1612013002

Cr-Commit-Position: refs/heads/master@{#33431}

7f62e122

20 Jan, 2016 1 commit

[turbofan] optimize spills in defered blocks · 7c54dc33

mtrofin authored 8 years ago

Up to now, for ranges spilled in deferred blocks, we would spill every
time a range would switch from using a register to spill slots. That can
be redundant, leading to avoidable code size  cost.

This change addresses this issue, by performing the spills as early as
possible.

BUG=

Review URL: https://codereview.chromium.org/1551013002

Cr-Commit-Position: refs/heads/master@{#33413}

7c54dc33

18 Jan, 2016 1 commit

[turbofan] Skip spill operand ranges from spilled in deferred. · 4ca809d7

mtrofin authored 8 years ago

We don't need to treat as "spills in deferred blocks" ranges that have
no spill range, or that have a spill operand.

Review URL: https://codereview.chromium.org/1597883002

Cr-Commit-Position: refs/heads/master@{#33349}

4ca809d7

14 Jan, 2016 1 commit

[turbofan] Splinter when range ends at hot block start · 83683e92

mtrofin authored 8 years ago

We were sometimes losing a splintering opportunity when a range was
ending at the beginning of a hot (==non-deferred) block, when giving
its value to some fixed range - i.e. a fixed operand of the first instruction
in that hot block.

Renamed 2 APIs to better reflect what their intent is.

Added self-checking when introducing moves connecting ranges, to
ensure we don't spill/fill in hot blocks ranges spilled only in deferred
blocks. Verified locally that these checks would have tripped in a few
cases before this change.

BUG=

Review URL: https://codereview.chromium.org/1564583002

Cr-Commit-Position: refs/heads/master@{#33301}

83683e92

10 Dec, 2015 1 commit

[turbofan] Make MachineType a pair of enums. · bb2a830d

jarin authored 9 years ago

MachineType is now a class with two enum fields:
- MachineRepresentation
- MachineSemantic

Both enums are usable on their own, and this change switches some places from using MachineType to use just MachineRepresentation. Most notably:
- register allocator now uses just the representation.
- Phi and Select nodes only refer to representations.

Review URL: https://codereview.chromium.org/1513543003

Cr-Commit-Position: refs/heads/master@{#32738}

bb2a830d

25 Nov, 2015 1 commit

A simpler way to determine if a range spills only in deferred blocks, by · be7e4361

mtrofin authored 9 years ago

validating that the hot path does not spill - somewhat simpler code.

Cleared the scenario where a range is defined in a deferred block. The
code before was slightly more complicated by not leveraging the
property that these sort of ranges would be completely contained within
deferred blocks.

Moved "spills in deferred blocks" marking to a more appropriate
location.

One thing this CL achieves is correct support for scenarios where a
range is spilled both on the deferred and then hot path, and the ranges
concatenate. I owe better unit testing, which I will add in a subsequent
CL.

BUG=

Review URL: https://codereview.chromium.org/1472803004

Cr-Commit-Position: refs/heads/master@{#32302}

be7e4361

26 Oct, 2015 1 commit

Reland of "[Turbofan] Fix perf regression (patchset #1 id:1 of · 109f8565

mtrofin authored 9 years ago

https://codereview.chromium.org/1412123009/ )"

This reverts commit 5308a999.

BUG=chromium:546416
LOG=N

Review URL: https://codereview.chromium.org/1424653004

Cr-Commit-Position: refs/heads/master@{#31581}

109f8565

23 Oct, 2015 2 commits

Revert of [Turbofan] Fix perf regression (patchset #1 id:1 of... · 5308a999

machenbach authored 9 years ago

Revert of [Turbofan] Fix perf regression (patchset #1 id:1 of https://codereview.chromium.org/1412123009/ )

Reason for revert:
[sheriff] breaks benchmarks:
http://build.chromium.org/p/client.v8/builders/V8%20Linux%20-%20debug/builds/4998

Original issue's description:
> [Turbofan] Fix perf regression introduced by per-range change.
>
> When the range ends just at the gap of a non-deferred block, the last
> instruction the range covers is in the predecessor. If that predecessor is
> a deferred block, before this CL, we would splinter the remainder of the
> range all the way to the end. That leads to inefficient codegen, because
> we still want a split inside the deferred block.
>
> Also, opportunistically added a trace before we splinter, for better
> diagnostics.
>
> BUG= chromium:546416
> LOG=N
>
> Committed: https://crrev.com/32b6e085e74a8fcf94a01d20740fe4fdede07a86
> Cr-Commit-Position: refs/heads/master@{#31529}

TBR=bmeurer@chromium.org,jarin@chromium.org,mtrofin@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG= chromium:546416

Review URL: https://codereview.chromium.org/1412893007

Cr-Commit-Position: refs/heads/master@{#31531}

5308a999

[Turbofan] Fix perf regression introduced by per-range change. · 32b6e085

mtrofin authored 9 years ago

When the range ends just at the gap of a non-deferred block, the last
instruction the range covers is in the predecessor. If that predecessor is
a deferred block, before this CL, we would splinter the remainder of the
range all the way to the end. That leads to inefficient codegen, because
we still want a split inside the deferred block.

Also, opportunistically added a trace before we splinter, for better
diagnostics.

BUG= chromium:546416
LOG=N

Review URL: https://codereview.chromium.org/1412123009

Cr-Commit-Position: refs/heads/master@{#31529}

32b6e085

21 Oct, 2015 1 commit

[Turbofan] Re-enable single splinter. · 3e2e2062

mtrofin authored 9 years ago

Revert "Revert of [turbofan] Splinter into one range.
(patchset #2 id:80001 of https://codereview.chromium.org/1391023007/ )"

This reverts commit 23a8837f.

Also added a CHECK in Merge to validate that splitting yields a different
range and thus advances the algorithm. Ran stress bots successfully. Likely my earlier change in Splintering addressed the stress test scenario
that was looping infinitely.

BUG=

Review URL: https://codereview.chromium.org/1406983004

Cr-Commit-Position: refs/heads/master@{#31430}

3e2e2062

20 Oct, 2015 1 commit

Instead of splintering by chunks of deferred blocks, irrespective of the · 27f51390

mtrofin authored 9 years ago

range's internal structure, we take a range at a time and splinter based on
the blocks it covers. This is no different in scenarios where a UseInterval
covers non-deferred then deferred blocks. However, in scenarios where
a deferred block jumps to another one, and there are no other blocks
covered by the range in between, this CL will treat the two such blocks
together, while the previous one would treat them separately. This matters
in cases such as deoptimization blocks preceded (not necessarily
consecutively) by a single instruction (jump) Merging block.

Review URL: https://codereview.chromium.org/1415833002

Cr-Commit-Position: refs/heads/master@{#31422}

27f51390

15 Oct, 2015 1 commit

Revert of [turbofan] Splinter into one range. (patchset #2 id:80001 of... · 23a8837f

bmeurer authored 9 years ago

Revert of [turbofan] Splinter into one range. (patchset #2 id:80001 of https://codereview.chromium.org/1391023007/ )

Reason for revert:
Weird endless loop in TopLevelLiveRange::Merge() due to always splitting first and not making progress. See comments, unfortunately no useable repro.

Original issue's description:
> [turbofan] Splinter into one range.
>
> Before this CL, we created one live range per successive set of
> deferred blocks. For scenarios with many such blocks, this creates
> an upfront pressure for the register allocator to deal with many ranges.
> Linear sorts ranges, which is a super-linear operation.
>
> The change places all deferred intervals into one range, meaning that,
> at most, there will be twice as many live ranges as the original set. In
> pathological cases (benchmarks/Compile/slow_nbody1.js), this change
> halves the compilation time. We see some improvements elsewhere,
> notably SQLite at ~4-5%.
>
> We may be able to avoid the subsequent merge. Its cost is the
> additional ranges it may need to create. The sole reason for the merge
> phase is to provide an unchanged view of the world to the subsequent
> phases. With the at-most-one splinter model, we may be able to teach
> the other phases about splintering - should we find perf hindrances
> due to merging.
>
> Committed: https://crrev.com/efdcd20267870276c5824f1ccf4e171ac378f7ae
> Cr-Commit-Position: refs/heads/master@{#31224}

TBR=jarin@chromium.org,mtrofin@google.com,mtrofin@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true

Review URL: https://codereview.chromium.org/1403163003

Cr-Commit-Position: refs/heads/master@{#31300}

23a8837f

13 Oct, 2015 1 commit

[turbofan] Splinter into one range. · efdcd202

mtrofin authored 9 years ago

Before this CL, we created one live range per successive set of
deferred blocks. For scenarios with many such blocks, this creates
an upfront pressure for the register allocator to deal with many ranges.
Linear sorts ranges, which is a super-linear operation.

The change places all deferred intervals into one range, meaning that,
at most, there will be twice as many live ranges as the original set. In
pathological cases (benchmarks/Compile/slow_nbody1.js), this change
halves the compilation time. We see some improvements elsewhere,
notably SQLite at ~4-5%.

We may be able to avoid the subsequent merge. Its cost is the
additional ranges it may need to create. The sole reason for the merge
phase is to provide an unchanged view of the world to the subsequent
phases. With the at-most-one splinter model, we may be able to teach
the other phases about splintering - should we find perf hindrances
due to merging.

Review URL: https://codereview.chromium.org/1391023007

Cr-Commit-Position: refs/heads/master@{#31224}

efdcd202

03 Sep, 2015 1 commit

[turbofan] Include individual deferred block ranges in splintering. · 5fc253a8

mtrofin authored 9 years ago

While we may have consecutive deferred blocks, they may be entered
from other blocks that are not predecessors of the first block. Same
for the last block and exiting. So the total set of live ranges within
the given set of blocks is the union of the individual in and out live
sets.

This is a performance bug, not a functional bug.

BUG=

Review URL: https://codereview.chromium.org/1322703003

Cr-Commit-Position: refs/heads/master@{#30564}

5fc253a8

31 Aug, 2015 1 commit

[turbofan] Break dependency on RegisterAllocationData from Merge. · 2f27911b

mtrofin authored 9 years ago

Found this while working on the unit tests for split/splinter/merge. The
dependency is unnecessary and hinders testability.

BUG=

Review URL: https://codereview.chromium.org/1325453005

Cr-Commit-Position: refs/heads/master@{#30465}

2f27911b

28 Aug, 2015 1 commit

[turbofan] Splintering: special case deoptimizing blocks. · bed054c4

mtrofin authored 9 years ago

This avoids a whole range traversal each time we encounter a deferred
block (or a succession of them). The traversal (in the removed
IsIntervalAlreadyExcluded) is unnecessary - an interval with a hole
where deferred blocks are shouldn't be listed in the in/out sets of
those blocks in the first place.

It turns out the root cause (that appeared like we had to special
case ranges with holes, as the comment described) was deferred
blocks with a deoptimization call. That would place the live range
in the in_set of the block, but then splitting would fail because the start
and split position would be the same - this is because everywhere else,
the deferred block would have at least a second instruction, other
than the use - like a jump - ahead of which we'd perform the lower
part of the splintering. In the usual case, this choice of a position
avoids moves on the hot path (because any moves will be before the
jump, but still in the deferred block).

With deoptimization calls, that's not the case, there is just one
instruction, the deoptimization call. So we perform the second cut of
the splintering right after the block. Since there is no control flow from
the deoptimization block to any functional block - the control flow
goes to the exit block - the range connector won't insert moves on the
hot path - although we may want to see what happens for the exit
block, and maybe teach the range connector to ignore control flow
appearing to come from blocks with deoptimization calls.

Review URL: https://codereview.chromium.org/1323473003

Cr-Commit-Position: refs/heads/master@{#30447}

bed054c4

27 Aug, 2015 1 commit

[turbofan] LiveRange splintering optimizations. · 2ba2f40c

mtrofin authored 9 years ago

Related to 1318893002 - another source of regressions in
benchmarks sensitive to compile time is the splintering
logic. This change addresses some, but not all, of that. In
particular, there are still some places (figuring out if a
range has a hole right where a deferred set of blocks is)
that need another look.

BUG=chromium:1318893002
LOG=n

Review URL: https://codereview.chromium.org/1319843002

Cr-Commit-Position: refs/heads/master@{#30425}

2ba2f40c

26 Aug, 2015 1 commit

[turbofan] Separate LiveRange and TopLevelLiveRange concepts · 0ee4b473

mtrofin authored 9 years ago

A TopLevelLiveRange is the live range of a virtual register. Through
register allocation, it may end up being split in a succession of child
live ranges, where data flow is handled through moves from
predecessor to successor child.

Today, the concepts of "top level" and "child" live ranges are conflated
under the LiveRange class. However, a good few APIs pertain solely
to TopLevelLiveRanges. This was communicated through comments or
DCHECKs - but this makes for poor code comprehensibility and maintainability.

For example, the worklist of the register allocator (live_ranges()) needs
to only contain TopLevelLiveRanges; spill range concerns are associated
only with the top range; phi-ness; certain phases in the allocation pipeline;
APIs on LiveRange used for initial construction - before splitting;
splintering - these are all responsibilities associated to TopLevelLiveRanges,
and not child live ranges.

This change separates the concepts.

An effect of this change is that child live range allocation need not involve
RegisterAllocationData. That's "a good thing" (lower coupling), but it has
the side-effect of not having a good way to construct unique identifiers for
child live ranges, relative to a given InstructionSequence.

LiveRange Id are used primarily for tracing/output-ing, and debugging.

I propose a 2-component identifier: a virtual register (vreg) number,
uniquely identifying TopLevelLiveRanges; and a relative identifier, which
uniquely identifies children of a given TopLevelLiveRange. "0" is reserved
for the TopLevel range. The relative identifier does not necessarily
indicate order in the child chain, which is no worse than the current state
of affairs.

I believe this change should make it easier to understand a trace output
(because the virtual register number is readily available). I plan to formalize
with a small structure the notion of live range id, and consolidate tracing
around that, as part of a separate CL. (there are seemingly disparate ways
to trace - printf or stream-based APIs - so this seems like an opportune
change to consolidate that)

Review URL: https://codereview.chromium.org/1311983002

Cr-Commit-Position: refs/heads/master@{#30370}

0ee4b473

25 Aug, 2015 1 commit

[turbofan] Deferred blocks splintering. · 5d954d65

mtrofin authored 9 years ago

This change encompasses what is necessary to enable stack checks in loops without suffering large regressions.

Primarily, it consists of a new mechanism for dealing with deferred blocks by "splintering", rather than splitting, inside deferred blocks.

My initial change was splitting along deferred block boundaries, but the regression introduced by stackchecks wasn't resolved conclusively. After investigation, it appears that just splitting ranges along cold block boundaries leads to a greater opportunity for moves on the hot path, hence the suboptimal outcome.

The alternative "splinters" ranges rather than splitting them. While splitting creates 2 ranges and links them (parent-child), in contrast, splintering creates a new independent range with no parent-child relation to the original. The original range appears as if it has a liveness hole in the place of the splintered one. All thus obtained ranges are then register allocated with no change to the register allocator.

The splinters (cold blocks) do not conflict with the hot path ranges, by construction. The hot path ones have less pressure to split, because we remove a source of conflicts. After allocation, we merge the splinters back to their original ranges and continue the pipeline. We leverage the previous changes made for deferred blocks (determining where to spill, for example).

Review URL: https://codereview.chromium.org/1305393003

Cr-Commit-Position: refs/heads/master@{#30357}

5d954d65