- 05 Sep, 2016 1 commit
-
-
mtrofin authored
has_slot_use is computed early, and we need it to determine if we need to generate SpillRanges. After splintering, however, the information may be incorrect - e.g. just the splinter may have slot uses, and not the original. BUG= Review-Url: https://codereview.chromium.org/2312523002 Cr-Commit-Position: refs/heads/master@{#39147}
-
- 22 Jan, 2016 1 commit
-
-
mtrofin authored
Revert "Revert of [turbofan] optimize spills in defered blocks (patchset #3 id:240001 of https://codereview.chromium.org/1551013002/ )" This reverts commit 7f62e122. The regressions reported in the bug below appear to be bogus, and caused by chromium:579503 BUG=chromium:579900 LOG=N Review URL: https://codereview.chromium.org/1612923003 Cr-Commit-Position: refs/heads/master@{#33454}
-
- 21 Jan, 2016 1 commit
-
-
jarin authored
Revert of [turbofan] optimize spills in defered blocks (patchset #3 id:240001 of https://codereview.chromium.org/1551013002/ ) Reason for revert: Regresses lots of benchmarks: https://crbug.com/579900 Original issue's description: > [turbofan] optimize spills in defered blocks > > Up to now, for ranges spilled in deferred blocks, we would spill every > time a range would switch from using a register to spill slots. That can > be redundant, leading to avoidable code size cost. > > This change addresses this issue, by performing the spills as early as > possible. > > BUG= > > Committed: https://crrev.com/7c54dc33855b8ac31f26b309671f9b5481a74376 > Cr-Commit-Position: refs/heads/master@{#33413} TBR=bmeurer@chromium.org,mtrofin@chromium.org # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG= Review URL: https://codereview.chromium.org/1612013002 Cr-Commit-Position: refs/heads/master@{#33431}
-
- 20 Jan, 2016 1 commit
-
-
mtrofin authored
Up to now, for ranges spilled in deferred blocks, we would spill every time a range would switch from using a register to spill slots. That can be redundant, leading to avoidable code size cost. This change addresses this issue, by performing the spills as early as possible. BUG= Review URL: https://codereview.chromium.org/1551013002 Cr-Commit-Position: refs/heads/master@{#33413}
-
- 18 Jan, 2016 1 commit
-
-
mtrofin authored
We don't need to treat as "spills in deferred blocks" ranges that have no spill range, or that have a spill operand. Review URL: https://codereview.chromium.org/1597883002 Cr-Commit-Position: refs/heads/master@{#33349}
-
- 14 Jan, 2016 1 commit
-
-
mtrofin authored
We were sometimes losing a splintering opportunity when a range was ending at the beginning of a hot (==non-deferred) block, when giving its value to some fixed range - i.e. a fixed operand of the first instruction in that hot block. Renamed 2 APIs to better reflect what their intent is. Added self-checking when introducing moves connecting ranges, to ensure we don't spill/fill in hot blocks ranges spilled only in deferred blocks. Verified locally that these checks would have tripped in a few cases before this change. BUG= Review URL: https://codereview.chromium.org/1564583002 Cr-Commit-Position: refs/heads/master@{#33301}
-
- 10 Dec, 2015 1 commit
-
-
jarin authored
MachineType is now a class with two enum fields: - MachineRepresentation - MachineSemantic Both enums are usable on their own, and this change switches some places from using MachineType to use just MachineRepresentation. Most notably: - register allocator now uses just the representation. - Phi and Select nodes only refer to representations. Review URL: https://codereview.chromium.org/1513543003 Cr-Commit-Position: refs/heads/master@{#32738}
-
- 25 Nov, 2015 1 commit
-
-
mtrofin authored
validating that the hot path does not spill - somewhat simpler code. Cleared the scenario where a range is defined in a deferred block. The code before was slightly more complicated by not leveraging the property that these sort of ranges would be completely contained within deferred blocks. Moved "spills in deferred blocks" marking to a more appropriate location. One thing this CL achieves is correct support for scenarios where a range is spilled both on the deferred and then hot path, and the ranges concatenate. I owe better unit testing, which I will add in a subsequent CL. BUG= Review URL: https://codereview.chromium.org/1472803004 Cr-Commit-Position: refs/heads/master@{#32302}
-
- 26 Oct, 2015 1 commit
-
-
mtrofin authored
https://codereview.chromium.org/1412123009/ )" This reverts commit 5308a999. BUG=chromium:546416 LOG=N Review URL: https://codereview.chromium.org/1424653004 Cr-Commit-Position: refs/heads/master@{#31581}
-
- 23 Oct, 2015 2 commits
-
-
machenbach authored
Revert of [Turbofan] Fix perf regression (patchset #1 id:1 of https://codereview.chromium.org/1412123009/ ) Reason for revert: [sheriff] breaks benchmarks: http://build.chromium.org/p/client.v8/builders/V8%20Linux%20-%20debug/builds/4998 Original issue's description: > [Turbofan] Fix perf regression introduced by per-range change. > > When the range ends just at the gap of a non-deferred block, the last > instruction the range covers is in the predecessor. If that predecessor is > a deferred block, before this CL, we would splinter the remainder of the > range all the way to the end. That leads to inefficient codegen, because > we still want a split inside the deferred block. > > Also, opportunistically added a trace before we splinter, for better > diagnostics. > > BUG= chromium:546416 > LOG=N > > Committed: https://crrev.com/32b6e085e74a8fcf94a01d20740fe4fdede07a86 > Cr-Commit-Position: refs/heads/master@{#31529} TBR=bmeurer@chromium.org,jarin@chromium.org,mtrofin@chromium.org NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG= chromium:546416 Review URL: https://codereview.chromium.org/1412893007 Cr-Commit-Position: refs/heads/master@{#31531}
-
mtrofin authored
When the range ends just at the gap of a non-deferred block, the last instruction the range covers is in the predecessor. If that predecessor is a deferred block, before this CL, we would splinter the remainder of the range all the way to the end. That leads to inefficient codegen, because we still want a split inside the deferred block. Also, opportunistically added a trace before we splinter, for better diagnostics. BUG= chromium:546416 LOG=N Review URL: https://codereview.chromium.org/1412123009 Cr-Commit-Position: refs/heads/master@{#31529}
-
- 21 Oct, 2015 1 commit
-
-
mtrofin authored
Revert "Revert of [turbofan] Splinter into one range. (patchset #2 id:80001 of https://codereview.chromium.org/1391023007/ )" This reverts commit 23a8837f. Also added a CHECK in Merge to validate that splitting yields a different range and thus advances the algorithm. Ran stress bots successfully. Likely my earlier change in Splintering addressed the stress test scenario that was looping infinitely. BUG= Review URL: https://codereview.chromium.org/1406983004 Cr-Commit-Position: refs/heads/master@{#31430}
-
- 20 Oct, 2015 1 commit
-
-
mtrofin authored
range's internal structure, we take a range at a time and splinter based on the blocks it covers. This is no different in scenarios where a UseInterval covers non-deferred then deferred blocks. However, in scenarios where a deferred block jumps to another one, and there are no other blocks covered by the range in between, this CL will treat the two such blocks together, while the previous one would treat them separately. This matters in cases such as deoptimization blocks preceded (not necessarily consecutively) by a single instruction (jump) Merging block. Review URL: https://codereview.chromium.org/1415833002 Cr-Commit-Position: refs/heads/master@{#31422}
-
- 15 Oct, 2015 1 commit
-
-
bmeurer authored
Revert of [turbofan] Splinter into one range. (patchset #2 id:80001 of https://codereview.chromium.org/1391023007/ ) Reason for revert: Weird endless loop in TopLevelLiveRange::Merge() due to always splitting first and not making progress. See comments, unfortunately no useable repro. Original issue's description: > [turbofan] Splinter into one range. > > Before this CL, we created one live range per successive set of > deferred blocks. For scenarios with many such blocks, this creates > an upfront pressure for the register allocator to deal with many ranges. > Linear sorts ranges, which is a super-linear operation. > > The change places all deferred intervals into one range, meaning that, > at most, there will be twice as many live ranges as the original set. In > pathological cases (benchmarks/Compile/slow_nbody1.js), this change > halves the compilation time. We see some improvements elsewhere, > notably SQLite at ~4-5%. > > We may be able to avoid the subsequent merge. Its cost is the > additional ranges it may need to create. The sole reason for the merge > phase is to provide an unchanged view of the world to the subsequent > phases. With the at-most-one splinter model, we may be able to teach > the other phases about splintering - should we find perf hindrances > due to merging. > > Committed: https://crrev.com/efdcd20267870276c5824f1ccf4e171ac378f7ae > Cr-Commit-Position: refs/heads/master@{#31224} TBR=jarin@chromium.org,mtrofin@google.com,mtrofin@chromium.org NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true Review URL: https://codereview.chromium.org/1403163003 Cr-Commit-Position: refs/heads/master@{#31300}
-
- 13 Oct, 2015 1 commit
-
-
mtrofin authored
Before this CL, we created one live range per successive set of deferred blocks. For scenarios with many such blocks, this creates an upfront pressure for the register allocator to deal with many ranges. Linear sorts ranges, which is a super-linear operation. The change places all deferred intervals into one range, meaning that, at most, there will be twice as many live ranges as the original set. In pathological cases (benchmarks/Compile/slow_nbody1.js), this change halves the compilation time. We see some improvements elsewhere, notably SQLite at ~4-5%. We may be able to avoid the subsequent merge. Its cost is the additional ranges it may need to create. The sole reason for the merge phase is to provide an unchanged view of the world to the subsequent phases. With the at-most-one splinter model, we may be able to teach the other phases about splintering - should we find perf hindrances due to merging. Review URL: https://codereview.chromium.org/1391023007 Cr-Commit-Position: refs/heads/master@{#31224}
-
- 03 Sep, 2015 1 commit
-
-
mtrofin authored
While we may have consecutive deferred blocks, they may be entered from other blocks that are not predecessors of the first block. Same for the last block and exiting. So the total set of live ranges within the given set of blocks is the union of the individual in and out live sets. This is a performance bug, not a functional bug. BUG= Review URL: https://codereview.chromium.org/1322703003 Cr-Commit-Position: refs/heads/master@{#30564}
-
- 31 Aug, 2015 1 commit
-
-
mtrofin authored
Found this while working on the unit tests for split/splinter/merge. The dependency is unnecessary and hinders testability. BUG= Review URL: https://codereview.chromium.org/1325453005 Cr-Commit-Position: refs/heads/master@{#30465}
-
- 28 Aug, 2015 1 commit
-
-
mtrofin authored
This avoids a whole range traversal each time we encounter a deferred block (or a succession of them). The traversal (in the removed IsIntervalAlreadyExcluded) is unnecessary - an interval with a hole where deferred blocks are shouldn't be listed in the in/out sets of those blocks in the first place. It turns out the root cause (that appeared like we had to special case ranges with holes, as the comment described) was deferred blocks with a deoptimization call. That would place the live range in the in_set of the block, but then splitting would fail because the start and split position would be the same - this is because everywhere else, the deferred block would have at least a second instruction, other than the use - like a jump - ahead of which we'd perform the lower part of the splintering. In the usual case, this choice of a position avoids moves on the hot path (because any moves will be before the jump, but still in the deferred block). With deoptimization calls, that's not the case, there is just one instruction, the deoptimization call. So we perform the second cut of the splintering right after the block. Since there is no control flow from the deoptimization block to any functional block - the control flow goes to the exit block - the range connector won't insert moves on the hot path - although we may want to see what happens for the exit block, and maybe teach the range connector to ignore control flow appearing to come from blocks with deoptimization calls. Review URL: https://codereview.chromium.org/1323473003 Cr-Commit-Position: refs/heads/master@{#30447}
-
- 27 Aug, 2015 1 commit
-
-
mtrofin authored
Related to 1318893002 - another source of regressions in benchmarks sensitive to compile time is the splintering logic. This change addresses some, but not all, of that. In particular, there are still some places (figuring out if a range has a hole right where a deferred set of blocks is) that need another look. BUG=chromium:1318893002 LOG=n Review URL: https://codereview.chromium.org/1319843002 Cr-Commit-Position: refs/heads/master@{#30425}
-
- 26 Aug, 2015 1 commit
-
-
mtrofin authored
A TopLevelLiveRange is the live range of a virtual register. Through register allocation, it may end up being split in a succession of child live ranges, where data flow is handled through moves from predecessor to successor child. Today, the concepts of "top level" and "child" live ranges are conflated under the LiveRange class. However, a good few APIs pertain solely to TopLevelLiveRanges. This was communicated through comments or DCHECKs - but this makes for poor code comprehensibility and maintainability. For example, the worklist of the register allocator (live_ranges()) needs to only contain TopLevelLiveRanges; spill range concerns are associated only with the top range; phi-ness; certain phases in the allocation pipeline; APIs on LiveRange used for initial construction - before splitting; splintering - these are all responsibilities associated to TopLevelLiveRanges, and not child live ranges. This change separates the concepts. An effect of this change is that child live range allocation need not involve RegisterAllocationData. That's "a good thing" (lower coupling), but it has the side-effect of not having a good way to construct unique identifiers for child live ranges, relative to a given InstructionSequence. LiveRange Id are used primarily for tracing/output-ing, and debugging. I propose a 2-component identifier: a virtual register (vreg) number, uniquely identifying TopLevelLiveRanges; and a relative identifier, which uniquely identifies children of a given TopLevelLiveRange. "0" is reserved for the TopLevel range. The relative identifier does not necessarily indicate order in the child chain, which is no worse than the current state of affairs. I believe this change should make it easier to understand a trace output (because the virtual register number is readily available). I plan to formalize with a small structure the notion of live range id, and consolidate tracing around that, as part of a separate CL. (there are seemingly disparate ways to trace - printf or stream-based APIs - so this seems like an opportune change to consolidate that) Review URL: https://codereview.chromium.org/1311983002 Cr-Commit-Position: refs/heads/master@{#30370}
-
- 25 Aug, 2015 1 commit
-
-
mtrofin authored
This change encompasses what is necessary to enable stack checks in loops without suffering large regressions. Primarily, it consists of a new mechanism for dealing with deferred blocks by "splintering", rather than splitting, inside deferred blocks. My initial change was splitting along deferred block boundaries, but the regression introduced by stackchecks wasn't resolved conclusively. After investigation, it appears that just splitting ranges along cold block boundaries leads to a greater opportunity for moves on the hot path, hence the suboptimal outcome. The alternative "splinters" ranges rather than splitting them. While splitting creates 2 ranges and links them (parent-child), in contrast, splintering creates a new independent range with no parent-child relation to the original. The original range appears as if it has a liveness hole in the place of the splintered one. All thus obtained ranges are then register allocated with no change to the register allocator. The splinters (cold blocks) do not conflict with the hot path ranges, by construction. The hot path ones have less pressure to split, because we remove a source of conflicts. After allocation, we merge the splinters back to their original ranges and continue the pipeline. We leverage the previous changes made for deferred blocks (determining where to spill, for example). Review URL: https://codereview.chromium.org/1305393003 Cr-Commit-Position: refs/heads/master@{#30357}
-