[turbofan] fine grained in-block move optimization
So far, we've been moving down gaps wholesale. This change moves individual move operations instead. This improves some benchmarks, and should overall reduce code size, because it improves the chance of reducing the number of moves. For example, there are improvements on x64 in Emscripten (Bullet, in particular) , JetStream geomean, Embenchen (zlib). In the process of making this change, I noticed we can separate the tasks performed by the move optimizer, as follows: - group gaps into 1 - push gaps down, jumping instructions (these 2 were together before) - merge blocks (and then push gaps down) - finalize We can do without a finalization list. This avoids duplicating storage - we already have the list of instructions; it also simplifies the logic, since, with this change, we may process an instruction's gap twice. Compile time doesn't regress much (see pathological cases), but we may want to avoid the allocations of the few sets used in the new code. I'll do that in a subsequent change. BUG= Review URL: https://codereview.chromium.org/1634093002 Cr-Commit-Position: refs/heads/master@{#33715}
Showing
This diff is collapsed.
Please
register
or
sign in
to comment