-
Ng Zhi An authored
Instead of putting the 16 immediate bytes on the stack, we move them into a temporary register. The instruction-selector then has to change, to ensure that the operands are distinct from the temporary. Tested on the two workloads given in https://github.com/zeux/wasm-simd/issues/2#issuecomment-614399004 For slow, the row "filter:" oct12 goes from ~50ms to ~27ms, the rest of the figures look about the same or slightly faster. For optimal, the same figure goes from ~25ms to ~24ms, the rest of the figures look slightly faster. Raw outputs are uploaded to bug. Bug: v8:10117 Change-Id: I7f77a3066b5e24584f1c01574aa9311f56bd7fb4 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2152853 Commit-Queue: Zhi An Ng <zhin@chromium.org> Reviewed-by: Bill Budge <bbudge@chromium.org> Reviewed-by: Deepti Gandluri <gdeepti@chromium.org> Cr-Commit-Position: refs/heads/master@{#67190}
a8b789fc