-
Zhi An Ng authored
Couple of optimizations for v128.bitselect on both ia32 and x64. 1. Remove an extra movaps when AVX is supported, since we have 3-operand instructions 2. Tweak the algorithm from: xor(and(xor(src1, src2), mask) src2) To: or(and(src1, mask), andnot(src2, mask)) It is easier to read and understand, and also eliminate a dependency chain (on kScratchDoubleReg) in the older algorithm. 3. Use integer forms of the logical ops. Older processors have higher throughput on these, compared to the floating point ops. However, the integer forms are 1 byte longer, so on SSE, we stick to the floating point ops. For AVX, this reduces instruction count from 9948 to 9868. Change-Id: Idd5d26b99a76255dbfa63e2c304e6af3760c4ec6 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2591859Reviewed-by: Bill Budge <bbudge@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/master@{#71845}
8e9ad4f8