• Ng Zhi An's avatar
    [wasm-simd][x64] Optimize f32x4 splat and extract lanes · 4068b3d2
    Ng Zhi An authored
    For splats, we can make use of vshufps to avoid a movss. Without
    AVX, specific dst to be same as src in the instruction selector.
    
    For extract lane, we can use vshufps to extract a float into a dst xmm,
    and leave junk in the higher bits.
    
    On the meshopt_decoder.js benchmark in linked bug, it removes about 7
    movss instructions that did nothing. Hardware can do register renaming,
    but let's not rely on that :)
    
    R=bbudge@chromium.org
    
    Bug: v8:10116
    Change-Id: I4d68c10536a79659de673060d537d58113308477
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2481473
    Commit-Queue: Zhi An Ng <zhin@chromium.org>
    Reviewed-by: 's avatarBill Budge <bbudge@chromium.org>
    Cr-Commit-Position: refs/heads/master@{#70628}
    4068b3d2
code-generator-x64.cc 177 KB