• Zhi An Ng's avatar
    [wasm-simd][x64] Optimize some integer widen_high ops · b145152d
    Zhi An Ng authored
    Optimize:
    - i32x4.widen_high_i16x8_s
    - i32x4.widen_high_i16x8_u
    - i16x8.widen_high_i8x16_s
    - i16x8.widen_high_i8x16_u
    
    These optimizations were suggested in http://b/175364869.
    
    The main change is to move away from palignr, which has a dependency on
    dst, and also the AVX version is 2 bytes longer than the punpckhqdq.
    
    For the signed and unsigned variants, we have slightly different
    optimizations. Unsigned variants can use an punpckh* instruction with a
    zero-ed scratch register, that effectively zero-extends. Signed variants
    use the movhlps instruction to move high half to low half of dst, then
    use packed signed extension instructions.
    
    The common fallback for these instructions is to use pshufd, which does
    not have a dependency on dst, but is 1 byte longer than the punpckh*
    instructions.
    
    FIXED=b/175364869
    
    Change-Id: If28da2aaa8f6e39a58e63b01cc9a81bbbb294606
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2591853Reviewed-by: 's avatarBill Budge <bbudge@chromium.org>
    Commit-Queue: Zhi An Ng <zhin@chromium.org>
    Cr-Commit-Position: refs/heads/master@{#71856}
    b145152d
code-generator-x64.cc 184 KB