- 14 Oct, 2021 1 commit
-
-
Ng Zhi An authored
4 instructions, i8x16, i16x8, i32x4, i64x2 relaxed lane select. These instructions only guarantee results when the entire lane is set or unset, so vpblendvb will give correct results for all of them. Bug: v8:12284 Change-Id: I76959a23f2d97de8ecc3bef43d138184484e3c4d Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3207006Reviewed-by:
Deepti Gandluri <gdeepti@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/main@{#77401}
-
- 07 Oct, 2021 1 commit
-
-
Ng Zhi An authored
x64 already had logic to enable a lower CPU extension if a higher level one was supported. Add this to ia32. And also add SSSE3->SSE3 logic. Drive-by cleanup to remove an extra CpuFeatureScope. Bug: v8:11154 Change-Id: I12e3aa990cc07149da213911c624468a39f4e1a3 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3212811Reviewed-by:
Deepti Gandluri <gdeepti@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/main@{#77291}
-
- 21 Sep, 2021 1 commit
-
-
Ng Zhi An authored
Drive-by edit to use ASM_CODE_COMMENT for better code comments for all the more complicated macro-assembler functions. Also undef macros (AVX_OP et al) since they are not longer used outside of shared-macro-assembler. Bug: v8:11589 Change-Id: I424f27b5b742a8efb26ccef87dbffb01eae60335 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3173892Reviewed-by:
Adam Klein <adamk@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/main@{#76973}
-
- 20 Sep, 2021 1 commit
-
-
Ng Zhi An authored
When dst != lhs, we moved lhs to dst, but dst can be == rhs, so we would overwrite rhs, and end up comparing lhs with itself, always returning false. We handle the different aliasing cases in the macro-assembler function I64x2GtS, to simplify the checks in Liftoff a little bit. TurboFan does not need to change as it will require dst == lhs when AVX is not supported. Bug: v8:12237 Change-Id: Icefa6eb79083c003e93dbbd11ccc419aae4b15d3 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3169312Reviewed-by:
Clemens Backes <clemensb@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/main@{#76945}
-
- 17 Sep, 2021 1 commit
-
-
Ng Zhi An authored
Optimize i64x2mul when AVX is supported to elide some moves. Bug: v8:11589 Change-Id: Ide0bba502a35cbb632e3fc311c9697c5f54f9d82 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3163280Reviewed-by:
Adam Klein <adamk@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/main@{#76889}
-
- 13 Sep, 2021 1 commit
-
-
Ng Zhi An authored
We move the implementation in Liftoff (which is the most general and handles AVX/SSE and also register aliasing) into shared-macro-assembler. Also consolidate SSE/AVX for ia32. No functionality change is expected. Bug: v8:11589 Bug: v8:11217 Change-Id: I64cc71791f04332dd3505055f4672430c2daf5ac Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3131373Reviewed-by:
Deepti Gandluri <gdeepti@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/main@{#76805}
-
- 08 Sep, 2021 1 commit
-
-
Ng Zhi An authored
Change-Id: I8afa821412ae248ddea990755404a9bf5f33184e Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3125434Reviewed-by:
Adam Klein <adamk@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/main@{#76736}
-
- 07 Sep, 2021 1 commit
-
-
Ng Zhi An authored
Bug: v8:12094 Change-Id: Ibefce881cbfcd4445485197a4a2615bdf0599ada Fixed: v8:12094 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3123638 Commit-Queue: Zhi An Ng <zhin@chromium.org> Reviewed-by:
Deepti Gandluri <gdeepti@chromium.org> Cr-Commit-Position: refs/heads/main@{#76706}
-
- 26 Aug, 2021 2 commits
-
-
Ng Zhi An authored
Change-Id: I65128f04c86ae5332b4fc477ce3a131552932990 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3122567Reviewed-by:
Deepti Gandluri <gdeepti@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/main@{#76519}
-
Ng Zhi An authored
Bug: v8:11589 Change-Id: I7c97920d8ab94408b5cde4e90e7ff1aa9bcaeeba Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3119995Reviewed-by:
Adam Klein <adamk@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/main@{#76511}
-
- 24 Aug, 2021 2 commits
-
-
Ng Zhi An authored
Bug: v8:11589 Change-Id: Ie51cfd6cd6315f7f14f0c584f190a478ed565b0e Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3114603Reviewed-by:
Adam Klein <adamk@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/main@{#76475}
-
Ng Zhi An authored
We were overwriting the shift Register, instead, we should be using the tmp_shift register. Bug: chromium:1242689 Change-Id: I732c9c1f8a43401ce003b22893db9e39dfac3817 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3116115 Commit-Queue: Zhi An Ng <zhin@chromium.org> Reviewed-by:
Adam Klein <adamk@chromium.org> Cr-Commit-Position: refs/heads/main@{#76466}
-
- 19 Aug, 2021 3 commits
-
-
Ng Zhi An authored
Fixed: v8:12095 Bug: v8:12095 Change-Id: If2021397000958ccdd058b99ce8f4d6e8d4d2836 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3097106Reviewed-by:
Deepti Gandluri <gdeepti@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/main@{#76398}
-
Ng Zhi An authored
liftoff-assembler-ia32.h can now use it. TurboFan ia32 doesn't use it because it generates different instruction codes (movlps, movhps). Bug: v8:11589 Change-Id: I07540814acff2d8ea48e06d1e00023d80b276a3d Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3095009 Commit-Queue: Zhi An Ng <zhin@chromium.org> Reviewed-by:
Deepti Gandluri <gdeepti@chromium.org> Cr-Commit-Position: refs/heads/main@{#76373}
-
Ng Zhi An authored
Move optimized implementation (accounts for AVX2) into shared-macro-assembler, and use it everywhere. Drive-by fix in liftoff-assembler-ia32.h to use Movss and Movsd macro-assembler functions to that they emit AVX when supported. Bug: v8:11589 Change-Id: Ibc4f2709d323d5b835bcac175a32b422d47d3355 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3095008 Commit-Queue: Zhi An Ng <zhin@chromium.org> Reviewed-by:
Deepti Gandluri <gdeepti@chromium.org> Cr-Commit-Position: refs/heads/main@{#76372}
-
- 17 Aug, 2021 2 commits
-
-
Ng Zhi An authored
Change i16x8.splat to use Punpcklqdq instead of Pshufd as the final step to move low 32 bits to all lanes. Move this implementation to shared-macro-assembler and use it everywhere. Bug: v8:11589,v8:12090 Change-Id: I968b1dca5a262e4e67875caea18c5c09828cb33a Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3092558 Commit-Queue: Zhi An Ng <zhin@chromium.org> Reviewed-by:
Deepti Gandluri <gdeepti@chromium.org> Cr-Commit-Position: refs/heads/main@{#76353}
-
Ng Zhi An authored
The optimal implementation is in TurboFan x64 codegen, move it into shared-macro-assembler, and have TurboFan ia32 and Liftoff use it. The optimal implementation accounts for AVX2 support. We add a couple of AVX2 instruction to ia32 in sse-instr.h, not all of them are used, but follow-up patches will use them, so we add support (including diassembly and test) in this change. Drive-by clean up to test-disasm-x64.cc to merge 2 AVX2 test sections. Bug: v8:11589 Change-Id: I1c8d7deb0f8bb70b29e7a680e5dbcfb09ca5505b Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3092555Reviewed-by:
Clemens Backes <clemensb@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/main@{#76352}
-
- 13 Aug, 2021 1 commit
-
-
Ng Zhi An authored
Use movsd/vmovsd instead of pblendw/vpblendw. It is two bytes shorter, and avoids mixing integer and floating-point domain instructions. Bug: v8:12074 Change-Id: Ia41072fbf8da7d99618a55d59634f7399a7105ce Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3088358Reviewed-by:
Deepti Gandluri <gdeepti@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/master@{#76287}
-
- 12 Aug, 2021 3 commits
-
-
Ng Zhi An authored
Move the implementation into shared macro-assembler. TurboFan and Liftoff for both ia32 and x64 can now share the implementation. No functionality change expected. Bug: v8:11589 Change-Id: Ia1f680ba139fca627e82e7dc0a9cf1c833e483cf Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3088513 Commit-Queue: Zhi An Ng <zhin@chromium.org> Reviewed-by:
Clemens Backes <clemensb@chromium.org> Cr-Commit-Position: refs/heads/master@{#76268}
-
Ng Zhi An authored
Move the implementation into shared macro-assembler. TurboFan and Liftoff for both ia32 and x64 can now share the implementation. No functionality change expected. Bug: v8:11589 Change-Id: I8d3567ef6e4a430fe8e007e44d5d55cf8e8a6a7a Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3088273 Commit-Queue: Zhi An Ng <zhin@chromium.org> Reviewed-by:
Clemens Backes <clemensb@chromium.org> Cr-Commit-Position: refs/heads/master@{#76264}
-
Ng Zhi An authored
Move I32x4SConvertF32x4 into shared implementation, and takes care of both AVX and no-AVX implementation. Instruction selector still requires dst == src to save a move in codegen. Bug: v8:11589 Change-Id: Ie982682b3002192ab27700bf73f8c1e66aeba492 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3086732 Commit-Queue: Zhi An Ng <zhin@chromium.org> Reviewed-by:
Clemens Backes <clemensb@chromium.org> Cr-Commit-Position: refs/heads/master@{#76243}
-
- 10 Aug, 2021 1 commit
-
-
Ng Zhi An authored
Use logical shifts to emulate arithmetic shift, by first adding a bias to make all signed values unsigned, then subtracting the shifted bias. Details are in code comments for SharedTurboAssembler::I64x2ShrS. Also refactor ia32 (which was already using this algorithm) to use the shared macro-assembler function. And convert Liftoff's implementation as well. Bug: v8:12058 Change-Id: Ia1fd5fe5a9a0b7a7f31c426d4112256c8bf7021b Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3083291 Commit-Queue: Zhi An Ng <zhin@chromium.org> Reviewed-by:
Clemens Backes <clemensb@chromium.org> Cr-Commit-Position: refs/heads/master@{#76209}
-
- 12 May, 2021 1 commit
-
-
Ng Zhi An authored
This is a reland of 3356078a. The fix is in PS2: - fix the DCHECK to be triggered only if dst != src, the dcheck is meant to prevent rep from being overwritten, which happens only if dst != src - fix instruction selector for f64x2.replace_lane, require SameAsFirst only for non-AVX, which makes dst == src, saving a move - on x64 we also require all registers, since the macro-assembler helper only handles registers Original change's description: > [wasm-simd][x64][ia32] Factor f64x2.replace_lane into shared code > > This pblendw/movlhps combination has lower latency and requires less > unop than pinsrq (1 v.s. 2). > > Bug: v8:11589 > Change-Id: I770b0c20a286774afefbac5ef0adffe463318f21 > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2828871 > Reviewed-by: Bill Budge <bbudge@chromium.org> > Commit-Queue: Zhi An Ng <zhin@chromium.org> > Cr-Commit-Position: refs/heads/master@{#74049} Bug: v8:11589 Change-Id: I51cba0539d5241242dc4d7d971ede1940b9ac1fd Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2842264 Commit-Queue: Zhi An Ng <zhin@chromium.org> Reviewed-by:
Bill Budge <bbudge@chromium.org> Cr-Commit-Position: refs/heads/master@{#74545}
-
- 10 May, 2021 2 commits
-
-
Ng Zhi An authored
Bug: v8:11589 Change-Id: I971003a41455d9594b9b98379e7976b75718d417 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2885738Reviewed-by:
Bill Budge <bbudge@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/master@{#74490}
-
Ng Zhi An authored
Bug: v8:11589 Change-Id: I572dcc740f9974261521e239cd37c64af3bb0d7d Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2883484Reviewed-by:
Bill Budge <bbudge@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/master@{#74488}
-
- 22 Apr, 2021 1 commit
-
-
Ng Zhi An authored
Bug: v8:11589 Change-Id: If92ef6d8ce49831818c797909a7655db8101d154 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2842263Reviewed-by:
Bill Budge <bbudge@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/master@{#74126}
-
- 20 Apr, 2021 2 commits
-
-
Ng Zhi An authored
Bug: v8:11589 Change-Id: I90a0c9f8325eb56c607addf1adde60673dfbc9c7 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2840688Reviewed-by:
Bill Budge <bbudge@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/master@{#74076}
-
Ng Zhi An authored
Bug: v8:11589 Change-Id: I13c57e1dcc77345bcc9d95a14cf878db6dd60e02 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2837589Reviewed-by:
Bill Budge <bbudge@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/master@{#74073}
-
- 19 Apr, 2021 3 commits
-
-
Ng Zhi An authored
Bug: v8:11589 Change-Id: I871ec1aecbac065e80c05309e478d814675c0d44 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2828700 Commit-Queue: Zhi An Ng <zhin@chromium.org> Reviewed-by:
Bill Budge <bbudge@chromium.org> Cr-Commit-Position: refs/heads/master@{#74052}
-
Zhi An Ng authored
This reverts commit b824d853. Reason for revert: https://ci.chromium.org/ui/p/v8/builders/ci/V8%20Linux64%20-%20debug/36784/overview Original change's description: > [wasm-simd][x64][ia32] Factor f64x2.replace_lane into shared code > > This pblendw/movlhps combination has lower latency and requires less > unop than pinsrq (1 v.s. 2). > > Bug: v8:11589 > Change-Id: I770b0c20a286774afefbac5ef0adffe463318f21 > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2828871 > Reviewed-by: Bill Budge <bbudge@chromium.org> > Commit-Queue: Zhi An Ng <zhin@chromium.org> > Cr-Commit-Position: refs/heads/master@{#74049} Bug: v8:11589 Change-Id: I1be96e59fdb844db1e228be3a09d4a06798a16c3 No-Presubmit: true No-Tree-Checks: true No-Try: true Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2837805 Auto-Submit: Zhi An Ng <zhin@chromium.org> Commit-Queue: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com> Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com> Cr-Commit-Position: refs/heads/master@{#74050}
-
Ng Zhi An authored
This pblendw/movlhps combination has lower latency and requires less unop than pinsrq (1 v.s. 2). Bug: v8:11589 Change-Id: I770b0c20a286774afefbac5ef0adffe463318f21 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2828871Reviewed-by:
Bill Budge <bbudge@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/master@{#74049}
-
- 15 Apr, 2021 1 commit
-
-
Ng Zhi An authored
Bug: v8:11589 Change-Id: I6f43e6382b3441adf59dbaea58d766013cf3793b Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2826712Reviewed-by:
Bill Budge <bbudge@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/master@{#73983}
-
- 06 Apr, 2021 1 commit
-
-
Ng Zhi An authored
These functions have the same signature for both SSE and AVX versions. We move them all into SharedTurboAssembler. Need to fixup a couple of callsites, since now we use a template helper to call the right function, whereas previously it was overloaded and there was implicit conversions from int to uint8_t. Bug: v8:11589 Change-Id: I8b4146ba1fb838f6b0d6f78f6b95495b8988fc4c Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2800569 Commit-Queue: Zhi An Ng <zhin@chromium.org> Reviewed-by:
Bill Budge <bbudge@chromium.org> Cr-Commit-Position: refs/heads/master@{#73794}
-
- 01 Apr, 2021 1 commit
-
-
Ng Zhi An authored
Bug: v8:11589 Change-Id: I3d5c72105d682913e192bcec340f16267b5707d2 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2797543Reviewed-by:
Bill Budge <bbudge@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/master@{#73778}
-
- 31 Mar, 2021 1 commit
-
-
Ng Zhi An authored
Move the helper class and some function definitions into SharedTurboAssembler. We leave most of the other function definitions inside of macro-assembler-x64, and will move them later. Also move i16x8.ext_mul high as a check that this code movement works. Bug: v8:11589 Change-Id: I8ec1fa24cb93b4c4c8bd936a9df06cbf5328374f Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2792080Reviewed-by:
Bill Budge <bbudge@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/master@{#73750}
-
- 29 Mar, 2021 1 commit
-
-
Ng Zhi An authored
Bug: v8:11589 Change-Id: Iaabea832006e68f9506c1e191d324cee46680e20 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2791766Reviewed-by:
Bill Budge <bbudge@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/master@{#73715}
-
- 26 Mar, 2021 1 commit
-
-
Ng Zhi An authored
Also clean up some comments in liftoff-assembler-x64.h. Bug: v8:11589 Change-Id: I47fe5c2c794c863be1afde86d289ea197219a4f8 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2787591 Commit-Queue: Zhi An Ng <zhin@chromium.org> Reviewed-by:
Bill Budge <bbudge@chromium.org> Cr-Commit-Position: refs/heads/master@{#73692}
-
- 25 Mar, 2021 3 commits
-
-
Ng Zhi An authored
Left i16x8.extmul_low in the arch-specific macro-assemblers because they rely on other functions defined in the same file. We can come back and move it afterwards. Bug: v8:11589 Change-Id: I2ea81c50ed52cc3e59e001b5e80aaf6b93a6572c Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2786280Reviewed-by:
Bill Budge <bbudge@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/master@{#73688}
-
Ng Zhi An authored
Bug: v8:11589 Change-Id: I3f1c6d1ece6c634915358f30537c9bbabc0cd3b0 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2785818 Commit-Queue: Zhi An Ng <zhin@chromium.org> Reviewed-by:
Bill Budge <bbudge@chromium.org> Cr-Commit-Position: refs/heads/master@{#73678}
-
Ng Zhi An authored
The x64 and ia32 implementations are the same, modulo function signature. x64 has a kScratchDoubleReg available, ia32 takes it as a argument. We standardize on the ia32 function signature and have callers on x64 pass in the scratch register. Bug: v8:11589 Change-Id: I2f75705ed9c618d6e7a7e34ac96b78b772c4944d Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2786094Reviewed-by:
Bill Budge <bbudge@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/master@{#73676}
-