- 29 Apr, 2020 1 commit
-
-
Jakob Gruber authored
This is a reland of 6a0e7224 Original change's description: > [regexp] Limit the size of inlined choice nodes > > Codegen for unicode property escapes (e.g.: /\p{L}/u) can produce huge > code objects. This effect can be further magnified through inlining, > leading to exponential code growth in the size of the pattern. > > This CL is a (fairly hacky) way to avoid exponential growth. We > recognize choice nodes with 'many' choices and disable inlining for > them. In the future we should fix this properly, either by using the > code size budget correctly, or by improving codegen for property > escapes. > > Bug: v8:10441 > Change-Id: I817f145251ec8b1b9906cc735c9e9bdb004c98ed > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2170229 > Commit-Queue: Jakob Gruber <jgruber@chromium.org> > Reviewed-by: Yang Guo <yangguo@chromium.org> > Cr-Commit-Position: refs/heads/master@{#67433} Tbr: yangguo@chromium.org Bug: v8:10441 Change-Id: I9a16cc9e8248cb46d3d16a4e2d250968cc1b7b39 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2172679Reviewed-by:
Jakob Gruber <jgruber@chromium.org> Commit-Queue: Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/master@{#67462}
-
- 28 Apr, 2020 2 commits
-
-
Clemens Backes authored
This reverts commit 6a0e7224. Reason for revert: Fails noi18n: https://ci.chromium.org/p/v8/builders/ci/V8%20Linux%20-%20noi18n%20-%20debug/31513 Original change's description: > [regexp] Limit the size of inlined choice nodes > > Codegen for unicode property escapes (e.g.: /\p{L}/u) can produce huge > code objects. This effect can be further magnified through inlining, > leading to exponential code growth in the size of the pattern. > > This CL is a (fairly hacky) way to avoid exponential growth. We > recognize choice nodes with 'many' choices and disable inlining for > them. In the future we should fix this properly, either by using the > code size budget correctly, or by improving codegen for property > escapes. > > Bug: v8:10441 > Change-Id: I817f145251ec8b1b9906cc735c9e9bdb004c98ed > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2170229 > Commit-Queue: Jakob Gruber <jgruber@chromium.org> > Reviewed-by: Yang Guo <yangguo@chromium.org> > Cr-Commit-Position: refs/heads/master@{#67433} TBR=yangguo@chromium.org,jgruber@chromium.org Change-Id: I503b8b2be539468d86e4ec1ac13074cd1c06a5cb No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: v8:10441 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2169101Reviewed-by:
Clemens Backes <clemensb@chromium.org> Commit-Queue: Clemens Backes <clemensb@chromium.org> Cr-Commit-Position: refs/heads/master@{#67436}
-
Jakob Gruber authored
Codegen for unicode property escapes (e.g.: /\p{L}/u) can produce huge code objects. This effect can be further magnified through inlining, leading to exponential code growth in the size of the pattern. This CL is a (fairly hacky) way to avoid exponential growth. We recognize choice nodes with 'many' choices and disable inlining for them. In the future we should fix this properly, either by using the code size budget correctly, or by improving codegen for property escapes. Bug: v8:10441 Change-Id: I817f145251ec8b1b9906cc735c9e9bdb004c98ed Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2170229 Commit-Queue: Jakob Gruber <jgruber@chromium.org> Reviewed-by:
Yang Guo <yangguo@chromium.org> Cr-Commit-Position: refs/heads/master@{#67433}
-
- 31 Jul, 2019 1 commit
-
-
Seth Brenith authored
This is a reland of 4b15b984 Updates since original: fix an arithmetic overflow bug, remove an invalid DCHECK, add a unit test that would trigger that DCHECK. Original change's description: > [regexp] Better quick checks on loop entry nodes > > Like the predecessor change https://crrev.com/c/v8/v8/+/1702125 , this > change is inspired by attempting to exit earlier from generated RegExp > code, when no further matches are possible because any match would be > too long. The motivating example this time is the following expression, > which tests whether a string of Unicode playing cards has five of the > same suit in a row: > > /([🂡-🂮]{5})|([🂱-🂾]{5})|([🃁-🃎]{5})|([🃑-🃞]{5})/u > > A human reading this expression can readily see that any match requires > at least 10 characters (5 surrogate pairs), but the LoopChoiceNode for > each repeated option reports its minimum distance to the end of a match > as zero. This is correct, because the LoopChoiceNode's behavior depends > on additional state (the loop counter). However, the preceding node, a > SET_REGISTER action that initializes the loop counter, could confidently > state that it consumes at least 10 characters. Furthermore, when we try > to emit a quick check for that action, we could follow only paths from > the LoopChoiceNode that are possible based on the minimum iteration > count. This change implements both of those "could"s. > > I expect this improvement to apply pretty broadly to expressions that > use minimum repetition counts and that don't meet the criteria for > unrolling. In this particular case, I get about 12% improvement on the > overall UniPoker test, due to reducing the execution time of this > expression by 85% and the execution time of another similar expression > that checks for n-of-a-kind by 20%. > > Bug: v8:9305 > > Change-Id: I319e381743967bdf83324be75bae943fbb5dd496 > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1704941 > Commit-Queue: Seth Brenith <seth.brenith@microsoft.com> > Reviewed-by: Jakob Gruber <jgruber@chromium.org> > Cr-Commit-Position: refs/heads/master@{#62963} Bug: v8:9305 Change-Id: I992070d383009013881bf778242254c27134b650 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1726674Reviewed-by:
Jakob Gruber <jgruber@chromium.org> Commit-Queue: Seth Brenith <seth.brenith@microsoft.com> Cr-Commit-Position: refs/heads/master@{#63009}
-
- 30 Jul, 2019 1 commit
-
-
Leszek Swirski authored
This reverts commit 4b15b984. Reason for revert: UBSan failure (https://logs.chromium.org/logs/v8/buildbucket/cr-buildbucket.appspot.com/8906578530303352544/+/steps/Check/0/logs/regress-126412/0). Original change's description: > [regexp] Better quick checks on loop entry nodes > > Like the predecessor change https://crrev.com/c/v8/v8/+/1702125 , this > change is inspired by attempting to exit earlier from generated RegExp > code, when no further matches are possible because any match would be > too long. The motivating example this time is the following expression, > which tests whether a string of Unicode playing cards has five of the > same suit in a row: > > /([🂡-🂮]{5})|([🂱-🂾]{5})|([🃁-🃎]{5})|([🃑-🃞]{5})/u > > A human reading this expression can readily see that any match requires > at least 10 characters (5 surrogate pairs), but the LoopChoiceNode for > each repeated option reports its minimum distance to the end of a match > as zero. This is correct, because the LoopChoiceNode's behavior depends > on additional state (the loop counter). However, the preceding node, a > SET_REGISTER action that initializes the loop counter, could confidently > state that it consumes at least 10 characters. Furthermore, when we try > to emit a quick check for that action, we could follow only paths from > the LoopChoiceNode that are possible based on the minimum iteration > count. This change implements both of those "could"s. > > I expect this improvement to apply pretty broadly to expressions that > use minimum repetition counts and that don't meet the criteria for > unrolling. In this particular case, I get about 12% improvement on the > overall UniPoker test, due to reducing the execution time of this > expression by 85% and the execution time of another similar expression > that checks for n-of-a-kind by 20%. > > Bug: v8:9305 > > Change-Id: I319e381743967bdf83324be75bae943fbb5dd496 > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1704941 > Commit-Queue: Seth Brenith <seth.brenith@microsoft.com> > Reviewed-by: Jakob Gruber <jgruber@chromium.org> > Cr-Commit-Position: refs/heads/master@{#62963} TBR=jgruber@chromium.org,seth.brenith@microsoft.com Change-Id: Iac085b75e054fdf0d218987cfe449be1f1630545 No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: v8:9305 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1725621Reviewed-by:
Leszek Swirski <leszeks@chromium.org> Commit-Queue: Leszek Swirski <leszeks@chromium.org> Cr-Commit-Position: refs/heads/master@{#62977}
-
- 29 Jul, 2019 1 commit
-
-
Seth Brenith authored
Like the predecessor change https://crrev.com/c/v8/v8/+/1702125 , this change is inspired by attempting to exit earlier from generated RegExp code, when no further matches are possible because any match would be too long. The motivating example this time is the following expression, which tests whether a string of Unicode playing cards has five of the same suit in a row: /([🂡-🂮]{5})|([🂱-🂾]{5})|([🃁-🃎]{5})|([🃑-🃞]{5})/u A human reading this expression can readily see that any match requires at least 10 characters (5 surrogate pairs), but the LoopChoiceNode for each repeated option reports its minimum distance to the end of a match as zero. This is correct, because the LoopChoiceNode's behavior depends on additional state (the loop counter). However, the preceding node, a SET_REGISTER action that initializes the loop counter, could confidently state that it consumes at least 10 characters. Furthermore, when we try to emit a quick check for that action, we could follow only paths from the LoopChoiceNode that are possible based on the minimum iteration count. This change implements both of those "could"s. I expect this improvement to apply pretty broadly to expressions that use minimum repetition counts and that don't meet the criteria for unrolling. In this particular case, I get about 12% improvement on the overall UniPoker test, due to reducing the execution time of this expression by 85% and the execution time of another similar expression that checks for n-of-a-kind by 20%. Bug: v8:9305 Change-Id: I319e381743967bdf83324be75bae943fbb5dd496 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1704941 Commit-Queue: Seth Brenith <seth.brenith@microsoft.com> Reviewed-by:
Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/master@{#62963}
-
- 25 Jul, 2019 1 commit
-
-
Seth Brenith authored
The motivating example is JetStream 2's UniPoker test, which tests whether a sorted string of Unicode playing cards contains a five-card straight using a regular expression. In the top-level generated loop for this RegExp, we see this loop exit condition: 00000350000C2067 27 83fffe cmpl rdi,0xfe 00000350000C206A 2a 0f8da8e40000 jge 00000350000D0518 <+0xe4d8> Meaning if the current position is pointing at the very last (16-bit) character, then we exit the loop. Otherwise we go on and try to find various matches starting at the current position. However, we can see in the original expression that any possible match is at least 10 characters (5 astral-plane Unicode values), so we're wasting a lot of time attempting to find matches in cases where we're too close to the end of the string for any match to succeed. This example might be a bit contrived, but I expect that an improvement in this bounds check would help a larger family of regular expressions, where the minimum match length is large relative to the string being matched and we don't meet the other necessary criteria for fast Boyer- Moore lookahead. To get the desired bounds check in this case, this patch does the following: 1. Compute accurate EatsAtLeast values for every node during the analysis phase. This could end up doing more work than the current implementation, but analysis already has to touch every node, so it seems like a cache-friendly time to compute these values. In some cases, this might be less total work than the current implementation, because the current implementation might recompute the same node multiple times. 2. When emitting a quick check, use the EatsAtLeast value from the predecessor ChoiceNode for the bounds check. This improves the UniPoker score on my machine by about 4%, because it cuts the time spent checking for straights roughly in half, and checking for straights originally accounted for about 8% of the total time. Bug: v8:9305 Change-Id: I110b190c2578f73b2263259d5aa5750e921b01be Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1702125 Commit-Queue: Seth Brenith <seth.brenith@microsoft.com> Reviewed-by:
Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/master@{#62919}
-
- 19 Jun, 2019 1 commit
-
-
Jakob Gruber authored
Bug: v8:9359 Change-Id: I237f16324ff036f2cbfb7ca97b4ac208442b06cc Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1664056 Auto-Submit: Jakob Gruber <jgruber@chromium.org> Commit-Queue: Sigurd Schneider <sigurds@chromium.org> Reviewed-by:
Sigurd Schneider <sigurds@chromium.org> Reviewed-by:
Peter Marshall <petermarshall@chromium.org> Cr-Commit-Position: refs/heads/master@{#62268}
-
- 18 Jun, 2019 1 commit
-
-
Jakob Gruber authored
Bug: v8:9359 Change-Id: I1b490c928ed884f4ad33e005699f98614be75233 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1662306 Commit-Queue: Jakob Gruber <jgruber@chromium.org> Auto-Submit: Jakob Gruber <jgruber@chromium.org> Reviewed-by:
Peter Marshall <petermarshall@chromium.org> Cr-Commit-Position: refs/heads/master@{#62242}
-
- 17 Jun, 2019 1 commit
-
-
Jakob Gruber authored
Bug: v8:9359 Change-Id: I06a4ccc53abff25237a1113774a0b17bdf861c86 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1658157Reviewed-by:
Peter Marshall <petermarshall@chromium.org> Commit-Queue: Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/master@{#62198}
-