- 13 May, 2022 1 commit
-
-
Clemens Backes authored
Now that we require C++17 support, we can just use the standard static_assert without message, instead of our STATIC_ASSERT macro. R=leszeks@chromium.org Bug: v8:12425 Change-Id: I1d4e39c310b533bcd3a4af33d027827e6c083afe Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3647353Reviewed-by: Leszek Swirski <leszeks@chromium.org> Reviewed-by: Hannes Payer <hpayer@chromium.org> Commit-Queue: Clemens Backes <clemensb@chromium.org> Cr-Commit-Position: refs/heads/main@{#80524}
-
- 18 Jan, 2022 2 commits
-
-
Jakob Gruber authored
Apply case-insensitive comparisons not only for the initial character, but for the entire prefix. This avoids degenerate behavior for patterns like /aaaa|AAAA|AAAA/i (i.e. generate a single 4-char prefix instead of four 1-char prefixes). Bug: v8:12472 Change-Id: Ib2b49fe73ca846a1b7ec90056cc64bdf5cf33026 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3398114Reviewed-by: Patrick Thier <pthier@chromium.org> Commit-Queue: Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/main@{#78668}
-
Jakob Gruber authored
Recursive ToNode node generation may overflow the stack for large graphs. As a quick fix, insert periodic stack overflow checks in selected ToNode methods. As a more permanent fix, in the future we could abort gracefully (instead of crashing on a CHECK), and/or refactor into iterative node generation. Bug: v8:12472 Change-Id: Ie5fbe838c5f6a5192d7d9b44bfe6f6c76a8d26e7 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3398112Reviewed-by: Leszek Swirski <leszeks@chromium.org> Commit-Queue: Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/main@{#78667}
-
- 01 Dec, 2021 1 commit
-
-
Jakob Gruber authored
When emitting code, character ranges must only specify ranges which the actual subject string (one- or two-byte) may contain. This was not always the case, specifically for ranges with `from <= kMaxUint8` and `to > kMaxUint8`. The reason this is so tricky: 1. not all parts of the pipeline know whether we are compiling for one- or two-byte subjects; 2. for case-insensitive regexps, an out-of-bounds CharacterRange may have an in-bounds case equivalent (e.g. /[Ÿ]/i also matches 'ÿ' == \u{ff}), which only gets added somewhere in the middle of the pipeline. Our current solution is to clamp immediately before code emission. We also keep the existing handling/dchecks of the 0x10ffff marker value which may occur in the two-byte subject case. Bug: v8:11069 Change-Id: Ic7b34a13a900ea2aa3df032daac9236bf5682a42 Fixed: chromium:1275096 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3306569 Commit-Queue: Jakob Gruber <jgruber@chromium.org> Reviewed-by: Leszek Swirski <leszeks@chromium.org> Cr-Commit-Position: refs/heads/main@{#78186}
-
- 15 Nov, 2021 1 commit
-
-
Ng Zhi An authored
Bug: v8:12244,v8:12245 Change-Id: I38c9a767bd17f76bbf269ad79adc6798d94753a2 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3273529Reviewed-by: Jakob Gruber <jgruber@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/main@{#77914}
-
- 09 Nov, 2021 1 commit
-
-
Ng Zhi An authored
Bug: v8:12244,v8:12245 Change-Id: I5b908f056222c57e796fb76e86ceea9a77cde77f Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3265066Reviewed-by: Jakob Gruber <jgruber@chromium.org> Commit-Queue: Zhi An Ng <zhin@chromium.org> Cr-Commit-Position: refs/heads/main@{#77782}
-
- 03 Nov, 2021 1 commit
-
-
Jakob Gruber authored
Unfortunately, CharacterRanges may use 0x10ffff as a marker value signifying 'highest possible code unit' irrespective of whether the regexp instance has the unicode flag or not. This value makes it through RegExpCharacterClass::ToNode unmodified (since no surrogate desugaring takes place without /u). Correctly mask out the 0xffff value for purposes of building our uint16_t range array. Note: It'd be better to never introduce 0x10ffff in the first place, but given the irregexp pipeline's lack of hackability I hesitate to change this - we are sure to rely on it implicitly in other spots. Drive-by: Refactors. Fixed: chromium:1264508 Bug: v8:11069 Change-Id: Ib3c5780e91f682f1a6d15f26eb4cf03636d93c25 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3256549 Commit-Queue: Jakob Gruber <jgruber@chromium.org> Reviewed-by: Mathias Bynens <mathias@chromium.org> Cr-Commit-Position: refs/heads/main@{#77673}
-
- 19 Oct, 2021 1 commit
-
-
Jakob Gruber authored
Large character classes may easily be created when unicode properties (e.g.: /\p{L}/u and /\P{L}/u) are used - these are expanded internally into character classes that consist of hundreds of character ranges. Previously to this CL, we'd emit branching code for each of these ranges, leading to very large regexp code objects. This CL adds a new codegen mode for large character classes (where 'large' currently means > 16 ranges). Instead of emitting branching code inline, the ranges are written into a ByteArray and we call into the C function IsCharacterInRangeArray for the actual branching logic. The ByteArray is smaller than emitted code and is deduplicated if the same character class is matched repeatedly in the same pattern. Note this mode is *not* implemented for the interpreter, since we currently don't have a constant pool for irregexp bytecode, and thus cannot reference ByteArrays. Bug: v8:11069 Change-Id: I2d728e42d85114b796c637f791848731a104cd54 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3229377Reviewed-by: Patrick Thier <pthier@chromium.org> Auto-Submit: Jakob Gruber <jgruber@chromium.org> Commit-Queue: Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/main@{#77463}
-
- 14 Oct, 2021 1 commit
-
-
Jakob Gruber authored
- Anonymous namespaces instead of static functions. - Comments. - Reserve enough space in the range ZoneList. Change-Id: Ie79fda770974796cd590a155dc5fd504472e5bc9 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3220341 Auto-Submit: Jakob Gruber <jgruber@chromium.org> Reviewed-by: Patrick Thier <pthier@chromium.org> Commit-Queue: Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/main@{#77391}
-
- 12 Oct, 2021 1 commit
-
-
Jakob Gruber authored
.. instead of referring to them through magic chars {s,S,w,W,d,D,n,.,*}. Change-Id: Ib50937a2a7d4229a021377586a54be3db9ed8c1d Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3217196 Commit-Queue: Jakob Gruber <jgruber@chromium.org> Reviewed-by: Patrick Thier <pthier@chromium.org> Cr-Commit-Position: refs/heads/main@{#77337}
-
- 11 Oct, 2021 1 commit
-
-
Jakob Gruber authored
No functional changes. - Removed unused Isolate* argument from regexp extrefs. - Added const where possible. - Removed unused functions. - Shuffled declarations for better readability. - ... Change-Id: I6d9093052e8de4e33e9411541a691d0bab7b20c9 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3217193 Commit-Queue: Jakob Gruber <jgruber@chromium.org> Auto-Submit: Jakob Gruber <jgruber@chromium.org> Reviewed-by: Patrick Thier <pthier@chromium.org> Cr-Commit-Position: refs/heads/main@{#77316}
-
- 19 Aug, 2021 2 commits
-
-
Jakob Gruber authored
.. and decrease the include-ball size. Change-Id: Id35358a6882156f6684475b7f0b0193f8ca5eaf5 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3103313 Commit-Queue: Jakob Gruber <jgruber@chromium.org> Reviewed-by: Patrick Thier <pthier@chromium.org> Cr-Commit-Position: refs/heads/main@{#76386}
-
Jakob Gruber authored
The JSRegExp heap object should not be the source of truth for regexp flags, which are also relevant in places that don't need or want to care about the heap object layout (e.g.: the regexp parser). Introduce RegExpFlags as a new source of truth, and base everything else on these flags. As a first change, remove the js-regexp.h dependency from the regexp parser. Other files in src/regexp/ should be updated in follow-up work. Change-Id: Id9a6706c7f09e93f743b08b647b211d0cb0b9c76 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3103306Reviewed-by: Leszek Swirski <leszeks@chromium.org> Reviewed-by: Patrick Thier <pthier@chromium.org> Commit-Queue: Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/main@{#76379}
-
- 27 Jul, 2021 1 commit
-
-
Jakob Gruber authored
The implementation came in with https://chromium-review.googlesource.com/758999. This feature was never enabled by default, is not used anywhere, and is not on any standardization path. Bug: v8:10953 Change-Id: Ia2b0a556c1fb504a4cd05bdfa9f0a9c5be608d26 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3053589Reviewed-by: Mathias Bynens <mathias@chromium.org> Commit-Queue: Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/master@{#75934}
-
- 01 Jul, 2021 1 commit
-
-
Peter Kasting authored
There are still a few cases remaining that seem more controversial; I'll upload those separately. Bug: chromium:1066980 Change-Id: Iabbaf23f9bbe97781857c0c589f2b3db685dfdc2 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2994804 Commit-Queue: Peter Kasting <pkasting@chromium.org> Auto-Submit: Peter Kasting <pkasting@chromium.org> Reviewed-by: Ross McIlroy <rmcilroy@chromium.org> Cr-Commit-Position: refs/heads/master@{#75494}
-
- 24 Jun, 2021 3 commits
-
-
Dan Elphick authored
This is a reland of 9701d4a4 with a small fix for some code landed in between the dry-run and submission. Original change's description: > [base] Move most of src/numbers into base > > Moves all but conversions.*, hash-seed-inl.h and math-random.* into > base, in preparation for moving the parts of conversions that don't > access HeapObjects. > > Also moves uc16 and uc32 out of commons/globals.h into base/strings.h. > > Bug: v8:11917 > Change-Id: Ife359148bb0961a63833aff40d26331454b6afb6 > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2979595 > Reviewed-by: Ross McIlroy <rmcilroy@chromium.org> > Reviewed-by: Clemens Backes <clemensb@chromium.org> > Commit-Queue: Ross McIlroy <rmcilroy@chromium.org> > Auto-Submit: Dan Elphick <delphick@chromium.org> > Cr-Commit-Position: refs/heads/master@{#75354} Bug: v8:11917 Change-Id: Ie1ec9032fe56646a7c7303185cecc70fce5694ae Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2982607Reviewed-by: Clemens Backes <clemensb@chromium.org> Reviewed-by: Ross McIlroy <rmcilroy@chromium.org> Commit-Queue: Dan Elphick <delphick@chromium.org> Cr-Commit-Position: refs/heads/master@{#75368}
-
Nico Hartmann authored
This reverts commit 9701d4a4. Reason for revert: https://ci.chromium.org/ui/p/v8/builders/ci/V8%20Mac64/40802/overview Original change's description: > [base] Move most of src/numbers into base > > Moves all but conversions.*, hash-seed-inl.h and math-random.* into > base, in preparation for moving the parts of conversions that don't > access HeapObjects. > > Also moves uc16 and uc32 out of commons/globals.h into base/strings.h. > > Bug: v8:11917 > Change-Id: Ife359148bb0961a63833aff40d26331454b6afb6 > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2979595 > Reviewed-by: Ross McIlroy <rmcilroy@chromium.org> > Reviewed-by: Clemens Backes <clemensb@chromium.org> > Commit-Queue: Ross McIlroy <rmcilroy@chromium.org> > Auto-Submit: Dan Elphick <delphick@chromium.org> > Cr-Commit-Position: refs/heads/master@{#75354} Bug: v8:11917 Change-Id: Iacf796c95256016fa74f0a910c5bb1a86baa425a No-Presubmit: true No-Tree-Checks: true No-Try: true Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2982605 Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com> Reviewed-by: Nico Hartmann <nicohartmann@chromium.org> Commit-Queue: Nico Hartmann <nicohartmann@chromium.org> Cr-Commit-Position: refs/heads/master@{#75356}
-
Dan Elphick authored
Moves all but conversions.*, hash-seed-inl.h and math-random.* into base, in preparation for moving the parts of conversions that don't access HeapObjects. Also moves uc16 and uc32 out of commons/globals.h into base/strings.h. Bug: v8:11917 Change-Id: Ife359148bb0961a63833aff40d26331454b6afb6 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2979595Reviewed-by: Ross McIlroy <rmcilroy@chromium.org> Reviewed-by: Clemens Backes <clemensb@chromium.org> Commit-Queue: Ross McIlroy <rmcilroy@chromium.org> Auto-Submit: Dan Elphick <delphick@chromium.org> Cr-Commit-Position: refs/heads/master@{#75354}
-
- 18 Jun, 2021 1 commit
-
-
Dan Elphick authored
The adding of base:: was mostly prepared using git grep and sed: git grep -l <pattern> | grep -v base/vector.h | \ xargs sed -i 's/\b<pattern>\b/base::<pattern>/ with lots of manual clean-ups due to the resulting v8::internal::base::Vectors. #includes were fixed using: git grep -l "src/utils/vector.h" | \ axargs sed -i 's!src/utils/vector.h!src/base/vector.h!' Bug: v8:11879 Change-Id: I3e6d622987fee4478089c40539724c19735bd625 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2968412Reviewed-by: Clemens Backes <clemensb@chromium.org> Reviewed-by: Hannes Payer <hpayer@chromium.org> Commit-Queue: Dan Elphick <delphick@chromium.org> Cr-Commit-Position: refs/heads/master@{#75243}
-
- 09 Jun, 2021 1 commit
-
-
Iain Ireland authored
In issue 11290, we disabled the propagation of EAL data out of lookarounds, because it was incorrect for lookahead nodes in loops. This caused performance regressions: for example, `/^\P{Letter}+$/u` (matching only characters that are not in Unicode's Letter category) uses negative lookahead when matching lone surrogates, and became about 2x slower. I spent some time looking into fixes, and this is what I've settled on. Some background: the implementation of lookarounds in irregexp is split between positive and negative lookaheads. (Lookbehinds aren't relevant here, because backwards matches always have EAL=0.) Positive lookaheads are wrapped in BEGIN_SUBMATCH and POSITIVE_SUBMATCH_SUCCESS ActionNodes. BEGIN_SUBMATCH saves the current state. POSITIVE_SUBMATCH_SUCCESS restores the necessary state (while leaving any captures that occurred during the lookaround intact). Negative lookaheads also begin with a BEGIN_SUBMATCH node, but follow it with a NegativeLookaroundChoiceNode. This node has two successors: a lookaround node, and a continue node. It only executes the continue node if the lookaround node backtracks, which automatically restores the previous state. Negative lookarounds also can't update captures. This affects EAL calculations. It turns out that negative lookaheads are already doing the right thing: EatsAtLeastPropagator only propagates information from the continue node, ignoring the lookaround node. The same is true for quick checks (see the comment in RegExpLookaround:Builder::ForMatch). A BEGIN_SUBMATCH for a negative lookahead can simply propagate the EAL data from its successor like any other ActionNode, and everything works. Positive lookaheads are harder. I tried saving a pointer to the successor in BEGIN_SUBMATCH, but ran into problems in FillInBMInfo, because the EAL value corresponded to the nodes after the lookahead, but the analysis was still looking at the nodes inside. I fell back to a more modest approach: split BEGIN_SUBMATCH in two, and propagate EAL info for BEGIN_NEGATIVE_SUBMATCH while keeping the current behaviour for BEGIN_POSITIVE_SUBMATCH. This fixes the performance regression at hand. Two potential approaches for fixing EAL for positive lookahead are: 1. Handling positive lookahead with its own dedicated choice node, like NegativeLookaroundChoiceNode. 2. Adding an eats_at_least_inside_loop field to EatsAtLeastInfo, which is <= eats_at_least_from_possibly_start, and using that value in EatsAtLeastFromLoopEntry. Both of those approaches are more complex than I want to tackle right now, though. Bug: v8:11844 Change-Id: I2a43509c2c21194b8c18f0a587fa21c194db76c2 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2934858Reviewed-by: Jakob Gruber <jgruber@chromium.org> Commit-Queue: Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/master@{#75031}
-
- 24 Nov, 2020 1 commit
-
-
Georg Neis authored
Apart from removing Min and Max (utils.h), this is mostly a renaming. In a few cases I had to add a cast. In a bunch of cases I had to use initializer lists to force call-by-value for static member constants because call-by-reference wouldn't compile (like in the previous CL). In a few places I used initializer lists in place of nested min/max operations. Bug: v8:11074 Change-Id: I53a5411be6334ff41e7a8517e6b87fb46f14d086 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2545523 Commit-Queue: Georg Neis <neis@chromium.org> Reviewed-by: Hannes Payer <hpayer@chromium.org> Reviewed-by: Jakob Gruber <jgruber@chromium.org> Reviewed-by: Ulan Degenbaev <ulan@chromium.org> Cr-Commit-Position: refs/heads/master@{#71380}
-
- 24 Sep, 2020 1 commit
-
-
Jakob Gruber authored
This fixes a case in which we forgot to assign flags to TextNodes created through AddBmpCharacters AddNonBmpSurrogatePairs AddLoneLeadSurrogates AddLoneTrailSurrogates functions. If these initially had a flag (e.g. case-insensitive 'i') set, that information was lost. This bug resulted in missing case folding in no_i18n builds (perhaps other things as well that just aren't covered by our test suite). Cq-Include-Trybots: luci.v8.try:v8_linux_noi18n_rel_ng Bug: v8:10131,v8:10120 Change-Id: Icef4f0dbd47971a538e07bab2f1067c383fd59c6 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2423718Reviewed-by: Leszek Swirski <leszeks@chromium.org> Commit-Queue: Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/master@{#70106}
-
- 10 Jul, 2020 2 commits
-
-
Igor Sheludko authored
... by migrating old-style code MyObject* obj = new (zone) MyObject(...) to the new style MyObject* obj = zone->New<MyObject>(...) Bug: v8:10689 Change-Id: Icc60fdbf247ec05f9b5688b3d2d73d4fed06ea89 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2289770 Commit-Queue: Igor Sheludko <ishell@chromium.org> Reviewed-by: Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/master@{#68784}
-
Igor Sheludko authored
... and introduce a bottleneck for collecting reusable zone memory statistics. Tbr: jgruber@chromium.org Bug: v8:10572 Change-Id: I418f8b495c0d89c0eb73f4e19bc4315acfadb480 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2287500Reviewed-by: Igor Sheludko <ishell@chromium.org> Reviewed-by: Toon Verwaest <verwaest@chromium.org> Commit-Queue: Igor Sheludko <ishell@chromium.org> Cr-Commit-Position: refs/heads/master@{#68776}
-
- 10 Jun, 2020 1 commit
-
-
Jakob Gruber authored
Prior to this change, uc16 was typedef'd to (unsigned) uint16_t while uc32 was typedef'd to (signed) int32_t. For consistency, and to avoid unexpected behavior around signed/unsigned comparisons, this changes uc32 to the unsigned uint32_t type. As part of this change, old-style error passing (return -1, check for negative return values) was updated to use named error values. Bug: v8:10568 Change-Id: I8524e66ee20e8738749cd34c4fe82c14e885dcb3 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2235533Reviewed-by: Leszek Swirski <leszeks@chromium.org> Reviewed-by: Clemens Backes <clemensb@chromium.org> Reviewed-by: Jakob Kummerow <jkummerow@chromium.org> Commit-Queue: Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/master@{#68282}
-
- 29 Apr, 2020 1 commit
-
-
Jakob Gruber authored
This is a reland of 6a0e7224 Original change's description: > [regexp] Limit the size of inlined choice nodes > > Codegen for unicode property escapes (e.g.: /\p{L}/u) can produce huge > code objects. This effect can be further magnified through inlining, > leading to exponential code growth in the size of the pattern. > > This CL is a (fairly hacky) way to avoid exponential growth. We > recognize choice nodes with 'many' choices and disable inlining for > them. In the future we should fix this properly, either by using the > code size budget correctly, or by improving codegen for property > escapes. > > Bug: v8:10441 > Change-Id: I817f145251ec8b1b9906cc735c9e9bdb004c98ed > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2170229 > Commit-Queue: Jakob Gruber <jgruber@chromium.org> > Reviewed-by: Yang Guo <yangguo@chromium.org> > Cr-Commit-Position: refs/heads/master@{#67433} Tbr: yangguo@chromium.org Bug: v8:10441 Change-Id: I9a16cc9e8248cb46d3d16a4e2d250968cc1b7b39 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2172679Reviewed-by: Jakob Gruber <jgruber@chromium.org> Commit-Queue: Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/master@{#67462}
-
- 28 Apr, 2020 2 commits
-
-
Clemens Backes authored
This reverts commit 6a0e7224. Reason for revert: Fails noi18n: https://ci.chromium.org/p/v8/builders/ci/V8%20Linux%20-%20noi18n%20-%20debug/31513 Original change's description: > [regexp] Limit the size of inlined choice nodes > > Codegen for unicode property escapes (e.g.: /\p{L}/u) can produce huge > code objects. This effect can be further magnified through inlining, > leading to exponential code growth in the size of the pattern. > > This CL is a (fairly hacky) way to avoid exponential growth. We > recognize choice nodes with 'many' choices and disable inlining for > them. In the future we should fix this properly, either by using the > code size budget correctly, or by improving codegen for property > escapes. > > Bug: v8:10441 > Change-Id: I817f145251ec8b1b9906cc735c9e9bdb004c98ed > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2170229 > Commit-Queue: Jakob Gruber <jgruber@chromium.org> > Reviewed-by: Yang Guo <yangguo@chromium.org> > Cr-Commit-Position: refs/heads/master@{#67433} TBR=yangguo@chromium.org,jgruber@chromium.org Change-Id: I503b8b2be539468d86e4ec1ac13074cd1c06a5cb No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: v8:10441 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2169101Reviewed-by: Clemens Backes <clemensb@chromium.org> Commit-Queue: Clemens Backes <clemensb@chromium.org> Cr-Commit-Position: refs/heads/master@{#67436}
-
Jakob Gruber authored
Codegen for unicode property escapes (e.g.: /\p{L}/u) can produce huge code objects. This effect can be further magnified through inlining, leading to exponential code growth in the size of the pattern. This CL is a (fairly hacky) way to avoid exponential growth. We recognize choice nodes with 'many' choices and disable inlining for them. In the future we should fix this properly, either by using the code size budget correctly, or by improving codegen for property escapes. Bug: v8:10441 Change-Id: I817f145251ec8b1b9906cc735c9e9bdb004c98ed Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2170229 Commit-Queue: Jakob Gruber <jgruber@chromium.org> Reviewed-by: Yang Guo <yangguo@chromium.org> Cr-Commit-Position: refs/heads/master@{#67433}
-
- 10 Mar, 2020 1 commit
-
-
Iain Ireland authored
Non-unicode, case-insensitive regexps (e.g. /foo/i, not foo/iu) use a case-folding algorithm that doesn't quite match the Unicode definition. There are two places in irregexp that need to do case-folding. Prior to this patch, neither of them quite matched the spec (https://tc39.es/ecma262/#sec-runtime-semantics-canonicalize-ch). This patch implements the "Canonicalize" algorithm in src/regexp/special-case.h, and uses it in the relevant places. It replaces special-case logic around upper-casing / ASCII characters with the following approach: 1. For most characters, calling UnicodeSet::closeOver on a set containing that character will produce the correct set of case-insensitive matches. 2. For a small handful of characters (like the sharp S that prompted this change), UnicodeSet::closeOver will include some characters that should be omitted. For example, although closeOver('ß') = "ßẞ", uppercase('ß') is "SS", so step 3.e means that 'ß' canonicalizes to itself, and should not match 'ẞ'. In these cases, we can skip the closeOver entirely, because it will never add an equivalent character. These characters are in the IgnoreSet. 3. For an even smaller handful of characters, UnicodeSet::closeOver will produce some characters that should be omitted, but also some characters that should be included. For example, closeOver('k') = "kKK" (lowercase k, uppercase K, U+212A KELVIN SIGN), but KELVIN SIGN should not match either of the other two (step 3.g). To handle this, we put such characters in the SpecialAddSet. In these cases, we closeOver the original character, but filter out the results that do not have the same canonical value. The computation of IgnoreSet and SpecialAddSet happens at build time, using the pre-existing gen-regexp-special-case.cc step. R=jgruber@chromium.org Bug: v8:10248 Change-Id: I00d48b180c83bb8e645cc59eda57b01eab134f0b Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2072858Reviewed-by: Frank Tang <ftang@chromium.org> Reviewed-by: Jakob Gruber <jgruber@chromium.org> Commit-Queue: Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/master@{#66641}
-
- 31 Jul, 2019 1 commit
-
-
Seth Brenith authored
This is a reland of 4b15b984 Updates since original: fix an arithmetic overflow bug, remove an invalid DCHECK, add a unit test that would trigger that DCHECK. Original change's description: > [regexp] Better quick checks on loop entry nodes > > Like the predecessor change https://crrev.com/c/v8/v8/+/1702125 , this > change is inspired by attempting to exit earlier from generated RegExp > code, when no further matches are possible because any match would be > too long. The motivating example this time is the following expression, > which tests whether a string of Unicode playing cards has five of the > same suit in a row: > > /([🂡-🂮]{5})|([🂱-🂾]{5})|([🃁-🃎]{5})|([🃑-🃞]{5})/u > > A human reading this expression can readily see that any match requires > at least 10 characters (5 surrogate pairs), but the LoopChoiceNode for > each repeated option reports its minimum distance to the end of a match > as zero. This is correct, because the LoopChoiceNode's behavior depends > on additional state (the loop counter). However, the preceding node, a > SET_REGISTER action that initializes the loop counter, could confidently > state that it consumes at least 10 characters. Furthermore, when we try > to emit a quick check for that action, we could follow only paths from > the LoopChoiceNode that are possible based on the minimum iteration > count. This change implements both of those "could"s. > > I expect this improvement to apply pretty broadly to expressions that > use minimum repetition counts and that don't meet the criteria for > unrolling. In this particular case, I get about 12% improvement on the > overall UniPoker test, due to reducing the execution time of this > expression by 85% and the execution time of another similar expression > that checks for n-of-a-kind by 20%. > > Bug: v8:9305 > > Change-Id: I319e381743967bdf83324be75bae943fbb5dd496 > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1704941 > Commit-Queue: Seth Brenith <seth.brenith@microsoft.com> > Reviewed-by: Jakob Gruber <jgruber@chromium.org> > Cr-Commit-Position: refs/heads/master@{#62963} Bug: v8:9305 Change-Id: I992070d383009013881bf778242254c27134b650 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1726674Reviewed-by: Jakob Gruber <jgruber@chromium.org> Commit-Queue: Seth Brenith <seth.brenith@microsoft.com> Cr-Commit-Position: refs/heads/master@{#63009}
-
- 30 Jul, 2019 1 commit
-
-
Leszek Swirski authored
This reverts commit 4b15b984. Reason for revert: UBSan failure (https://logs.chromium.org/logs/v8/buildbucket/cr-buildbucket.appspot.com/8906578530303352544/+/steps/Check/0/logs/regress-126412/0). Original change's description: > [regexp] Better quick checks on loop entry nodes > > Like the predecessor change https://crrev.com/c/v8/v8/+/1702125 , this > change is inspired by attempting to exit earlier from generated RegExp > code, when no further matches are possible because any match would be > too long. The motivating example this time is the following expression, > which tests whether a string of Unicode playing cards has five of the > same suit in a row: > > /([🂡-🂮]{5})|([🂱-🂾]{5})|([🃁-🃎]{5})|([🃑-🃞]{5})/u > > A human reading this expression can readily see that any match requires > at least 10 characters (5 surrogate pairs), but the LoopChoiceNode for > each repeated option reports its minimum distance to the end of a match > as zero. This is correct, because the LoopChoiceNode's behavior depends > on additional state (the loop counter). However, the preceding node, a > SET_REGISTER action that initializes the loop counter, could confidently > state that it consumes at least 10 characters. Furthermore, when we try > to emit a quick check for that action, we could follow only paths from > the LoopChoiceNode that are possible based on the minimum iteration > count. This change implements both of those "could"s. > > I expect this improvement to apply pretty broadly to expressions that > use minimum repetition counts and that don't meet the criteria for > unrolling. In this particular case, I get about 12% improvement on the > overall UniPoker test, due to reducing the execution time of this > expression by 85% and the execution time of another similar expression > that checks for n-of-a-kind by 20%. > > Bug: v8:9305 > > Change-Id: I319e381743967bdf83324be75bae943fbb5dd496 > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1704941 > Commit-Queue: Seth Brenith <seth.brenith@microsoft.com> > Reviewed-by: Jakob Gruber <jgruber@chromium.org> > Cr-Commit-Position: refs/heads/master@{#62963} TBR=jgruber@chromium.org,seth.brenith@microsoft.com Change-Id: Iac085b75e054fdf0d218987cfe449be1f1630545 No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: v8:9305 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1725621Reviewed-by: Leszek Swirski <leszeks@chromium.org> Commit-Queue: Leszek Swirski <leszeks@chromium.org> Cr-Commit-Position: refs/heads/master@{#62977}
-
- 29 Jul, 2019 1 commit
-
-
Seth Brenith authored
Like the predecessor change https://crrev.com/c/v8/v8/+/1702125 , this change is inspired by attempting to exit earlier from generated RegExp code, when no further matches are possible because any match would be too long. The motivating example this time is the following expression, which tests whether a string of Unicode playing cards has five of the same suit in a row: /([🂡-🂮]{5})|([🂱-🂾]{5})|([🃁-🃎]{5})|([🃑-🃞]{5})/u A human reading this expression can readily see that any match requires at least 10 characters (5 surrogate pairs), but the LoopChoiceNode for each repeated option reports its minimum distance to the end of a match as zero. This is correct, because the LoopChoiceNode's behavior depends on additional state (the loop counter). However, the preceding node, a SET_REGISTER action that initializes the loop counter, could confidently state that it consumes at least 10 characters. Furthermore, when we try to emit a quick check for that action, we could follow only paths from the LoopChoiceNode that are possible based on the minimum iteration count. This change implements both of those "could"s. I expect this improvement to apply pretty broadly to expressions that use minimum repetition counts and that don't meet the criteria for unrolling. In this particular case, I get about 12% improvement on the overall UniPoker test, due to reducing the execution time of this expression by 85% and the execution time of another similar expression that checks for n-of-a-kind by 20%. Bug: v8:9305 Change-Id: I319e381743967bdf83324be75bae943fbb5dd496 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1704941 Commit-Queue: Seth Brenith <seth.brenith@microsoft.com> Reviewed-by: Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/master@{#62963}
-
- 02 Jul, 2019 1 commit
-
-
Frank Tang authored
This is a reland of f23f644f Fix the issue by wrap v8_executable("gen-regexp-special-case") inside if (current_toolchain == v8_generator_toolchain) { and change deps of action("run_gen-regexp-special-case") to ":gen-regexp-special-case($v8_generator_toolchain)", Original change's description: > Speed up CharacterRange::AddCaseEquivalents > > By using the lexCss("color:") to measure the performance > The change make the lexCss("color:") > x21 - x40 times faster than trunk. > x2.3 - x4.6 times faster than m74. > > Design Doc: http://shorturl.at/adfO5 > > Measured by out/x64.release/d8 reg977003.js > see reg977003.js attached to chromium:977003 > > Also see another cl of benchmark in > https://chromium-review.googlesource.com/c/v8/v8/+/1679651/ > > > Bug: chromium:977003 > Change-Id: Ie8518493d2c33df1594be1b4576bda715087b421 > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1674851 > Commit-Queue: Frank Tang <ftang@chromium.org> > Reviewed-by: Yang Guo <yangguo@chromium.org> > Cr-Commit-Position: refs/heads/master@{#62471} Bug: chromium:977003 Change-Id: Ie690810f596e9551b5765f422665c9617391bcf8 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1683706Reviewed-by: Frank Tang <ftang@chromium.org> Reviewed-by: Yang Guo <yangguo@chromium.org> Commit-Queue: Frank Tang <ftang@chromium.org> Cr-Commit-Position: refs/heads/master@{#62486}
-
- 01 Jul, 2019 2 commits
-
-
Maya Lekova authored
This reverts commit f23f644f. Reason for revert: Breaks arm debug builder - https://ci.chromium.org/p/v8/builders/ci/V8%20Arm%20-%20debug%20builder/22390 - missing file? Original change's description: > Speed up CharacterRange::AddCaseEquivalents > > By using the lexCss("color:") to measure the performance > The change make the lexCss("color:") > x21 - x40 times faster than trunk. > x2.3 - x4.6 times faster than m74. > > Design Doc: http://shorturl.at/adfO5 > > Measured by out/x64.release/d8 reg977003.js > see reg977003.js attached to chromium:977003 > > Also see another cl of benchmark in > https://chromium-review.googlesource.com/c/v8/v8/+/1679651/ > > > Bug: chromium:977003 > Change-Id: Ie8518493d2c33df1594be1b4576bda715087b421 > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1674851 > Commit-Queue: Frank Tang <ftang@chromium.org> > Reviewed-by: Yang Guo <yangguo@chromium.org> > Cr-Commit-Position: refs/heads/master@{#62471} TBR=adamk@chromium.org,jkummerow@chromium.org,yangguo@chromium.org,jshin@chromium.org,gsathya@chromium.org,ftang@chromium.org Change-Id: I780fac2cf5f4bae6846f8d5c8765cabd76637545 No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: chromium:977003 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1684073Reviewed-by: Maya Lekova <mslekova@chromium.org> Commit-Queue: Maya Lekova <mslekova@chromium.org> Cr-Commit-Position: refs/heads/master@{#62472}
-
Frank Tang authored
By using the lexCss("color:") to measure the performance The change make the lexCss("color:") x21 - x40 times faster than trunk. x2.3 - x4.6 times faster than m74. Design Doc: http://shorturl.at/adfO5 Measured by out/x64.release/d8 reg977003.js see reg977003.js attached to chromium:977003 Also see another cl of benchmark in https://chromium-review.googlesource.com/c/v8/v8/+/1679651/ Bug: chromium:977003 Change-Id: Ie8518493d2c33df1594be1b4576bda715087b421 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1674851 Commit-Queue: Frank Tang <ftang@chromium.org> Reviewed-by: Yang Guo <yangguo@chromium.org> Cr-Commit-Position: refs/heads/master@{#62471}
-
- 18 Jun, 2019 2 commits
-
-
Jakob Gruber authored
This class used to be based on DispatchTable, which itself uses an interval tree to both categorize and canonicalize ranges (i.e. such that no overlap and all immediately adjacent ranges are merged). The produced ranges were then entered into lists for {bmp,lead_surrogate,trail_surrogate,non_bmp} splits. With this CL, we simplify to a plain loop over all character range kinds instead. The dispatch table (and ZoneSplayList, perhaps SplayList) can be removed in follow-ups. Bug: v8:9359 Change-Id: I9c6b72f3bc44d1557af7c74419709ae5662611f1 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1664053 Auto-Submit: Jakob Gruber <jgruber@chromium.org> Reviewed-by: Michael Lippautz <mlippautz@chromium.org> Reviewed-by: Peter Marshall <petermarshall@chromium.org> Commit-Queue: Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/master@{#62260}
-
Jakob Gruber authored
This CL renames jsregexp.{h,cc} to regexp.{h,cc}, hides all non-public functions of RegExpImpl in the .cc file, and renames the public parts of RegExpImpl to just RegExp. Include directives from outside the src/regexp directory are limited to regexp.h, regexp-stack.h, and regexp-utils.h. We also expose all result codes that can be returned by irregexp code (including RETRY) on the public header since they are needed elsewhere, e.g. in builtins. Bug: v8:9359 Change-Id: Iae1a01ac9f6e1e4dc168f3fbe8fe8679cb6b1259 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1662297Reviewed-by: Michael Achenbach <machenbach@chromium.org> Reviewed-by: Leszek Swirski <leszeks@chromium.org> Reviewed-by: Ulan Degenbaev <ulan@chromium.org> Reviewed-by: Peter Marshall <petermarshall@chromium.org> Commit-Queue: Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/master@{#62240}
-
- 17 Jun, 2019 2 commits
-
-
Jakob Gruber authored
The breaking change was https://chromium-review.googlesource.com/c/v8/v8/+/1658157 Bug: v8:9359 Change-Id: I6fa956631a8e475123cf6f8f44e66f2c499d47b7 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1660627 Auto-Submit: Jakob Gruber <jgruber@chromium.org> Commit-Queue: Jakob Gruber <jgruber@chromium.org> Commit-Queue: Leszek Swirski <leszeks@chromium.org> Reviewed-by: Leszek Swirski <leszeks@chromium.org> Cr-Commit-Position: refs/heads/master@{#62207}
-
Jakob Gruber authored
RegExp assertions (e.g.: '^', '$', '\b', ...) sequences have certain properties that this rewriter exploits: 1. They are zero-width and order-independent, thus one can remove all duplicate assertions. 2. If a subsequence is guaranteed to fail, the entire sequence fails. Any sequence always known to fail (e.g. containing both '\b' and '\B') can be rewritten to a single node that triggers failure. This CL generalizes the previous optimization for repeated assertions to be order-independent, i.e. assertions only have to be in the same sequence but not next to each other. Bug: v8:6515, v8:6126 Change-Id: I3f92f081ce8a55ad8c34c269a09a6686e3b008f3 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1657925 Commit-Queue: Jakob Gruber <jgruber@chromium.org> Reviewed-by: Peter Marshall <petermarshall@chromium.org> Cr-Commit-Position: refs/heads/master@{#62201}
-
- 13 Jun, 2019 1 commit
-
-
Jakob Gruber authored
This is a reland of 811bfbbc Original change's description: > [regexp] Move AST-to-Node code to a dedicated file > > Prior to this CL, jsregexp contains a bunch of things that are slightly > related but would be cleaner in separate files, including: AST-to-Node > transformations, the compiler implementation, and a debugging printer. > > This CL extracts AST-to-Node transformations. > > Bug: v8:9359 > Change-Id: I030cfca5c40cfd72e3a7abe2188e4654cfe2277c > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1655303 > Auto-Submit: Jakob Gruber <jgruber@chromium.org> > Reviewed-by: Yang Guo <yangguo@chromium.org> > Commit-Queue: Jakob Gruber <jgruber@chromium.org> > Cr-Commit-Position: refs/heads/master@{#62148} Tbr: yangguo@chromium.org Bug: v8:9359 Change-Id: I68a16086dc56c9a059547033ca8bc1e9de1080db Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1658568Reviewed-by: Jakob Gruber <jgruber@chromium.org> Commit-Queue: Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/master@{#62154}
-