-
Jakob Gruber authored
When emitting code, character ranges must only specify ranges which the actual subject string (one- or two-byte) may contain. This was not always the case, specifically for ranges with `from <= kMaxUint8` and `to > kMaxUint8`. The reason this is so tricky: 1. not all parts of the pipeline know whether we are compiling for one- or two-byte subjects; 2. for case-insensitive regexps, an out-of-bounds CharacterRange may have an in-bounds case equivalent (e.g. /[Ÿ]/i also matches 'ÿ' == \u{ff}), which only gets added somewhere in the middle of the pipeline. Our current solution is to clamp immediately before code emission. We also keep the existing handling/dchecks of the 0x10ffff marker value which may occur in the two-byte subject case. Bug: v8:11069 Change-Id: Ic7b34a13a900ea2aa3df032daac9236bf5682a42 Fixed: chromium:1275096 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3306569 Commit-Queue: Jakob Gruber <jgruber@chromium.org> Reviewed-by: Leszek Swirski <leszeks@chromium.org> Cr-Commit-Position: refs/heads/main@{#78186}
2e17aaca