• Jakob Gruber's avatar
    [regexp] Fix CharacterRange limits again again again · 2e17aaca
    Jakob Gruber authored
    When emitting code, character ranges must only specify ranges which
    the actual subject string (one- or two-byte) may contain.
    
    This was not always the case, specifically for ranges with
    `from <= kMaxUint8` and `to > kMaxUint8`.
    
    The reason this is so tricky: 1. not all parts of the pipeline know
    whether we are compiling for one- or two-byte subjects; 2. for
    case-insensitive regexps, an out-of-bounds CharacterRange may have an
    in-bounds case equivalent (e.g. /[Ÿ]/i also matches 'ÿ' == \u{ff}),
    which only gets added somewhere in the middle of the pipeline.
    
    Our current solution is to clamp immediately before code emission. We
    also keep the existing handling/dchecks of the 0x10ffff marker value
    which may occur in the two-byte subject case.
    
    Bug: v8:11069
    Change-Id: Ic7b34a13a900ea2aa3df032daac9236bf5682a42
    Fixed: chromium:1275096
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3306569
    Commit-Queue: Jakob Gruber <jgruber@chromium.org>
    Reviewed-by: 's avatarLeszek Swirski <leszeks@chromium.org>
    Cr-Commit-Position: refs/heads/main@{#78186}
    2e17aaca
Name
Last commit
Last update
..
arm Loading commit data...
arm64 Loading commit data...
experimental Loading commit data...
ia32 Loading commit data...
loong64 Loading commit data...
mips Loading commit data...
mips64 Loading commit data...
ppc Loading commit data...
riscv64 Loading commit data...
s390 Loading commit data...
x64 Loading commit data...
DIR_METADATA Loading commit data...
OWNERS Loading commit data...
gen-regexp-special-case.cc Loading commit data...
property-sequences.cc Loading commit data...
property-sequences.h Loading commit data...
regexp-ast.cc Loading commit data...
regexp-ast.h Loading commit data...
regexp-bytecode-generator-inl.h Loading commit data...
regexp-bytecode-generator.cc Loading commit data...
regexp-bytecode-generator.h Loading commit data...
regexp-bytecode-peephole.cc Loading commit data...
regexp-bytecode-peephole.h Loading commit data...
regexp-bytecodes.cc Loading commit data...
regexp-bytecodes.h Loading commit data...
regexp-compiler-tonode.cc Loading commit data...
regexp-compiler.cc Loading commit data...
regexp-compiler.h Loading commit data...
regexp-dotprinter.cc Loading commit data...
regexp-dotprinter.h Loading commit data...
regexp-error.cc Loading commit data...
regexp-error.h Loading commit data...
regexp-flags.h Loading commit data...
regexp-interpreter.cc Loading commit data...
regexp-interpreter.h Loading commit data...
regexp-macro-assembler-arch.h Loading commit data...
regexp-macro-assembler-tracer.cc Loading commit data...
regexp-macro-assembler-tracer.h Loading commit data...
regexp-macro-assembler.cc Loading commit data...
regexp-macro-assembler.h Loading commit data...
regexp-nodes.h Loading commit data...
regexp-parser.cc Loading commit data...
regexp-parser.h Loading commit data...
regexp-stack.cc Loading commit data...
regexp-stack.h Loading commit data...
regexp-utils.cc Loading commit data...
regexp-utils.h Loading commit data...
regexp.cc Loading commit data...
regexp.h Loading commit data...
special-case.h Loading commit data...