• Jakob Gruber's avatar
    [regexp] Handle marker value 0x10ffff in MakeRangeArray · bfa681ff
    Jakob Gruber authored
    Unfortunately, CharacterRanges may use 0x10ffff as a marker value
    signifying 'highest possible code unit' irrespective of whether the
    regexp instance has the unicode flag or not. This value makes it
    through RegExpCharacterClass::ToNode unmodified (since no surrogate
    desugaring takes place without /u). Correctly mask out the 0xffff
    value for purposes of building our uint16_t range array.
    
    Note: It'd be better to never introduce 0x10ffff in the first place,
    but given the irregexp pipeline's lack of hackability I hesitate to
    change this - we are sure to rely on it implicitly in other spots.
    
    Drive-by: Refactors.
    
    Fixed: chromium:1264508
    Bug: v8:11069
    Change-Id: Ib3c5780e91f682f1a6d15f26eb4cf03636d93c25
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3256549
    Commit-Queue: Jakob Gruber <jgruber@chromium.org>
    Reviewed-by: 's avatarMathias Bynens <mathias@chromium.org>
    Cr-Commit-Position: refs/heads/main@{#77673}
    bfa681ff
regexp-compiler-tonode.cc 63.5 KB