• Jakob Gruber's avatar
    [regexp] Rewrite certain Assertion sequences · c51e4f3c
    Jakob Gruber authored
    RegExp assertions (e.g.: '^', '$', '\b', ...) sequences have certain
    properties that this rewriter exploits:
    
    1. They are zero-width and order-independent, thus one can remove all
    duplicate assertions.
    2. If a subsequence is guaranteed to fail, the entire sequence fails.
    Any sequence always known to fail (e.g. containing both '\b' and '\B')
    can be rewritten to a single node that triggers failure.
    
    This CL generalizes the previous optimization for repeated assertions
    to be order-independent, i.e. assertions only have to be in the same
    sequence but not next to each other.
    
    Bug: v8:6515, v8:6126
    Change-Id: I3f92f081ce8a55ad8c34c269a09a6686e3b008f3
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1657925
    Commit-Queue: Jakob Gruber <jgruber@chromium.org>
    Reviewed-by: 's avatarPeter Marshall <petermarshall@chromium.org>
    Cr-Commit-Position: refs/heads/master@{#62201}
    c51e4f3c
regexp-compiler-tonode.cc 60.8 KB