• Seth Brenith's avatar
    Reland "[regexp] Better quick checks on loop entry nodes" · bea0ffd0
    Seth Brenith authored
    This is a reland of 4b15b984
    
    Updates since original: fix an arithmetic overflow bug, remove an invalid
    DCHECK, add a unit test that would trigger that DCHECK.
    
    Original change's description:
    > [regexp] Better quick checks on loop entry nodes
    >
    > Like the predecessor change https://crrev.com/c/v8/v8/+/1702125 , this
    > change is inspired by attempting to exit earlier from generated RegExp
    > code, when no further matches are possible because any match would be
    > too long. The motivating example this time is the following expression,
    > which tests whether a string of Unicode playing cards has five of the
    > same suit in a row:
    >
    > /([🂡-🂮]{5})|([🂱-🂾]{5})|([🃁-🃎]{5})|([🃑-🃞]{5})/u
    >
    > A human reading this expression can readily see that any match requires
    > at least 10 characters (5 surrogate pairs), but the LoopChoiceNode for
    > each repeated option reports its minimum distance to the end of a match
    > as zero. This is correct, because the LoopChoiceNode's behavior depends
    > on additional state (the loop counter). However, the preceding node, a
    > SET_REGISTER action that initializes the loop counter, could confidently
    > state that it consumes at least 10 characters. Furthermore, when we try
    > to emit a quick check for that action, we could follow only paths from
    > the LoopChoiceNode that are possible based on the minimum iteration
    > count. This change implements both of those "could"s.
    >
    > I expect this improvement to apply pretty broadly to expressions that
    > use minimum repetition counts and that don't meet the criteria for
    > unrolling. In this particular case, I get about 12% improvement on the
    > overall UniPoker test, due to reducing the execution time of this
    > expression by 85% and the execution time of another similar expression
    > that checks for n-of-a-kind by 20%.
    >
    > Bug: v8:9305
    >
    > Change-Id: I319e381743967bdf83324be75bae943fbb5dd496
    > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1704941
    > Commit-Queue: Seth Brenith <seth.brenith@microsoft.com>
    > Reviewed-by: Jakob Gruber <jgruber@chromium.org>
    > Cr-Commit-Position: refs/heads/master@{#62963}
    
    Bug: v8:9305
    Change-Id: I992070d383009013881bf778242254c27134b650
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1726674Reviewed-by: 's avatarJakob Gruber <jgruber@chromium.org>
    Commit-Queue: Seth Brenith <seth.brenith@microsoft.com>
    Cr-Commit-Position: refs/heads/master@{#63009}
    bea0ffd0
regexp-nodes.h 29.9 KB