• Seth Brenith's avatar
    [regexp] Better quick checks on loop entry nodes · 4b15b984
    Seth Brenith authored
    Like the predecessor change https://crrev.com/c/v8/v8/+/1702125 , this
    change is inspired by attempting to exit earlier from generated RegExp
    code, when no further matches are possible because any match would be
    too long. The motivating example this time is the following expression,
    which tests whether a string of Unicode playing cards has five of the
    same suit in a row:
    
    /([🂡-🂮]{5})|([🂱-🂾]{5})|([🃁-🃎]{5})|([🃑-🃞]{5})/u
    
    A human reading this expression can readily see that any match requires
    at least 10 characters (5 surrogate pairs), but the LoopChoiceNode for
    each repeated option reports its minimum distance to the end of a match
    as zero. This is correct, because the LoopChoiceNode's behavior depends
    on additional state (the loop counter). However, the preceding node, a
    SET_REGISTER action that initializes the loop counter, could confidently
    state that it consumes at least 10 characters. Furthermore, when we try
    to emit a quick check for that action, we could follow only paths from
    the LoopChoiceNode that are possible based on the minimum iteration
    count. This change implements both of those "could"s.
    
    I expect this improvement to apply pretty broadly to expressions that
    use minimum repetition counts and that don't meet the criteria for
    unrolling. In this particular case, I get about 12% improvement on the
    overall UniPoker test, due to reducing the execution time of this
    expression by 85% and the execution time of another similar expression
    that checks for n-of-a-kind by 20%.
    
    Bug: v8:9305
    
    Change-Id: I319e381743967bdf83324be75bae943fbb5dd496
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1704941
    Commit-Queue: Seth Brenith <seth.brenith@microsoft.com>
    Reviewed-by: 's avatarJakob Gruber <jgruber@chromium.org>
    Cr-Commit-Position: refs/heads/master@{#62963}
    4b15b984
Name
Last commit
Last update
..
api Loading commit data...
asmjs Loading commit data...
ast Loading commit data...
base Loading commit data...
builtins Loading commit data...
codegen Loading commit data...
common Loading commit data...
compiler Loading commit data...
compiler-dispatcher Loading commit data...
d8 Loading commit data...
date Loading commit data...
debug Loading commit data...
deoptimizer Loading commit data...
diagnostics Loading commit data...
execution Loading commit data...
extensions Loading commit data...
flags Loading commit data...
handles Loading commit data...
heap Loading commit data...
ic Loading commit data...
init Loading commit data...
inspector Loading commit data...
interpreter Loading commit data...
json Loading commit data...
libplatform Loading commit data...
libsampler Loading commit data...
logging Loading commit data...
numbers Loading commit data...
objects Loading commit data...
parsing Loading commit data...
profiler Loading commit data...
protobuf Loading commit data...
regexp Loading commit data...
roots Loading commit data...
runtime Loading commit data...
sanitizer Loading commit data...
snapshot Loading commit data...
strings Loading commit data...
tasks Loading commit data...
third_party Loading commit data...
torque Loading commit data...
tracing Loading commit data...
trap-handler Loading commit data...
utils Loading commit data...
wasm Loading commit data...
zone Loading commit data...
DEPS Loading commit data...
OWNERS Loading commit data...