• Seth Brenith's avatar
    [regexp] Better quick checks on loop entry nodes · 4b15b984
    Seth Brenith authored
    Like the predecessor change https://crrev.com/c/v8/v8/+/1702125 , this
    change is inspired by attempting to exit earlier from generated RegExp
    code, when no further matches are possible because any match would be
    too long. The motivating example this time is the following expression,
    which tests whether a string of Unicode playing cards has five of the
    same suit in a row:
    
    /([🂡-🂮]{5})|([🂱-🂾]{5})|([🃁-🃎]{5})|([🃑-🃞]{5})/u
    
    A human reading this expression can readily see that any match requires
    at least 10 characters (5 surrogate pairs), but the LoopChoiceNode for
    each repeated option reports its minimum distance to the end of a match
    as zero. This is correct, because the LoopChoiceNode's behavior depends
    on additional state (the loop counter). However, the preceding node, a
    SET_REGISTER action that initializes the loop counter, could confidently
    state that it consumes at least 10 characters. Furthermore, when we try
    to emit a quick check for that action, we could follow only paths from
    the LoopChoiceNode that are possible based on the minimum iteration
    count. This change implements both of those "could"s.
    
    I expect this improvement to apply pretty broadly to expressions that
    use minimum repetition counts and that don't meet the criteria for
    unrolling. In this particular case, I get about 12% improvement on the
    overall UniPoker test, due to reducing the execution time of this
    expression by 85% and the execution time of another similar expression
    that checks for n-of-a-kind by 20%.
    
    Bug: v8:9305
    
    Change-Id: I319e381743967bdf83324be75bae943fbb5dd496
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1704941
    Commit-Queue: Seth Brenith <seth.brenith@microsoft.com>
    Reviewed-by: 's avatarJakob Gruber <jgruber@chromium.org>
    Cr-Commit-Position: refs/heads/master@{#62963}
    4b15b984
Name
Last commit
Last update
..
arm Loading commit data...
arm64 Loading commit data...
ia32 Loading commit data...
mips Loading commit data...
mips64 Loading commit data...
ppc Loading commit data...
s390 Loading commit data...
x64 Loading commit data...
OWNERS Loading commit data...
gen-regexp-special-case.cc Loading commit data...
property-sequences.cc Loading commit data...
property-sequences.h Loading commit data...
regexp-ast.cc Loading commit data...
regexp-ast.h Loading commit data...
regexp-bytecode-generator-inl.h Loading commit data...
regexp-bytecode-generator.cc Loading commit data...
regexp-bytecode-generator.h Loading commit data...
regexp-bytecodes.h Loading commit data...
regexp-compiler-tonode.cc Loading commit data...
regexp-compiler.cc Loading commit data...
regexp-compiler.h Loading commit data...
regexp-dotprinter.cc Loading commit data...
regexp-dotprinter.h Loading commit data...
regexp-interpreter.cc Loading commit data...
regexp-interpreter.h Loading commit data...
regexp-macro-assembler-arch.h Loading commit data...
regexp-macro-assembler-tracer.cc Loading commit data...
regexp-macro-assembler-tracer.h Loading commit data...
regexp-macro-assembler.cc Loading commit data...
regexp-macro-assembler.h Loading commit data...
regexp-nodes.h Loading commit data...
regexp-parser.cc Loading commit data...
regexp-parser.h Loading commit data...
regexp-stack.cc Loading commit data...
regexp-stack.h Loading commit data...
regexp-utils.cc Loading commit data...
regexp-utils.h Loading commit data...
regexp.cc Loading commit data...
regexp.h Loading commit data...
special-case.h Loading commit data...