• Seth Brenith's avatar
    Reland "[regexp] Better quick checks on loop entry nodes" · bea0ffd0
    Seth Brenith authored
    This is a reland of 4b15b984
    
    Updates since original: fix an arithmetic overflow bug, remove an invalid
    DCHECK, add a unit test that would trigger that DCHECK.
    
    Original change's description:
    > [regexp] Better quick checks on loop entry nodes
    >
    > Like the predecessor change https://crrev.com/c/v8/v8/+/1702125 , this
    > change is inspired by attempting to exit earlier from generated RegExp
    > code, when no further matches are possible because any match would be
    > too long. The motivating example this time is the following expression,
    > which tests whether a string of Unicode playing cards has five of the
    > same suit in a row:
    >
    > /([🂡-🂮]{5})|([🂱-🂾]{5})|([🃁-🃎]{5})|([🃑-🃞]{5})/u
    >
    > A human reading this expression can readily see that any match requires
    > at least 10 characters (5 surrogate pairs), but the LoopChoiceNode for
    > each repeated option reports its minimum distance to the end of a match
    > as zero. This is correct, because the LoopChoiceNode's behavior depends
    > on additional state (the loop counter). However, the preceding node, a
    > SET_REGISTER action that initializes the loop counter, could confidently
    > state that it consumes at least 10 characters. Furthermore, when we try
    > to emit a quick check for that action, we could follow only paths from
    > the LoopChoiceNode that are possible based on the minimum iteration
    > count. This change implements both of those "could"s.
    >
    > I expect this improvement to apply pretty broadly to expressions that
    > use minimum repetition counts and that don't meet the criteria for
    > unrolling. In this particular case, I get about 12% improvement on the
    > overall UniPoker test, due to reducing the execution time of this
    > expression by 85% and the execution time of another similar expression
    > that checks for n-of-a-kind by 20%.
    >
    > Bug: v8:9305
    >
    > Change-Id: I319e381743967bdf83324be75bae943fbb5dd496
    > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1704941
    > Commit-Queue: Seth Brenith <seth.brenith@microsoft.com>
    > Reviewed-by: Jakob Gruber <jgruber@chromium.org>
    > Cr-Commit-Position: refs/heads/master@{#62963}
    
    Bug: v8:9305
    Change-Id: I992070d383009013881bf778242254c27134b650
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1726674Reviewed-by: 's avatarJakob Gruber <jgruber@chromium.org>
    Commit-Queue: Seth Brenith <seth.brenith@microsoft.com>
    Cr-Commit-Position: refs/heads/master@{#63009}
    bea0ffd0
Name
Last commit
Last update
..
arm Loading commit data...
arm64 Loading commit data...
ia32 Loading commit data...
mips Loading commit data...
mips64 Loading commit data...
ppc Loading commit data...
s390 Loading commit data...
x64 Loading commit data...
OWNERS Loading commit data...
gen-regexp-special-case.cc Loading commit data...
property-sequences.cc Loading commit data...
property-sequences.h Loading commit data...
regexp-ast.cc Loading commit data...
regexp-ast.h Loading commit data...
regexp-bytecode-generator-inl.h Loading commit data...
regexp-bytecode-generator.cc Loading commit data...
regexp-bytecode-generator.h Loading commit data...
regexp-bytecodes.h Loading commit data...
regexp-compiler-tonode.cc Loading commit data...
regexp-compiler.cc Loading commit data...
regexp-compiler.h Loading commit data...
regexp-dotprinter.cc Loading commit data...
regexp-dotprinter.h Loading commit data...
regexp-interpreter.cc Loading commit data...
regexp-interpreter.h Loading commit data...
regexp-macro-assembler-arch.h Loading commit data...
regexp-macro-assembler-tracer.cc Loading commit data...
regexp-macro-assembler-tracer.h Loading commit data...
regexp-macro-assembler.cc Loading commit data...
regexp-macro-assembler.h Loading commit data...
regexp-nodes.h Loading commit data...
regexp-parser.cc Loading commit data...
regexp-parser.h Loading commit data...
regexp-stack.cc Loading commit data...
regexp-stack.h Loading commit data...
regexp-utils.cc Loading commit data...
regexp-utils.h Loading commit data...
regexp.cc Loading commit data...
regexp.h Loading commit data...
special-case.h Loading commit data...