• Seth Brenith's avatar
    [regexp] Use stricter bounds check to avoid additional iteration · f42b1a5d
    Seth Brenith authored
    The motivating example is JetStream 2's UniPoker test, which tests
    whether a sorted string of Unicode playing cards contains a five-card
    straight using a regular expression. In the top-level generated loop for
    this RegExp, we see this loop exit condition:
    
    00000350000C2067    27  83fffe         cmpl rdi,0xfe
    00000350000C206A    2a  0f8da8e40000   jge 00000350000D0518  <+0xe4d8>
    
    Meaning if the current position is pointing at the very last (16-bit)
    character, then we exit the loop. Otherwise we go on and try to find
    various matches starting at the current position. However, we can see
    in the original expression that any possible match is at least 10
    characters (5 astral-plane Unicode values), so we're wasting a lot of
    time attempting to find matches in cases where we're too close to the
    end of the string for any match to succeed.
    
    This example might be a bit contrived, but I expect that an improvement
    in this bounds check would help a larger family of regular expressions,
    where the minimum match length is large relative to the string being
    matched and we don't meet the other necessary criteria for fast Boyer-
    Moore lookahead.
    
    To get the desired bounds check in this case, this patch does the
    following:
    1. Compute accurate EatsAtLeast values for every node during the
       analysis phase. This could end up doing more work than the current
       implementation, but analysis already has to touch every node, so it
       seems like a cache-friendly time to compute these values. In some
       cases, this might be less total work than the current implementation,
       because the current implementation might recompute the same node
       multiple times.
    2. When emitting a quick check, use the EatsAtLeast value from the
       predecessor ChoiceNode for the bounds check.
    
    This improves the UniPoker score on my machine by about 4%, because it
    cuts the time spent checking for straights roughly in half, and checking
    for straights originally accounted for about 8% of the total time.
    
    Bug: v8:9305
    Change-Id: I110b190c2578f73b2263259d5aa5750e921b01be
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1702125
    Commit-Queue: Seth Brenith <seth.brenith@microsoft.com>
    Reviewed-by: 's avatarJakob Gruber <jgruber@chromium.org>
    Cr-Commit-Position: refs/heads/master@{#62919}
    f42b1a5d
Name
Last commit
Last update
..
api Loading commit data...
asmjs Loading commit data...
ast Loading commit data...
base Loading commit data...
builtins Loading commit data...
codegen Loading commit data...
common Loading commit data...
compiler Loading commit data...
compiler-dispatcher Loading commit data...
d8 Loading commit data...
date Loading commit data...
debug Loading commit data...
deoptimizer Loading commit data...
diagnostics Loading commit data...
execution Loading commit data...
extensions Loading commit data...
flags Loading commit data...
handles Loading commit data...
heap Loading commit data...
ic Loading commit data...
init Loading commit data...
inspector Loading commit data...
interpreter Loading commit data...
json Loading commit data...
libplatform Loading commit data...
libsampler Loading commit data...
logging Loading commit data...
numbers Loading commit data...
objects Loading commit data...
parsing Loading commit data...
profiler Loading commit data...
protobuf Loading commit data...
regexp Loading commit data...
roots Loading commit data...
runtime Loading commit data...
sanitizer Loading commit data...
snapshot Loading commit data...
strings Loading commit data...
tasks Loading commit data...
third_party Loading commit data...
torque Loading commit data...
tracing Loading commit data...
trap-handler Loading commit data...
utils Loading commit data...
wasm Loading commit data...
zone Loading commit data...
DEPS Loading commit data...
OWNERS Loading commit data...