• Martin Bidlingmaier's avatar
    [regexp] Support capture groups in experimental engine · 98b8ca89
    Martin Bidlingmaier authored
    This commit adds support for capture groups (as in e.g. /x(123|abc)y/)
    in the experimental regexp engine.  Now every InterpreterThread owns a
    register array containing (sub)match boundaries. There is a new
    instruction to record the current input index in some register.
    
    Submatches in quantifier bodies should be reported only if they occur
    during the last repetition.  Thus we reset those registers before
    attempting to match the body of a quantifier.  This is implemented with
    another new instruction.
    
    Because of concerns for the growing sizeof the NfaInterpreter object
    (which is allocated on the stack), this commit replaces the
    `SmallVector` members of the NfaInterpreter with zone-allocated arrays.
    Register arrays, which for a fixed regexp are all the same size, are
    allocated with a RecyclingZoneAllocator for cheap memory reclamation via
    a linked list of equally-sized free blocks.
    
    Possible optimizations for management of register array memory:
    1. If there are few register per thread, then it is likely faster to
       store them inline in the InterpreterThread struct.
    2. re2 implements copy-on-write:  InterpreterThreads can share the same
       register array. If a thread attempts to write to shared register
       array, the register array is cloned first.
    3. The register at index 1 contains the end of the match; this is only
       written to right before an ACCEPT statement.  We could make ACCEPT
       equivalent to what's currently CAPTURE 1 followed by ACCEPT.  We
       could then save the memory for register 1 for threads that haven't
       finished yet.  This is particularly interesting if now optimization 1
       kicks in.
    
    Cq-Include-Trybots: luci.v8.try:v8_linux64_fyi_rel_ng
    Bug: v8:10765
    Change-Id: I2c0503206ce331e13ac9912945bb66736d740197
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2390770
    Commit-Queue: Martin Bidlingmaier <mbid@google.com>
    Reviewed-by: 's avatarJakob Gruber <jgruber@chromium.org>
    Cr-Commit-Position: refs/heads/master@{#69929}
    98b8ca89
Name
Last commit
Last update
..
api Loading commit data...
asmjs Loading commit data...
ast Loading commit data...
base Loading commit data...
builtins Loading commit data...
codegen Loading commit data...
common Loading commit data...
compiler Loading commit data...
compiler-dispatcher Loading commit data...
d8 Loading commit data...
date Loading commit data...
debug Loading commit data...
deoptimizer Loading commit data...
diagnostics Loading commit data...
execution Loading commit data...
extensions Loading commit data...
flags Loading commit data...
handles Loading commit data...
heap Loading commit data...
ic Loading commit data...
init Loading commit data...
inspector Loading commit data...
interpreter Loading commit data...
json Loading commit data...
libplatform Loading commit data...
libsampler Loading commit data...
logging Loading commit data...
numbers Loading commit data...
objects Loading commit data...
parsing Loading commit data...
profiler Loading commit data...
protobuf Loading commit data...
regexp Loading commit data...
roots Loading commit data...
runtime Loading commit data...
sanitizer Loading commit data...
snapshot Loading commit data...
strings Loading commit data...
tasks Loading commit data...
third_party Loading commit data...
torque Loading commit data...
tracing Loading commit data...
trap-handler Loading commit data...
utils Loading commit data...
wasm Loading commit data...
zone Loading commit data...
DEPS Loading commit data...
OWNERS Loading commit data...