-
Martin Bidlingmaier authored
Assertions are implemented with the new ASSERTION instruction. The nfa interpreter evaluates the assertion based on the current context in the subject string every time a thread executes ASSERTION. This is analogous to what re2 and rust/regex do. Alternatives to this approach: - The interpreter could calculate eagerly for all assertion types whether they are satisfied whenever the current input position is advanced. This would make evaluating the ASSERTION instruction itself cheaper, but at the cost of making every advance in the input string more expensive. I suspect this would be slower on average because assertions are not that common that we typically evaluate >= 2 assertions at every input position. - Assertions in a regexp could be desugared into CONSUME_RANGE instructions, so that no new instruction would be necessary. For example, the word boundary assertion \b is satisfied at a given position/state if we have just consumed a word character and will consume a non-word character next, or vice-versa. The tricky part about this is that the assertion itself should not consume input, so we'd have to split (automaton) states according to whether we've arrived at them via a word character or not. The current compiler is not really equipped for this kind of transformation. For {start,end} of {line,file} assertions, we'd need to introduce dummy characters indicating start/end of input (say, 0x10000 and 0x10001) which we feed to the interpreter before respectively after the actual input. I suspect that this approach wouldn't make much of a difference for NFA execution. It would likely speed up (lazy) DFA execution though because assertions would be dealt with in the fast path. Cq-Include-Trybots: luci.v8.try:v8_linux64_fyi_rel_ng Bug: v8:10765 Change-Id: Ic2012c943e0ce54eb8662789fb3d4c1b6cd8d606 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2398644 Commit-Queue: Martin Bidlingmaier <mbid@google.com> Reviewed-by: Leszek Swirski <leszeks@chromium.org> Reviewed-by: Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/master@{#70026}
e83511c2