• Milad Fa's avatar
    PPC/s390: [regexp] Compact codegen for large character classes · 841d33a5
    Milad Fa authored
    Port 8bbb44e5
    
    Original Commit Message:
    
        Large character classes may easily be created when unicode
        properties (e.g.: /\p{L}/u and /\P{L}/u) are used - these are
        expanded internally into character classes that consist of hundreds
        of character ranges. Previously to this CL, we'd emit branching code
        for each of these ranges, leading to very large regexp code objects.
    
        This CL adds a new codegen mode for large character classes (where
        'large' currently means > 16 ranges). Instead of emitting branching
        code inline, the ranges are written into a ByteArray and we call into
        the C function IsCharacterInRangeArray for the actual branching logic.
        The ByteArray is smaller than emitted code and is deduplicated if the
        same character class is matched repeatedly in the same pattern.
    
        Note this mode is *not* implemented for the interpreter, since we
        currently don't have a constant pool for irregexp bytecode, and thus
        cannot reference ByteArrays.
    
    R=jgruber@chromium.org, joransiu@ca.ibm.com, junyan@redhat.com, midawson@redhat.com
    BUG=
    LOG=N
    
    Change-Id: I2ded01fa2767e56e72be81b949eefb5fb85b7013
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3231981Reviewed-by: 's avatarJunliang Yan <junyan@redhat.com>
    Commit-Queue: Milad Fa <mfarazma@redhat.com>
    Cr-Commit-Position: refs/heads/main@{#77473}
    841d33a5
Name
Last commit
Last update
.github Loading commit data...
bazel Loading commit data...
build_overrides Loading commit data...
custom_deps Loading commit data...
docs Loading commit data...
gni Loading commit data...
include Loading commit data...
infra Loading commit data...
samples Loading commit data...
src Loading commit data...
test Loading commit data...
testing Loading commit data...
third_party Loading commit data...
tools Loading commit data...
.bazelrc Loading commit data...
.clang-format Loading commit data...
.clang-tidy Loading commit data...
.editorconfig Loading commit data...
.flake8 Loading commit data...
.git-blame-ignore-revs Loading commit data...
.gitattributes Loading commit data...
.gitignore Loading commit data...
.gn Loading commit data...
.mailmap Loading commit data...
.vpython Loading commit data...
.ycm_extra_conf.py Loading commit data...
AUTHORS Loading commit data...
BUILD.bazel Loading commit data...
BUILD.gn Loading commit data...
CODE_OF_CONDUCT.md Loading commit data...
COMMON_OWNERS Loading commit data...
DEPS Loading commit data...
DIR_METADATA Loading commit data...
ENG_REVIEW_OWNERS Loading commit data...
INFRA_OWNERS Loading commit data...
INTL_OWNERS Loading commit data...
LICENSE Loading commit data...
LICENSE.fdlibm Loading commit data...
LICENSE.strongtalk Loading commit data...
LICENSE.v8 Loading commit data...
LOONG_OWNERS Loading commit data...
MIPS_OWNERS Loading commit data...
OWNERS Loading commit data...
PPC_OWNERS Loading commit data...
PRESUBMIT.py Loading commit data...
README.md Loading commit data...
RISCV_OWNERS Loading commit data...
S390_OWNERS Loading commit data...
WATCHLISTS Loading commit data...
WORKSPACE Loading commit data...
codereview.settings Loading commit data...