1. 19 Oct, 2021 1 commit
    • Jakob Gruber's avatar
      [regexp] Compact codegen for large character classes · 8bbb44e5
      Jakob Gruber authored
      Large character classes may easily be created when unicode
      properties (e.g.: /\p{L}/u and /\P{L}/u) are used - these are
      expanded internally into character classes that consist of hundreds
      of character ranges. Previously to this CL, we'd emit branching code
      for each of these ranges, leading to very large regexp code objects.
      
      This CL adds a new codegen mode for large character classes (where
      'large' currently means > 16 ranges). Instead of emitting branching
      code inline, the ranges are written into a ByteArray and we call into
      the C function IsCharacterInRangeArray for the actual branching logic.
      The ByteArray is smaller than emitted code and is deduplicated if the
      same character class is matched repeatedly in the same pattern.
      
      Note this mode is *not* implemented for the interpreter, since we
      currently don't have a constant pool for irregexp bytecode, and thus
      cannot reference ByteArrays.
      
      Bug: v8:11069
      Change-Id: I2d728e42d85114b796c637f791848731a104cd54
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3229377Reviewed-by: 's avatarPatrick Thier <pthier@chromium.org>
      Auto-Submit: Jakob Gruber <jgruber@chromium.org>
      Commit-Queue: Jakob Gruber <jgruber@chromium.org>
      Cr-Commit-Position: refs/heads/main@{#77463}
      8bbb44e5
  2. 12 Oct, 2021 1 commit
  3. 19 Aug, 2021 1 commit
  4. 24 Jun, 2021 3 commits
  5. 17 Mar, 2021 1 commit
  6. 03 Jun, 2020 1 commit
  7. 02 Mar, 2020 1 commit
  8. 29 Aug, 2019 1 commit
  9. 25 Jul, 2019 1 commit
    • Seth Brenith's avatar
      [regexp] Use stricter bounds check to avoid additional iteration · f42b1a5d
      Seth Brenith authored
      The motivating example is JetStream 2's UniPoker test, which tests
      whether a sorted string of Unicode playing cards contains a five-card
      straight using a regular expression. In the top-level generated loop for
      this RegExp, we see this loop exit condition:
      
      00000350000C2067    27  83fffe         cmpl rdi,0xfe
      00000350000C206A    2a  0f8da8e40000   jge 00000350000D0518  <+0xe4d8>
      
      Meaning if the current position is pointing at the very last (16-bit)
      character, then we exit the loop. Otherwise we go on and try to find
      various matches starting at the current position. However, we can see
      in the original expression that any possible match is at least 10
      characters (5 astral-plane Unicode values), so we're wasting a lot of
      time attempting to find matches in cases where we're too close to the
      end of the string for any match to succeed.
      
      This example might be a bit contrived, but I expect that an improvement
      in this bounds check would help a larger family of regular expressions,
      where the minimum match length is large relative to the string being
      matched and we don't meet the other necessary criteria for fast Boyer-
      Moore lookahead.
      
      To get the desired bounds check in this case, this patch does the
      following:
      1. Compute accurate EatsAtLeast values for every node during the
         analysis phase. This could end up doing more work than the current
         implementation, but analysis already has to touch every node, so it
         seems like a cache-friendly time to compute these values. In some
         cases, this might be less total work than the current implementation,
         because the current implementation might recompute the same node
         multiple times.
      2. When emitting a quick check, use the EatsAtLeast value from the
         predecessor ChoiceNode for the bounds check.
      
      This improves the UniPoker score on my machine by about 4%, because it
      cuts the time spent checking for straights roughly in half, and checking
      for straights originally accounted for about 8% of the total time.
      
      Bug: v8:9305
      Change-Id: I110b190c2578f73b2263259d5aa5750e921b01be
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1702125
      Commit-Queue: Seth Brenith <seth.brenith@microsoft.com>
      Reviewed-by: 's avatarJakob Gruber <jgruber@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#62919}
      f42b1a5d
  10. 23 May, 2019 1 commit
  11. 18 Sep, 2018 1 commit
  12. 09 Jan, 2017 1 commit
  13. 14 Oct, 2016 1 commit
  14. 27 Jan, 2016 1 commit
  15. 15 Dec, 2015 1 commit
  16. 26 Nov, 2015 1 commit
  17. 17 Nov, 2015 3 commits
  18. 14 Aug, 2015 1 commit
  19. 13 Aug, 2015 1 commit
  20. 01 Jun, 2015 1 commit
  21. 18 May, 2015 1 commit
  22. 23 Jan, 2015 1 commit
    • danno's avatar
      Remove the dependency of Zone on Isolate · c7b09aac
      danno authored
      Along the way:
      - Thread isolate parameter explicitly through code that used to
        rely on getting it from the zone.
      - Canonicalize the parameter position of isolate and zone for
        affected code
      - Change Hydrogen New<> instruction templates to automatically
        pass isolate
      
      R=mstarzinger@chromium.org
      LOG=N
      
      Review URL: https://codereview.chromium.org/868883002
      
      Cr-Commit-Position: refs/heads/master@{#26252}
      c7b09aac
  23. 04 Aug, 2014 1 commit
  24. 20 Jun, 2014 1 commit
  25. 03 Jun, 2014 1 commit
  26. 23 May, 2014 1 commit
  27. 09 May, 2014 1 commit
  28. 29 Apr, 2014 1 commit
  29. 21 Mar, 2014 1 commit
  30. 12 Feb, 2014 1 commit
  31. 09 Dec, 2013 1 commit
  32. 06 Jun, 2013 1 commit
  33. 11 Jun, 2012 1 commit
  34. 06 Jun, 2012 1 commit
  35. 22 May, 2012 1 commit
  36. 30 Mar, 2012 1 commit