1. 16 Dec, 2020 1 commit
    • Milad Fa's avatar
      PPC/s390: Reland "[Turboprop] Move dynamic check maps immediate args to deopt exit." · 07f0b7a4
      Milad Fa authored
      Port 7bdb0fbb
      
      Original Commit Message:
      
          This is a reland of b2a611d8
      
          Original change's description:
          > [Turboprop] Move dynamic check maps immediate args to deopt exit.
          >
          > Rather than loading the immediate arguments required by the
          > dynamic check maps builtin into registers in the fast-path,
          > instead insert them into the instruction stream in the deopt
          > exit and have the builtin load them into registers itself.
          >
          > BUG=v8:10582
          >
          > Change-Id: I66716570b408501374eed8f5e6432df64c6deb7c
          > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2589736
          > Commit-Queue: Ross McIlroy <rmcilroy@chromium.org>
          > Reviewed-by: Sathya Gunasekaran  <gsathya@chromium.org>
          > Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
          > Cr-Commit-Position: refs/heads/master@{#71790}
      
      R=rmcilroy@chromium.org, joransiu@ca.ibm.com, junyan@redhat.com, midawson@redhat.com
      BUG=
      LOG=N
      
      Change-Id: I83fc0f3e3ebcf19ca4303e50aae94d7b353cd0ac
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2595708Reviewed-by: 's avatarJoran Siu <joransiu@ca.ibm.com>
      Commit-Queue: Milad Fa <mfarazma@redhat.com>
      Cr-Commit-Position: refs/heads/master@{#71809}
      07f0b7a4
  2. 02 Dec, 2020 1 commit
    • Milad Fa's avatar
      PPC/s390: [Turboprop] Move deoptimizations for dynamic map checks into builtin. · 2bc979aa
      Milad Fa authored
      Port b6643320
      
      Original Commit Message:
      
          In order to reduce the codegen size of dynamic map checks, add the
          ability to have an eager with resume deopt point, which can call
          a given builitin to perform a more detailed check than can be done
          in codegen, and then either deoptimizes itself (as if the calling
          code had performed an eager deopt) or resumes execution in the
          calling code after the check.
      
          In addition, support for adding extra arguments to a
          deoptimization continuation is added to enable us to pass the
          necessary arguments to the DynamicMapChecks builtin.
      
          Finally, a trampoline is added to the DynamicMapChecks which saves
          the registers that might be clobbered by that builtin, to avoid
          having to save them in the generated code. This trampoline also
          performs the deoptimization based on the result of the
          DynamicMapChecks builtin.
      
          In order to ensure both the trampoline and DynamicMapChecks
          builtin have the same call interface, and to limit the number
          of registers that need saving in the trampoline, the
          DynamicMapChecks builtin is moved to be a CSA builtin with a
          custom CallInterfaceDescriptor, that calls an exported Torque
          macro that implements the actual functionality.
      
          All told, this changes the codegen for a monomorphic dynamic
          map check from:
              movl rbx,<expected_map>
              cmpl [<object>-0x1],rbx
              jnz <deferred_call>
             resume_point:
              ...
             deferred_call:
              <spill registers>
              movl rax,<slot>
              movq rbx,<object>
              movq rcx,<handler>
              movq r10,<DynamicMapChecks>
              call r10
              cmpq rax,0x0
              jz <restore_regs>
              cmpq rax,0x1
              jz <deopt_point_1>
              cmpq rax,0x2
              jz <deopt_point_2>
              int3l
             restore_regs:
              <restore_regs>
              jmp <resume_point>
              ...
             deopt_point_1:
              call Deoptimization_Eager
             deopt_point_2:
              call Deoptimization_Bailout
      
              movl rcx,<expected_map>
              movq rdx,<handler>
              cmpl [<object>-0x1],rcx
              jnz <deopt_point>
             resume_point:
              ...
             deopt_point:
              call DynamicMapChecksTrampoline
              jmp <resume_point>
      
      R=rmcilroy@chromium.org, joransiu@ca.ibm.com, junyan@redhat.com, midawson@redhat.com
      BUG=v8:10582
      LOG=N
      
      Change-Id: I0739c1b40ed06bb22b73ebe1833ea648b540882a
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2569359Reviewed-by: 's avatarJunliang Yan <junyan@redhat.com>
      Commit-Queue: Milad Fa <mfarazma@redhat.com>
      Cr-Commit-Position: refs/heads/master@{#71571}
      2bc979aa
  3. 20 Oct, 2020 1 commit
    • Junliang Yan's avatar
      PPC/s390: [deoptimizer] Change deopt entries into builtins · 5d5ed19f
      Junliang Yan authored
      Port 7f58ced7
      
      Original Commit Message:
      
          While the overall goal of this commit is to change deoptimization
          entries into builtins, there are multiple related things happening:
      
          - Deoptimization entries, formerly stubs (i.e. Code objects generated
            at runtime, guaranteed to be immovable), have been converted into
            builtins. The major restriction is that we now need to preserve the
            kRootRegister, which was formerly used on most architectures to pass
            the deoptimization id. The solution differs based on platform.
          - Renamed DEOPT_ENTRIES_OR_FOR_TESTING code kind to FOR_TESTING.
          - Removed heap/ support for immovable Code generation.
          - Removed the DeserializerData class (no longer needed).
          - arm64: to preserve 4-byte deopt exits, introduced a new optimization
            in which the final jump to the deoptimization entry is generated
            once per Code object, and deopt exits can continue to emit a
            near-call.
          - arm,ia32,x64: change to fixed-size deopt exits. This reduces exit
            sizes by 4/8, 5, and 5 bytes, respectively.
      
          On arm the deopt exit size is reduced from 12 (or 16) bytes to 8 bytes
          by using the same strategy as on arm64 (recalc deopt id from return
          address). Before:
      
           e300a002       movw r10, <id>
           e59fc024       ldr ip, [pc, <entry offset>]
           e12fff3c       blx ip
      
          After:
      
           e59acb35       ldr ip, [r10, <entry offset>]
           e12fff3c       blx ip
      
          On arm64 the deopt exit size remains 4 bytes (or 8 bytes in same cases
          with CFI). Additionally, up to 4 builtin jumps are emitted per Code
          object (max 32 bytes added overhead per Code object). Before:
      
           9401cdae       bl <entry offset>
      
          After:
      
           # eager deoptimization entry jump.
           f95b1f50       ldr x16, [x26, <eager entry offset>]
           d61f0200       br x16
           # lazy deoptimization entry jump.
           f95b2b50       ldr x16, [x26, <lazy entry offset>]
           d61f0200       br x16
           # the deopt exit.
           97fffffc       bl <eager deoptimization entry jump offset>
      
          On ia32 the deopt exit size is reduced from 10 to 5 bytes. Before:
      
           bb00000000     mov ebx,<id>
           e825f5372b     call <entry>
      
          After:
      
           e8ea2256ba     call <entry>
      
          On x64 the deopt exit size is reduced from 12 to 7 bytes. Before:
      
           49c7c511000000 REX.W movq r13,<id>
           e8ea2f0700     call <entry>
      
          After:
      
           41ff9560360000 call [r13+<entry offset>]
      
      R=jgruber@chromium.org, joransiu@ca.ibm.com, jyan@ca.ibm.com, michael_dawson@ca.ibm.com, miladfar@ca.ibm.com
      BUG=
      LOG=N
      
      Change-Id: I49e4c92759043e46beb3c76c97823285b16feeef
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2486225Reviewed-by: 's avatarMilad Fa <mfarazma@redhat.com>
      Commit-Queue: Junliang Yan <junyan@redhat.com>
      Cr-Commit-Position: refs/heads/master@{#70637}
      5d5ed19f
  4. 28 Apr, 2020 1 commit
  5. 17 Mar, 2020 1 commit
  6. 13 Feb, 2020 1 commit
    • Georgia Kouveli's avatar
      Reland "[arm64] Protect return addresses stored on stack" · 73f88b5f
      Georgia Kouveli authored
      This is a reland of 137bfe47
      
      Original change's description:
      > [arm64] Protect return addresses stored on stack
      > 
      > This change uses the Arm v8.3 pointer authentication instructions in
      > order to protect return addresses stored on the stack.  The generated
      > code signs the return address before storing on the stack and
      > authenticates it after loading it. This also changes the stack frame
      > iterator in order to authenticate stored return addresses and re-sign
      > them when needed, as well as the deoptimizer in order to sign saved
      > return addresses when creating new frames. This offers a level of
      > protection against ROP attacks.
      > 
      > This functionality is enabled with the v8_control_flow_integrity flag
      > that this CL introduces.
      > 
      > The code size effect of this change is small for Octane (up to 2% in
      > some cases but mostly much lower) and negligible for larger benchmarks,
      > however code size measurements are rather noisy. The performance impact
      > on current cores (where the instructions are NOPs) is single digit,
      > around 1-2% for ARES-6 and Octane, and tends to be smaller for big
      > cores than for little cores.
      > 
      > Bug: v8:10026
      > Change-Id: I0081f3938c56e2f24d8227e4640032749f4f8368
      > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1373782
      > Commit-Queue: Georgia Kouveli <georgia.kouveli@arm.com>
      > Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
      > Reviewed-by: Georg Neis <neis@chromium.org>
      > Cr-Commit-Position: refs/heads/master@{#66239}
      
      Bug: v8:10026
      Change-Id: Id1adfa2e6c713f6977d69aa467986e48fe67b3c2
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2051958Reviewed-by: 's avatarGeorg Neis <neis@chromium.org>
      Reviewed-by: 's avatarRoss McIlroy <rmcilroy@chromium.org>
      Commit-Queue: Georgia Kouveli <georgia.kouveli@arm.com>
      Cr-Commit-Position: refs/heads/master@{#66254}
      73f88b5f
  7. 12 Feb, 2020 2 commits
    • Nico Hartmann's avatar
      Revert "[arm64] Protect return addresses stored on stack" · 6a9a67d9
      Nico Hartmann authored
      This reverts commit 137bfe47.
      
      Reason for revert: https://ci.chromium.org/p/v8/builders/ci/V8%20Arm%20-%20debug/13072
      
      Original change's description:
      > [arm64] Protect return addresses stored on stack
      > 
      > This change uses the Arm v8.3 pointer authentication instructions in
      > order to protect return addresses stored on the stack.  The generated
      > code signs the return address before storing on the stack and
      > authenticates it after loading it. This also changes the stack frame
      > iterator in order to authenticate stored return addresses and re-sign
      > them when needed, as well as the deoptimizer in order to sign saved
      > return addresses when creating new frames. This offers a level of
      > protection against ROP attacks.
      > 
      > This functionality is enabled with the v8_control_flow_integrity flag
      > that this CL introduces.
      > 
      > The code size effect of this change is small for Octane (up to 2% in
      > some cases but mostly much lower) and negligible for larger benchmarks,
      > however code size measurements are rather noisy. The performance impact
      > on current cores (where the instructions are NOPs) is single digit,
      > around 1-2% for ARES-6 and Octane, and tends to be smaller for big
      > cores than for little cores.
      > 
      > Bug: v8:10026
      > Change-Id: I0081f3938c56e2f24d8227e4640032749f4f8368
      > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1373782
      > Commit-Queue: Georgia Kouveli <georgia.kouveli@arm.com>
      > Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
      > Reviewed-by: Georg Neis <neis@chromium.org>
      > Cr-Commit-Position: refs/heads/master@{#66239}
      
      TBR=rmcilroy@chromium.org,mstarzinger@chromium.org,neis@chromium.org,georgia.kouveli@arm.com
      
      Change-Id: I57d5928949b0d403774550b9bf7dc0b08ce4e703
      No-Presubmit: true
      No-Tree-Checks: true
      No-Try: true
      Bug: v8:10026
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2051952Reviewed-by: 's avatarNico Hartmann <nicohartmann@chromium.org>
      Commit-Queue: Nico Hartmann <nicohartmann@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#66242}
      6a9a67d9
    • Georgia Kouveli's avatar
      [arm64] Protect return addresses stored on stack · 137bfe47
      Georgia Kouveli authored
      This change uses the Arm v8.3 pointer authentication instructions in
      order to protect return addresses stored on the stack.  The generated
      code signs the return address before storing on the stack and
      authenticates it after loading it. This also changes the stack frame
      iterator in order to authenticate stored return addresses and re-sign
      them when needed, as well as the deoptimizer in order to sign saved
      return addresses when creating new frames. This offers a level of
      protection against ROP attacks.
      
      This functionality is enabled with the v8_control_flow_integrity flag
      that this CL introduces.
      
      The code size effect of this change is small for Octane (up to 2% in
      some cases but mostly much lower) and negligible for larger benchmarks,
      however code size measurements are rather noisy. The performance impact
      on current cores (where the instructions are NOPs) is single digit,
      around 1-2% for ARES-6 and Octane, and tends to be smaller for big
      cores than for little cores.
      
      Bug: v8:10026
      Change-Id: I0081f3938c56e2f24d8227e4640032749f4f8368
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1373782
      Commit-Queue: Georgia Kouveli <georgia.kouveli@arm.com>
      Reviewed-by: 's avatarRoss McIlroy <rmcilroy@chromium.org>
      Reviewed-by: 's avatarGeorg Neis <neis@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#66239}
      137bfe47
  8. 25 Oct, 2019 1 commit
  9. 30 Sep, 2019 1 commit
  10. 20 Aug, 2019 1 commit
    • Jakob Gruber's avatar
      [deoptimizer] Extract frame layout calculation into helper classes · 81642fa6
      Jakob Gruber authored
      The deoptimizer calculates frame layout based on the translation's
      `height` field, together with additional data (e.g.: are we looking at
      the topmost frame? what kind of deopt are we in?). The result is the
      final deoptimized frame size in bytes, together with a bunch of
      intermediate results such as the variable frame size (= without the
      fixed-size portion).
      
      In order to consider the deoptimized frame size in optimized stack
      checks, we will need to calculate the frame layout during compilation
      in addition to what we currently do during deoptimization. This CL
      moves in that direction by extracting relevant parts of frame layout
      calculation into classes that can be reused by both compiler and
      deoptimizer.
      
      These helpers will support both precise and conservative modes; the
      deoptimizer will use the precise mode (since it has full information),
      while the instruction selector will use the conservative mode.
      
      Bug: v8:9534
      Change-Id: I93d6c39f10d251733f4625d3cc161b2010652d02
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1760825
      Commit-Queue: Jakob Gruber <jgruber@chromium.org>
      Reviewed-by: 's avatarSigurd Schneider <sigurds@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#63279}
      81642fa6
  11. 02 Aug, 2019 1 commit
  12. 31 Jul, 2019 1 commit
  13. 30 Jul, 2019 1 commit
    • Georgia Kouveli's avatar
      [arm64] Reduce code size of deoptimization exits · 207d6b35
      Georgia Kouveli authored
      Do not pass the deoptimization index in a register, instead infer it
      from the address we made the deoptimization call from. This makes the
      deoptimization exit sequence one instruction long instead of two.
      
      This requires emitting all deoptimization exits at the end of the
      function in a contiguous block, making sure no constant or veneer
      pools are emitted in between. This means that soft deoptimizations
      require an additional branch to the end of the function, which
      counteracts the removal of the move instruction, however soft
      deoptimizations are rare compared to eager and lazy ones.
      
      This reduces the code size of optimised functions for benchmarks like
      Octane and ARES-6 by about 4%.
      
      Change-Id: I771f9104a07de7931a4bb9c5836e25fb55b1a2a4
      Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1714876
      Commit-Queue: Georgia Kouveli <georgia.kouveli@arm.com>
      Reviewed-by: 's avatarMichael Starzinger <mstarzinger@chromium.org>
      Reviewed-by: 's avatarGeorg Neis <neis@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#62991}
      207d6b35
  14. 17 Jul, 2019 1 commit
  15. 11 Jul, 2019 1 commit
  16. 28 May, 2019 1 commit