1. 15 Feb, 2019 1 commit
    • Benedikt Meurer's avatar
      [isolate] Move ThreadLocalTop into IsolateData. · e17e46fd
      Benedikt Meurer authored
      This refactors the ThreadLocalTop into separate header and
      implementation files, and moves it from the Isolate to the
      IsolateData (with some tweaks to make the layout of the class
      predictable). This has the advantage that all external references
      referring to addresses in the ThreadLocalTop (like js_entry_sp,
      c_function, c_entry_fp, etc.) need only a single memory access
      to reach them. For example the CallApiCallback can now use
      
      ```
      mov %rbp,0x8e40(%r13)
      mov %rsi,0x8de0(%r13)
      mov %rbx,0x8e50(%r13)
      ```
      
      to setup the information about context, frame pointer, and C++
      function pointer in the ThreadLocalTop instead of the previously
      generated code
      
      ```
      mov 0x2e28(%r13),%r10
      mov %rbp,(%r10)
      mov 0x2e38(%r13),%r10
      mov %rsi,(%r10)
      mov 0x2e30(%r13),%r10
      mov %rbx,(%r10)
      ```
      
      which always had to load the scratch register %r10 with the actual
      address first. This has interesting performance impact. On the
      test case mentioned in v8:8820 (with the `d8` patch applied), the
      performance goes from
      
      ```
      console.timeEnd: fnMono, 2290.012000
      console.timeEnd: fnCall, 2604.954000
      ```
      
      to
      
      ```
      console.timeEnd: fnMono, 2062.743000
      console.timeEnd: fnCall, 2477.556000
      ```
      
      which is a pretty solid **10%** improvement for the monomorphic API
      accessor case, and a **5%** improvement for calling into the API
      accessor instead.
      
      But there might as well be other places besides API callback calls
      that will benefit from this change, which I haven't tested explicitly.
      
      Although this change is supposed to be as minimal as possible without
      any functional effects, some changes were necessary/logical. Eventually
      we should reconsider changing the layout and the types for the fields
      in the ThreadLocalTop to be more consistent with the other IsolateData
      entities. But this can be done in separate follow-up CLs, as this will
      be quite a bit of churn on the code base, depending on how we do that
      exactly, and is orthogonal to this optimization.
      
      Bug: v8:8820, v8:8848, chromium:913553
      Change-Id: I4732c8e60231f0312eb7767358c48bae0338220d
      Cq-Include-Trybots: luci.chromium.try:linux-blink-rel
      Reviewed-on: https://chromium-review.googlesource.com/c/1474230Reviewed-by: 's avatarYang Guo <yangguo@chromium.org>
      Commit-Queue: Benedikt Meurer <bmeurer@chromium.org>
      Cr-Commit-Position: refs/heads/master@{#59624}
      e17e46fd