• Benedikt Meurer's avatar
    [isolate] Move ThreadLocalTop into IsolateData. · e17e46fd
    Benedikt Meurer authored
    This refactors the ThreadLocalTop into separate header and
    implementation files, and moves it from the Isolate to the
    IsolateData (with some tweaks to make the layout of the class
    predictable). This has the advantage that all external references
    referring to addresses in the ThreadLocalTop (like js_entry_sp,
    c_function, c_entry_fp, etc.) need only a single memory access
    to reach them. For example the CallApiCallback can now use
    
    ```
    mov %rbp,0x8e40(%r13)
    mov %rsi,0x8de0(%r13)
    mov %rbx,0x8e50(%r13)
    ```
    
    to setup the information about context, frame pointer, and C++
    function pointer in the ThreadLocalTop instead of the previously
    generated code
    
    ```
    mov 0x2e28(%r13),%r10
    mov %rbp,(%r10)
    mov 0x2e38(%r13),%r10
    mov %rsi,(%r10)
    mov 0x2e30(%r13),%r10
    mov %rbx,(%r10)
    ```
    
    which always had to load the scratch register %r10 with the actual
    address first. This has interesting performance impact. On the
    test case mentioned in v8:8820 (with the `d8` patch applied), the
    performance goes from
    
    ```
    console.timeEnd: fnMono, 2290.012000
    console.timeEnd: fnCall, 2604.954000
    ```
    
    to
    
    ```
    console.timeEnd: fnMono, 2062.743000
    console.timeEnd: fnCall, 2477.556000
    ```
    
    which is a pretty solid **10%** improvement for the monomorphic API
    accessor case, and a **5%** improvement for calling into the API
    accessor instead.
    
    But there might as well be other places besides API callback calls
    that will benefit from this change, which I haven't tested explicitly.
    
    Although this change is supposed to be as minimal as possible without
    any functional effects, some changes were necessary/logical. Eventually
    we should reconsider changing the layout and the types for the fields
    in the ThreadLocalTop to be more consistent with the other IsolateData
    entities. But this can be done in separate follow-up CLs, as this will
    be quite a bit of churn on the code base, depending on how we do that
    exactly, and is orthogonal to this optimization.
    
    Bug: v8:8820, v8:8848, chromium:913553
    Change-Id: I4732c8e60231f0312eb7767358c48bae0338220d
    Cq-Include-Trybots: luci.chromium.try:linux-blink-rel
    Reviewed-on: https://chromium-review.googlesource.com/c/1474230Reviewed-by: 's avatarYang Guo <yangguo@chromium.org>
    Commit-Queue: Benedikt Meurer <bmeurer@chromium.org>
    Cr-Commit-Position: refs/heads/master@{#59624}
    e17e46fd
api.cc 395 KB