• Seth Brenith's avatar
    Reland "Background merging of deserialized scripts" · 766b2a4d
    Seth Brenith authored
    This is a reland of commit e895b7af
    
    The unit test has been updated to work correctly when
    --stress-incremental-marking is enabled.
    
    Original change's description:
    > Background merging of deserialized scripts
    >
    > Recently, https://crrev.com/c/v8/v8/+/3681880 added new API functions
    > with which an embedder could request that V8 merge newly deserialized
    > script data into an existing Script from the Isolate's compilation
    > cache. This change implements those new functions. This functionality is
    > still disabled by default due to the flag
    > merge_background_deserialized_script_with_compilation_cache.
    >
    > The goal of this new functionality is to reduce memory usage when
    > multiple frames load the same script with a long delay between (long
    > enough for the script to have been evicted from Blink's in-memory cache
    > and for the top-level SharedFunctionInfo to be flushed). In that case,
    > there are two Script objects for the same script: one which was found in
    > the Isolate compilation cache (the "old" script), and one which was
    > recently deserialized (the "new" script). The new script's object graph
    > is essentially standalone: it may point to internalized strings and
    > readonly objects such as the empty feedback metadata, but otherwise
    > it is unconnected to the rest of the heap. The merging logic takes any
    > useful data from the new script's object graph and attaches it into the
    > old script's object graph, so that the new Script object and any other
    > duplicated objects can be discarded. More specifically:
    >
    > 1. If the new Script has a SharedFunctionInfo for a particular function
    >    literal, and the old Script does not, then the old Script is updated
    >    to refer to the new SharedFunctionInfo.
    > 2. If the new Script has a compiled SharedFunctionInfo for a particular
    >    function literal, and the old Script has an uncompiled
    >    SharedFunctionInfo, then the old SharedFunctionInfo is updated to
    >    point to the function_data and feedback_metadata from the new
    >    SharedFunctionInfo.
    > 3. If any used object from the new object graph points to a
    >    SharedFunctionInfo, where the old object graph contains a matching
    >    SharedFunctionInfo for the same function literal, then that pointer
    >    is updated to point to the old SharedFunctionInfo.
    >
    > The document at [0] includes diagrams showing an example merge on a very
    > small script.
    >
    > Steps 1 and 2 above are pretty simple, but step 3 requires walking a
    > possibly large set of objects, so this new API lets the embedder run
    > step 3 from a background thread. Steps 1 and 2 are performed later, on
    > the main thread.
    >
    > The next important question is: in what ways can the old script's object
    > graph be modified during the background execution of step 3, or during
    > the time after step 3 but before steps 1 and 2?
    >
    > A. SharedFunctionInfos can go from compiled to uncompiled due to
    >    flushing. This is okay; the worst outcome is that the function would
    >    need to be compiled again later. Such a risk is already present,
    >    since V8 doesn't keep IsCompiledScopes for every compiled function in
    >    a background-deserialized script.
    > B. SharedFunctionInfos can go from uncompiled to compiled due to lazy
    >    compilation. This is also okay; the merge completion logic on the
    >    main thread will just keep this lazily compiled data rather than
    >    inserting compiled data from the newly deserialized object graph.
    > C. SharedFunctionInfos can be cleared from the Script's weak array if
    >    they are no longer referenced. This is mostly okay, because any
    >    SharedFunctionInfo that is needed by the background merge is strongly
    >    referenced and therefore can't be cleared. The only problem arises if
    >    the top-level SharedFunctionInfo gets cleared, so the merge task must
    >    deliberately keep a reference to that one.
    > D. SharedFunctionInfos can be created if they are needed due to lazy
    >    compilation of a parent function. This change is somewhat troublesome
    >    because it invalidates the background thread's work and requires a
    >    re-traversal on the main thread to update any pointers that should
    >    point to this lazily compiled SharedFunctionInfo.
    >
    > At a high level, this change implements three previously unimplemented
    > functions in BackgroundDeserializeTask (in compiler.cc) and updates one:
    >
    > - BackgroundDeserializeTask::SourceTextAvailable, run on the main
    >   thread, checks whether there is a matching Script in the Isolate
    >   compilation cache which doesn't already have a top-level
    >   SharedFunctionInfo. If so, it saves that Script in a persistent
    >   handle.
    > - BackgroundDeserializeTask::ShouldMergeWithExistingScript checks
    >   whether the persistent handle from the first step exists (a fast
    >   operation which can be called from any thread).
    > - BackgroundDeserializeTask::MergeWithExistingScript, run on a
    >   background thread, performs step 3 of the merge described above and
    >   generates lists of persistent data describing how the main thread can
    >   complete the merge.
    > - BackgroundDeserializeTask::Finish is updated to perform the merge
    >   steps 1 and 2 listed above, as well as a possible re-traversal of the
    >   graph if required due to newly created SharedFunctionInfos in the old
    >   Script.
    >
    > The merge logic has nothing to do with deserialization, and indeed I
    > hope to reuse it for background compilation tasks as well, so it is all
    > contained within a new class BackgroundMergeTask (in compiler.h,cc). It
    > uses a second class, ForwardPointersVisitor (in compiler.cc) to perform
    > the object visitation that updates pointers to SharedFunctionInfos.
    >
    > [0] https://docs.google.com/document/d/1UksB5Vm7TT1-f3S9W1dK_rP9jKn_ly0WVm_UDPpWuBw/edit
    >
    > Bug: v8:12808
    > Change-Id: Id405869e9d5b106ca7afd9c4b08cb5813e6852c6
    > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3739232
    > Reviewed-by: Leszek Swirski <leszeks@chromium.org>
    > Commit-Queue: Seth Brenith <seth.brenith@microsoft.com>
    > Cr-Commit-Position: refs/heads/main@{#81941}
    
    Bug: v8:12808
    Change-Id: Id2036dfa4eba8670cac899773d7a906825fa2c50
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3787266Reviewed-by: 's avatarLeszek Swirski <leszeks@chromium.org>
    Commit-Queue: Seth Brenith <seth.brenith@microsoft.com>
    Cr-Commit-Position: refs/heads/main@{#82045}
    766b2a4d
code-serializer.cc 29.9 KB