• Leszek Swirski's avatar
    Reland^4 "[serializer] Allocate during deserialization" · 3c508b38
    Leszek Swirski authored
    This relands commit 3f4e9bbe.
    which was a reland of c4a062a9
    which was a reland of 28a30c57
    which was a reland of 5d7a29c9
    
    The change had an issue that embedders implementing heap tracing (e.g.
    Unified Heap with Blink) could be passed an uninitialized pointer if
    marking happened during deserialization of an object containing such a
    pointer. Because of the 0xdeadbed0 uninitialized filler value, these
    embedders would then receive the value 0xdeadbed0deadbed0 as the
    'pointer', and crash on dereference.
    
    There is, however, special handling already for null pointers in heap
    tracing, also for dealing with not-yet initialized values. So, we can
    make the uninitialized Smi filler be 0x00000000, and that will make such
    embedded fields have a nullptr representation, making them follow the
    normal uninitialized value bailouts.
    
    In addition, it relands the following dependent changes, which are
    relanding unchanged and are followup performance improvements.
    Relanding them in the same change should allow for cleaner reverts
    should they be needed.
    
    This relands commit 76ad3ab5
    [identity-map] Change resize heuristic
    
    This relands commit 77cc96aa
    [identity-map] Cache the calculated Hash
    
    This relands commit bee5b996
    [serializer] Remove Deserializer::Initialize
    
    This relands commit c8f73f22
    [serializer] Cache instance type in PostProcessNewObject
    
    This relands commit 4e7c99ab
    [identity-map] Remove double-lookups in IdentityMap
    
    Original change's description:
    > Reland^3 "[serializer] Allocate during deserialization"
    >
    > This is a reland of c4a062a9
    > which was a reland of 28a30c57
    > which was a reland of 5d7a29c9
    >
    > Fixes TSAN errors from non-atomic writes in the deserializer. Now all
    > writes are (relaxed) atomic.
    >
    > Original change's description:
    > > Reland^2 "[serializer] Allocate during deserialization"
    > >
    > > This is a reland of 28a30c57
    > > which was a reland of 5d7a29c9
    > >
    > > The crashes were from calling RegisterDeserializerFinished on a null
    > > Isolate pointer, for a deserializer that was never initialised
    > > (specifically, ReadOnlyDeserializer when ROHeap is shared).
    > >
    > > Original change's description:
    > > > Reland "[serializer] Allocate during deserialization"
    > > >
    > > > This is a reland of 5d7a29c9
    > > >
    > > > This reland shuffles around the order of checks in Heap::AllocateRawWith
    > > > to not check the new space addresses until it's known that this is a new
    > > > space allocation. This fixes an UBSan failure during read-only space
    > > > deserialization, which happens before the new space is initialized.
    > > >
    > > > It also fixes some issues discovered by --stress-snapshot, around
    > > > serializing ThinStrings (which are now elided as part of serialization),
    > > > handle counts (I bumped the maximum handle count in that check), and
    > > > clearing map transitions (the map backpointer field needed a Smi
    > > > uninitialized value check).
    > > >
    > > > Original change's description:
    > > > > [serializer] Allocate during deserialization
    > > > >
    > > > > This patch removes the concept of reservations and a specialized
    > > > > deserializer allocator, and instead makes the deserializer allocate
    > > > > directly with the Heap's Allocate method.
    > > > >
    > > > > The major consequence of this is that the GC can now run during
    > > > > deserialization, which means that:
    > > > >
    > > > >   a) Deserialized objects are visible to the GC, and
    > > > >   b) Objects that the deserializer/deserialized objects point to can
    > > > >      move.
    > > > >
    > > > > Point a) is mostly not a problem due to previous work in making
    > > > > deserialized objects "GC valid", i.e. making sure that they have a valid
    > > > > size before any subsequent allocation/safepoint. We now additionally
    > > > > have to initialize the allocated space with a valid tagged value -- this
    > > > > is a magic Smi value to keep "uninitialized" checks simple.
    > > > >
    > > > > Point b) is solved by Handlifying the deserializer. This involves
    > > > > changing any vectors of objects into vectors of Handles, and any object
    > > > > keyed map into an IdentityMap (we can't use Handles as keys because
    > > > > the object's address is no longer a stable hash).
    > > > >
    > > > > Back-references can no longer be direct chunk offsets, so instead the
    > > > > deserializer stores a Handle to each deserialized object, and the
    > > > > backreference is an index into this handle array. This encoding could
    > > > > be optimized in the future with e.g. a second pass over the serialized
    > > > > array which emits a different bytecode for objects that are and aren't
    > > > > back-referenced.
    > > > >
    > > > > Additionally, the slot-walk over objects to initialize them can no
    > > > > longer use absolute slot offsets, as again an object may move and its
    > > > > slot address would become invalid. Now, slots are walked as relative
    > > > > offsets to a Handle to the object, or as absolute slots for the case of
    > > > > root pointers. A concept of "slot accessor" is introduced to share the
    > > > > code between these two modes, and writing the slot (including write
    > > > > barriers) is abstracted into this accessor.
    > > > >
    > > > > Finally, the Code body walk is modified to deserialize all objects
    > > > > referred to by RelocInfos before doing the RelocInfo walk itself. This
    > > > > is because RelocInfoIterator uses raw pointers, so we cannot allocate
    > > > > during a RelocInfo walk.
    > > > >
    > > > > As a drive-by, the VariableRawData bytecode is tweaked to use tagged
    > > > > size rather than byte size -- the size is expected to be tagged-aligned
    > > > > anyway, so now we get an extra few bits in the size encoding.
    > > > >
    > > > > Bug: chromium:1075999
    > > > > Change-Id: I672c42f553f2669888cc5e35d692c1b8ece1845e
    > > > > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2404451
    > > > > Commit-Queue: Leszek Swirski <leszeks@chromium.org>
    > > > > Reviewed-by: Jakob Gruber <jgruber@chromium.org>
    > > > > Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
    > > > > Cr-Commit-Position: refs/heads/master@{#70229}
    
    Bug: chromium:1075999
    Change-Id: Ib514a4ef16bd02bfb60d046ecbf8fae1ead64a98
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2452689
    Commit-Queue: Leszek Swirski <leszeks@chromium.org>
    Reviewed-by: 's avatarUlan Degenbaev <ulan@chromium.org>
    Reviewed-by: 's avatarJakob Gruber <jgruber@chromium.org>
    Cr-Commit-Position: refs/heads/master@{#70366}
    3c508b38
code-serializer.h 4.19 KB