Commit ce538f70 authored by Wiktor Garbacz's avatar Wiktor Garbacz Committed by Commit Bot

[parser] Refactor streaming scanner streams.

Unify, simplify logic, reduce UTF8 specific handling.

Intend of this is also to have stream views.
Stream views can be used concurrently by multiple threads, but
only one thread may fetch new data from the underlying source.
This together with unified stream view creation is intended to be
used for parse tasks.

BUG=v8:6093

Change-Id: Ied8e93090c506d4735080298f0fdaeed32043915
Reviewed-on: https://chromium-review.googlesource.com/501789
Commit-Queue: Wiktor Garbacz <wiktorg@google.com>
Reviewed-by: 's avatarDaniel Vogelheim <vogelheim@chromium.org>
Reviewed-by: 's avatarMarja Hölttä <marja@chromium.org>
Cr-Commit-Position: refs/heads/master@{#45336}
parent e418a1e4
...@@ -1266,11 +1266,6 @@ class V8_EXPORT ScriptCompiler { ...@@ -1266,11 +1266,6 @@ class V8_EXPORT ScriptCompiler {
* length of the data returned. When the data ends, GetMoreData should * length of the data returned. When the data ends, GetMoreData should
* return 0. Caller takes ownership of the data. * return 0. Caller takes ownership of the data.
* *
* When streaming UTF-8 data, V8 handles multi-byte characters split between
* two data chunks, but doesn't handle multi-byte characters split between
* more than two data chunks. The embedder can avoid this problem by always
* returning at least 2 bytes of data.
*
* If the embedder wants to cancel the streaming, they should make the next * If the embedder wants to cancel the streaming, they should make the next
* GetMoreData call return 0. V8 will interpret it as end of data (and most * GetMoreData call return 0. V8 will interpret it as end of data (and most
* probably, parsing will fail). The streaming task will return as soon as * probably, parsing will fail). The streaming task will return as soon as
......
This diff is collapsed.
...@@ -435,6 +435,18 @@ TEST(CharacterStreams) { ...@@ -435,6 +435,18 @@ TEST(CharacterStreams) {
TestCharacterStreams(buffer, arraysize(buffer) - 1, 576, 3298); TestCharacterStreams(buffer, arraysize(buffer) - 1, 576, 3298);
} }
TEST(Uft8MultipleBOMChunks) {
const char* chunks = "\xef\xbb\xbf\0\xef\xbb\xbf\0\xef\xbb\xbf\0a\0";
const uint16_t unicode[] = {0xFEFF, 0xFEFF, 97};
ChunkSource chunk_source(chunks);
std::unique_ptr<i::Utf16CharacterStream> stream(i::ScannerStream::For(
&chunk_source, v8::ScriptCompiler::StreamedSource::UTF8, nullptr));
for (size_t i = 0; i < arraysize(unicode); i++) {
CHECK_EQ(unicode[i], stream->Advance());
}
CHECK_EQ(i::Utf16CharacterStream::kEndOfInput, stream->Advance());
}
// Regression test for crbug.com/651333. Read invalid utf-8. // Regression test for crbug.com/651333. Read invalid utf-8.
TEST(Regress651333) { TEST(Regress651333) {
const uint8_t bytes[] = const uint8_t bytes[] =
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment