Commits · 8e11cc395afeca1cd3f0c5038f00e734779f4a23 · Linshizhi / V8

28 May, 2019 1 commit

[cleanup] Remove 'typedef struct' and 'typedef enum' · 0b14b8a1

Just use standard C++ syntax to define structs and enums instead.

R=ahaas@chromium.org

Bug: v8:9183
Change-Id: Ibae1643bd1dc74267cdd14ec45a36fc65bf0ab4b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1631410Reviewed-by: Andreas Haas <ahaas@chromium.org>
Reviewed-by: Yang Guo <yangguo@chromium.org>
Commit-Queue: Clemens Hammacher <clemensh@chromium.org>
Cr-Commit-Position: refs/heads/master@{#61889}

0b14b8a1

27 May, 2019 1 commit

Move unittest files · 24a51e1e

Yang Guo authored 5 years ago

R=sigurds@chromium.org

Bug: v8:9247
Change-Id: I25743f048e3e6cd22a18e003e77c8b78f147b630
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1630680Reviewed-by: Sigurd Schneider <sigurds@chromium.org>
Commit-Queue: Yang Guo <yangguo@chromium.org>
Cr-Commit-Position: refs/heads/master@{#61836}

24a51e1e

23 May, 2019 1 commit

Move utility code to src/utils · dec3298d

Yang Guo authored 5 years ago

NOPRESUBMIT=true
TBR=mstarzinger@chromium.org

Bug: v8:9247
Change-Id: I4cd6b79a1c2cba944f6f23caed59d4f1a4ee358b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1624217
Commit-Queue: Yang Guo <yangguo@chromium.org>
Reviewed-by: Igor Sheludko <ishell@chromium.org>
Reviewed-by: Georg Neis <neis@chromium.org>
Reviewed-by: Jaroslav Sevcik <jarin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#61790}

dec3298d

21 May, 2019 1 commit

Yang Guo authored 5 years ago

Bug: v8:9247
Change-Id: I9bcf2694b449f79cdbe03f5fde59cb21b8cad418
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1619758
Commit-Queue: Yang Guo <yangguo@chromium.org>
Reviewed-by: Michael Lippautz <mlippautz@chromium.org>
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Reviewed-by: Georg Neis <neis@chromium.org>
Cr-Commit-Position: refs/heads/master@{#61676}

be014256

26 Apr, 2019 1 commit

[runtime] Simplify/unify utf8 handling · b7ed86ec

Toon Verwaest authored 5 years ago

- Removes Utf8Iterator
- Replaces Utf8Decoder with something based on ValueOfIncremental +
  NonAsciiStart and moves it into v8/internal.
- Internalizes utf8 strings by first converting them to one or two byte
- Removes IsUtf8EqualsTo and replaces current uses with IsOneByteEqualsTo

Tbr: jgruber@chromium.org
Change-Id: I16e08d910a745e78d6fd465718fc69ad731fd217
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1585840
Commit-Queue: Toon Verwaest <verwaest@chromium.org>
Reviewed-by: Igor Sheludko <ishell@chromium.org>
Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
Cr-Commit-Position: refs/heads/master@{#61049}

b7ed86ec

06 Feb, 2019 1 commit

Fix & reland "[utf8] Rewrite NewStringFromUtf8 using Utf8::ValueOfIncremental" · e0f0d60c

Toon Verwaest authored 5 years ago

Change-Id: I2c8bd545dc606d76603bdf73f1ea54d4c04842c1
Reviewed-on: https://chromium-review.googlesource.com/c/1456101Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
Commit-Queue: Toon Verwaest <verwaest@chromium.org>
Cr-Commit-Position: refs/heads/master@{#59399}

e0f0d60c

05 Feb, 2019 1 commit

Revert "[utf8] Rewrite NewStringFromUtf8 using Utf8::ValueOfIncremental" · ec30cf47

Maya Lekova authored 5 years ago

This reverts commit 73dd9b55.

Reason for revert: Broke telemetry layout tests - https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7-rel/9936 as can be seen in this roll - https://chromium-review.googlesource.com/c/chromium/src/+/1454259

Original change's description:
> [utf8] Rewrite NewStringFromUtf8 using Utf8::ValueOfIncremental
> 
> This is 3-4x faster than using the Utf8Decoder. This matters for proper
> parse-time measurements using d8.
> 
> Change-Id: I9870e9fbe400ec022a6eeb20491c80a2a32f8519
> Reviewed-on: https://chromium-review.googlesource.com/c/1451827
> Commit-Queue: Toon Verwaest <verwaest@chromium.org>
> Reviewed-by: Leszek Swirski <leszeks@chromium.org>
> Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#59347}

TBR=ulan@chromium.org,leszeks@chromium.org,verwaest@chromium.org

# Not skipping CQ checks because original CL landed > 1 day ago.

Change-Id: I3f8faebb61c19a41ee496a571228f53c0d5fc8dd
Reviewed-on: https://chromium-review.googlesource.com/c/1454495Reviewed-by: Maya Lekova <mslekova@chromium.org>
Commit-Queue: Yang Guo <yangguo@chromium.org>
Cr-Commit-Position: refs/heads/master@{#59378}

ec30cf47

04 Feb, 2019 1 commit

[utf8] Rewrite NewStringFromUtf8 using Utf8::ValueOfIncremental · 73dd9b55

Toon Verwaest authored 5 years ago

This is 3-4x faster than using the Utf8Decoder. This matters for proper
parse-time measurements using d8.

Change-Id: I9870e9fbe400ec022a6eeb20491c80a2a32f8519
Reviewed-on: https://chromium-review.googlesource.com/c/1451827
Commit-Queue: Toon Verwaest <verwaest@chromium.org>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
Cr-Commit-Position: refs/heads/master@{#59347}

73dd9b55

15 Nov, 2018 1 commit

[base] Introduce VectorOf helper · 3ad032b7

Clemens Hammacher authored 6 years ago

We often need to create a {Vector} view of data owned by a container
like {std::vector}. The canonical way to do this is this:
Vector<T>{vec.data(), vec.size()}

This pattern is repeating information which can be deduced
automatically, like the type T.

This CL introduces a {VectorOf} helper which can construct a {Vector}
for any container providing a {data()} and {size()} accessor, and uses
it to replace the pattern above.

R=ishell@chromium.org

Bug: v8:8238
Change-Id: Ib3a11662acc82cb83f2b4afd07ba88e579d71dba
Reviewed-on: https://chromium-review.googlesource.com/c/1337584Reviewed-by: Igor Sheludko <ishell@chromium.org>
Commit-Queue: Clemens Hammacher <clemensh@chromium.org>
Cr-Commit-Position: refs/heads/master@{#57538}

3ad032b7

20 Feb, 2018 1 commit

Consolidate UTF-8 Vector<char> to uc16 decoding into Iterator · f6b6f71b

Justin Ridgewell authored 6 years ago

Too many files know how to deal with decoding, counting, and splitting UTF-8
into uc16 chars. This consolidates several callers who deal with full
(Vector<char>, not streaming) bytes by using a UTF-8 Iterator to decode bytes
into individual uc16 chars.

R=marja@chromium.org

Bug: 
Change-Id: Ia36df3e8c1abd0398415ad23a474557c71c19a01
Reviewed-on: https://chromium-review.googlesource.com/831093Reviewed-by: Marja Hölttä <marja@chromium.org>
Commit-Queue: Justin Ridgewell <jridgewell@google.com>
Cr-Commit-Position: refs/heads/master@{#51405}

f6b6f71b

11 Dec, 2017 1 commit

Implement DFA Unicode Decoder · cedec225

Justin Ridgewell authored 7 years ago

This is a separation of the DFA Unicode Decoder from
https://chromium-review.googlesource.com/c/v8/v8/+/789560

I attempted to make the DFA's table a bit more explicit in this CL. Still, the
linter prevents me from letting me present the array as a "table" in source
code. For a better representation, please refer to
https://docs.google.com/spreadsheets/d/1L9STtkmWs-A7HdK5ZmZ-wPZ_VBjQ3-Jj_xN9c6_hLKA

- - - - -

Now for a big copy-paste from 789560:

Essentially, reworks a standard FSM (imagine an
array of structs) and flattens it out into a single-dimension array.
Using Table 3-7 of the Unicode 10.0.0 standard (page 126 of
http://www.unicode.org/versions/Unicode10.0.0/ch03.pdf), we can nicely
map all bytes into one of 12 character classes:

00. 0x00-0x7F
01. 0x80-0x8F (split from general continuation because this range is not
    valid after a 0xF0 leading byte)
02. 0x90-0x9F (split from general continuation because this range is not
    valid after a 0xE0 nor a 0xF4 leading byte)
03. 0xA0-0xBF (the rest of the continuation range)
04. 0xC0-0xC1, 0xF5-0xFF (the joined range of invalid bytes, notice this
    includes 255 which we use as a known bad byte during hex-to-int
        decoding)
05. 0xC2-0xDF (leading bytes which require any continuation byte
    afterwards)
06. 0xE0 (leading byte which requires a 0xA0-0xBF afterwards then any
    continuation byte after that)
07. 0xE1-0xEC, 0xEE-0xEF (leading bytes which requires any continuation
    afterwards then any continuation byte after that)
08. 0xED (leading byte which requires a 0x80-0x9F afterwards then any
    continuation byte after that)
09. 0xF1-F3 (leading bytes which requires any continuation byte
    afterwards then any continuation byte then any continuation byte)
10. 0xF0 (leading bytes which requires a 0x90-0xBF afterwards then any
    continuation byte then any continuation byte)
11. 0xF4 (leading bytes which requires a 0x80-0x8F afterwards then any
    continuation byte then any continuation byte)

Note that 0xF0 and 0xF1-0xF3 were swapped so that fewer bytes were
needed to represent the transition state ("9, 10, 10, 10" vs.
"10, 9, 9, 9").

Using these 12 classes as "transitions", we can map from one state to
the next. Each state is defined as some multiple of 12, so that we're
always starting at the 0th column of each row of the FSM. From each
state, we add the transition and get a index of the new row the FSM is
entering.

If at any point we encounter a bad byte, the state + bad-byte-transition
is guaranteed to map us into the first row of the FSM (which contains no
valid exiting transitions).

The key differences from Björn's original (or his self-modified) DFA is
the "bad" state is now mapped to 0 (or the first row of the FSM) instead
of 12 (the second row). This saves ~50 bytes when gzipping, and also
speeds up determining if a string is properly encoded (see his sample
code at http://bjoern.hoehrmann.de/utf-8/decoder/dfa/#performance).

Finally, I've replace his ternary check with an array access, to make
the algorithm branchless. This places a requirement on the caller to 0
out the code point between successful decodings, which it could always
have done because it's already branching.

R=marja@google.com

Bug: 
Change-Id: I574f208a84dc5d06caba17127b0d41f7ce1a3395
Reviewed-on: https://chromium-review.googlesource.com/805357
Commit-Queue: Justin Ridgewell <jridgewell@google.com>
Reviewed-by: Marja Hölttä <marja@chromium.org>
Reviewed-by: Mathias Bynens <mathias@chromium.org>
Cr-Commit-Position: refs/heads/master@{#50012}

cedec225

02 Dec, 2017 1 commit

Normalize casing of hexadecimal digits · 822be9b2

Mathias Bynens authored 7 years ago

This patch normalizes the casing of hexadecimal digits in escape
sequences of the form `\xNN` and integer literals of the form
`0xNNNN`.

Previously, the V8 code base used an inconsistent mixture of uppercase
and lowercase.

Google’s C++ style guide uses uppercase in its examples:
https://google.github.io/styleguide/cppguide.html#Non-ASCII_Characters

Moreover, uppercase letters more clearly stand out from the lowercase
`x` (or `u`) characters at the start, as well as lowercase letters
elsewhere in strings.

BUG=v8:7109
TBR=marja@chromium.org,titzer@chromium.org,mtrofin@chromium.org,mstarzinger@chromium.org,rossberg@chromium.org,yangguo@chromium.org,mlippautz@chromium.org
NOPRESUBMIT=true

Cq-Include-Trybots: master.tryserver.blink:linux_trusty_blink_rel;master.tryserver.chromium.linux:linux_chromium_rel_ng
Change-Id: I790e21c25d96ad5d95c8229724eb45d2aa9e22d6
Reviewed-on: https://chromium-review.googlesource.com/804294
Commit-Queue: Mathias Bynens <mathias@chromium.org>
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Cr-Commit-Position: refs/heads/master@{#49810}

822be9b2

29 Sep, 2017 1 commit

[unicode] Add tests for UTF-8 decoders + minor cleanups. · fcb89f55

Marja Hölttä authored 7 years ago

Verify that both UTF-8 decoders (incremental and non-incremental one) match the
expectations.

Also cleanup / harden the UTF-8 handling code, as suggested in
https://chromium-review.googlesource.com/c/v8/v8/+/671020/ .


BUG=chromium:765608

Change-Id: I6344d62ca15b75ac8e333421c94c4aa35ab8190d
Reviewed-on: https://chromium-review.googlesource.com/681217
Commit-Queue: Marja Hölttä <marja@chromium.org>
Reviewed-by: Camillo Bruni <cbruni@chromium.org>
Cr-Commit-Position: refs/heads/master@{#48229}

fcb89f55

22 Nov, 2016 1 commit

Fix out-of-range access in unibrow::Utf8::CalculateValue. · 9d524bd3

jbroman authored 8 years ago

This code should not access bytes out of the permitted range in order to check
the range of a possible UTF-8 value. Instead, the length check should occur
before such checks.

BUG=chromium:667260, chromium:662822

Review-Url: https://codereview.chromium.org/2520053003
Cr-Commit-Position: refs/heads/master@{#41165}

9d524bd3