Commits · 3fd1f8a7b4b9a1381f46f724973f14f11126765f · Linshizhi / V8

31 Jul, 2018 1 commit

Fix canonicalization of grandfathered tags · f24b575d

Jungshik Shin authored 6 years ago

ICU maps a few grandfathered tags to made-up values even when there
is no preferred value entry in the IANA language tag registry. [1]

1. Check for grandfathered tags without preferred value upfront
   and return them as they're.
2. Lowercase the input before structural validity check to simplify
   check for grandfathered tag without preferred value as well
   as regexps used in the structural validity check.

intl/general/grandfathered_tags_without_preferred_value is added and
intl/general/language_tags_with_preferred_values is changed to check
for case-insensitive matching of grandfathered tags.

[1] https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry

Bug: v8:7669
Test: test262/intl402/Intl/getCanonicalLocales/preferred-grandfathered
Test: intl/general/grandfathered_tags_without_preferred_value
Cq-Include-Trybots: luci.v8.try:v8_linux_noi18n_rel_ng
Cq-Include-Trybots: luci.chromium.try:linux_chromium_rel_ng
Change-Id: Ie0520de8712928300fd71fe152909789483ec256
Reviewed-on: https://chromium-review.googlesource.com/1156529
Commit-Queue: Jungshik Shin <jshin@chromium.org>
Reviewed-by: Sathya Gunasekaran <gsathya@chromium.org>
Cr-Commit-Position: refs/heads/master@{#54829}

f24b575d

27 Jul, 2018 1 commit

[Intl] Add tests for duplicate subtag detection. · 47922400

Brian Stell authored 6 years ago

Also removed an obsolete test that is covered by test262/intl402

Bug: v8:7954, v8:5751

Cq-Include-Trybots: luci.v8.try:v8_linux_noi18n_rel_ng
Change-Id: I41113653cd27c165e6f0a52e4b63bb9ddc553cba
Reviewed-on: https://chromium-review.googlesource.com/1150453
Commit-Queue: Jungshik Shin <jshin@chromium.org>
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Reviewed-by: Jungshik Shin <jshin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#54757}

47922400

11 Jun, 2018 1 commit

Add more testing of SupportedLocalesOf() · b365b641

Brian Stell authored 6 years ago

R=gsathya@chromium.org, littledan@chromium.org

Cq-Include-Trybots: luci.v8.try:v8_linux_noi18n_rel_ng
Change-Id: Ib3806f2b8d6f8adf61fe0dc8c327c461e1d20304
Reviewed-on: https://chromium-review.googlesource.com/1095558
Commit-Queue: Brian Stell <bstell@chromium.org>
Commit-Queue: Sathya Gunasekaran <gsathya@chromium.org>
Reviewed-by: Sathya Gunasekaran <gsathya@chromium.org>
Cr-Commit-Position: refs/heads/master@{#53653}

b365b641

26 Apr, 2018 1 commit

Fix the fast path for locale canonicalization · 919270e0

Jungshik Shin authored 6 years ago

Not all 2 or 3 letter language codes are canonical. Some of them need
to be canonicalized.

Specifically, exclude {jw,ji,iw,in} and all three-letter codes from the
fast path except for 'fil'.

{jw,ji,iw,in} are deprecated ISO 639 codes for
{Javanese, Yiddish, Hebrew, Indonesian}. They should be
canonicalized to {jv,yi,he,id}. So, do not return early
in the fast path, but pass it down to the full canonicalization.

In addition, there are 70+ deprecated 3-letter codes that need to be
replaced by their modern equivalents. Instead of checking and replacing
in v8, just pass them to ICU to handle.

Along with the following ICU change, two more tests will pass.

https://chromium-review.googlesource.com/c/chromium/deps/icu/+/1026797

These two tests still fail because of the disagreement between ICU and the test
expectations about 5 grandfathered tags with no preferred value (e.g.
i-default, zh-min, cel-gaulish).

'intl402/Intl/getCanonicalLocales/canonicalized-tags'
'intl402/Intl/getCanonicalLocales/preferred-grandfathered'

Bug: v8:5693, v8:7669
Test: test262/intl402/language-tags-canonicalized.js
Test: test262/intl402/Intl/preferred-variants.js
Test: intl/general/language_tags_with_preferred_values.js
Cq-Include-Trybots: luci.v8.try:v8_linux_noi18n_rel_ng
Change-Id: Ide7e9c90ac046859604c7b71c641f84ce9c64be5
Reviewed-on: https://chromium-review.googlesource.com/1023379Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Commit-Queue: Jungshik Shin <jshin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#52823}

919270e0

12 Oct, 2017 1 commit

Correct the misuse of uloc_{to,from}LanguageTag · 69bd294a

Jungshik Shin authored 7 years ago

- remove unused Runtime_GetLanguageTagVariants
- add test for another related bug (chromium:770452) as well as for 
chromium:770450 . 

Bug: chromium:770450, chromium:770452
Test: intl/general/invalid-locale.js
Cq-Include-Trybots: master.tryserver.v8:v8_linux_noi18n_rel_ng
Change-Id: I4496a4a5421000faa0e37aed85fea21ceb487998
Reviewed-on: https://chromium-review.googlesource.com/710816Reviewed-by: Adam Klein <adamk@chromium.org>
Commit-Queue: Jungshik Shin <jshin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#48483}

69bd294a

02 Aug, 2017 1 commit

Fix common misspellings · b41f857b

Julien Brianceau authored 7 years ago

Bug: chromium:750830
Cq-Include-Trybots: master.tryserver.blink:linux_trusty_blink_rel;master.tryserver.chromium.linux:linux_chromium_rel_ng;master.tryserver.v8:v8_linux_noi18n_rel_ng
Change-Id: Icab7b5a1c469d5e77d04df8bfca8319784e92af4
Reviewed-on: https://chromium-review.googlesource.com/595655
Commit-Queue: Julien Brianceau <jbriance@cisco.com>
Reviewed-by: Yang Guo <yangguo@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Reviewed-by: Clemens Hammacher <clemensh@chromium.org>
Reviewed-by: Daniel Ehrenberg <littledan@chromium.org>
Cr-Commit-Position: refs/heads/master@{#47072}

b41f857b

29 Jun, 2017 1 commit

Remove icu_case_mapping flag · 1163aba7

Jungshik Shin authored 7 years ago

icu-case-mapping was shipped a few months ago. By dropping
the flag, unibrow's case conversion code won't be included
by default because V8_INTL_SUPPORT is on by default.

BUG=v8:4477, v8:4476
TEST=test262/{built-ins,intl402}/Strings/*, webkit/fast/js/*,
     mjsunit/string-case, intl/general/case*

Cq-Include-Trybots: master.tryserver.v8:v8_linux_noi18n_rel_ng
Change-Id: I78be9cc64b4588bc5af79ecbbadf93af6e84a1df
Reviewed-on: https://chromium-review.googlesource.com/534541
Commit-Queue: Jungshik Shin <jshin@chromium.org>
Reviewed-by: Benedikt Meurer <bmeurer@chromium.org>
Reviewed-by: Daniel Ehrenberg <littledan@chromium.org>
Cr-Commit-Position: refs/heads/master@{#46304}

1163aba7

04 May, 2017 1 commit

Handle private / grandfathered tags gracefully for case-conversion · 6545911f

Jungshik Shin authored 7 years ago

Bug=v8:6083
Test=intl/general/case-mapping.js

Change-Id: I254c54520262298d6843948654d1dc4583b0c245
Reviewed-on: https://chromium-review.googlesource.com/496886Reviewed-by: Adam Klein <adamk@chromium.org>
Commit-Queue: Jungshik Shin <jshin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#45115}

6545911f

13 Jan, 2017 1 commit

Fix two DCHECK failures in ICU case mapping code · ac9e6285

jshin authored 8 years ago

1.
DCHECK in runtime-i18n.cc for case mapping was wrong to
assume that the longest primary language tag is 3 characters.
BCP 47 actually allows up to 8 characters.

2. GetFlatContent() was called to a string without flattening it first.

BUG=680314,680464
TEST=intl/general/case-mapping (see also the bugs)

Review-Url: https://codereview.chromium.org/2629763003
Cr-Commit-Position: refs/heads/master@{#42343}

ac9e6285

23 Dec, 2016 1 commit

[intl] Add new semantics + compat fallback to Intl constructor · b0a09d78

littledan authored 8 years ago

ECMA 402 v2 made Intl constructors more strict in terms of how they would
initialize objects, refusing to initialize objects which have already
been constructed. However, when Chrome tried to ship these semantics,
we ran into web compatibility issues.

This patch tries to square the circle and implement the simpler v2 object
semantics while including a compatibility workaround to allow objects to
sort of be initialized later, storing the real underlying Intl object
in a symbol-named property.

The new semantics are described in this PR against the ECMA 402 spec:
https://github.com/tc39/ecma402/pull/84

BUG=v8:4360, v8:4870
LOG=Y

Review-Url: https://codereview.chromium.org/2582993002
Cr-Commit-Position: refs/heads/master@{#41943}

b0a09d78

19 Dec, 2016 1 commit

Optimize case conversion with icu_case_mapping · af38272d

jshin authored 8 years ago

Use FastAsciiConvert (as used by Unibrow) for i18n-aware
case conversion with --icu_case_mapping.

Move FastAsciiConvert to src/string-case.cc so that it can be used
by both runtime-{string,i18n}.

Add more tests.

BUG=v8:4477,v8:4476
TEST=intl/general/case*

Review-Url: https://codereview.chromium.org/2533983006
Cr-Commit-Position: refs/heads/master@{#41821}

af38272d

28 Nov, 2016 1 commit

Fix the uppercasing of U+00E7(ç) and U+00F7(÷) · 2f5da9a5

jshin authored 8 years ago

Due to a typo in runtime-i18n.js, 'ç'(U+00E7) was not uppercased while
'÷'(U+00F7) was incorrectly uppercased to '×'(U+00D7).

Add a comprehensive test for Latin-1 supplemental block (U+00A0 ~ U+00FF).
(they're special-cased for speed-up and needs to have a test for the range.).

TEST=intl/general/case-mapping
BUG=v8:5681

Review-Url: https://codereview.chromium.org/2533033003
Cr-Commit-Position: refs/heads/master@{#41331}

2f5da9a5

15 Nov, 2016 1 commit

Use a regular ICU API for el-Upper · 4f224b39

jshin authored 8 years ago

ICU now supports uppercasing in Greek via its regular uppercasing API.
So, there's no need to use a slow transliteration API for uppercasing
in Greek.

This CL includes rolling ICU to ICU 58.1.

Besides, drop intl402/Intl/getCanonicalLocales/weird-cases from
test262.status because it passes now with ICU 58.1.

BUG=chromium:637001,v8:5012

Review-Url: https://codereview.chromium.org/2491333003
Cr-Commit-Position: refs/heads/master@{#41009}

4f224b39

18 Aug, 2016 1 commit

Expose getCanonicalLocales() for Intl object. · 520f38fc

jshin authored 8 years ago

Also add a test for the return object of getCanonicalLocaleList().

See https://github.com/tc39/test262/issues/745 for more details.

BUG=v8:5012
TEST=test262/intl402/Intl/getCanonicalLocales/*
TEST=intl/general/getCanonicalLocales

Review-Url: https://codereview.chromium.org/2239523002
Cr-Commit-Position: refs/heads/master@{#38733}

520f38fc

11 Aug, 2016 1 commit

Revert of Throw when case mapping result > max string length (patchset #3... · 08f7c10e

machenbach authored 8 years ago

Revert of Throw when case mapping result > max string length (patchset #3 id:40001 of https://codereview.chromium.org/2236593002/ )

Reason for revert:
The test is very flaky and made it on many configurations into the top 10 of the slowest tests:

https://build.chromium.org/p/client.v8.ports/builders/V8%20Arm/builds/845
https://build.chromium.org/p/client.v8/builders/V8%20Win32%20-%20nosnap%20-%20shared/builds/15418
https://build.chromium.org/p/client.v8/builders/V8%20Linux/builds/12369/steps/Check/logs/durations

Original issue's description:
> Throw when case mapping result > max string length
>
> Throw 'Range Error: invalid string length' when the result of
> case mapping is longer than the max string length (kMaxLength in
> objects.h = 1 << 28 - 16).
>
> This is for case mapping with ICU.
>
> BUG=v8:5271
> TEST=intl/general/case-mapping.js with --icu_case_mapping
>
> Committed: https://crrev.com/c7a2046670468b900b9dbbb4ce45beb5e0e717fd
> Cr-Commit-Position: refs/heads/master@{#38565}

TBR=littledan@chromium.org,jshin@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=v8:5271

Review-Url: https://codereview.chromium.org/2236393002
Cr-Commit-Position: refs/heads/master@{#38582}

08f7c10e

10 Aug, 2016 1 commit

Throw when case mapping result > max string length · c7a20466

jshin authored 8 years ago

Throw 'Range Error: invalid string length' when the result of
case mapping is longer than the max string length (kMaxLength in
objects.h = 1 << 28 - 16).

This is for case mapping with ICU.

BUG=v8:5271
TEST=intl/general/case-mapping.js with --icu_case_mapping

Review-Url: https://codereview.chromium.org/2236593002
Cr-Commit-Position: refs/heads/master@{#38565}

c7a20466

11 May, 2016 1 commit

Use ICU case conversion/transliterator for case conversion · b348d47b

jshin authored 8 years ago

When I18N is enabled, use ICU's case conversion API and transliteration
API [1] to implement String.prototype.to{Upper,Lower}Case and
String.prototype.toLocale{Upper,Lower}Case.

* ICU-based case conversion was implemented in runtime-i18n.cc/i18n.js
* The above 4 functions are overridden with those in i18n.js when
  --icu_case_mapping flag is turned on. To control the override by the flag,
  they're overriden in icu-case-mapping.js

Previously, toLocale{U,L}Case just called to{U,L}Case so that they didn't
support locale-sensitive case conversion for Turkic languages (az, tr),
Greek (el) and Lithuanian (lt).

Before ICU APIs for the most general case are called, a fast-path for Latin-1
is tried. It's taken from Blink and adopted as necessary. This fast path
is always tried for to{U,L}Case. For toLocale{U,L}Case, it's only taken
when a locale (explicitly specified or default) is not in {az, el, lt, tr}.

With these changes, a build with --icu_case_mapping=true passes a bunch
of tests in test262/intl402/Strings/* and intl/* that failed before.

Handling of pure ASCII strings (aligned at word boundary) are not as fast
as Unibrow's implementation that uses word-by-word case conversion. OTOH,
Latin-1 input handling is faster than Unibrow. General Unicode input
handling is slower but more accurate.

See https://docs.google.com/spreadsheets/d/1KJCJxKc1FxFXjwmYqABS0_2cNdPetvnd8gY8_HGSbrg/edit?usp=sharing for the benchmark.

This CL started with http://crrev.com/1544023002#ps200001 by littledan@,
but has changed significantly since.

[1] See why transliteration API is needed for uppercasing in Greek.
    http://bugs.icu-project.org/trac/ticket/10582

R=yangguo
BUG=v8:4476,v8:4477
LOG=Y
TEST=test262/{built-ins,intl402}/Strings/*, webkit/fast/js/*, mjsunit/string-case,
     intl/general/case*

Review-Url: https://codereview.chromium.org/1812673005
Cr-Commit-Position: refs/heads/master@{#36187}

b348d47b

10 Oct, 2014 1 commit

Allow identifier code points from supplementary multilingual planes. · 0dd69ec4

yangguo@chromium.org authored 10 years ago

ES5.1 section 6 ("Source Text"):
"Throughout the rest of this document, the phrase “code unit” and the
word “character” will be used to refer to a 16-bit unsigned value
used to represent a single 16-bit unit of text."

This changed in ES6 draft section 10.1 ("Source Text"):
"The ECMAScript code is expressed using Unicode, version 5.1 or later.
ECMAScript source text is a sequence of code points. All Unicode code
point values from U+0000 to U+10FFFF, including surrogate code points,
may occur in source text where permitted by the ECMAScript grammars."

This patch is to reflect this spec change.

BUG=v8:3617
LOG=Y
R=jochen@chromium.org

Review URL: https://codereview.chromium.org/640193002

git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@24510 ce2b1a6d-e550-0410-aec6-3dcde31c8c00

0dd69ec4

01 Aug, 2013 1 commit

Remove test that v8Intl symbol exists, as we don't define it anymore. · 8bee9f0c

jochen@chromium.org authored 11 years ago

R=jkummerow@chromium.org

Review URL: https://codereview.chromium.org/21511002

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@16013 ce2b1a6d-e550-0410-aec6-3dcde31c8c00

8bee9f0c

10 Jul, 2013 1 commit

Import intl test suite from v8-i18n project · c61c74d2

jochen@chromium.org authored 11 years ago

BUG=v8:2745
R=jkummerow@chromium.org

Review URL: https://codereview.chromium.org/18687003

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@15584 ce2b1a6d-e550-0410-aec6-3dcde31c8c00

c61c74d2