Commit ce9b1b2a authored by Frank Tang's avatar Frank Tang Committed by V8 LUCI CQ

[intl] Remove incorrect optimization for 0 length string

In collator and localeCompare, we have an incorrect optimization
for zero length string that compare the length and ignore the
fact some non zero length string could be considered as equal to
a zero length string because the content are all ignoreable.

Took out this incorrect optimization with test cases.

The regression is introduced in
https://source.chromium.org/chromium/_/chromium/v8/v8.git/+/6fbb8bc806da7231ceb81e492d09abe3f43e320e which first appeared in 97.0.4665.0



Bug: chromium:1347690
Change-Id: Ie70feb9598b1842f8a8744c38f33b3397865abfd
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3832526Reviewed-by: 's avatarShu-yu Guo <syg@chromium.org>
Reviewed-by: 's avatarJakob Linke <jgruber@chromium.org>
Commit-Queue: Frank Tang <ftang@chromium.org>
Cr-Commit-Position: refs/heads/main@{#82632}
parent 134ca75c
......@@ -230,7 +230,6 @@ icu::StringPiece ToICUStringPiece(Isolate* isolate, Handle<String> string,
if (!flat.IsOneByte()) return icu::StringPiece();
int32_t length = string->length();
DCHECK_LT(offset, length);
const char* char_buffer =
reinterpret_cast<const char*>(flat.ToOneByteVector().begin());
if (!String::IsAscii(char_buffer, length)) {
......@@ -1418,10 +1417,8 @@ int Intl::CompareStrings(Isolate* isolate, const icu::Collator& icu_collator,
return UCollationResult::UCOL_EQUAL;
}
// Early return for empty strings.
if (string1->length() == 0 || string2->length() == 0) {
return ToUCollationResult(string1->length() - string2->length());
}
// We cannot return early for 0-length strings because of Unicode
// ignorable characters. See also crbug.com/1347690.
string1 = String::Flatten(isolate, string1);
string2 = String::Flatten(isolate, string2);
......
// Copyright 2022 the V8 project authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
// Comparison to empty string could return zero for certain Unicode character.
// In all locales, some Unicode characters are ignorable.
// Unicode in C0
assertEquals(0, (new Intl.Collator('en')).compare("","\u0001"));
// SOFT HYPHEN
assertEquals(0, (new Intl.Collator('en')).compare("","\u00AD"));
// ARABIC SIGN SAMVAT
assertEquals(0, (new Intl.Collator('en')).compare("","\u0604"));
assertEquals(0, (new Intl.Collator('en')).compare("","\u0001\u0002\u00AD\u0604"));
// Default Thai collation ignores punctuation.
assertEquals(0, (new Intl.Collator('th')).compare(""," "));
assertEquals(0, (new Intl.Collator('th')).compare("","*"));
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment