Commit f3e41d96 authored by littledan's avatar littledan Committed by Commit bot

Fix Unicode string normalization with null bytes

Previously, String.prototype.normalize constructed its ICU input
string as a null-terminated string. This creates a bug for strings
which contain a null byte, which is allowed in ECMAScript. This
patch constructs the ICU string based on its length so that the
entire string is normalized.

R=jshin@chromium.org
BUG=v8:4654
LOG=Y

Review URL: https://codereview.chromium.org/1645223003

Cr-Commit-Position: refs/heads/master@{#33614}
parent 85aba7df
......@@ -586,8 +586,9 @@ RUNTIME_FUNCTION(Runtime_StringNormalize) {
// TODO(mnita): check Normalizer2 (not available in ICU 46)
UErrorCode status = U_ZERO_ERROR;
icu::UnicodeString input(false, u_value, string_value.length());
icu::UnicodeString result;
icu::Normalizer::normalize(u_value, normalizationForms[form_id], 0, result,
icu::Normalizer::normalize(input, normalizationForms[form_id], 0, result,
status);
if (U_FAILURE(status)) {
return isolate->heap()->undefined_value();
......
// Copyright 2016 the V8 project authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
assertEquals('hello\u0000foobar', 'hello\u0000foobar'.normalize('NFC'));
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment