• vogelheim's avatar
    Return kBadChar for longest subpart of incomplete utf-8 character. · fd40ebb1
    vogelheim authored
    This brings the two utf-8 decoders (bulk + incremental) in line.
    Technically, either behaviour was correct, since the utf-8 spec
    demands incomplete utf-8 be handled, but does not specify how.
    Unicode recommends that "the maximal subpart at that offset
    should be replaced by a single U+FFFD," and with this change we
    consistently do that. More details + spec references in the bug.
    
    BUG=chromium:662822
    
    Review-Url: https://codereview.chromium.org/2493143003
    Cr-Commit-Position: refs/heads/master@{#41025}
    fd40ebb1
unicode.cc 171 KB