• marja@chromium.org's avatar
    Script streaming: more UTF-8 handing fixes (again). · 394af55a
    marja@chromium.org authored
    1) Since we fill the output buffer both from the chunks and the conversion
    buffer, it's possible that we run out of space and call CopyCharsHelper with 0
    length. The underlying functions don't handle it gracefully, so check there.
    
    2) There was a bug where we used to try to copy too many characters from the
    beginning of the data chunk into the conversion buffer. Continuation bytes in
    UTF-8 are of the form 0b10XXXXXX. If a byte is bigger than that, it's the first
    byte of a new UTF-8 character and we should ignore it.
    
    These two together (or maybe in combination with surrogates) are a probable
    reason for crbug.com/420932.
    
    3) The test data was off; \uc481 is \xec\x92\x81.
    
    BUG=420932
    LOG=N
    R=yangguo@chromium.org
    
    Review URL: https://codereview.chromium.org/662003003
    
    git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@24725 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
    394af55a
scanner-character-streams.cc 16.5 KB