java - Wrong Content-Length for response text with umlaut -


there problem associated umlaut. description on request:

@requestmapping(value = "/description", method = requestmethod.post, consumes = "application/json", produces = "text/plain;charset=utf-8")     @responsebody     private string getdescription() {                  return "ärchik";     } 

on frontend response.responsetext fails score last letter response.responsetext = "ärchi"

i found problem in wrong content-length: 7 if set content-length:8, work , return full description "ärchik"

but not understand why 8?

"ärchik".getbytes("utf-8").length = 7 

response headers

cache-control:must-revalidate

content-length:7

content-type:text/plain;charset=utf-8

date:mon, 14 apr 2014 09:08:26 gmt

server:apache-coyote/1.1

i'm turning core of comment answer, since seems on right track.

the reason string 1 byte longer expected 'ä' got encoded 3 bytes not two. can happen if 1 uses not precomposed codepoint u+00e4 (utf-8: c3 a4) instead letter 'a' (which simple ascii letter @ u+0061) followed combining diaresis u+0308, encoded 61 cc 88. there several normal forms unicode, , longer encoding result of conversion nfd.

looking @ own answer, seems did normalization, @ point content length determined un-normalized (or perhaps nfc-normalized) string.


Comments

Popular posts from this blog

windows - Single EXE to Install Python Standalone Executable for Easy Distribution -

c# - Access objects in UserControl from MainWindow in WPF -

javascript - How to name a jQuery function to make a browser's back button work? -