Project Package Encoding
When building our project via maven, we need to specify charset encoding in the following way:
<build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <version>2.0.2</version> <configuration> <source>1.7</source> <target>1.7</target> <encoding>UTF-8</encoding> </configuration> </plugin> </plugins> </build>
Web Request Encoding
As for web request charset encoding, firstly we need to clarify that there are two kinds of request encoding, namely, 'URI encoding' and 'body encoding'. When we do GET operation upon an URL, all parameters in the URL will be applied 'URI encoding',whose default value is 'ISO-8859-1', whereas POST operation will go through 'body encoding'.
URI Encoding
If we don't intend to change the default charset encoding for URI encoding, we could retrieve the correct String parameter using the following code in SpringMVC controller.
new String(request.getParameter("cdc").getBytes("ISO-8859-1"), "utf-8");
Conversely, the way to change the default charset encoding is to edit '$TOMCAT_HOME/conf/server.xml' file, all <connector> tags in which need to be set `URIEncoding="UTF-8"` as an attribute.
<Connector port="8080" protocol="HTTP/1.1" connectionTimeout="20000" redirectPort="8443" URIEncoding="UTF-8" /> ... <Connector port="8009" protocol="AJP/1.3" redirectPort="8443" URIEncoding="UTF-8" /> ...
Body Encoding
This is set by adding filter in web.xml of your project:
<filter> <filter-name>utf8-encoding</filter-name> <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class> <init-param> <param-name>encoding</param-name> <param-value>utf-8</param-value> </init-param> <init-param> <param-name>forceEncoding</param-name> <param-value>true</param-value> </init-param> </filter> <filter-mapping> <filter-name>utf8-encoding</filter-name> <servlet-name>your_servlet_name</servlet-name> </filter-mapping>
HttpClient Request Encoding
When applying HttpClient for POST request, there are times that we need to instantiate a StringEntity object. Be sure that you always add encoding information explicitly.
StringEntity se = new StringEntity(xmlRequestData, CharEncoding.UTF_8);
Java String/Byte Conversion Encoding
Similarly to the above StringEntity scenarios in HttpClient Request Encoding, when it comes to talking about String and byte in Java, we always need to specify encoding voluntarily. String has the concept of charset encoding whereas byte does not.
byte[] bytes = "some_unicode_words".getBytes(CharEncoding.UTF_8); new String(bytes, CharEncoding.UTF_8);
Tomcat catalog.out Encoding
Tomcat's inner logging charset encoding could be set in '$TOMCAT_HOME/bin/catalina.sh'. Find the location of 'JAVA_OPTS' keyword, then append the following setting:
JAVA_OPTS="$JAVA_OPTS -Dfile.encoding=utf-8"
log4j Log File Encoding
Edit 'log4j.properties' file, add the following property under the corresponding appender.
log4j.appender.appender_name.encoding = UTF-8
OS Encoding
Operation System's charset encoding could be checked by `locale` command on Linux. Append the following content to '/etc/profile' will set encoding to UTF-8.# -- locale -- export LANG=en_US.UTF-8 export LC_CTYPE=en_US.UTF-8 export LC_NUMERIC=en_US.UTF-8 export LC_TIME=en_US.UTF-8 export LC_COLLATE=en_US.UTF-8 export LC_MONETARY=en_US.UTF-8 export LC_MESSAGES=en_US.UTF-8 export LC_PAPER=en_US.UTF-8 export LC_NAME=en_US.UTF-8 export LC_ADDRESS=en_US.UTF-8 export LC_TELEPHONE=en_US.UTF-8 export LC_MEASUREMENT=en_US.UTF-8 export LC_IDENTIFICATION=en_US.UTF-8 export LC_ALL=en_US.UTF-8
REFERENCE:
1. Get Parameter Encoding
2. Spring MVC UTF-8 Encoding
No comments:
Post a Comment