[IPT] GBIF Case 1773: UTF8

Burke Chih-Jen Ko (GBIF) bko at gbif.org
Thu Sep 8 21:42:14 CEST 2011


Hi Mickaël,

Glad to learn you're progressing. And sorry for delayed response.

> For the record I did a mysqldump of the table using --default-character-set=latin1 (dumping in two files, one for the structure, the other for the data), converted the files with the forceUTF8 library and replacing occurrences of "latin1" with "utf8".

I suppose the file is also encoded in UTF-8 when you saved the result?

> The resulting data is displayed correctly both with IPT and with TapirLink, and with both latin1 and utf8 as the character set for MySQL server. But this is on my own computer, testing on the server with MySQL/latin1 gives me errors with TapirLink. But nevermind, I'll create a temporary database for the time of migration.

How's the situation now? Did you sort out why TapirLink had errors?

Thanks - we also learned from the problem-solving.

Burke





> Again, thanks a lot.
> 
> Cheers,
> Mickaël
> ________________________________________
> From: Burke Chih-Jen Ko (GBIF) [bko at gbif.org]
> Sent: Friday, August 26, 2011 11:06 AM
> To: Mickael Graf
> Cc: Johan Dunfalk; GBIF IPT mailing list; GBIF Helpdesk
> Subject: Re: GBIF Case 1773: UTF8
> 
> Hi MIchaël,
> 
> Yes this time I have correct data to test. I changed from your script is to re-save the file in latin1, and add drop-table script from your previous dump. It was saved in UTF-8 despite the sql settings in the script are all latin1. The refined file is attached.
> 
> Then I reproduce the environment as the steps here:
> 
> 1) Adjust MySQL server and client encoding settings to match yours:
> The server and client connection show:
> Server characterset:    latin1
> Db     characterset:    latin1
> Client characterset:    latin1
> Conn.  characterset:    latin1
> 
> 2) Create a database from the attached script, the database encoding is latin1.
> 
> 3) Follow the normal procedure to create a SQL source in IPT. See "settings" screen shot.
> 
> 4) Since JDBC driver detects source encoding automatically, the encoding setting in the bottom-left doesn't matter for SQL source. However, we're thinking about forcing the encoding as instructed. Please refer to mysql jdbc connector page[1].
> 
> 5) The preview result on my side is attached as the result.png image. Närke is rendered correctly, whether your browser encoding is latin1 or UTF-8.
> 
> Since we assume everything on your side is latin1, if it still doesn't work, you can change a line in the jdbc.properties file of a *deployed* IPT, to force jdbc encoding:
> 
> 6) In [Tomcat root]/webapps/ipt/WEB-INF/classes, you have jdbc.properties, at line 7, you have
> 
> mysql.url=jdbc:mysql://{host}/{database}
> 
> 7) add the encoding setting to the connection, so it reads as
> mysql.url=jdbc:mysql://{host}/{database}?characterEncoding=Cp1252
> 
> The encoding name used by JDBC driver is slightly different from MySQL[1, again].
> 
> Let me know if you can work out a refreshed result. Otherwise I suspect there was once UTF-8 encoding involved in certain steps while you establishing the database, therefore you might want to consider a clean start, using a small set of data, or the script you gave me, by;
> 
> 1. Export all your database as SQL script, make sure the client you use(phpmyadmin?) also honours latin1 in every step.
> 2. Check the file encoding and the contents are exported correctly.
> 3. Import the SQL and try from IPT again.
> 
> Hope this helps. Do let us know if your problem is resolved.
> 
> Thanks,
> 
> Burke
> 
> [1] http://dev.mysql.com/doc/refman/5.0/en/connector-j-reference-charsets.html
> 
> 
> 
> 
> 



More information about the IPT mailing list