[IPT] GBIF Case 1773: UTF8

Mickael Graf Mickael.Graf at nrm.se
Mon Aug 15 15:19:11 CEST 2011


Hi Burke,

Changing my.cnf breaks everything. So I reversed back. I need to study how to correctly migrate my data to a complete utf8 system. 
MySQL still comes with latin1 as default (I just checked on my Ubuntu 11.04!)

How can I test Tim's suggestion?

Cheers,
Mickaël

________________________________________
From: Burke Chih-Jen Ko (GBIF) [bko at gbif.org]
Sent: Friday, August 12, 2011 4:30 PM
To: Mickael Graf
Cc: Johan Dunfalk; GBIF IPT mailing list
Subject: Re: GBIF Case 1773: UTF8

Hi Mickaël,

I can see from your script that the database is created using UTF-8, but it could be the connection characterset that interprets the UTF-8 information as iso-8859-1. Force opening a UTF-8 text file with Närke using latin1 charset indeed render the text as Närke.

In the [mysqld] section of /etc/my.cnf, you can instruct the server to start with preferred characterset and collation:

character_set_server=utf8
default-character-set=utf8
character_set_client=utf8
collation_server=utf8_general_ci
skip-character-set-client-handshake

The last line force the connection charset as the one specified for the server.

So I suggest some steps:
1. Add lines above to your my.cnf
2. Restart the mysql,
3. First see if things still looks the same on TapirLink.
4. Try export the same sample script you gave us earlier, if the närke shows as it should be, then it should be fine on IPT.

Let me know if this setting works.

But if this change breaks TapirLink and others, you'll need to decide to configure others all or, see if Tim's suggestion works. (jdbc:mysql://localhost:3306/specimen_collections?autoReconnect=true&useUnicode=true&characterEncoding=UTF8&characterSetResults=UTF8)

Cheers,

Burke


More information about the IPT mailing list