[IPT] GBIF Case 1773: UTF8

Mickael Graf Mickael.Graf at nrm.se
Tue Aug 16 09:26:04 CEST 2011


Hi Burke,

I tried both UTF-8, Latin1 and Windows 1252, but the result looks always the same. It looks like this setting has no influence on the final result, at least here.

/Mickaël

________________________________________
From: Burke Chih-Jen Ko (GBIF) [bko at gbif.org]
Sent: Monday, August 15, 2011 4:24 PM
To: Mickael Graf
Cc: Johan Dunfalk; GBIF IPT mailing list
Subject: Re: GBIF Case 1773: UTF8

Hi Michaël,

Have you tried using Latin 1 as the character encoding in the source data editing page of IPT?

Burke

On Aug 15, 2011, at 3:19 PM, Mickael Graf wrote:

> Hi Burke,
>
> Changing my.cnf breaks everything. So I reversed back. I need to study how to correctly migrate my data to a complete utf8 system.
> MySQL still comes with latin1 as default (I just checked on my Ubuntu 11.04!)
>
> How can I test Tim's suggestion?
>
> Cheers,
> Mickaël
>
> ________________________________________
> From: Burke Chih-Jen Ko (GBIF) [bko at gbif.org]
> Sent: Friday, August 12, 2011 4:30 PM
> To: Mickael Graf
> Cc: Johan Dunfalk; GBIF IPT mailing list
> Subject: Re: GBIF Case 1773: UTF8
>
> Hi Mickaël,
>
> I can see from your script that the database is created using UTF-8, but it could be the connection characterset that interprets the UTF-8 information as iso-8859-1. Force opening a UTF-8 text file with Närke using latin1 charset indeed render the text as Närke.
>
> In the [mysqld] section of /etc/my.cnf, you can instruct the server to start with preferred characterset and collation:
>
> character_set_server=utf8
> default-character-set=utf8
> character_set_client=utf8
> collation_server=utf8_general_ci
> skip-character-set-client-handshake
>
> The last line force the connection charset as the one specified for the server.
>
> So I suggest some steps:
> 1. Add lines above to your my.cnf
> 2. Restart the mysql,
> 3. First see if things still looks the same on TapirLink.
> 4. Try export the same sample script you gave us earlier, if the närke shows as it should be, then it should be fine on IPT.
>
> Let me know if this setting works.
>
> But if this change breaks TapirLink and others, you'll need to decide to configure others all or, see if Tim's suggestion works. (jdbc:mysql://localhost:3306/specimen_collections?autoReconnect=true&useUnicode=true&characterEncoding=UTF8&characterSetResults=UTF8)
>
> Cheers,
>
> Burke
>



More information about the IPT mailing list