Good, we're going further now. Here is what I get:
Connection id: 26948 Current database: nrm Current user: root@localhost SSL: Not in use Current pager: stdout Using outfile: '' Using delimiter: ; Server version: 5.0.77 Source distribution Protocol version: 10 Connection: Localhost via UNIX socket Server characterset: latin1 Db characterset: latin1 Client characterset: latin1 Conn. characterset: latin1 UNIX socket: /var/lib/mysql/mysql.sock Uptime: 50 days 2 hours 7 min 54 sec
Threads: 5 Questions: 3266242 Slow queries: 266 Opens: 3577 Flush tables: 1 Open tables: 64 Queries per second avg: 0.755
I haven't tried Tim's suggestion. I simply don't know where to look/test.
Cheers, Mickaël
________________________________________ From: Burke Chih-Jen Ko (GBIF) [bko@gbif.org] Sent: Friday, August 12, 2011 2:59 PM To: Mickael Graf Cc: Johan Dunfalk; GBIF IPT mailing list; helpdesk@gbif.org Subject: Re: GBIF Case 1773: UTF8
And have you tried Tim's suggestion?
Could you try issuing \s command in the mysql client shell to show it's character set settings? That would show information like this:
Connection id: 37141 Current database: Current user: root@localhost SSL: Not in use Current pager: stdout Using outfile: '' Using delimiter: ; Server version: 5.1.56 Source distribution Protocol version: 10 Connection: Localhost via UNIX socket Server characterset: utf8 Db characterset: utf8 Client characterset: utf8 Conn. characterset: utf8 UNIX socket: /var/lib/mysql/mysql.sock Uptime: 41 days 22 hours 43 min 50 sec
Threads: 1 Questions: 597738 Slow queries: 20 Opens: 2328 Flush tables: 1 Open tables: 64 Queries per second avg: 0.164 --------------
which contains character set settings.
Cheers,
Burke
On Aug 12, 2011, at 2:30 PM, Mickael Graf wrote:
That's possible. But the very same data is displayed correctly with TapirLink.
Yes, you can copy to the mailing list later.
Cheers, Mickaël ________________________________________ From: Burke Chih-Jen Ko (GBIF) [bko@gbif.org] Sent: Friday, August 12, 2011 2:08 PM To: Mickael Graf Cc: helpdesk@gbif.org Subject: Re: GBIF Case 1773: UTF8
Hi Mickael,
I can see in the script the accented characters are already wrong. If you generate the script from sql client, perhaps the problem is on the DB side?
For the script I see now I am sure the IPT won't read it correctly if the source is already "Närke".
Would you mind I copy the thread to the IPT mailing list later?
Burke
On Aug 12, 2011, at 1:51 PM, Mickael Graf wrote:
Hi Burke,
I am using a view. Here come some scripts for checking the data/IPT.
The statement in IPT is then simply 'select * from rcDwCIPT'.
Cheers, Mickaël
From: Burke Chih-Jen Ko (GBIF) [bko@gbif.org] Sent: Thursday, August 11, 2011 9:24 AM To: Mickael Graf Cc: helpdesk@gbif.org Subject: GBIF Case 1773: UTF8
Hi Mickaël,
Do you use SQL view or text file as the source for IPT? May I have some sample records to test and reproduce your issue?
Thanks!
Burke
On Aug 10, 2011, at 4:25 PM, Mickael Graf wrote:
Hi,
I am testing IPT 2 and NRM RingedBirds is the guinea pig.
Well, I have some issue with the encoding because, while swedish characters work fine with TapirLink (see http://www.gbif.se/tapir/tapir_client.php, choose NRM-RingedBirds and make an inventory over StateProvince), it's a mess with IPT, both as a preview and as a zipped file. For instance 'Närke' is displayed as 'Närke'. This happens regardless of the character encoding chosen under /source.do. The original data is UTF8, but then I don't know if any settings in tomcat need to be changed.
Do you have some knowledge about this issue? I am very bad at java, so I don't know where to look (and issue 418 doesn't help).
Cheers, Mickaël
<View_RC_DwC_IPT.sql><rc_test.sql>