[API-users] funny characters in common names / vernaculars for species

jorrit poelen jhpoelen at xs4all.nl
Tue Nov 24 00:39:56 CET 2015


Ok. Sounds like we are on the same page. What do you think would be the most effective way to document this content issue?

thx,
-jorrit

> On Nov 23, 2015, at 3:35 PM, Guido Sautter <sautter at ipd.uka.de> wrote:
> 
> Hi Jorrit,
>> Thanks for your reply.
> welcome as can be.
> 
>> Thanks for confirming that there’s an character conversion issue happening somewhere.
>> 
>> Since the mangled characters appear in both html and json provided by GBIF, I’d say it is probably a gbif issue.
> Well, what we can say at this point is that GBIF _has_ mangled characters ... which doesn't mean the mangling necessarily happened at their facilities.
> 
>> Is there a way to find out whether the invalid character handling occurs in a data provider or within GBIF itself?
> Sorry to say, no. That's why I stated that characters got mangled "at some point". All we can say is that it happened upstream from GBIF's API.
> 
> Best,
> Guido
> 
>>> On Nov 23, 2015, at 3:14 PM, Guido Sautter <sautter at ipd.uka.de <mailto:sautter at ipd.uka.de>> wrote:
>>> 
>>> That usually happens when, at some point, UTF-8 encoded text is read as ANSI. It only happens if the text contains characters above 127 (0x79), however.
>>> 
>>> Hope that helps,
>>> Guido
>>> 
>>>> Hey y’all:
>>>> 
>>>> I am noticing some funny characters (e.g. "Wintergrün”) for species available here:
>>>> 
>>>> http://www.gbif.org/species/2882753/vernaculars <http://www.gbif.org/species/2882753/vernaculars>
>>>> 
>>>> Same is observed using the api:
>>>> 
>>>> http://api.gbif.org/v1/species/2882753/vernacularNames <http://api.gbif.org/v1/species/2882753/vernacularNames>
>>>> 
>>>> I am assuming that the actual common name should be something like “Wintergrün”.
>>>> 
>>>> While I was looking into this, I also noticed that no characterset is specified in http response headers.
>>>> 
>>>> Please confirm that this is expected behavior.
>>>> 
>>>> thx,
>>>> -jorrit
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> API-users mailing list
>>>> API-users at lists.gbif.org <mailto:API-users at lists.gbif.org>
>>>> http://lists.gbif.org/mailman/listinfo/api-users <http://lists.gbif.org/mailman/listinfo/api-users>
>>> 
>>> _______________________________________________
>>> API-users mailing list
>>> API-users at lists.gbif.org <mailto:API-users at lists.gbif.org>
>>> http://lists.gbif.org/mailman/listinfo/api-users <http://lists.gbif.org/mailman/listinfo/api-users>
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gbif.org/pipermail/api-users/attachments/20151123/116dcb44/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 496 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.gbif.org/pipermail/api-users/attachments/20151123/116dcb44/attachment-0001.sig>


More information about the API-users mailing list