[API-users] funny characters in common names / vernaculars for species
jorrit poelen
jhpoelen at xs4all.nl
Tue Nov 24 00:39:56 CET 2015
Ok. Sounds like we are on the same page. What do you think would be the most effective way to document this content issue?
thx,
-jorrit
> On Nov 23, 2015, at 3:35 PM, Guido Sautter <sautter at ipd.uka.de> wrote:
>
> Hi Jorrit,
>> Thanks for your reply.
> welcome as can be.
>
>> Thanks for confirming that there’s an character conversion issue happening somewhere.
>>
>> Since the mangled characters appear in both html and json provided by GBIF, I’d say it is probably a gbif issue.
> Well, what we can say at this point is that GBIF _has_ mangled characters ... which doesn't mean the mangling necessarily happened at their facilities.
>
>> Is there a way to find out whether the invalid character handling occurs in a data provider or within GBIF itself?
> Sorry to say, no. That's why I stated that characters got mangled "at some point". All we can say is that it happened upstream from GBIF's API.
>
> Best,
> Guido
>
>>> On Nov 23, 2015, at 3:14 PM, Guido Sautter <sautter at ipd.uka.de <mailto:sautter at ipd.uka.de>> wrote:
>>>
>>> That usually happens when, at some point, UTF-8 encoded text is read as ANSI. It only happens if the text contains characters above 127 (0x79), however.
>>>
>>> Hope that helps,
>>> Guido
>>>
>>>> Hey y’all:
>>>>
>>>> I am noticing some funny characters (e.g. "Wintergrün”) for species available here:
>>>>
>>>> http://www.gbif.org/species/2882753/vernaculars <http://www.gbif.org/species/2882753/vernaculars>
>>>>
>>>> Same is observed using the api:
>>>>
>>>> http://api.gbif.org/v1/species/2882753/vernacularNames <http://api.gbif.org/v1/species/2882753/vernacularNames>
>>>>
>>>> I am assuming that the actual common name should be something like “Wintergrün”.
>>>>
>>>> While I was looking into this, I also noticed that no characterset is specified in http response headers.
>>>>
>>>> Please confirm that this is expected behavior.
>>>>
>>>> thx,
>>>> -jorrit
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> API-users mailing list
>>>> API-users at lists.gbif.org <mailto:API-users at lists.gbif.org>
>>>> http://lists.gbif.org/mailman/listinfo/api-users <http://lists.gbif.org/mailman/listinfo/api-users>
>>>
>>> _______________________________________________
>>> API-users mailing list
>>> API-users at lists.gbif.org <mailto:API-users at lists.gbif.org>
>>> http://lists.gbif.org/mailman/listinfo/api-users <http://lists.gbif.org/mailman/listinfo/api-users>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gbif.org/pipermail/api-users/attachments/20151123/116dcb44/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 496 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.gbif.org/pipermail/api-users/attachments/20151123/116dcb44/attachment-0001.sig>
More information about the API-users
mailing list