collecting a bunch of links to API responses that include mangled characters looks like a good option to me.Ok. Sounds like we are on the same page. What do you think would be the most effective way to document this content issue?
On Nov 23, 2015, at 3:35 PM, Guido Sautter <sautter@ipd.uka.de> wrote:
Hi Jorrit,
Thanks for your reply.welcome as can be.
Well, what we can say at this point is that GBIF _has_ mangled characters ... which doesn't mean the mangling necessarily happened at their facilities.Thanks for confirming that there’s an character conversion issue happening somewhere.
Since the mangled characters appear in both html and json provided by GBIF, I’d say it is probably a gbif issue.
Sorry to say, no. That's why I stated that characters got mangled "at some point". All we can say is that it happened upstream from GBIF's API.Is there a way to find out whether the invalid character handling occurs in a data provider or within GBIF itself?
Best,
Guido
On Nov 23, 2015, at 3:14 PM, Guido Sautter <sautter@ipd.uka.de> wrote:
_______________________________________________That usually happens when, at some point, UTF-8 encoded text is read as ANSI. It only happens if the text contains characters above 127 (0x79), however.
Hope that helps,
Guido
Hey y’all:
I am noticing some funny characters (e.g. "Wintergrün”) for species available here:
Same is observed using the api:
I am assuming that the actual common name should be something like “Wintergrün”.
While I was looking into this, I also noticed that no characterset is specified in http response headers.
Please confirm that this is expected behavior.
thx,-jorrit
_______________________________________________ API-users mailing list API-users@lists.gbif.org http://lists.gbif.org/mailman/listinfo/api-users
API-users mailing list
API-users@lists.gbif.org
http://lists.gbif.org/mailman/listinfo/api-users