<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">Hi Jorrit,<br>
</div>
<blockquote
cite="mid:A8D664E7-7024-492F-9020-4645484374A3@xs4all.nl"
type="cite">Thanks for your reply.</blockquote>
welcome as can be.<br>
<br>
<blockquote
cite="mid:A8D664E7-7024-492F-9020-4645484374A3@xs4all.nl"
type="cite">
<div class="">Thanks for confirming that there’s an character
conversion issue happening somewhere. </div>
<div class=""><br class="">
</div>
<div class="">Since the mangled characters appear in both html and
json provided by GBIF, I’d say it is probably a gbif issue.</div>
</blockquote>
Well, what we can say at this point is that GBIF _has_ mangled
characters ... which doesn't mean the mangling necessarily happened
at their facilities.<br>
<br>
<blockquote
cite="mid:A8D664E7-7024-492F-9020-4645484374A3@xs4all.nl"
type="cite">
<div class="">Is there a way to find out whether the invalid
character handling occurs in a data provider or within GBIF
itself?</div>
</blockquote>
Sorry to say, no. That's why I stated that characters got mangled
"at some point". All we can say is that it happened upstream from
GBIF's API.<br>
<br>
Best,<br>
Guido<br>
<br class="">
<blockquote
cite="mid:A8D664E7-7024-492F-9020-4645484374A3@xs4all.nl"
type="cite">
<div class="">
<div>
<blockquote type="cite" class="">
<div class="">On Nov 23, 2015, at 3:14 PM, Guido Sautter
<<a moz-do-not-send="true"
href="mailto:sautter@ipd.uka.de" class="">sautter@ipd.uka.de</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type" class="">
<div bgcolor="#FFFFFF" text="#000000" class="">
<div class="moz-cite-prefix">That usually happens when,
at some point, UTF-8 encoded text is read as ANSI. It
only happens if the text contains characters above 127
(0x79), however.<br class="">
<br class="">
Hope that helps,<br class="">
Guido<br class="">
<br class="">
</div>
<blockquote
cite="mid:62A4D2F9-1172-4E2A-A84D-BC3929211EA5@xs4all.nl"
type="cite" class="">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252" class="">
Hey y’all:
<div class=""><br class="">
</div>
<div class="">I am noticing some funny characters
(e.g. "Wintergrün”) for species available here:</div>
<div class=""><br class="">
</div>
<div class=""><a moz-do-not-send="true"
href="http://www.gbif.org/species/2882753/vernaculars"
class="">http://www.gbif.org/species/2882753/vernaculars</a></div>
<div class=""><br class="">
</div>
<div class="">Same is observed using the api:</div>
<div class=""><br class="">
</div>
<div class=""><a moz-do-not-send="true"
href="http://api.gbif.org/v1/species/2882753/vernacularNames"
class="">http://api.gbif.org/v1/species/2882753/vernacularNames</a></div>
<div class=""><br class="">
</div>
<div class="">I am assuming that the actual common
name should be something like “Wintergrün”.</div>
<div class=""><br class="">
</div>
<div class="">While I was looking into this, I also
noticed that no characterset is specified in http
response headers.</div>
<div class=""><br class="">
</div>
<div class="">Please confirm that this is expected
behavior. </div>
<div class=""><br class="">
</div>
<div class="">thx,</div>
<div class="">-jorrit</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<br class="">
<fieldset class="mimeAttachmentHeader"></fieldset>
<br class="">
<pre class="" wrap="">_______________________________________________
API-users mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:API-users@lists.gbif.org">API-users@lists.gbif.org</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://lists.gbif.org/mailman/listinfo/api-users">http://lists.gbif.org/mailman/listinfo/api-users</a>
</pre>
</blockquote>
<br class="">
</div>
_______________________________________________<br
class="">
API-users mailing list<br class="">
<a moz-do-not-send="true"
href="mailto:API-users@lists.gbif.org" class="">API-users@lists.gbif.org</a><br
class="">
<a class="moz-txt-link-freetext" href="http://lists.gbif.org/mailman/listinfo/api-users">http://lists.gbif.org/mailman/listinfo/api-users</a><br
class="">
</div>
</blockquote>
</div>
<br class="">
</div>
</blockquote>
<br>
</body>
</html>