<html><head><meta http-equiv="Content-Type" content="text/html charset=windows-1252"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class="">Ok. Sounds like we are on the same page. What do you think would be the most effective way to document this content issue?</div><div class=""><br class=""></div><div class="">thx,</div><div class="">-jorrit</div><br class=""><div><blockquote type="cite" class=""><div class="">On Nov 23, 2015, at 3:35 PM, Guido Sautter <<a href="mailto:sautter@ipd.uka.de" class="">sautter@ipd.uka.de</a>> wrote:</div><br class="Apple-interchange-newline"><div class="">
  
    <meta content="text/html; charset=windows-1252" http-equiv="Content-Type" class="">
  
  <div bgcolor="#FFFFFF" text="#000000" class="">
    <div class="moz-cite-prefix">Hi Jorrit,<br class="">
    </div>
    <blockquote cite="mid:A8D664E7-7024-492F-9020-4645484374A3@xs4all.nl" type="cite" class="">Thanks for your reply.</blockquote>
    welcome as can be.<br class="">
    <br class="">
    <blockquote cite="mid:A8D664E7-7024-492F-9020-4645484374A3@xs4all.nl" type="cite" class="">
      <div class="">Thanks for confirming that there’s an character
        conversion issue happening somewhere. </div>
      <div class=""><br class="">
      </div>
      <div class="">Since the mangled characters appear in both html and
        json provided by GBIF, I’d say it is probably a gbif issue.</div>
    </blockquote>
    Well, what we can say at this point is that GBIF _has_ mangled
    characters ... which doesn't mean the mangling necessarily happened
    at their facilities.<br class="">
    <br class="">
    <blockquote cite="mid:A8D664E7-7024-492F-9020-4645484374A3@xs4all.nl" type="cite" class="">
      <div class="">Is there a way to find out whether the invalid
        character handling occurs in a data provider or within GBIF
        itself?</div>
    </blockquote>
    Sorry to say, no. That's why I stated that characters got mangled
    "at some point". All we can say is that it happened upstream from
    GBIF's API.<br class="">
    <br class="">
    Best,<br class="">
    Guido<br class="">
    <br class="">
    <blockquote cite="mid:A8D664E7-7024-492F-9020-4645484374A3@xs4all.nl" type="cite" class="">
      <div class="">
        <div class="">
          <blockquote type="cite" class="">
            <div class="">On Nov 23, 2015, at 3:14 PM, Guido Sautter
              <<a moz-do-not-send="true" href="mailto:sautter@ipd.uka.de" class="">sautter@ipd.uka.de</a>>
              wrote:</div>
            <br class="Apple-interchange-newline">
            <div class="">
              <meta content="text/html; charset=windows-1252" http-equiv="Content-Type" class="">
              <div bgcolor="#FFFFFF" text="#000000" class="">
                <div class="moz-cite-prefix">That usually happens when,
                  at some point, UTF-8 encoded text is read as ANSI. It
                  only happens if the text contains characters above 127
                  (0x79), however.<br class="">
                  <br class="">
                  Hope that helps,<br class="">
                  Guido<br class="">
                  <br class="">
                </div>
                <blockquote cite="mid:62A4D2F9-1172-4E2A-A84D-BC3929211EA5@xs4all.nl" type="cite" class="">
                  <meta http-equiv="Content-Type" content="text/html;
                    charset=windows-1252" class="">
                  Hey y’all:
                  <div class=""><br class="">
                  </div>
                  <div class="">I am noticing some funny characters
                    (e.g. "Wintergrün”) for species available here:</div>
                  <div class=""><br class="">
                  </div>
                  <div class=""><a moz-do-not-send="true" href="http://www.gbif.org/species/2882753/vernaculars" class="">http://www.gbif.org/species/2882753/vernaculars</a></div>
                  <div class=""><br class="">
                  </div>
                  <div class="">Same is observed using the api:</div>
                  <div class=""><br class="">
                  </div>
                  <div class=""><a moz-do-not-send="true" href="http://api.gbif.org/v1/species/2882753/vernacularNames" class="">http://api.gbif.org/v1/species/2882753/vernacularNames</a></div>
                  <div class=""><br class="">
                  </div>
                  <div class="">I am assuming that the actual common
                    name should be something like “Wintergrün”.</div>
                  <div class=""><br class="">
                  </div>
                  <div class="">While I was looking into this, I also
                    noticed that no characterset is specified in http
                    response headers.</div>
                  <div class=""><br class="">
                  </div>
                  <div class="">Please confirm that this is expected
                    behavior. </div>
                  <div class=""><br class="">
                  </div>
                  <div class="">thx,</div>
                  <div class="">-jorrit</div>
                  <div class=""><br class="">
                  </div>
                  <div class=""><br class="">
                  </div>
                  <div class=""><br class="">
                  </div>
                  <div class=""><br class="">
                  </div>
                  <br class="">
                  <fieldset class="mimeAttachmentHeader"></fieldset>
                  <br class="">
                  <pre class="" wrap="">_______________________________________________
API-users mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:API-users@lists.gbif.org">API-users@lists.gbif.org</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://lists.gbif.org/mailman/listinfo/api-users">http://lists.gbif.org/mailman/listinfo/api-users</a>
</pre>
                </blockquote>
                <br class="">
              </div>
              _______________________________________________<br class="">
              API-users mailing list<br class="">
              <a moz-do-not-send="true" href="mailto:API-users@lists.gbif.org" class="">API-users@lists.gbif.org</a><br class="">
              <a class="moz-txt-link-freetext" href="http://lists.gbif.org/mailman/listinfo/api-users">http://lists.gbif.org/mailman/listinfo/api-users</a><br class="">
            </div>
          </blockquote>
        </div>
        <br class="">
      </div>
    </blockquote>
    <br class="">
  </div>

</div></blockquote></div><br class=""></body></html>