<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <p>Dear all,</p>
    <p><br>
    </p>
    <p>Thank you very much for your valuable feedback!</p>
    <p><br>
    </p>
    <p>I'll explain a bit what I'm doing just to clarify, sorry if this
      spam to some.</p>
    <p><br>
    </p>
    <p>I want to build a model for species assemblages based on
      co-occurrence of taxa within an arbitrary area. I'm building a 2D
      lattice in which for each cell I'm collapsing the data into a
      taxonomic tree (the occurrences). For doing this I need first to
      obtain the data from the gbif api and later, based on the ids (or
      names) of each taxonomic level (from kingdom to occurrence) build
      a tree coupled to each cell. <br>
    </p>
    <p><br>
    </p>
    <p>The implementation is done with postgresql (postgis) for storing
      the raw gbif data and neo4j for storing the relation <br>
    </p>
    <p>"Being a member of the [ specie, genus, family,,,] [name/id]" The
      idea is to include data from different sources similar to the
      project Matthew and Jennifer had mentioned (which I'm very
      interested and like to hear more) and traverse the network looking
      for significant merged information. <br>
    </p>
    <p><br>
    </p>
    <p>One of the immediate problems I've found is to import big chunks
      of the gbif data into my specification. Thanks to this thread I've
      found the tools that are the most used by the community
      (pygbif,rgbif, and python-dwca-reader). I was using urlib2 and
      things like that. <br>
    </p>
    <p> I'll be happy to share any code or ideas with the people
      interested.</p>
    <p><br>
    </p>
    <p>Btw, I've checked the tinkerpop project which uses the Gremlin
      traversal language as independent from the DBMS. <br>
    </p>
    <p>Perhaps it's possible to use it with spark and Guoda as well?</p>
    <p><br>
    </p>
    <p><br>
    </p>
    <p>Does GOuda is working now?</p>
    <p><br>
    </p>
    <p>Best wishes</p>
    <p><br>
    </p>
    <p>Juan.<br>
    </p>
    <p> <br>
    </p>
    <p><br>
    </p>
    <p><br>
    </p>
    <p><br>
    </p>
    <p><br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 31/05/16 17:02, Collins, Matthew
      wrote:<br>
    </div>
    <blockquote cite="mid:1464710544965.30033@acis.ufl.edu" type="cite">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <style type="text/css" style="display:none"><!-- p { margin-top: 0px; margin-bottom: 0px; }--></style>
      <p>Jorrit pointed out this thread to us at iDigBio. Downloading
        and importing data into a relational database will work great,
        especially if as Jan said you can cut the data size down to a
        reasonable amount.<br>
      </p>
      <p><span style="font-size: 12pt;"><br>
        </span></p>
      <p><span style="font-size: 12pt;">Another approach we've been
          working on in a collaboration called GUODA [1] is to build an
          Apache Spark environment with pre-formatted data frames with
          common data sets in them for researchers to use. This approach
          would offer a remote service where you could write arbitrary
          Spark code, probably in Jupyter notebooks, to iterate over
          data. Spark does a lot of cool stuff including GraphX which
          might be of interest. This is </span>definitely pre-alpha at
        this point and if anyone is interested, I'd like to hear your
        thoughts.<span style="font-size: 12pt;"> I'll also be at SPNHC
          talking about this.</span></p>
      <p><span style="font-size: 12pt;"><br>
        </span></p>
      <p>One thing we've found in working on this is that importing data
        into a structured data format isn't always easy. If you only
        want a few columns, it'll be fine. But getting the data typing,
        format standardization, and column name syntax of the whole
        width of an iDigBio record right requires some code. <span
          style="font-family: Calibri, Arial, Helvetica, sans-serif;
          font-size: 16px; background-color: rgb(255, 255, 255);">
          I looked to see if </span><span style="font-family: Calibri,
          Arial, Helvetica, sans-serif; font-size: 16px;
          background-color: rgb(255, 255, 255);">EcoData R</span><span
          style="font-family: Calibri, Arial, Helvetica, sans-serif;
          font-size: 16px; background-color: rgb(255, 255, 255);">etriever
          [2] had a GBIF data source and they have an eBird o</span><span
          style="font-family: Calibri, Arial, Helvetica, sans-serif;
          font-size: 16px; background-color: rgb(255, 255, 255);">ne
          that perhaps you might find useful as a</span> starting point
        if you wanted to try to use someone else's code to download and
        import data.<br>
      </p>
      <p><br>
      </p>
      <p>For other data structures like BHL, we're kind of making stuff
        up since we're packaging a relational structure and not
        something nearly as flat as GBIF and DWC stuff.
        <br>
      </p>
      <p><br>
      </p>
      <p>[1] <a moz-do-not-send="true" href="http://guoda.bio/">http://guoda.bio/</a>​<br>
      </p>
      <p>[2] <a moz-do-not-send="true"
          href="http://www.ecodataretriever.org/">http://www.ecodataretriever.org/</a><br>
      </p>
      <p><br>
      </p>
      <div id="Signature">
        <div name="divtagdefaultwrapper"
          style="font-family:Calibri,Arial,Helvetica,sans-serif;
          font-size:; margin:0">
          <div style="font-family:Tahoma; font-size:13px">
            <div style="font-family:Tahoma; font-size:13px">Matthew
              Collins<br>
              Technical Operations Manager<br>
              Advanced Computing and Information Systems Lab, ECE<br>
              University of Florida<br>
              <span class="Object"
                id="OBJ_PREFIX_DWT1049_com_zimbra_phone"><a
                  moz-do-not-send="true" href="callto:352-392-5414"
                  tabindex="0" id="NoLP">352-392-5414</a></span></div>
          </div>
        </div>
      </div>
      <div style="word-wrap:break-word">
        <hr tabindex="-1" style="display:inline-block; width:98%">
        <div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt"
            color="#000000" face="Calibri, sans-serif"><b>From:</b>
            jorrit poelen <a class="moz-txt-link-rfc2396E" href="mailto:jhpoelen@xs4all.nl"><jhpoelen@xs4all.nl></a><br>
            <b>Sent:</b> Monday, May 30, 2016 11:16 AM<br>
            <b>To:</b> Collins, Matthew; Thompson, Alexander M; Hammock,
            Jennifer<br>
            <b>Subject:</b> Fwd: [API-users] Is there any NEO4J or
            graph-based driver for this API ?</font>
          <div> </div>
        </div>
        <div>Hey y’all:
          <div class=""><br class="">
          </div>
          <div class="">Interesting request below on the GBIF mailing
            list - sounds like a perfect fit for the GUODA use cases. </div>
          <div class=""><br class="">
          </div>
          <div class="">Would it be too early to jump onto this thread
            and share our efforts/vision?</div>
          <div class=""><br class="">
          </div>
          <div class="">thx,<br>
          </div>
          <div class="">-jorrit<br class="">
            <div><br class="">
              <blockquote type="cite" class="">
                <div class="">Begin forwarded message:</div>
                <br class="Apple-interchange-newline">
                <div class="" style="margin-top:0px; margin-right:0px;
                  margin-bottom:0px; margin-left:0px">
                  <span class="" style=""><b class="">From: </b></span><span
                    class="" style="">Jan Legind <<a
                      moz-do-not-send="true"
                      href="mailto:jlegind@gbif.org" class=""><a class="moz-txt-link-abbreviated" href="mailto:jlegind@gbif.org">jlegind@gbif.org</a></a>><br
                      class="">
                  </span></div>
                <div class="" style="margin-top:0px; margin-right:0px;
                  margin-bottom:0px; margin-left:0px">
                  <span class="" style=""><b class="">Subject: </b></span><span
                    class="" style=""><b class="">Re: [API-users] Is
                      there any NEO4J or graph-based driver for this API
                      ?</b><br class="">
                  </span></div>
                <div class="" style="margin-top:0px; margin-right:0px;
                  margin-bottom:0px; margin-left:0px">
                  <span class="" style=""><b class="">Date: </b></span><span
                    class="" style="">May 30, 2016 at 5:48:51 AM PDT<br
                      class="">
                  </span></div>
                <div class="" style="margin-top:0px; margin-right:0px;
                  margin-bottom:0px; margin-left:0px">
                  <span class="" style=""><b class="">To: </b></span><span
                    class="" style="">Mauro Cavalcanti <<a
                      moz-do-not-send="true"
                      href="mailto:maurobio@gmail.com" class=""><a class="moz-txt-link-abbreviated" href="mailto:maurobio@gmail.com">maurobio@gmail.com</a></a>>,
                    "Juan M. Escamilla Molgora" <<a
                      moz-do-not-send="true"
                      href="mailto:j.escamillamolgora@lancaster.ac.uk"
                      class=""><a class="moz-txt-link-abbreviated" href="mailto:j.escamillamolgora@lancaster.ac.uk">j.escamillamolgora@lancaster.ac.uk</a></a>><br
                      class="">
                  </span></div>
                <div class="" style="margin-top:0px; margin-right:0px;
                  margin-bottom:0px; margin-left:0px">
                  <span class="" style=""><b class="">Cc: </b></span><span
                    class="">"<a moz-do-not-send="true"
                      href="mailto:api-users@lists.gbif.org" class="">api-users@lists.gbif.org</a>"
                    <<a moz-do-not-send="true"
                      href="mailto:api-users@lists.gbif.org" class="">api-users@lists.gbif.org</a>><br
                      class="">
                  </span></div>
                <br class="">
                <div class="">
                  <div class="WordSection1"
                    style="font-family:Helvetica; font-size:12px;
                    font-style:normal; font-weight:normal;
                    letter-spacing:normal; orphans:auto;
                    text-align:start; text-indent:0px;
                    text-transform:none; white-space:normal;
                    widows:auto; word-spacing:0px">
                    <div class="" style="margin:0cm 0cm 0.0001pt;
                      font-size:12pt; font-family:'Times New
                      Roman',serif">
                      <span class=""
                        style="font-family:Calibri,sans-serif;
                        color:rgb(31,73,125)">Dear Juan,</span></div>
                    <div class="" style="margin:0cm 0cm 0.0001pt;
                      font-size:12pt; font-family:'Times New
                      Roman',serif">
                      <span class=""
                        style="font-family:Calibri,sans-serif;
                        color:rgb(31,73,125)"> </span></div>
                    <div class="" style="margin:0cm 0cm 0.0001pt;
                      font-size:12pt; font-family:'Times New
                      Roman',serif">
                      <span class=""
                        style="font-family:Calibri,sans-serif;
                        color:rgb(31,73,125)">Unfortunately we have no
                        tool for creating these kind of SQL like queries
                        to the portal. I am sure you are aware that the
                        filters in the occurrence search pages can be
                        applied in combination in numerous ways. The API
                        can go even further in this regard[1], but it
                        not well suited for retrieving occurrence
                        records since there is a 200.000 records ceiling
                        making it unfit for species exceeding this
                        number.</span></div>
                    <div class="" style="margin:0cm 0cm 0.0001pt;
                      font-size:12pt; font-family:'Times New
                      Roman',serif">
                      <span class=""
                        style="font-family:Calibri,sans-serif;
                        color:rgb(31,73,125)"> </span></div>
                    <div class="" style="margin:0cm 0cm 0.0001pt;
                      font-size:12pt; font-family:'Times New
                      Roman',serif">
                      <span class=""
                        style="font-family:Calibri,sans-serif;
                        color:rgb(31,73,125)">There is going be updates
                        to the pygbif package[2] in the near future that
                        will enable you to launch user downloads
                        programmatically where a whole list of different
                        species can be used as a query parameter as well
                        as adding polygons.[3]</span></div>
                    <div class="" style="margin:0cm 0cm 0.0001pt;
                      font-size:12pt; font-family:'Times New
                      Roman',serif">
                      <span class=""
                        style="font-family:Calibri,sans-serif;
                        color:rgb(31,73,125)"> </span></div>
                    <div class="" style="margin:0cm 0cm 0.0001pt;
                      font-size:12pt; font-family:'Times New
                      Roman',serif">
                      <span class=""
                        style="font-family:Calibri,sans-serif;
                        color:rgb(31,73,125)">In the meantime, Mauro’s
                        suggestion is excellent. If you can narrow your
                        search down until it returns a manageable
                        download (say less than 100 million records),
                        importing this into a database should be doable.
                        From there, you can refine using SQL queries.</span></div>
                    <div class="" style="margin:0cm 0cm 0.0001pt;
                      font-size:12pt; font-family:'Times New
                      Roman',serif">
                      <span class=""
                        style="font-family:Calibri,sans-serif;
                        color:rgb(31,73,125)"> </span></div>
                    <div class="" style="margin:0cm 0cm 0.0001pt;
                      font-size:12pt; font-family:'Times New
                      Roman',serif">
                      <span class=""
                        style="font-family:Calibri,sans-serif;
                        color:rgb(31,73,125)">Best,</span></div>
                    <div class="" style="margin:0cm 0cm 0.0001pt;
                      font-size:12pt; font-family:'Times New
                      Roman',serif">
                      <span class=""
                        style="font-family:Calibri,sans-serif;
                        color:rgb(31,73,125)">Jan K. Legind, GBIF Data
                        manager   </span></div>
                    <div class="" style="margin:0cm 0cm 0.0001pt;
                      font-size:12pt; font-family:'Times New
                      Roman',serif">
                      <span class=""
                        style="font-family:Calibri,sans-serif;
                        color:rgb(31,73,125)"> </span></div>
                    <div class="" style="margin:0cm 0cm 0.0001pt;
                      font-size:12pt; font-family:'Times New
                      Roman',serif">
                      <span class=""
                        style="font-family:Calibri,sans-serif;
                        color:rgb(31,73,125)">[1]<span
                          class="Apple-converted-space"> </span><a
                          moz-do-not-send="true"
                          href="http://www.gbif.org/developer/occurrence#search"
                          class="" style="color:purple;
                          text-decoration:underline"><a class="moz-txt-link-freetext" href="http://www.gbif.org/developer/occurrence#search">http://www.gbif.org/developer/occurrence#search</a></a></span></div>
                    <div class="" style="margin:0cm 0cm 0.0001pt;
                      font-size:12pt; font-family:'Times New
                      Roman',serif">
                      <span class=""
                        style="font-family:Calibri,sans-serif;
                        color:rgb(31,73,125)">[2]<span
                          class="Apple-converted-space"> </span><a
                          moz-do-not-send="true"
                          href="https://github.com/sckott/pygbif"
                          class="" style="color:purple;
                          text-decoration:underline"><a class="moz-txt-link-freetext" href="https://github.com/sckott/pygbif">https://github.com/sckott/pygbif</a></a></span></div>
                    <div class="" style="margin:0cm 0cm 0.0001pt;
                      font-size:12pt; font-family:'Times New
                      Roman',serif">
                      <span class=""
                        style="font-family:Calibri,sans-serif;
                        color:rgb(31,73,125)">[3]<span
                          class="Apple-converted-space"> </span><a
                          moz-do-not-send="true"
                          href="https://github.com/jlegind/GBIF-downloads"
                          class="" style="color:purple;
                          text-decoration:underline"><a class="moz-txt-link-freetext" href="https://github.com/jlegind/GBIF-downloads">https://github.com/jlegind/GBIF-downloads</a></a></span></div>
                    <div class="" style="margin:0cm 0cm 0.0001pt;
                      font-size:12pt; font-family:'Times New
                      Roman',serif">
                      <span class=""
                        style="font-family:Calibri,sans-serif;
                        color:rgb(31,73,125)"> </span></div>
                    <div class="" style="margin:0cm 0cm 0.0001pt;
                      font-size:12pt; font-family:'Times New
                      Roman',serif">
                      <b class=""><span class="" style="font-size:10pt;
                          font-family:Tahoma,sans-serif">From:</span></b><span
                        class="" style="font-size:10pt;
                        font-family:Tahoma,sans-serif"><span
                          class="Apple-converted-space"> </span>API-users
                        [<a moz-do-not-send="true"
                          href="mailto:api-users-bounces@lists.gbif.org"
                          class="" style="color:purple;
                          text-decoration:underline">mailto:api-users-bounces@lists.gbif.org</a>]<span
                          class="Apple-converted-space"> </span><b
                          class="">On Behalf Of<span
                            class="Apple-converted-space"> </span></b>Mauro
                        Cavalcanti<br class="">
                        <b class="">Sent:</b><span
                          class="Apple-converted-space"> </span>30. maj
                        2016 14:06<br class="">
                        <b class="">To:</b><span
                          class="Apple-converted-space"> </span>Juan M.
                        Escamilla Molgora<br class="">
                        <b class="">Cc:</b><span
                          class="Apple-converted-space"> </span><a
                          moz-do-not-send="true"
                          href="mailto:api-users@lists.gbif.org"
                          class="" style="color:purple;
                          text-decoration:underline"><a class="moz-txt-link-abbreviated" href="mailto:api-users@lists.gbif.org">api-users@lists.gbif.org</a></a><br
                          class="">
                        <b class="">Subject:</b><span
                          class="Apple-converted-space"> </span>Re:
                        [API-users] Is there any NEO4J or graph-based
                        driver for this API ?</span></div>
                    <div class="" style="margin:0cm 0cm 0.0001pt;
                      font-size:12pt; font-family:'Times New
                      Roman',serif">
                       </div>
                    <div class="">
                      <div class="">
                        <div class="">
                          <p class="MsoNormal" style="margin:0cm 0cm
                            12pt; font-size:12pt; font-family:'Times New
                            Roman',serif">
                            Hi,</p>
                        </div>
                        <p class="MsoNormal" style="margin:0cm 0cm 12pt;
                          font-size:12pt; font-family:'Times New
                          Roman',serif">
                          One solution I have successfully adopted for
                          this is to download the records (either
                          "manually" via browser or, yet better, using a
                          Python script using the fine pygbif library),
                          storing them into a MySQL or SQLite database
                          and then perform the relational queries. I can
                          provide examples if you are interested.</p>
                      </div>
                      <div class="" style="margin:0cm 0cm 0.0001pt;
                        font-size:12pt; font-family:'Times New
                        Roman',serif">
                        Best regards,</div>
                    </div>
                    <div class="">
                      <div class="" style="margin:0cm 0cm 0.0001pt;
                        font-size:12pt; font-family:'Times New
                        Roman',serif">
                         </div>
                      <div class="">
                        <div class="" style="margin:0cm 0cm 0.0001pt;
                          font-size:12pt; font-family:'Times New
                          Roman',serif">
                          2016-05-30 8:59 GMT-03:00 Juan M. Escamilla
                          Molgora <<a moz-do-not-send="true"
                            href="mailto:j.escamillamolgora@lancaster.ac.uk"
                            target="_blank" class=""
                            style="color:purple;
                            text-decoration:underline">j.escamillamolgora@lancaster.ac.uk</a>>:</div>
                        <div class="" style="margin:0cm 0cm 0.0001pt;
                          font-size:12pt; font-family:'Times New
                          Roman',serif">
                          Hola,<br class="">
                          <br class="">
                          Is there any API for making relational queries
                          like taxonomy, location or timestamp?<br
                            class="">
                          <br class="">
                          Thank you and best wishes<br class="">
                          <br class="">
                          Juan<br class="">
_______________________________________________<br class="">
                          API-users mailing list<br class="">
                          <a moz-do-not-send="true"
                            href="mailto:API-users@lists.gbif.org"
                            target="_blank" class=""
                            style="color:purple;
                            text-decoration:underline">API-users@lists.gbif.org</a><br
                            class="">
                          <a moz-do-not-send="true"
                            href="http://lists.gbif.org/mailman/listinfo/api-users"
                            target="_blank" class=""
                            style="color:purple;
                            text-decoration:underline">http://lists.gbif.org/mailman/listinfo/api-users</a></div>
                      </div>
                      <div class="" style="margin:0cm 0cm 0.0001pt;
                        font-size:12pt; font-family:'Times New
                        Roman',serif">
                        <br class="">
                        <br class="" clear="all">
                        <br class="">
                        --<span class="Apple-converted-space"> </span></div>
                      <div class="">
                        <div class="" style="margin:0cm 0cm 0.0001pt;
                          font-size:12pt; font-family:'Times New
                          Roman',serif">
                          Dr. Mauro J. Cavalcanti<br class="">
                          E-mail:<span class="Apple-converted-space"> </span><a
                            moz-do-not-send="true"
                            href="mailto:maurobio@gmail.com"
                            target="_blank" class=""
                            style="color:purple;
                            text-decoration:underline"><a class="moz-txt-link-abbreviated" href="mailto:maurobio@gmail.com">maurobio@gmail.com</a></a><br
                            class="">
                          Web:<span class="Apple-converted-space"> </span><a
                            moz-do-not-send="true"
                            href="http://sites.google.com/site/maurobio"
                            target="_blank" class=""
                            style="color:purple;
                            text-decoration:underline"><a class="moz-txt-link-freetext" href="http://sites.google.com/site/maurobio">http://sites.google.com/site/maurobio</a></a></div>
                      </div>
                    </div>
                  </div>
                  <span class="" style="font-family:Helvetica;
                    font-size:12px; font-style:normal;
                    font-weight:normal; letter-spacing:normal;
                    orphans:auto; text-align:start; text-indent:0px;
                    text-transform:none; white-space:normal;
                    widows:auto; word-spacing:0px; float:none;
                    display:inline!important">_______________________________________________</span><br
                    class="" style="font-family:Helvetica;
                    font-size:12px; font-style:normal;
                    font-weight:normal; letter-spacing:normal;
                    orphans:auto; text-align:start; text-indent:0px;
                    text-transform:none; white-space:normal;
                    widows:auto; word-spacing:0px">
                  <span class="" style="font-family:Helvetica;
                    font-size:12px; font-style:normal;
                    font-weight:normal; letter-spacing:normal;
                    orphans:auto; text-align:start; text-indent:0px;
                    text-transform:none; white-space:normal;
                    widows:auto; word-spacing:0px; float:none;
                    display:inline!important">API-users mailing list</span><br
                    class="" style="font-family:Helvetica;
                    font-size:12px; font-style:normal;
                    font-weight:normal; letter-spacing:normal;
                    orphans:auto; text-align:start; text-indent:0px;
                    text-transform:none; white-space:normal;
                    widows:auto; word-spacing:0px">
                  <a moz-do-not-send="true"
                    href="mailto:API-users@lists.gbif.org" class=""
                    style="color:purple; text-decoration:underline;
                    font-family:Helvetica; font-size:12px;
                    font-style:normal; font-weight:normal;
                    letter-spacing:normal; orphans:auto;
                    text-align:start; text-indent:0px;
                    text-transform:none; white-space:normal;
                    widows:auto; word-spacing:0px">API-users@lists.gbif.org</a><br
                    class="" style="font-family:Helvetica;
                    font-size:12px; font-style:normal;
                    font-weight:normal; letter-spacing:normal;
                    orphans:auto; text-align:start; text-indent:0px;
                    text-transform:none; white-space:normal;
                    widows:auto; word-spacing:0px">
                  <a moz-do-not-send="true"
                    href="http://lists.gbif.org/mailman/listinfo/api-users"
                    class="" style="color:purple;
                    text-decoration:underline; font-family:Helvetica;
                    font-size:12px; font-style:normal;
                    font-weight:normal; letter-spacing:normal;
                    orphans:auto; text-align:start; text-indent:0px;
                    text-transform:none; white-space:normal;
                    widows:auto; word-spacing:0px">http://lists.gbif.org/mailman/listinfo/api-users</a></div>
              </blockquote>
            </div>
            <br class="">
          </div>
        </div>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
API-users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:API-users@lists.gbif.org">API-users@lists.gbif.org</a>
<a class="moz-txt-link-freetext" href="http://lists.gbif.org/mailman/listinfo/api-users">http://lists.gbif.org/mailman/listinfo/api-users</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>