<div dir="ltr">Hi Nico, <div><br></div><div>Of course, I'd rather use something that works already than write my own code. I'll let you know if I have any questions. </div><div><br></div><div>Best, Scott<br></div></div><br><div class="gmail_quote"><div dir="ltr">On Fri, Jun 3, 2016 at 12:41 AM Nicolas Noé <<a href="mailto:n.noe@biodiversity.be">n.noe@biodiversity.be</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<font size="-1"><font face="Verdana">Hey Scott,<br>
<br>
About the functionality to read Darwin Core dumps: don't
hesitate to interfacee with <a href="https://github.com/BelgianBiodiversityPlatform/python-dwca-reader" target="_blank">python-dwca-reader</a>
to avoid reinventing the wheel. It's been written exactly for
this purpose, is now rather mature and used by several group of
users.<br>
<br>
I'm also always willing to put some effort in order to make its
integration with projects like pygbif easier. Just fill a GitHub
issue or send me a mail for any clarification/bug/missing
feature.<br>
<br>
Best,<br>
<br>
Nico<br>
</font></font><br>
<div>Le 2/06/16 16:17, Scott Chamberlain a
écrit :<br>
</div></div><div bgcolor="#FFFFFF" text="#000000">
<blockquote type="cite">
<div dir="ltr">Nils,
<div><br>
</div>
<div>We do have some download API functions in pygbif, and
there's a pull request now to add the methods to request
downloads. We'll add some functionality to read darwin core
dumps as well. </div>
<div><br>
</div>
<div>Scott</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr">On Thu, Jun 2, 2016 at 6:06 AM Tim Robertson <<a href="mailto:trobertson@gbif.org" target="_blank"><a href="mailto:trobertson@gbif.org" target="_blank">trobertson@gbif.org</a></a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-family:Calibri,sans-serif">
<div>Hi Nils,</div>
<div><br>
</div>
<div>It’s documented here: <a href="http://www.gbif.org/developer/occurrence#download" target="_blank">http://www.gbif.org/developer/occurrence#download</a></div>
<div>You POST a JSON doc with the query using HTTP basic
authentication.</div>
<div><br>
</div>
<div>If you need any help, please say and we can provide
some examples using CURL or python.</div>
<div><br>
</div>
<div>I am actually working on a revision to the mapping API,
and will see what may be possible returning distinct
locations. It’s tricky to do in real time though.</div>
<div>What kind of accuracy do you need please? </div>
<div><br>
</div>
<div>Thanks,</div>
<div>Tim</div>
<div><br>
</div>
<span>
<div style="font-family:Calibri;font-size:11pt;text-align:left;color:black;BORDER-BOTTOM:medium none;BORDER-LEFT:medium none;PADDING-BOTTOM:0in;PADDING-LEFT:0in;PADDING-RIGHT:0in;BORDER-TOP:#b5c4df 1pt solid;BORDER-RIGHT:medium none;PADDING-TOP:3pt">
<span style="font-weight:bold">From: </span>Nils
Hempelmann <<a href="mailto:info@nilshempelmann.de" target="_blank">info@nilshempelmann.de</a>><br>
<span style="font-weight:bold">Date: </span>Thursday 2
June 2016 at 15:00<br>
<span style="font-weight:bold">To: </span>Tim Robertson
<<a href="mailto:trobertson@gbif.org" target="_blank">trobertson@gbif.org</a>>,
"<a href="mailto:api-users@lists.gbif.org" target="_blank">api-users@lists.gbif.org</a>"
<<a href="mailto:api-users@lists.gbif.org" target="_blank">api-users@lists.gbif.org</a>>,
"<a href="mailto:wps-dev@lists.dkrz.de" target="_blank">wps-dev@lists.dkrz.de</a>"
<<a href="mailto:wps-dev@lists.dkrz.de" target="_blank">wps-dev@lists.dkrz.de</a>><br>
<span style="font-weight:bold">Subject: </span>Re:
[API-users] birdhouse meets GBIF<br>
</div>
</span></div>
<div style="word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-family:Calibri,sans-serif"><span>
<div><br>
</div>
<div>
<div bgcolor="#FFFFFF" text="#000000">Hi Tim <br>
<br>
Thanks for the quick answer. If not OGC service, how
is the download API callable from outside of GBIF?<br>
<br>
Merci <br>
Nils <br>
<br>
<br>
<div>On 02/06/2016 14:45, Tim Robertson wrote:<br>
</div>
<blockquote type="cite">
<div>Hi Nils</div>
<div><br>
</div>
<div>We don’t have any OGC services, but there is an
asynchronous download API which can deliver CSVs.</div>
<div>Off the top of my head, the only way you can
automate this at the moment would be to do
periodically issue a download (e.g. Daily) process
as you see fit, and cache the result for your app.</div>
<div><br>
</div>
<div>In the download API you can get any size from 1
- > 660 million records, which is why it is
asynchronous. It’s used a lot by various
applications and communities.</div>
<div><br>
</div>
<div>I hope this helps,</div>
<div>Tim</div>
<div><br>
</div>
<div><br>
</div>
<span>
<div style="font-family:Calibri;font-size:11pt;text-align:left;color:black;BORDER-BOTTOM:medium none;BORDER-LEFT:medium none;PADDING-BOTTOM:0in;PADDING-LEFT:0in;PADDING-RIGHT:0in;BORDER-TOP:#b5c4df 1pt solid;BORDER-RIGHT:medium none;PADDING-TOP:3pt">
<span style="font-weight:bold">From: </span>API-users
<<a href="mailto:api-users-bounces@lists.gbif.org" target="_blank">api-users-bounces@lists.gbif.org</a>>
on behalf of Nils Hempelmann <<a href="mailto:info@nilshempelmann.de" target="_blank"><a href="mailto:info@nilshempelmann.de" target="_blank">info@nilshempelmann.de</a></a>><br>
<span style="font-weight:bold">Date: </span>Thursday
2 June 2016 at 14:25<br>
<span style="font-weight:bold">To: </span>"<a href="mailto:api-users@lists.gbif.org" target="_blank"><a href="mailto:api-users@lists.gbif.org" target="_blank">api-users@lists.gbif.org</a></a>"
<<a href="mailto:api-users@lists.gbif.org" target="_blank">api-users@lists.gbif.org</a>>,
"<a href="mailto:wps-dev@lists.dkrz.de" target="_blank">wps-dev@lists.dkrz.de</a>"
<<a href="mailto:wps-dev@lists.dkrz.de" target="_blank">wps-dev@lists.dkrz.de</a>><br>
<span style="font-weight:bold">Subject: </span>[API-users]
birdhouse meets GBIF<br>
</div>
<div><br>
</div>
<div>
<div bgcolor="#FFFFFF" text="#000000">Dear all <br>
<br>
Here is an current problem to solve :-) for
the birdhouse WPS and GBIF developers.<br>
There is a species distribution process in the
birdhouse WPS, processing Climate data based
on occurrence coordinates from GBIF.
<br>
<br>
Currently the GBIF data are given to the
process via an url of the GBIF csv (which had
to be generated manually before).<br>
I was implementing a direct occurrence search
from birdhouse to GBIF with pygbif. Worked
fine BUT: there is a limit of 300 records
which is far to less to train the Species
distribution model.
<br>
<br>
So here the question: Is there a way to
request the species occurrence coordinates
somehow directly (e.g. like a WPS request)?
<br>
<br>
(The previous conversation is following)<br>
<br>
And some link for the background informations:
<br>
<br>
Birdhouse:<br>
<a href="http://birdhouse.readthedocs.io/en/latest/" target="_blank">http://birdhouse.readthedocs.io/en/latest/</a><br>
GBIF: <br>
<a href="http://www.gbif.org/" target="_blank">http://www.gbif.org/</a><br>
<br>
Species distribution process: <br>
<a href="http://flyingpigeon.readthedocs.io/en/latest/tutorials/sdm.html" target="_blank">http://flyingpigeon.readthedocs.io/en/latest/tutorials/sdm.html</a><br>
<br>
Birdhouse architecture: <br>
<a href="https://github.com/bird-house/birdhouse-docs/blob/master/slides/birdhouse-architecture/birdhouse-architecture.pdf" target="_blank">https://github.com/bird-house/birdhouse-docs/blob/master/slides/birdhouse-architecture/birdhouse-architecture.pdf</a><br>
<br>
Merci <br>
Nils<br>
<br>
-------- Forwarded Message -------
<div>
<table border="0" cellpadding="0" cellspacing="0">
<tbody>
<tr>
<th align="RIGHT" nowrap valign="BASELINE">Subject: </th>
<td>Re: pygbif for occurrence
coordinates</td>
</tr>
<tr>
<th align="RIGHT" nowrap valign="BASELINE">Date: </th>
<td>Thu, 2 Jun 2016 09:00:49 -0300</td>
</tr>
<tr>
<th align="RIGHT" nowrap valign="BASELINE">From: </th>
<td>Mauro Cavalcanti <a href="mailto:maurobio@gmail.com" target="_blank">
<a href="mailto:maurobio@gmail.com" target="_blank"><maurobio@gmail.com></a></a></td>
</tr>
<tr>
<th align="RIGHT" nowrap valign="BASELINE">To: </th>
<td>Nils Hempelmann <a href="mailto:info@nilshempelmann.de" target="_blank">
<a href="mailto:info@nilshempelmann.de" target="_blank"><info@nilshempelmann.de></a></a></td>
</tr>
<tr>
<th align="RIGHT" nowrap valign="BASELINE">CC: </th>
<td><a href="mailto:wps-dev@lists.dkrz.de" target="_blank">wps-dev@lists.dkrz.de</a>,
Carsten Ehbrecht
<a href="mailto:ehbrecht@dkrz.de" target="_blank">
<ehbrecht@dkrz.de></a>, <a href="mailto:Wolfgang.Falk@lwf.bayern.de" target="_blank">
<a href="mailto:Wolfgang.Falk@lwf.bayern.de" target="_blank">Wolfgang.Falk@lwf.bayern.de</a></a>,
Scott Chamberlain <a href="mailto:myrmecocystus@gmail.com" target="_blank">
<a href="mailto:myrmecocystus@gmail.com" target="_blank"><myrmecocystus@gmail.com></a></a></td>
</tr>
</tbody>
</table>
<br>
<br>
<p dir="ltr">Nils,</p>
<p dir="ltr">No, there is no other way to
get at once a larger number of records
using the GBIF API (although one could be
able to achieve this by sequentially
querying the server using "batches" of 200
or 300 records each). As I said, there are
good operational reasons for the GBIF
developers to have imposed such limit.</p>
<p dir="ltr">As of your other question, it
should better be put to the GBIF
developers themselves (in the discussion
list, so that we can all benefit from the
answers! :-))</p>
<p dir="ltr">With warmest regards,</p>
<p dir="ltr">--<br>
Dr. Mauro J. Cavalcanti<br>
E-mail: <a href="mailto:maurobio@gmail.com" target="_blank">maurobio@gmail.com</a><br>
Web: <a href="http://sites.google.com/site/maurobio" target="_blank">http://sites.google.com/site/maurobio</a></p>
<div class="gmail_quote">Em 02/06/2016
08:49, "Nils Hempelmann" <<a href="mailto:info@nilshempelmann.de" target="_blank"><a href="mailto:info@nilshempelmann.de" target="_blank">info@nilshempelmann.de</a></a>>
escreveu:<br type="attribution">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">Hi
Mauro<br>
<br>
Oh ... <br>
No way to set it unlimited?<br>
The 'manual via browser' option and
parsing the returning csv is the
current status.
<br>
or any alternative to pygbif?<br>
<br>
If I understood it correctly, GBIF
database is organized as Web Server,
so I gues there should be a way to
connect to the birdhouse WPS, am I
right?
<br>
<br>
(put the web-dev list in copy)<br>
<br>
Merci <br>
Nils <br>
<br>
<br>
<div>On 02/06/2016 12:40, Mauro
Cavalcanti wrote:<br>
</div>
<blockquote type="cite">
<p dir="ltr">Nils,</p>
<p dir="ltr">That's a limit imposed
(for good operational reasons) by
the GBIF API. If you want a larger
number of records, you'll have to
download them "manually" (that is,
via browser) and then parse
locally the csv file returned from
the GBIF server.</p>
<p dir="ltr">Hope this helps.</p>
<p dir="ltr">Best regards,</p>
<p dir="ltr">--<br>
Dr. Mauro J. Cavalcanti<br>
E-mail: <a href="mailto:maurobio@gmail.com" target="_blank">
maurobio@gmail.com</a><br>
Web: <a href="http://sites.google.com/site/maurobio" target="_blank">
http://sites.google.com/site/maurobio</a></p>
<div class="gmail_quote">Em
02/06/2016 04:48, "Nils
Hempelmann" <<a href="mailto:info@nilshempelmann.de" target="_blank"><a href="mailto:info@nilshempelmann.de" target="_blank">info@nilshempelmann.de</a></a>>
escreveu:<br type="attribution">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">Hi Scott <br>
<br>
works fine. Thanks a lot. Just
have a question for the search
limits: <br>
<br>
The maximal records seems to
be limited to 300. Is that on
purpose?<br>
And requesting more than
200000 gives a 'bad request'<br>
<br>
Merci <br>
Nils <br>
<br>
<br>
In [68]: len(
occurrences.search(taxonKey=key,
limit=100)['results'])<br>
Out[68]: 100<br>
<br>
In [69]: len(
occurrences.search(taxonKey=key,
limit=300)['results'])<br>
Out[69]: 300<br>
<br>
In [70]: len(
occurrences.search(taxonKey=key,
limit=3000)['results'])<br>
Out[70]: 300<br>
<br>
In [71]: len(
occurrences.search(taxonKey=key,
limit=200000)['results'])<br>
Out[71]: 300<br>
<br>
In [72]: len(
occurrences.search(taxonKey=key,
limit=200001)['results'])<br>
---------------------------------------------------------------------------<br>
HTTPError
Traceback (most recent call
last)<br>
<ipython-input-72-2f7d7b4ccba0>
in <module>()<br>
----> 1 len(
occurrences.search(taxonKey=key,
limit=200001)['results'])<br>
<br>
/home/nils/.conda/envs/birdhouse/lib/python2.7/site-packages/pygbif/occurrences/search.pyc
in search(taxonKey,
scientificName, country,
publishingCountry,
hasCoordinate, typeStatus,
recordNumber, lastInterpreted,
continent, geometry,
recordedBy, basisOfRecord,
datasetKey, eventDate,
catalogNumber, year, month,
decimalLatitude,
decimalLongitude, elevation,
depth, institutionCode,
collectionCode,
hasGeospatialIssue, issue, q,
mediatype, limit, offset,
**kwargs)<br>
251
'collectionCode':
collectionCode,
'hasGeospatialIssue':
hasGeospatialIssue,<br>
252 'issue':
issue, 'q': q, 'mediatype':
mediatype, 'limit': limit,<br>
--> 253 'offset':
offset}, **kwargs)<br>
254 return out<br>
<br>
/home/nils/.conda/envs/birdhouse/lib/python2.7/site-packages/pygbif/gbifutils.pyc
in gbif_GET(url, args,
**kwargs)<br>
17 def gbif_GET(url,
args, **kwargs):<br>
18 out =
requests.get(url, params=args,
headers=make_ua(), **kwargs)<br>
---> 19
out.raise_for_status()<br>
20
stopifnot(out.headers['content-type'])<br>
21 return out.json()<br>
<br>
/home/nils/.conda/envs/birdhouse/lib/python2.7/site-packages/requests/models.pyc
in raise_for_status(self)<br>
842 <br>
843 if
http_error_msg:<br>
--> 844 raise
HTTPError(http_error_msg,
response=self)<br>
845 <br>
846 def close(self):<br>
<br>
HTTPError: 400 Client Error:
Bad Request for url: <a href="http://api.gbif.org/v1/occurrence/search?taxonKey=2882316&limit=200001&offset=0" target="_blank">
</a><a href="http://api.gbif.org/v1/occurrence/search?taxonKey=2882316&limit=200001&offset=0" target="_blank">http://api.gbif.org/v1/occurrence/search?taxonKey=2882316&limit=200001&offset=0</a><br>
<br>
In [73]: <br>
<br>
<br>
<div>On 02/06/2016 02:22,
Scott Chamberlain wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Fixes to docs
and fix to download_get
are up on Github now. Will
push new version to pypi
soon. Let me know if some
thing still don't work for
you after reinstalling
from github
<div><br>
</div>
<div>S</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr">On Wed, Jun
1, 2016 at 4:51 PM Nils
Hempelmann <<a href="mailto:info@nilshempelmann.de" target="_blank"><a href="mailto:info@nilshempelmann.de" target="_blank">info@nilshempelmann.de</a></a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">Hi
Scott and Mauro <br>
<br>
Mauro sended me a
snipped of code which
worked fine: <br>
<a href="https://github.com/bird-house/flyingpigeon/blob/develop/scripts/pygbif_occurence.py" target="_blank"><a href="https://github.com/bird-house/flyingpigeon/blob/develop/scripts/pygbif_occurence.py" target="_blank">https://github.com/bird-house/flyingpigeon/blob/develop/scripts/pygbif_occurence.py</a></a><br>
<br>
I found the example
here:<br>
<a href="https://github.com/maurobio/pygbif#occurrence-data" target="_blank"><a href="https://github.com/maurobio/pygbif#occurrence-data" target="_blank">https://github.com/maurobio/pygbif#occurrence-data</a></a><br>
<br>
and here: <br>
<a href="https://github.com/sckott/pygbif#occurrences-module" target="_blank"><a href="https://github.com/sckott/pygbif#occurrences-module" target="_blank">https://github.com/sckott/pygbif#occurrences-module</a></a><br>
<br>
Thanks a lot very
great. That enables a
lot ;-) <br>
I ll keep you posted <br>
<br>
merci <br>
</div>
<div bgcolor="#FFFFFF" text="#000000">Nils <br>
</div>
<div bgcolor="#FFFFFF" text="#000000"><br>
<br>
<br>
<br>
<br>
<div>On 02/06/2016
01:45, Scott
Chamberlain wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">And
where are those
example from
exactly? I don't
see those examples
searching the repo
(which includes
all docs).
<div><br>
</div>
<div>`pygbif.name_suggest`
wouldn't work
because
`name_suggest`
is a method in
the `species`
module, so you'd
have to do
pygbif.species.name_suggest,
or `from pygbif
import species`,
then
`species.name_suggest`</div>
<div><br>
</div>
<div>Looks like <span>occ.get(taxonKey
= 252408386)<span>
is a
documentation
bug, that
should be
`key` instead
of `taxonKey`,
a copy paste
error. will
fix that. </span></span></div>
<div><font face="helvetica
neue,helvetica,
arial,sans-serif" color="#212121"><br>
</font></div>
<div><font face="helvetica
neue,helvetica,
arial,sans-serif" color="#212121">The `occ.download_get` call has a
small bug,
will fix that</font></div>
<div><font face="helvetica
neue,helvetica,
arial,sans-serif" color="#212121"><br>
</font></div>
<div><font face="helvetica
neue,helvetica,
arial,sans-serif" color="#212121">All other calls work for me<br>
</font>
<div><br>
</div>
<div>S </div>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr">On
Wed, Jun 1, 2016
at 4:34 PM Scott
Chamberlain <<a href="mailto:myrmecocystus@gmail.com" target="_blank"><a href="mailto:myrmecocystus@gmail.com" target="_blank">myrmecocystus@gmail.com</a></a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">Hi,
<div><br>
</div>
<div>What
version of
pygbif are you
on? And what
version of
Python><br>
<div><br>
</div>
<div>Best,
Scott<br>
</div>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr">On
Wed, Jun 1,
2016 at 4:02
PM Nils
Hempelmann
<<a href="mailto:info@nilshempelmann.de" target="_blank"><a href="mailto:info@nilshempelmann.de" target="_blank">info@nilshempelmann.de</a></a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi Mauro and
Scott<br>
<br>
Was checking
out pygbif.
seems to be a
very useful
tool.<br>
<br>
Can you help
me with the
syntax (or
forward me to
the
appropriate
person<br>
;-) )<br>
The given
snippets of
code are
outdated.<br>
<br>
I am basically
just looking
for the
occurrence
coordinates:<br>
here is my
first try :<br>
<br>
import pygbif<br>
occ =
pygbif.occurrences.search(scientificName='Fagus
sylvatica')<br>
occ['count']<br>
<br>
... and
further? ;-)<br>
<br>
the examples
in the docu
are throwing
errors:<br>
<br>
key=
pygbif.name_suggest(q='Helianthus
annuus',rank='species')['key']<br>
pygbif.search(taxonKey=key[0]['key'],limit=2)<br>
<br>
from pygbif
import
occurrences as
occ
occ.search(taxonKey
= 3329049)<br>
occ.get(taxonKey
= 252408386)
occ.count(isGeoreferenced
= True)<br>
occ.download_list(user
= "sckott",
limit = 5)<br>
occ.download_meta(key
=
"0000099-140929101555934")<br>
occ.download_get("0000099-140929101555934")<br>
<br>
<br>
Thanks<br>
Nils<br>
<br>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
<br>
</div>
</blockquote>
</div>
</blockquote>
<br>
</div>
</blockquote>
</div>
</blockquote>
<br>
</div>
</blockquote>
</div>
<br>
</div>
<br>
</div>
</div>
</span></blockquote>
<br>
</div>
</div>
</span></div>
_______________________________________________<br>
API-users mailing list<br>
<a href="mailto:API-users@lists.gbif.org" target="_blank">API-users@lists.gbif.org</a><br>
<a href="http://lists.gbif.org/mailman/listinfo/api-users" rel="noreferrer" target="_blank">http://lists.gbif.org/mailman/listinfo/api-users</a><br>
</blockquote>
</div>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
API-users mailing list
<a href="mailto:API-users@lists.gbif.org" target="_blank">API-users@lists.gbif.org</a>
<a href="http://lists.gbif.org/mailman/listinfo/api-users" target="_blank">http://lists.gbif.org/mailman/listinfo/api-users</a>
</pre>
</blockquote>
<br>
</div></blockquote></div>