[API-users] birdhouse meets GBIF
Nils Hempelmann
info at nilshempelmann.de
Thu Jun 2 15:00:03 CEST 2016
Hi Tim
Thanks for the quick answer. If not OGC service, how is the download API
callable from outside of GBIF?
Merci
Nils
On 02/06/2016 14:45, Tim Robertson wrote:
> Hi Nils
>
> We don’t have any OGC services, but there is an asynchronous download
> API which can deliver CSVs.
> Off the top of my head, the only way you can automate this at the
> moment would be to do periodically issue a download (e.g. Daily)
> process as you see fit, and cache the result for your app.
>
> In the download API you can get any size from 1 - > 660 million
> records, which is why it is asynchronous. It’s used a lot by various
> applications and communities.
>
> I hope this helps,
> Tim
>
>
> From: API-users <api-users-bounces at lists.gbif.org
> <mailto:api-users-bounces at lists.gbif.org>> on behalf of Nils
> Hempelmann <info at nilshempelmann.de <mailto:info at nilshempelmann.de>>
> Date: Thursday 2 June 2016 at 14:25
> To: "api-users at lists.gbif.org <mailto:api-users at lists.gbif.org>"
> <api-users at lists.gbif.org <mailto:api-users at lists.gbif.org>>,
> "wps-dev at lists.dkrz.de <mailto:wps-dev at lists.dkrz.de>"
> <wps-dev at lists.dkrz.de <mailto:wps-dev at lists.dkrz.de>>
> Subject: [API-users] birdhouse meets GBIF
>
> Dear all
>
> Here is an current problem to solve :-) for the birdhouse WPS and GBIF
> developers.
> There is a species distribution process in the birdhouse WPS,
> processing Climate data based on occurrence coordinates from GBIF.
>
> Currently the GBIF data are given to the process via an url of the
> GBIF csv (which had to be generated manually before).
> I was implementing a direct occurrence search from birdhouse to GBIF
> with pygbif. Worked fine BUT: there is a limit of 300 records which is
> far to less to train the Species distribution model.
>
> So here the question: Is there a way to request the species occurrence
> coordinates somehow directly (e.g. like a WPS request)?
>
> (The previous conversation is following)
>
> And some link for the background informations:
>
> Birdhouse:
> http://birdhouse.readthedocs.io/en/latest/
> GBIF:
> http://www.gbif.org/
>
> Species distribution process:
> http://flyingpigeon.readthedocs.io/en/latest/tutorials/sdm.html
>
> Birdhouse architecture:
> https://github.com/bird-house/birdhouse-docs/blob/master/slides/birdhouse-architecture/birdhouse-architecture.pdf
>
> Merci
> Nils
>
> -------- Forwarded Message -------
> Subject: Re: pygbif for occurrence coordinates
> Date: Thu, 2 Jun 2016 09:00:49 -0300
> From: Mauro Cavalcanti <maurobio at gmail.com>
> To: Nils Hempelmann <info at nilshempelmann.de>
> CC: wps-dev at lists.dkrz.de, Carsten Ehbrecht <ehbrecht at dkrz.de>,
> Wolfgang.Falk at lwf.bayern.de, Scott Chamberlain <myrmecocystus at gmail.com>
>
>
>
> Nils,
>
> No, there is no other way to get at once a larger number of records
> using the GBIF API (although one could be able to achieve this by
> sequentially querying the server using "batches" of 200 or 300 records
> each). As I said, there are good operational reasons for the GBIF
> developers to have imposed such limit.
>
> As of your other question, it should better be put to the GBIF
> developers themselves (in the discussion list, so that we can all
> benefit from the answers! :-))
>
> With warmest regards,
>
> --
> Dr. Mauro J. Cavalcanti
> E-mail: maurobio at gmail.com <mailto:maurobio at gmail.com>
> Web: http://sites.google.com/site/maurobio
>
> Em 02/06/2016 08:49, "Nils Hempelmann" <info at nilshempelmann.de
> <mailto:info at nilshempelmann.de>> escreveu:
>
> Hi Mauro
>
> Oh ...
> No way to set it unlimited?
> The 'manual via browser' option and parsing the returning csv is
> the current status.
> or any alternative to pygbif?
>
> If I understood it correctly, GBIF database is organized as Web
> Server, so I gues there should be a way to connect to the
> birdhouse WPS, am I right?
>
> (put the web-dev list in copy)
>
> Merci
> Nils
>
>
> On 02/06/2016 12:40, Mauro Cavalcanti wrote:
>>
>> Nils,
>>
>> That's a limit imposed (for good operational reasons) by the GBIF
>> API. If you want a larger number of records, you'll have to
>> download them "manually" (that is, via browser) and then parse
>> locally the csv file returned from the GBIF server.
>>
>> Hope this helps.
>>
>> Best regards,
>>
>> --
>> Dr. Mauro J. Cavalcanti
>> E-mail: maurobio at gmail.com <mailto:maurobio at gmail.com>
>> Web: http://sites.google.com/site/maurobio
>> <http://sites.google.com/site/maurobio>
>>
>> Em 02/06/2016 04:48, "Nils Hempelmann" <info at nilshempelmann.de
>> <mailto:info at nilshempelmann.de>> escreveu:
>>
>> Hi Scott
>>
>> works fine. Thanks a lot. Just have a question for the search
>> limits:
>>
>> The maximal records seems to be limited to 300. Is that on
>> purpose?
>> And requesting more than 200000 gives a 'bad request'
>>
>> Merci
>> Nils
>>
>>
>> In [68]: len( occurrences.search(taxonKey=key,
>> limit=100)['results'])
>> Out[68]: 100
>>
>> In [69]: len( occurrences.search(taxonKey=key,
>> limit=300)['results'])
>> Out[69]: 300
>>
>> In [70]: len( occurrences.search(taxonKey=key,
>> limit=3000)['results'])
>> Out[70]: 300
>>
>> In [71]: len( occurrences.search(taxonKey=key,
>> limit=200000)['results'])
>> Out[71]: 300
>>
>> In [72]: len( occurrences.search(taxonKey=key,
>> limit=200001)['results'])
>> ---------------------------------------------------------------------------
>> HTTPError Traceback (most recent call last)
>> <ipython-input-72-2f7d7b4ccba0> in <module>()
>> ----> 1 len( occurrences.search(taxonKey=key,
>> limit=200001)['results'])
>>
>> /home/nils/.conda/envs/birdhouse/lib/python2.7/site-packages/pygbif/occurrences/search.pyc
>> in search(taxonKey, scientificName, country,
>> publishingCountry, hasCoordinate, typeStatus, recordNumber,
>> lastInterpreted, continent, geometry, recordedBy,
>> basisOfRecord, datasetKey, eventDate, catalogNumber, year,
>> month, decimalLatitude, decimalLongitude, elevation, depth,
>> institutionCode, collectionCode, hasGeospatialIssue, issue,
>> q, mediatype, limit, offset, **kwargs)
>> 251 'collectionCode': collectionCode,
>> 'hasGeospatialIssue': hasGeospatialIssue,
>> 252 'issue': issue, 'q': q, 'mediatype':
>> mediatype, 'limit': limit,
>> --> 253 'offset': offset}, **kwargs)
>> 254 return out
>>
>> /home/nils/.conda/envs/birdhouse/lib/python2.7/site-packages/pygbif/gbifutils.pyc
>> in gbif_GET(url, args, **kwargs)
>> 17 def gbif_GET(url, args, **kwargs):
>> 18 out = requests.get(url, params=args,
>> headers=make_ua(), **kwargs)
>> ---> 19 out.raise_for_status()
>> 20 stopifnot(out.headers['content-type'])
>> 21 return out.json()
>>
>> /home/nils/.conda/envs/birdhouse/lib/python2.7/site-packages/requests/models.pyc
>> in raise_for_status(self)
>> 842
>> 843 if http_error_msg:
>> --> 844 raise HTTPError(http_error_msg,
>> response=self)
>> 845
>> 846 def close(self):
>>
>> HTTPError: 400 Client Error: Bad Request for url:
>> <http://api.gbif.org/v1/occurrence/search?taxonKey=2882316&limit=200001&offset=0>http://api.gbif.org/v1/occurrence/search?taxonKey=2882316&limit=200001&offset=0
>>
>> In [73]:
>>
>>
>> On 02/06/2016 02:22, Scott Chamberlain wrote:
>>> Fixes to docs and fix to download_get are up on Github now.
>>> Will push new version to pypi soon. Let me know if some
>>> thing still don't work for you after reinstalling from github
>>>
>>> S
>>>
>>> On Wed, Jun 1, 2016 at 4:51 PM Nils Hempelmann
>>> <info at nilshempelmann.de <mailto:info at nilshempelmann.de>> wrote:
>>>
>>> Hi Scott and Mauro
>>>
>>> Mauro sended me a snipped of code which worked fine:
>>> https://github.com/bird-house/flyingpigeon/blob/develop/scripts/pygbif_occurence.py
>>>
>>> I found the example here:
>>> https://github.com/maurobio/pygbif#occurrence-data
>>>
>>> and here:
>>> https://github.com/sckott/pygbif#occurrences-module
>>>
>>> Thanks a lot very great. That enables a lot ;-)
>>> I ll keep you posted
>>>
>>> merci
>>> Nils
>>>
>>>
>>>
>>>
>>>
>>> On 02/06/2016 01:45, Scott Chamberlain wrote:
>>>> And where are those example from exactly? I don't see
>>>> those examples searching the repo (which includes all
>>>> docs).
>>>>
>>>> `pygbif.name_suggest` wouldn't work because
>>>> `name_suggest` is a method in the `species` module, so
>>>> you'd have to do pygbif.species.name_suggest, or `from
>>>> pygbif import species`, then `species.name_suggest`
>>>>
>>>> Looks like occ.get(taxonKey = 252408386) is a
>>>> documentation bug, that should be `key` instead of
>>>> `taxonKey`, a copy paste error. will fix that.
>>>>
>>>> The `occ.download_get` call has a small bug, will fix that
>>>>
>>>> All other calls work for me
>>>>
>>>> S
>>>>
>>>> On Wed, Jun 1, 2016 at 4:34 PM Scott Chamberlain
>>>> <myrmecocystus at gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> What version of pygbif are you on? And what version
>>>> of Python>
>>>>
>>>> Best, Scott
>>>>
>>>> On Wed, Jun 1, 2016 at 4:02 PM Nils Hempelmann
>>>> <info at nilshempelmann.de> wrote:
>>>>
>>>> Hi Mauro and Scott
>>>>
>>>> Was checking out pygbif. seems to be a very
>>>> useful tool.
>>>>
>>>> Can you help me with the syntax (or forward me
>>>> to the appropriate person
>>>> ;-) )
>>>> The given snippets of code are outdated.
>>>>
>>>> I am basically just looking for the occurrence
>>>> coordinates:
>>>> here is my first try :
>>>>
>>>> import pygbif
>>>> occ =
>>>> pygbif.occurrences.search(scientificName='Fagus
>>>> sylvatica')
>>>> occ['count']
>>>>
>>>> ... and further? ;-)
>>>>
>>>> the examples in the docu are throwing errors:
>>>>
>>>> key= pygbif.name_suggest(q='Helianthus
>>>> annuus',rank='species')['key']
>>>> pygbif.search(taxonKey=key[0]['key'],limit=2)
>>>>
>>>> from pygbif import occurrences as occ
>>>> occ.search(taxonKey = 3329049)
>>>> occ.get(taxonKey = 252408386)
>>>> occ.count(isGeoreferenced = True)
>>>> occ.download_list(user = "sckott", limit = 5)
>>>> occ.download_meta(key = "0000099-140929101555934")
>>>> occ.download_get("0000099-140929101555934")
>>>>
>>>>
>>>> Thanks
>>>> Nils
>>>>
>>>
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gbif.org/pipermail/api-users/attachments/20160602/bbd05783/attachment-0001.html>
More information about the API-users
mailing list