[API-users] Requesting Occurrence Data for Large List of Species

Tim Robertson trobertson at gbif.org
Mon Apr 8 09:23:15 CEST 2019


Hi Ben,

Thanks. Apparently even 300 is too long.

For background info the issues related to 1) limits on length allowed for HTTP GET (internally there is a GET call) and 2) the workflow engine managing the context for the download imposes a limit.
Being an asynchronous service, if you polled the API you’d also see the error.

I’m afraid you either need to reduce the size, or take the approach I suggested of a wider search (e.g. a higher taxon) and then post filtering.

I hope this helps.

Thanks,
Tim


From: Benjamin Feinsilver <benjamin.feinsilver at gmail.com>
Date: Monday, 8 April 2019 at 05.07
To: Tim Robertson <trobertson at gbif.org>
Cc: "api-users at lists.gbif.org" <api-users at lists.gbif.org>
Subject: Re: [API-users] Requesting Occurrence Data for Large List of Species

Hi Tim,

I received an error message (via email) when attempting to post 300 taxon keys:

"We are sorry, but an error has occurred processing your download."

Please see attached query file.

Curl command:

curl --include --user username:password --header "Content-Type: application/json" --data @query_1.json http://api.gbif.org/v1/occurrence/download/request

I received a HTTP status code "201 Created."

Thanks,

Ben

On Wed, Apr 3, 2019 at 3:52 AM Tim Robertson <trobertson at gbif.org<mailto:trobertson at gbif.org>> wrote:
Hi Benjamin,

Download will be best.

However, there are limits and you will not be able to push 3000 in.
You could either split it into groups of e.g. 300, or use a higher taxon and then implement a post-filter to throw away those not in your list (the latter is how I would do it).

I am sorry for this nuisance, and this is a known issue that we do aim to address: https://github.com/gbif/portal-feedback/issues/1768

Thanks,
Tim


From: API-users <api-users-bounces at lists.gbif.org<mailto:api-users-bounces at lists.gbif.org>> on behalf of Benjamin Feinsilver <benjamin.feinsilver at gmail.com<mailto:benjamin.feinsilver at gmail.com>>
Date: Wednesday, 3 April 2019 at 09.33
To: "api-users at lists.gbif.org<mailto:api-users at lists.gbif.org>" <api-users at lists.gbif.org<mailto:api-users at lists.gbif.org>>
Subject: [API-users] Requesting Occurrence Data for Large List of Species

Hello,

If I have a list of around 3,000 species, and I would like to request occurrence data for each species, is it more efficient to use the Search or Download API?

If using the Download API, could I include the list of species in an external query file and use the "in" predicate? For example:

{
  "creator":"userName",
  "notification_address": ["userName at example.org<mailto:userName at example.org>"],
  "predicate":
  {
    "type":"in",
    "key":"SCIENTIFIC_NAME",
    "values":["cat1","cat2","cat3"]
  }
}

Thanks,

Ben
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.gbif.org/pipermail/api-users/attachments/20190408/5741a924/attachment-0001.html>


More information about the API-users mailing list