Hi Ben,

We've been able to make some changes to our download system, which has increased the limit to beyond 300 species.  The actual limit is unclear, since it depends on the length of the query in characters.  (It also runs particularly slowly.)

I've rerun your two failed downloads, you should have received an email notification for each of them.  I can see this is probably too late, and you've already worked on splitting to multiple download -- apologies for the delay here.

Cheers,

Matt

On 08/04/2019 21:06, Benjamin Feinsilver wrote:
Thanks, Tim. I'll take another stab at it this week if I have time. I'm hesitant to try the wider search approach because the list of plant species I have is pretty diverse and I don't think it could conveniently be split into a few taxonomic groups. I don't think it would make sense to try to download all 250M plant occurrences at the kingdom level either.

On Mon, Apr 8, 2019 at 3:23 AM Tim Robertson <trobertson@gbif.org> wrote:

Hi Ben,

 

Thanks. Apparently even 300 is too long.

 

For background info the issues related to 1) limits on length allowed for HTTP GET (internally there is a GET call) and 2) the workflow engine managing the context for the download imposes a limit.

Being an asynchronous service, if you polled the API you’d also see the error.

 

I’m afraid you either need to reduce the size, or take the approach I suggested of a wider search (e.g. a higher taxon) and then post filtering.

 

I hope this helps.

 

Thanks,

Tim

 

 

From: Benjamin Feinsilver <benjamin.feinsilver@gmail.com>
Date: Monday, 8 April 2019 at 05.07
To: Tim Robertson <trobertson@gbif.org>
Cc: "api-users@lists.gbif.org" <api-users@lists.gbif.org>
Subject: Re: [API-users] Requesting Occurrence Data for Large List of Species

 

Hi Tim,

 

I received an error message (via email) when attempting to post 300 taxon keys:

 

"We are sorry, but an error has occurred processing your download."

 

Please see attached query file.

 

Curl command:

 

curl --include --user username:password --header "Content-Type: application/json" --data @query_1.json http://api.gbif.org/v1/occurrence/download/request

 

I received a HTTP status code "201 Created."

 

Thanks,

 

Ben

 

On Wed, Apr 3, 2019 at 3:52 AM Tim Robertson <trobertson@gbif.org> wrote:

Hi Benjamin,

 

Download will be best.

 

However, there are limits and you will not be able to push 3000 in.

You could either split it into groups of e.g. 300, or use a higher taxon and then implement a post-filter to throw away those not in your list (the latter is how I would do it).

 

I am sorry for this nuisance, and this is a known issue that we do aim to address: https://github.com/gbif/portal-feedback/issues/1768

 

Thanks,

Tim

 

 

From: API-users <api-users-bounces@lists.gbif.org> on behalf of Benjamin Feinsilver <benjamin.feinsilver@gmail.com>
Date: Wednesday, 3 April 2019 at 09.33
To: "api-users@lists.gbif.org" <api-users@lists.gbif.org>
Subject: [API-users] Requesting Occurrence Data for Large List of Species

 

Hello,

 

If I have a list of around 3,000 species, and I would like to request occurrence data for each species, is it more efficient to use the Search or Download API?

 

If using the Download API, could I include the list of species in an external query file and use the "in" predicate? For example:

 

{
  "creator":"userName",
  "notification_address": ["userName@example.org"],
  "predicate":
  {
    "type":"in",
    "key":"SCIENTIFIC_NAME",
    "values":["cat1","cat2","cat3"]
  }
}

 

Thanks,

 

Ben


_______________________________________________
API-users mailing list
API-users@lists.gbif.org
https://lists.gbif.org/mailman/listinfo/api-users