Occurrence search API limitations
Hello,
I know there is a limit of 200000 records when using the occurrence search API, and also that this limit can be avoided using the asynchronous download service instead.
The problem is that this asynchronous service will send a link to a file to an email. That method seems not really convenient for programmatic downloads to be made from an interactive application, for example.
There is another mechanism that allows downloading more than 200000 records?
Removal of this limitation is expected?
Thank you very much.
Best regards,
E.Gracia
Hi,
we do not expect to remove or even extend the search limit I am afraid. For programmatic asynchroneous searches you need to actively poll the download request status to see when its done and then retrieve the content.
Alternatively It would be fairly simple for us to add a new success and failure callback URL parameter when creating a new request. We would then call these URLs once the download is completed or failed. Would that make things a lot easier than actively polling?
Markus
On 24 Apr 2015, at 11:18, ptrans2004es ptrans2004es@yahoo.es wrote:
Hello,
I know there is a limit of 200000 records when using the occurrence search API, and also that this limit can be avoided using the asynchronous download service instead. The problem is that this asynchronous service will send a link to a file to an email. That method seems not really convenient for programmatic downloads to be made from an interactive application, for example.
There is another mechanism that allows downloading more than 200000 records? Removal of this limitation is expected?
Thank you very much.
Best regards, E.Gracia _______________________________________________ API-users mailing list API-users@lists.gbif.org mailto:API-users@lists.gbif.org http://lists.gbif.org/mailman/listinfo/api-users http://lists.gbif.org/mailman/listinfo/api-users
Just wanted to say I think a callback URL (instead of polling a page or a mailbox) would be an amazing addition.
Some colleagues alredy asked me if such a mechanism was available, I'm pretty sure we're not alone!
Best,
Nico
Le 24/04/15 11:28, Markus Döring a écrit :
Hi,
we do not expect to remove or even extend the search limit I am afraid. For programmatic asynchroneous searches you need to actively poll the download request status to see when its done and then retrieve the content.
Alternatively It would be fairly simple for us to add a new success and failure callback URL parameter when creating a new request. We would then call these URLs once the download is completed or failed. Would that make things a lot easier than actively polling?
Markus
On 24 Apr 2015, at 11:18, ptrans2004es <ptrans2004es@yahoo.es mailto:ptrans2004es@yahoo.es> wrote:
Hello, I know there is a limit of 200000 records when using the occurrence search API, and also that this limit can be avoided using the asynchronous download service instead. The problem is that this asynchronous service will send a link to a file to an email. That method seems not really convenient for programmatic downloads to be made from an interactive application, for example. There is another mechanism that allows downloading more than 200000 records? Removal of this limitation is expected? Thank you very much. Best regards, E.Gracia _______________________________________________ API-users mailing list API-users@lists.gbif.org mailto:API-users@lists.gbif.org http://lists.gbif.org/mailman/listinfo/api-users
API-users mailing list API-users@lists.gbif.org http://lists.gbif.org/mailman/listinfo/api-users
Hi Nicolas,
If I understand you correctly, there is already such a callback URL you can use. The call to the GBIF API for asyncron download will respond with the downloadKey and with the call-back URL (as attribute: Location). The call-back URL can be pinged to check if the download is ready.
I have myself been using a BASH function to request a asynchron download request for a given taxonKey (provided as first parameter when calling the function). And I simply waited a day or two before using wget to collect the download files.
function gbifapi { curl -i --user _USER_:_PASSWORD_ -H "Content-Type: application/json" -H "Accept: application/json" -X POST -d "{"creator":"_USER_", "notification_address": ["_EMAIL_"], "predicate": {"type":"and", "predicates": [{"type":"equals","key":"HAS_COORDINATE","value":"true"}, {"type":"equals", "key":"TAXON_KEY", "value":"$1"}] }}" http://api.gbif.org/v1/occurrence/download/request >> log_gbifapi.txt }
############################ # Example of a logfile entry ############################ /** ---------------- HTTP/1.1 201 Created Server: Apache-Coyote/1.1 Location: http://apps.gbif.org/b_occurrence-ws/occurrence/download/request/0006630-141... Access-Control-Allow-Origin: * Content-Type: application/json x-api-url: /v1/occurrence/download/request Transfer-Encoding: chunked Date: Thu, 20 Nov 2014 09:32:59 GMT X-Varnish: 1422608112 Age: 0 Via: 1.1 varnish Connection: keep-alive
0006630-141024112412452 2704395 Cynosurus cristatus ----------------
-------To be converted into--------> http://api.gbif.org/v1/occurrence/download/request/0006630-141024112412452.z...
**/ ############################
On 24 April 2015 at 11:31, Nicolas Noé n.noe@biodiversity.be wrote:
Just wanted to say I think a callback URL (instead of polling a page or a mailbox) would be an amazing addition.
Some colleagues alredy asked me if such a mechanism was available, I'm pretty sure we're not alone!
Best,
Nico
Le 24/04/15 11:28, Markus Döring a écrit :
Hi,
we do not expect to remove or even extend the search limit I am afraid. For programmatic asynchroneous searches you need to actively poll the download request status to see when its done and then retrieve the content.
Alternatively It would be fairly simple for us to add a new success and failure callback URL parameter when creating a new request. We would then call these URLs once the download is completed or failed. Would that make things a lot easier than actively polling?
Markus
On 24 Apr 2015, at 11:18, ptrans2004es ptrans2004es@yahoo.es wrote:
Hello,
I know there is a limit of 200000 records when using the occurrence search API, and also that this limit can be avoided using the asynchronous download service instead. The problem is that this asynchronous service will send a link to a file to an email. That method seems not really convenient for programmatic downloads to be made from an interactive application, for example.
There is another mechanism that allows downloading more than 200000 records? Removal of this limitation is expected?
Thank you very much.
Best regards, E.Gracia _______________________________________________ API-users mailing list API-users@lists.gbif.org http://lists.gbif.org/mailman/listinfo/api-users
API-users mailing list API-users@lists.gbif.org http://lists.gbif.org/mailman/listinfo/api-users
API-users mailing list API-users@lists.gbif.org http://lists.gbif.org/mailman/listinfo/api-users
participants (4)
-
Dag Endresen
-
Markus Döring
-
Nicolas Noé
-
ptrans2004es