Downloading large numbers of datasets

I'm trying to download all the occurrence data with COUNTRY=US http://www.gbif.org/occurrence/search?COUNTRY=US I've requested a download file, which is here http://www.gbif.org/occurrence/download/0039949-160910150852091 However, I've been having a lot of trouble getting the download to complete since it is so large(43.2GB). I've tried various browsers and also wget on both Linux and Mac. It typically fails with a network error in the browser. wget displayed strange behavior in that it claimed the download was successful after downloading 4.1GB. This was on 64 bit linux with ext4, so I don't think there was a filesystem limitation. Any ideas on how to download this data? Does it make sense to write a script to use the API to request the datasets one by one to split it up? Thanks for any help -- Clint Coggins

Hi Clint, You're first to report a problem, though trying to download the file (64-bit Linux, wget) within the GBIF network also failed for me. I've copied it with a different method to here: http://download.gbif.org/2016/12/0039949-160910150852091.zip -- on a plain Apache server, so wget's "--continue" option should work if necessary. The MD5 checksum is e976523c9e6c7ec0cd9d3cb30030020b and the size is exactly 43,184,530,448 bytes. If downloading that doesn't work, I could split the file into chunks. We'll also look into why the download failed [1]. Cheers, Matt Blissett [1] http://dev.gbif.org/issues/browse/POR-3199 On 07/12/2016 14.53, Coggins, Clint wrote:
I'm trying to download all the occurrence data with COUNTRY=US
http://www.gbif.org/occurrence/search?COUNTRY=US
I've requested a download file, which is here http://www.gbif.org/occurrence/download/0039949-160910150852091
However, I've been having a lot of trouble getting the download to complete since it is so large(43.2GB). I've tried various browsers and also wget on both Linux and Mac. It typically fails with a network error in the browser. wget displayed strange behavior in that it claimed the download was successful after downloading 4.1GB. This was on 64 bit linux with ext4, so I don't think there was a filesystem limitation.
Any ideas on how to download this data? Does it make sense to write a script to use the API to request the datasets one by one to split it up?
Thanks for any help
-- Clint Coggins
_______________________________________________ API-users mailing list API-users@lists.gbif.org http://lists.gbif.org/mailman/listinfo/api-users
participants (2)
-
Coggins, Clint
-
Matthew Blissett