[API-users] Downloading large numbers of datasets

Coggins, Clint ccoggins at usgs.gov
Thu Dec 8 17:21:55 CET 2016


Hi Matt,
Thanks so much for your help. I was able to successfully get the file
from your link. I did notice there was not "Content-Length" header in
the original download response. Maybe that has something to do with
the wget trouble.

> Subject: Re: [API-users] Downloading large numbers of datasets
> Message-ID: <14c80817-da32-23c4-e1ee-eded21481079 at gbif.org>
> Content-Type: text/plain; charset="windows-1252"; Format="flowed"
>
> Hi Clint,
>
> You're first to report a problem, though trying to download the file
> (64-bit Linux, wget) within the GBIF network also failed for me.
>
> I've copied it with a different method to here:
> http://download.gbif.org/2016/12/0039949-160910150852091.zip -- on a
> plain Apache server, so wget's "--continue" option should work if necessary.
>
> The MD5 checksum is e976523c9e6c7ec0cd9d3cb30030020b and the size is
> exactly 43,184,530,448 bytes.
>
> If downloading that doesn't work, I could split the file into chunks.
> We'll also look into why the download failed [1].
>
> Cheers,
>
> Matt Blissett
>
> [1] http://dev.gbif.org/issues/browse/POR-3199
>
>
> On 07/12/2016 14.53, Coggins, Clint wrote:
>> I'm trying to download all the occurrence data with COUNTRY=US
>>
>> http://www.gbif.org/occurrence/search?COUNTRY=US
>>
>> I've requested a download file, which is here
>> http://www.gbif.org/occurrence/download/0039949-160910150852091
>>
>> However, I've been having a lot of trouble getting the download to
>> complete since it is so large(43.2GB). I've tried various browsers and
>> also wget on both Linux and Mac. It typically fails with a network
>> error in the browser. wget displayed strange behavior in that it
>> claimed the download was successful after downloading 4.1GB. This was
>> on 64 bit linux with ext4, so I don't think there was a filesystem
>> limitation.
>>
>> Any ideas on how to download this data? Does it make sense to write a
>> script to use the API to request the datasets one by one to split it up?
>>
>> Thanks for any help
>>
>> --
>> Clint Coggins


More information about the API-users mailing list