[IPT] [EXTERNAL] Re: How does one upload large datasets to GBIF?

Simpson, Annie asimpson at usgs.gov
Tue Jul 7 15:48:25 UTC 2020

Thank you, Laura, for your replies.

The datasets have been exported from databases and cleaned. They are generally UTF-8 tab delimited files. So it seems that the GBIF Registry API would be the correct solution.

We currently have 8 of these large datasets, only 2 of which would not be updated in the future. Do you have names of GBIF Product Team Members whom my technical team should contact to begin this process? Is there "how to" documentation you can point me to that they should read first?


From: Laura Anne Russell <larussell at gbif.org>
Sent: Tuesday, July 7, 2020 11:17 AM
To: Simpson, Annie <asimpson at usgs.gov>; ipt at lists.gbif.org <ipt at lists.gbif.org>
Subject: [EXTERNAL] Re: [IPT] How does one upload large datasets to GBIF?

 This email has been received from outside of DOI - Use caution before clicking on links, opening attachments, or responding.

I could also mention that it is possible to script the creation of the Darwin Core Archives and then use the GBIF Registry API for the connections with GBIF. Symbiota, PlutoF and some others are successfully doing this. It does require some initial coordination with our Product Team on how to set up and coordinate the registration process and potentially with our Informatics Team.



Laura Anne Russell

Programme Officer for Participation and Engagement

Global Biodiversity Information Facility (GBIF) Secretariat

larussell at gbif.org (email)

laura.anne.russell (Skype)

@pagodarose (Twitter)



+45 35 33 35 51 (office, direct line)


Universitetsparken 15

DK-2100 Copenhagen Ø


From: IPT <ipt-bounces at lists.gbif.org> on behalf of "Simpson, Annie" <asimpson at usgs.gov>
Date: Tuesday, 7 July 2020 at 16.48
To: "ipt at lists.gbif.org" <ipt at lists.gbif.org>
Subject: [IPT] How does one upload large datasets to GBIF?


What is the easiest or most popular way to send large datasets to GBIF, ones that are too large for the IPT software (I think that is more than 100MB zipped, 10+million records)? Does one modify their IPT instance? How? Or is there another process that is preferred?

We currently have IPT Version 2.3.6-r3985b6a installed and plan to upgrade to 2.4.0 soon.

A technical answer is what I seek (on behalf of our technical team).

Again my apologies if the answer to my question is easily found and I'm just not finding it.

Annie Simpson, BISON product owner


BioFoundational Data Team

Science Analytics & Synthesis Program

U.S. Geological Survey

12201 Sunrise Valley Dr. Mailstop 302

Reston VA   20192

asimpson at usgs.gov

+1 703-648-4281



[Image removed by sender.]<https://bison.usgs.gov/>

Biodiversity Information Serving Our Nation (BISON)<https://bison.usgs.gov/>

USGS Biodiversity Information Serving Our Nation (BISON) is a unique, web-based Federal mapping resource for species occurrence data in the United States and its Territories and Canada, including marine Exclusive Economic Zones (EEZs).


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.gbif.org/pipermail/ipt/attachments/20200707/7e7d749d/attachment-0001.html>

More information about the IPT mailing list