[IPT] can IPT2 handle big datasets?

John Wieczorek tuco at berkeley.edu
Fri Apr 8 21:59:53 CEST 2011


Nice! Congratulations on achieving such a critical goal.

On Thu, Apr 7, 2011 at 8:41 AM, Kyle Braak (GBIF) <kbraak at gbif.org> wrote:

> Dear IPT mailing list,
>
> This afternoon we conducted a little test to see whether the IPT2 can
> handle publishing a big dataset from a database.
>
> In the test we used a MySQL database, and successfully generated an archive
> with 24.000.000 records in about 50 minutes! This was run on a Tomcat server
> with 256MB memory.
>
> http://ipt.gbif.org/resource.do?r=bigdbtest
>
> Previously IPT1 had serious problems with such large datasets, but during
> IPT2 development special care was taken to be ensure that they could be
> handled gracefully. The way it is done now, is that the result sets from the
> database are streamed to the file system where they are written (about a
> 1000 records per result set) so there is no memory burden at all. This is
> one of the reasons why the IPT2 is not as feature rich as the IPT1 was.
>
> Best wishes,
>
> Kyle Braak
> Programmer
> Global Biodiversity Information Facility Secretariat
> Universitetsparken 15, DK-2100 Copenhagen, Denmark
> Tel: +45-35321479 Fax: +45-35321480
> http://community.gbif.org/pg/profile/kbraak
> URL: http://www.gbif.org
>
>
>
>
> _______________________________________________
> IPT mailing list
> IPT at lists.gbif.org
> http://lists.gbif.org/mailman/listinfo/ipt
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.gbif.org/pipermail/ipt/attachments/20110408/cca0eb04/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 123229 bytes
Desc: not available
Url : http://lists.gbif.org/pipermail/ipt/attachments/20110408/cca0eb04/attachment-0001.png 


More information about the IPT mailing list