[IPT] Publication of big dataset fails, can't find why

Menashe' Eliezer menashe.eliezer at gmail.com
Thu Mar 3 12:22:44 CET 2016


Good to know, you may want to do another validation using
http://tools.gbif.org/dwca-validator/validate.do
IPT also lets you do it from its interface.


Menashè


2016-03-03 11:49 GMT+01:00 Peter Desmet <peter.desmet at inbo.be>:

> Increased storage, which solved the issue. Thanks!
>
> André, I'm not aware of identical records, but if they have different
> occurrenceIDs, the IPT validation will pass.
>
> On Wed, Mar 2, 2016 at 5:06 PM, André Heughebaert
> <a.heughebaert at biodiversity.be> wrote:
> > Hi Peter,
> >
> > If I remember correctly the Florabank contains some identical records,
> > although having different IDs.
> > Could that be the reason of rejection by the IPT?
> >
> > Best regards,
> > André
> >
> > Le 02/03/16 15:59, Peter Desmet a écrit :
> >
> > Hi,
> >
> > We're trying to publish version 45.4 of this dataset:
> > http://data.inbo.be/ipt/resource?r=florabank1-occurrences, but the
> > validation seems to fail. Here's the publication log:
> > http://data.inbo.be/ipt/publicationlog.do?r=florabank1-occurrences
> >
> > As far as I can tell, the validation fails on "the core ID field
> > occurrenceID is always present and unique", but we have verified this
> > in the generated dwca-45.4.zip file, and all records have a unique
> > occurrenceID.
> >
> > Any idea what might be going on? Possible causes:
> >
> > 1. The dataset is quite big (3,5 million records)
> > 2. We've just solved this issue:
> > https://github.com/LifeWatchINBO/data-publication/issues/104 by
> > following Kyle Braak's instructions. The latest published version is
> > now 45.3, the current (to be published) version is 45.4, so everything
> > seems fine there.
> > 3. Even though the publication failed, the following files are created
> > in /resource/florabank1-occurrences:
> >
> > eml-45.4.xml
> > dwca-45.4,zip
> > florabank1-occurrences-45.4.
> > rtf
> >
> > Will those file be overwritten if I try to republish or might they be
> > causing the publication to fail?
> >
> > Thanks,
> >
> > Peter
> > _______________________________________________
> > IPT mailing list
> > IPT at lists.gbif.org
> > http://lists.gbif.org/mailman/listinfo/ipt
> >
> >
> > --
> > Ir Andre Heughebaert
> > Belgian Biodiversity Platform
> > +32(0)2238 3796
> > Av. Louise 231 Louizalaan
> > B-1050 Brussels ORCID 0000-0002-7839-5300
> _______________________________________________
> IPT mailing list
> IPT at lists.gbif.org
> http://lists.gbif.org/mailman/listinfo/ipt
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gbif.org/pipermail/ipt/attachments/20160303/b3dd9098/attachment.html>


More information about the IPT mailing list