[IPT] Publication of big dataset fails, can't find why

Peter Desmet peter.desmet at inbo.be
Wed Mar 2 16:34:17 CET 2016


Thanks all! Should have read through the whole exception stack.

On Wed, Mar 2, 2016 at 4:26 PM, Menashe' Eliezer
<menashe.eliezer at gmail.com> wrote:
> Please look at the following line in the log which explains why the
> validation has stopped unexpectedly:
>
> No space left on device
>
> --
> Menashè Eliezer
>
>
> 2016-03-02 16:10 GMT+01:00 Laura Russell <larussell at vertnet.org>:
>>
>> I’ve had publishing fail before when there was a line break (or maybe a
>> vertical tab, can’t remember which) in a remarks field that then made a new
>> line causing that new line to have a null occurrenceID.  Any chance
>> something like that could be causing the failure?
>>
>> Laura Russell
>> VertNet Programmer/iDigBio Data Mobilization Specialist
>>
>> phone: +01 785 813-1496
>> email: larussell at vertnet.org
>> Skype: laura.anne.russell
>> Hangouts: larussell at vertnet.org
>>
>> url: www.vertnet.org
>> url: www.idigbio.org
>>
>>
>> From: IPT <ipt-bounces at lists.gbif.org> on behalf of Peter Desmet
>> <peter.desmet at inbo.be>
>> Date: Wednesday, March 2, 2016 at 8:59 AM
>> To: <ipt at lists.gbif.org>
>> Cc: Stijn Van Hoey <stijn.vanhoey at inbo.be>
>> Subject: [IPT] Publication of big dataset fails, can't find why
>>
>> Hi,
>>
>> We're trying to publish version 45.4 of this dataset:
>> http://data.inbo.be/ipt/resource?r=florabank1-occurrences, but the
>> validation seems to fail. Here's the publication log:
>> http://data.inbo.be/ipt/publicationlog.do?r=florabank1-occurrences
>>
>> As far as I can tell, the validation fails on "the core ID field
>> occurrenceID is always present and unique", but we have verified this
>> in the generated dwca-45.4.zip file, and all records have a unique
>> occurrenceID.
>>
>> Any idea what might be going on? Possible causes:
>>
>> 1. The dataset is quite big (3,5 million records)
>> 2. We've just solved this issue:
>> https://github.com/LifeWatchINBO/data-publication/issues/104 by
>> following Kyle Braak's instructions. The latest published version is
>> now 45.3, the current (to be published) version is 45.4, so everything
>> seems fine there.
>> 3. Even though the publication failed, the following files are created
>> in /resource/florabank1-occurrences:
>>
>> eml-45.4.xml
>> dwca-45.4,zip
>> florabank1-occurrences-45.4.rtf
>>
>> Will those file be overwritten if I try to republish or might they be
>> causing the publication to fail?
>>
>> Thanks,
>>
>> Peter
>> _______________________________________________
>> IPT mailing list
>> IPT at lists.gbif.org
>> http://lists.gbif.org/mailman/listinfo/ipt
>>
>>
>> _______________________________________________
>> IPT mailing list
>> IPT at lists.gbif.org
>> http://lists.gbif.org/mailman/listinfo/ipt
>>
>


More information about the IPT mailing list