[Ala-portal] DwC-A loading problems

Daniel Lins daniel.lins at gmail.com
Mon May 5 07:39:35 CEST 2014


Hi Natasha,

I managed to import the DwC-A file following the steps reported in the
previous email. Thank you!

However, when I tried to update some metadata of an occurrence record
(already stored in the database), the system created a new record with
these duplicated information. So I started to have several records with the
same occurrenceID (I did set in the data resource configuration to use
"OcurrenceID" to uniquely identify a record).

How can I update existing records in the database? For instance, the
location's metadata of an occurrence record stored in my database?

I also would like to better understand the behavior of the properties
"Automatically loaded" and "Incremental Load".

Thanks!!

Regards,

Daniel Lins da Silva
(Mobile) 55 11 96144-4050
Research Center on Biodiversity and Computing (Biocomp)
University of Sao Paulo, Brazil
daniellins at usp.br
daniel.lins at gmail.com


2014-04-28 3:52 GMT-03:00 Daniel Lins <daniel.lins at gmail.com>:

> Thanks Natasha!
>
> I will try your recommendations. Once finished, I will contact you.
>
> Regards
>
> Daniel Lins da Silva
> (Mobile) 55 11 96144-4050
> Research Center on Biodiversity and Computing (Biocomp)
> University of Sao Paulo, Brazil
> daniellins at usp.br
> daniel.lins at gmail.com
>
>
>
> 2014-04-28 3:26 GMT-03:00 <Natasha.Quimby at csiro.au>:
>
>  Hi Daniel,
>>
>>  When you specify a local DwcA Load the archive needs to be unzipped.
>> Try unzipping *2f676abc-4503-489e-8f0c-fcb6e1bc554b.zip *and then
>> running the following:
>> s*udo** java -cp .:biocache.jar au.org.ala.util.DwCALoader dr7 -l
>> /data/collectory/upload/1398658607824/2f676abc-4503-489e-8f0c-fcb6e1bc554b*
>>
>>  If you configure the collectory to provide the dwca the biocacheautomatically unzips the archive for you.  You would need to configure dr7
>> with the following connection parameters:
>>
>>  "protocol":"DwCA"
>> "termsForUniqueKey":["occurrenceID"],
>> "url":"file:////data/collectory/upload/
>> 1398658607824/2f676abc-4503-489e-8f0c-fcb6e1bc554b.zip"
>>
>>  You could then load the resource by:
>> s*udo** java -cp .:biocache.jar au.org.ala.util.DwCALoader dr7*
>>
>>  If you continue to have issues please let us know.
>>
>>  Hope that this helps.
>>
>>  Regards
>> Natasha
>>
>>   From: Daniel Lins <daniel.lins at gmail.com>
>> Date: Monday, 28 April 2014 3:54 PM
>> To: "ala-portal at lists.gbif.org" <ala-portal at lists.gbif.org<ala-portal at lists.gbif.org><ala-portal at lists.gbif.org><ala-portal at lists.gbif.org><ala-portal at lists.gbif.org>
>> >, "dos Remedios, Nick (CES, Black Mountain)" <Nick.Dosremedios at csiro.au
>> >, "Martin, Dave (CES, Black Mountain)" <David.Martin at csiro.au>
>> Subject: [Ala-portal] DwC-A loading problems
>>
>>   Hi Nick and Dave,
>>
>>  We are having some problems in Biocache during the upload of DwC-A
>> files.
>>
>>  As shown below, after run the method "au.org.ala.util.DwCALoader", our
>> system returns the error message "Exception in thread "main" org.gbif.dwc
>> .text.UnkownDelimitersException: Unable to detect field delimiter"
>>
>>  I accomplished tests using DwC-A files with tab-delimited text files
>> and comma-delimited text files. In both cases the error generated was the
>> same.
>>
>>  What causes these problems? (** CSV Loader works great)
>>
>>  *tab-delimited file test*
>>
>>  poliusp at poliusp-VirtualBox:~/dev/biocache$ s*udo java -cp
>> .:biocache.jar au.org.ala.util.DwCALoader dr7 -l
>> /data/collectory/upload/1398658607824/2f676abc-4503-489e-8f0c-fcb6e1bc554b.zip*
>> 2014-04-28 01:44:02,837 INFO : [ConfigModule] - Loading configuration
>> from /data/biocache/config/biocache-config.properties
>> 2014-04-28 01:44:03,090 INFO : [ConfigModule] - Initialise SOLR
>> 2014-04-28 01:44:03,103 INFO : [ConfigModule] - Initialise name matching
>> indexes
>> 2014-04-28 01:44:03,605 INFO : [ConfigModule] - Initialise persistence
>> manager
>> 2014-04-28 01:44:03,606 INFO : [ConfigModule] - Configure complete
>> Loading archive /data/collectory
>> /upload/1398658607824/2f676abc-4503-489e-8f0c-fcb6e1bc554b.zip for
>> resource dr7 with unique terms List(dwc:occurrenceID) stripping spaces
>> false incremental false testing false
>> *Exception in thread "main" org.gbif.dwc.text.UnkownDelimitersException:
>> Unable to detect field delimiter*
>>         at org.gbif.file.CSVReaderFactory.buildArchiveFile(
>> CSVReaderFactory.java:129)
>>         at org.gbif.file.CSVReaderFactory.build(CSVReaderFactory.java:46)
>>         at org.gbif.dwc.text.ArchiveFactory.readFileHeaders(
>> ArchiveFactory.java:344)
>>         at org.gbif.dwc.text.ArchiveFactory.openArchive(
>> ArchiveFactory.java:289)
>>         at au.org.ala.util.DwCALoader.loadArchive(DwCALoader.scala:129)
>>         at au.org.ala.util.DwCALoader.loadLocal(DwCALoader.scala:106)
>>         at au.org.ala.util.DwCALoader$.main(DwCALoader.scala:52)
>>         at au.org.ala.util.DwCALoader.main(DwCALoader.scala)
>>
>>
>>  *comma-delimited file test*
>>
>>  poliusp at poliusp-VirtualBox:~/dev/biocache$ *sudo java -cp
>> .:biocache.jar au.org.ala.util.DwCALoader dr7 -l ./dwca-teste3.zip*
>> 2014-04-28 01:56:04,683 INFO : [ConfigModule] - Loading configuration
>> from /data/biocache/config/biocache-config.properties
>> 2014-04-28 01:56:04,940 INFO : [ConfigModule] - Initialise SOLR
>> 2014-04-28 01:56:04,951 INFO : [ConfigModule] - Initialise name matching
>> indexes
>> 2014-04-28 01:56:05,437 INFO : [ConfigModule] - Initialise persistence
>> manager
>> 2014-04-28 01:56:05,438 INFO : [ConfigModule] - Configure complete
>> Loading archive ./dwca-teste3.zip for resource dr7 with unique terms List
>> (dwc:occurrenceID) stripping spaces false incremental false testing false
>> *Exception in thread "main" org.gbif.dwc.text.UnkownDelimitersException:
>> Unable to detect field delimiter*
>>         at org.gbif.file.CSVReaderFactory.buildArchiveFile(
>> CSVReaderFactory.java:129)
>>         at org.gbif.file.CSVReaderFactory.build(CSVReaderFactory.java:46)
>>         at org.gbif.dwc.text.ArchiveFactory.readFileHeaders(
>> ArchiveFactory.java:344)
>>         at org.gbif.dwc.text.ArchiveFactory.openArchive(
>> ArchiveFactory.java:289)
>>         at au.org.ala.util.DwCALoader.loadArchive(DwCALoader.scala:129)
>>         at au.org.ala.util.DwCALoader.loadLocal(DwCALoader.scala:106)
>>         at au.org.ala.util.DwCALoader$.main(DwCALoader.scala:52)
>>         at au.org.ala.util.DwCALoader.main(DwCALoader.scala)
>>
>>
>>  Thanks!
>>
>>  Regards.
>> --
>>  Daniel Lins da Silva
>> (Mobile) 55 11 96144-4050
>>  Research Center on Biodiversity and Computing (Biocomp)
>> University of Sao Paulo, Brazil
>>  daniellins at usp.br
>> daniel.lins at gmail.com
>>
>>
>
>
> --
> Daniel Lins da Silva
> (Cel) 11 6144-4050
> daniel.lins at gmail.com
>



-- 
Daniel Lins da Silva
(Cel) 11 6144-4050
daniel.lins at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.gbif.org/pipermail/ala-portal/attachments/20140505/2a10014d/attachment-0001.html 


More information about the Ala-portal mailing list