[Ala-portal] DwC-A loading problems
Daniel Lins
daniel.lins at gmail.com
Fri Jun 27 05:54:03 CEST 2014
Hi Dave,
Did you see this mail? Do you think this issue can be something related to
the configuration of api_key property?
Thanks.
Regards,
2014-06-25 2:14 GMT-03:00 Daniel Lins <daniel.lins at gmail.com>:
> Hi Dave,
>
> Thanks for the support.
>
> The data loading in the biocache is working properly now. But
> error continues during the update of collectory (see below).
>
> *java.net.SocketTimeoutException: Read timed out*
> * at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)*
> * at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)*
> * at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)*
> * at java.lang.reflect.Constructor.newInstance(Constructor.java:526)*
> * at
> sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1675)*
> * at
> sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1673)*
> * at java.security.AccessController.doPrivileged(Native Method)*
> * at
> sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1671)*
> * at
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1244)*
> * at
> java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)*
> * at scalaj.http.Http$Request.liftedTree1$1(Http.scala:107)*
> * at scalaj.http.Http$Request.process(Http.scala:103)*
> * at scalaj.http.Http$Request.responseCode(Http.scala:120)*
> * at
> au.org.ala.biocache.load.DataLoader$class.updateLastChecked(DataLoader.scala:354)*
> * at
> au.org.ala.biocache.load.DwCALoader.updateLastChecked(DwCALoader.scala:74)*
> * at au.org.ala.biocache.load.DwCALoader.load(DwCALoader.scala:103)*
> * at au.org.ala.biocache.load.Loader.load(Loader.scala:75)*
> * at
> au.org.ala.biocache.cmd.CMD$$anonfun$executeCommand$7.apply(CMD.scala:69)*
> * at
> au.org.ala.biocache.cmd.CMD$$anonfun$executeCommand$7.apply(CMD.scala:69)*
> * at
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)*
> * at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)*
> * at au.org.ala.biocache.cmd.CMD$.executeCommand(CMD.scala:69)*
> * at
> au.org.ala.biocache.cmd.CommandLineTool$.main(CommandLineTool.scala:22)*
> * at au.org.ala.biocache.cmd.CommandLineTool.main(CommandLineTool.scala)*
> *Caused by: java.net.SocketTimeoutException: Read timed out*
> * at java.net.SocketInputStream.socketRead0(Native Method)*
> * at java.net.SocketInputStream.read(SocketInputStream.java:152)*
> * at java.net.SocketInputStream.read(SocketInputStream.java:122)*
> * at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)*
> * at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)*
> * at java.io.BufferedInputStream.read(BufferedInputStream.java:334)*
> * at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)*
> * at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)*
> * at
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323)*
> * at
> java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)*
> * at
> scalaj.http.Http$Request$$anonfun$responseCode$1.apply(Http.scala:120)*
> * atscala
> j.http.Http$Request$$anonfun$responseCode$1.apply(Http.scala:120)*
> * at scalaj.http.Http$Request.liftedTree1$1(Http.scala:104)*
> * ... 13 more*
>
> In the external configuration file
> (/data/biocache/config/biocache-config.properties) the property registry.url
> is correct (registry.url=http://192.168.15.132:8080/collectory/ws), indicating
> the URL of the collectory WS page.
>
> It could be something related to permission for external access? How works
> this *api_key* property in the collectory?
>
> Thanks!
>
> Regards,
>
> Daniel Lins da Silva
> (Mobile) 55 11 96144-4050
> Research Center on Biodiversity and Computing (Biocomp)
> University of Sao Paulo, Brazil
> daniellins at usp.br
> daniel.lins at gmail.com
>
>
> 2014-06-20 2:26 GMT-03:00 <David.Martin at csiro.au>:
>
>> Hi Daniel.
>>
>> There is a version updated of biocache-store of 1.1.1 that helped fix
>> some of the problems Burke spotted when loading darwin core archives
>> downloaded from the GBIF portal. The symptom where similar (only one record
>> loaded for a dataset).
>>
>> The exception in point 3) indicates the URL you have configured for the
>> collectory (registry.url in biocache.properties) is either incorrect, or
>> the collectory can not be accessed for some reason. At the end of data
>> load, the collectory is updated to indicate the last loaded date for that
>> dataset. This is done using a webservice.
>>
>> One thing to mention - if you want to remove all data from your
>> database, the easiest thing to do is use the cassandra-cli and run the
>> command:
>>
>> >> truncate occ;
>>
>> This will remove all occurrence records from the database, but not from
>> the index.
>>
>> The warnings you are seeing in the processing phase e.g.
>>
>> *2014-06-20 01:51:20,505 WARN : [ALANameSearcher] - Unable to parse
>> Abaca bunchy top (Babuvirus). Name of type virus unparsable: Abaca bunchy
>> top (Babuvirus)*
>>
>> are normal. This is referring to the sensitive species list in use.
>>
>> Cheers
>>
>> Dave
>>
>>
>> ------------------------------
>> *From:* Daniel Lins [daniel.lins at gmail.com]
>> *Sent:* 20 June 2014 15:11
>> *To:* Martin, Dave (CES, Black Mountain)
>> *Cc:* ala-portal at lists.gbif.org; dos Remedios, Nick (CES, Black
>> Mountain); Pedro Corrêa
>> *Subject:* Re: [Ala-portal] DwC-A loading problems
>>
>> Hi Dave, thanks for the information from the last email.
>>
>> I'm following your advice and performing the update of our test
>> environment for biocache version 1.1. But I'm having some problems and I
>> would like to know if you or anyone has already found this issue and know a
>> solution.
>>
>> To update the biocache version I did these steps below (based on the
>> Vagrant/Ansible installation process):
>>
>> 1. Cleaning of the database and index through delete-resource function
>> (delete-resource dr0 dr1 dr2 ...);
>> 2. An update of the Biocache config file
>> (/data/biocache/config/biocache-config.properties) (copied from the Vagrant
>> VM, with some configuration changes);
>> 3. An update of the biocache build file (biocache. jar) (copied from the
>> Vagrant VM - /usr/lib/biocache);
>> 4. Deployment of the new biocache-service build (copied from the Vagrant
>> VM - tomcat7/webapps/biocache-service.war)
>> 5. An update of the Solr config files (schema.xml, solrconfig.xml)
>> (copied from the Vagrant VM - /data/solr/biocache);
>> 6. Exclusion of the indexing folder of Biocache Core (/data/solr
>> /biocache/data);
>>
>> Notes 1 ** No change was made in the Hubs-Webapp and Collectory.
>>
>> Notes 2 ** The import of CSV files is working (using load-local-csv
>> dr0 /<file_location>/xxx.csv).
>>
>>
>> I tried to import a Darwin Core Archive by following these steps:
>>
>> 1. Created a data resource (dr0);
>>
>> 2. Uploaded a DWC-A zip file into the DR using the "Upload File" option.
>>
>> *Protocol:DarwinCore archive*
>> *Location
>> URL:file:////data/collectory/upload/1403239521145/dwca-ocorrencias_lobo_guara_1.zip*
>> *Automatically loaded:false*
>> *DwC terms that uniquely identify a record: occurrenceID*
>> *Strip whitespaces in key: false*
>> *Incremental Load: false*
>>
>> 3. Used the Command Line Tool (Biocache) to Load (*load dr0*), Process (*process
>> dr0*) and Index (*index dr0*) data.
>>
>>
>> During the data loading phase, the system generated these errors:
>>
>> *...*
>> *2014-06-20 01:49:12,506 INFO : [DataLoader] - Finished DwC loader.
>> Records processed: 32*
>> *java.net.SocketTimeoutException: Read timed out*
>> *at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)*
>> *at
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)*
>> *at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)*
>> *at java.lang.reflect.Constructor.newInstance(Constructor.java:526)*
>> *at
>> sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1675)*
>> *at
>> sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1673)*
>> *at java.security.AccessController.doPrivileged(Native Method)*
>> *at
>> sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1671)*
>> *at
>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1244)*
>> *at
>> java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)*
>> *at scalaj.http.Http$Request.liftedTree1$1(Http.scala:107)*
>> *at scalaj.http.Http$Request.process(Http.scala:103)*
>> *at scalaj.http.Http$Request.responseCode(Http.scala:120)*
>> *at
>> au.org.ala.biocache.load.DataLoader$class.updateLastChecked(DataLoader.scala:354)*
>> *at
>> au.org.ala.biocache.load.DwCALoader.updateLastChecked(DwCALoader.scala:74)*
>> *at au.org.ala.biocache.load.DwCALoader.load(DwCALoader.scala:103)*
>> *at au.org.ala.biocache.load.Loader.load(Loader.scala:75)*
>> *at
>> au.org.ala.biocache.cmd.CMD$$anonfun$executeCommand$7.apply(CMD.scala:69)*
>> *at
>> au.org.ala.biocache.cmd.CMD$$anonfun$executeCommand$7.apply(CMD.scala:69)*
>> *at
>> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)*
>> *at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)*
>> *at au.org.ala.biocache.cmd.CMD$.executeCommand(CMD.scala:69)*
>> *at
>> au.org.ala.biocache.cmd.CommandLineTool$.main(CommandLineTool.scala:22)*
>> *at au.org.ala.biocache.cmd.CommandLineTool.main(CommandLineTool.scala)*
>> *Caused by: java.net.SocketTimeoutException: Read timed out*
>> *at java.net.SocketInputStream.socketRead0(Native Method)*
>> *at java.net.SocketInputStream.read(SocketInputStream.java:152)*
>> *at java.net.SocketInputStream.read(SocketInputStream.java:122)*
>> *at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)*
>> *at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)*
>> *at java.io.BufferedInputStream.read(BufferedInputStream.java:334)*
>> *at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)*
>> *at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)*
>> *at
>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323)*
>> *at
>> java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)*
>> *at
>> scalaj.http.Http$Request$$anonfun$responseCode$1.apply(Http.scala:120)*
>> *at
>> scalaj.http.Http$Request$$anonfun$responseCode$1.apply(Http.scala:120)*
>> *at scalaj.http.Http$Request.liftedTree1$1(Http.scala:104)*
>> *... 13 more*
>>
>>
>> And in Cassandra was saved only one record:
>>
>>
>> *cqlsh:occ> select * from occ; *
>>
>> * key | portalId | uuid*
>> *----------+----------+--------------------------------------*
>> * dr0|null | null | 1b5b21fc-594a-46e6-b8db-cf37c50b8f7b*
>>
>>
>> During the data processing phase, the system generated these additional
>> errors:
>>
>> ...
>> *Jun 20, 2014 1:51:08 AM
>> org.geotools.referencing.factory.epsg.ThreadedHsqlEpsgFactory
>> createDataSource*
>> *INFO: Building new data source for
>> org.geotools.referencing.factory.epsg.ThreadedHsqlEpsgFactory*
>> *Jun 20, 2014 1:51:08 AM
>> org.geotools.referencing.factory.epsg.ThreadedHsqlEpsgFactory
>> createBackingStore*
>> *INFO: Building backing store for
>> org.geotools.referencing.factory.epsg.ThreadedHsqlEpsgFactory*
>> *2014-06-20 01:51:20,505 WARN : [ALANameSearcher] - Unable to parse Abaca
>> bunchy top (Babuvirus). Name of type virus unparsable: Abaca bunchy top
>> (Babuvirus)*
>> *2014-06-20 01:51:20,509 WARN : [ALANameSearcher] - Unable to parse Abaca
>> mosaic, sugarcane mosaic (Potyvirus). Name of type virus unparsable: Abaca
>> mosaic, sugarcane mosaic (Potyvirus)*
>> *2014-06-20 01:51:21,210 WARN : [ALANameSearcher] - Unable to parse Acute
>> bee paralysis (Cripavirus). Name of type virus unparsable: Acute bee
>> paralysis (Cripavirus)*
>> *2014-06-20 01:51:21,255 WARN : [ALANameSearcher] - Unable to parse
>> Agropyron mosaic (Rymovirus). Name of type virus unparsable: Agropyron
>> mosaic (Rymovirus)*
>> *2014-06-20 01:51:21,289 WARN : [ALANameSearcher] - Unable to parse
>> Alphacrytovirus vicia. Name of type virus unparsable: Alphacrytovirus vicia*
>> *2014-06-20 01:51:21,334 WARN : [ALANameSearcher] - Unable to parse
>> American plum line pattern (APLPV, Ilaravirus). Name of type virus
>> unparsable: American plum line pattern (APLPV, Ilaravirus)*
>> *2014-06-20 01:51:21,525 WARN : [ALANameSearcher] - Unable to parse Apis
>> iridescent (Iridovirus). Name of type virus unparsable: Apis iridescent
>> (Iridovirus)*
>> *2014-06-20 01:51:21,546 WARN : [ALANameSearcher] - Unable to parse
>> Apricot ring pox (Unassigned). Name of type blacklisted unparsable:
>> Apricot ring pox (Unassigned)*
>> *2014-06-20 01:51:21,549 WARN : [ALANameSearcher] - Unable to parse
>> Arabis mosaic (Nepovirus). Name of type virus unparsable: Arabis mosaic
>> (Nepovirus)*
>> *2014-06-20 01:51:21,623 WARN : [ALANameSearcher] - Unable to parse
>> Artichoke Italian latent (Nepovirus). Name of type virus unparsable:
>> Artichoke Italian latent (Nepovirus)*
>> *2014-06-20 01:51:21,640 WARN : [ALANameSearcher] - Unable to parse
>> Asparagus (Ilarvirus). Name of type virus unparsable: Asparagus
>> (Ilarvirus)*
>> *2014-06-20 01:51:21,641 WARN : [ALANameSearcher] - Unable to parse
>> Asparagus (Potyvirus). Name of type virus unparsable: Asparagus
>> (Potyvirus)*
>> ...
>>
>> During the last phase there were no errors. However, only one record
>> was indexed.
>>
>> *2014-06-20 01:54:07,739 INFO : [SolrIndexDAO] - >>>>>>>>>>>>> Document
>> count of index: 1*
>> *2014-06-20 01:54:07,741 INFO : [SolrIndexDAO] - Finalise finished.*
>>
>> I attached a file with the complete messages generated by Biocache
>> during this test.
>>
>>
>> Thanks!
>>
>> Cheers.
>>
>> Daniel Lins da Silva
>> (Mobile) 55 11 96144-4050
>> Research Center on Biodiversity and Computing (Biocomp)
>> University of Sao Paulo, Brazil
>> daniellins at usp.br
>> daniel.lins at gmail.com
>>
>>
>>
>> 2014-06-18 6:15 GMT-03:00 <David.Martin at csiro.au>:
>>
>>> Hi Daniel,
>>>
>>> From what you've said, Im not clear on what customisations you have
>>> made so its difficult to make a call on the impact of migrating to 1.1. We
>>> also do not know what subversion revisions you started with.
>>>
>>> We can tell you that functionally there wasn't a great deal of
>>> difference between the later snapshots of 1.0 and 1.1.
>>> The changes where largely structural i.e. a clean up of packages,
>>> removal of redundant code. We did this largely because we needed to (this
>>> code base is now over 5 years old) and we wanted to clean things up before
>>> other projects started to work with the software.
>>>
>>> Upgrading to biocache-service 1.1 and biocache-store shouldnt require
>>> any changes to cassandra, but it may require and upgrade of SOLR. If this
>>> is the case, you'll need to regenerate your index using the biocache
>>> commandline tool. Upgrading to 1.1 shouldnt require any changes to
>>> hubs-webapp if you've customised this component.
>>>
>>> I'd really recommend move to 1.1 sooner rather than later as it'll
>>> give you a stable baseline to work against.
>>>
>>> Hope this helps,
>>>
>>> Dave Martin
>>> ALA
>>>
>>> ------------------------------
>>> *From:* Daniel Lins [daniel.lins at gmail.com]
>>> *Sent:* 18 June 2014 15:54
>>>
>>> *To:* Martin, Dave (CES, Black Mountain)
>>> *Cc:* ala-portal at lists.gbif.org; dos Remedios, Nick (CES, Black
>>> Mountain); Pedro Corrêa; Nicholls, Miles (CES, Black Mountain)
>>> *Subject:* Re: [Ala-portal] DwC-A loading problems
>>>
>>> Hi Dave,
>>>
>>> How can I update the Biocache-1.0-SNAPSHOT to the version 1.1? I
>>> updated the biocache-store (biocache.jar) and the config file
>>> (/data/biocache/conf/config.properties-biocache) but I still have problems.
>>> Which other steps I need to do? Apparently this new version of the
>>> biocache configuration file generates impacts directly in my
>>> Biocache-Services and Solr.
>>>
>>> This update will generate some impacts in other components?
>>>
>>> I cannot use the installation process based on the Vagrant/Ansible
>>> because our environment is different and already have customizations. So I
>>> would like to update the biocache with minimum impact, if possible. After
>>> we will have to plan the update of the other components.
>>>
>>> Can you advise me as to the best way forward?
>>>
>>> Thanks!!
>>>
>>> Regards,
>>>
>>> Daniel Lins da Silva
>>> (Mobile) 55 11 96144-4050
>>> Research Center on Biodiversity and Computing (Biocomp)
>>> University of Sao Paulo, Brazil
>>> daniellins at usp.br
>>> daniel.lins at gmail.com
>>>
>>>
>>>
>>> 2014-05-26 3:58 GMT-03:00 <David.Martin at csiro.au>:
>>>
>>>> Thanks Daniel.
>>>>
>>>> I'd recommend upgrading to 1.1 and I'd recommend installation with
>>>> the ansible scripts. This will give you base line configuration.
>>>> The scripts can be tested on a local machine using vagrant.
>>>> The configuration between 1.0 and 1.1 changed significantly - removal
>>>> of redundant, legacy properties, adoption of standard format for property
>>>> names.
>>>> Heres the template used for the configuration file in the ansible
>>>> scripts:
>>>>
>>>>
>>>> https://github.com/gbif/ala-install/blob/master/ansible/roles/biocache-service/templates/config/biocache-config.properties
>>>>
>>>> Cheers
>>>>
>>>> Dave Martin
>>>> ALA
>>>>
>>>> ------------------------------
>>>> *From:* Daniel Lins [daniel.lins at gmail.com]
>>>> *Sent:* 26 May 2014 15:02
>>>> *To:* Martin, Dave (CES, Black Mountain)
>>>> *Cc:* ala-portal at lists.gbif.org; dos Remedios, Nick (CES, Black
>>>> Mountain); Pedro Corrêa; Nicholls, Miles (CES, Black Mountain)
>>>> *Subject:* Re: [Ala-portal] DwC-A loading problems
>>>>
>>>> Hi Dave,
>>>>
>>>> When I ran the ingest command (ingest dr0), the system showed errors
>>>> like these below. However, after the error messages, I ran the index
>>>> command (index dr0), and data were published on the Portal.
>>>>
>>>> 2014-05-20 14:15:05,412 ERROR: [Grid] - cannot find GRID: /data/ala
>>>> /data/layers/ready/diva/worldclim_bio_19
>>>> 2014-05-20 14:15:05,414 ERROR: [Grid] - java.io.FileNotFoundException:
>>>> /data/ala/data/layers/ready/diva/worldclim_bio_19.gri (No such file or
>>>> directory)
>>>> at java.io.RandomAccessFile.open(Native Method)
>>>> at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
>>>> at java.io.RandomAccessFile.<init>(RandomAccessFile.java:122)
>>>> at org.ala.layers.intersect.Grid.getValues3(Grid.java:1017)
>>>> at org.ala.layers.intersect.SamplingThread.intersectGrid(
>>>> SamplingThread.java:112)
>>>> at org.ala.layers.intersect.SamplingThread.sample(SamplingThread.java:
>>>> 97)
>>>> at org.ala.layers.intersect.SamplingThread.run(SamplingThread.java:67)
>>>>
>>>> 2014-05-20 14:15:05,447 INFO : [Sampling] - ********* END - TEST
>>>> BATCH SAMPLING FROM FILE ***************
>>>> 2014-05-20 14:15:05,496 INFO : [Sampling] - Finished loading:
>>>> /tmp/sampling-dr0.txt in 49ms
>>>> 2014-05-20 14:15:05,496 INFO : [Sampling] - Removing temporary file:
>>>> /tmp/sampling-dr0.txt
>>>> 2014-05-20 14:15:05,553 INFO : [Consumer] - Initialising thread: 0
>>>> 2014-05-20 14:15:05,575 INFO : [Consumer] - Initialising thread: 1
>>>> 2014-05-20 14:15:05,575 INFO : [Consumer] - Initialising thread: 2
>>>> 2014-05-20 14:15:05,577 INFO : [Consumer] - In thread: 0
>>>> 2014-05-20 14:15:05,579 INFO : [Consumer] - Initialising thread: 3
>>>> 2014-05-20 14:15:05,579 INFO : [ProcessWithActors] - Starting with
>>>> dr0| endingwith dr0|~
>>>> 2014-05-20 14:15:05,581 INFO : [Consumer] - In thread: 2
>>>> 2014-05-20 14:15:05,581 INFO : [Consumer] - In thread: 1
>>>> 2014-05-20 14:15:05,584 INFO : [Consumer] - In thread: 3
>>>> 2014-05-20 14:15:05,592 INFO : [ProcessWithActors] - Initialised
>>>> actors...
>>>> 2014-05-20 14:15:05,647 INFO : [ProcessWithActors] - First rowKey
>>>> processed: dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:
>>>> MA120999
>>>> 2014-05-20 14:15:05,998 INFO : [ProcessWithActors] - Last row key
>>>> processed: dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA99991
>>>> 2014-05-20 14:15:06,006 INFO : [ProcessWithActors] - Finished.
>>>> 2014-05-20 14:15:06,015 INFO : [AttributionDAO] - Calling web service
>>>> for dr0
>>>> 2014-05-20 14:15:06,017 INFO : [Consumer] - Killing (Actor.act)
>>>> thread: 3
>>>> 2014-05-20 14:15:06,016 INFO : [Consumer] - Killing (Actor.act)
>>>> thread: 2
>>>> 2014-05-20 14:15:06,015 INFO : [Consumer] - Killing (Actor.act)
>>>> thread: 1
>>>> 2014-05-20 14:15:06,289 INFO : [AttributionDAO] - Looking up collectory
>>>> web service for ICMBIO|PARNASO
>>>> May 20, 2014 2:15:10 PM org.geotools.referencing.factory.epsg.ThreadedEpsgFactory
>>>> <init>
>>>> INFO: Setting the EPSG factory org.geotools.referencing.factory.epsg.DefaultFactory
>>>> to a 1800000ms timeout
>>>> May 20, 2014 2:15:10 PM org.geotools.referencing.factory.epsg.ThreadedEpsgFactory
>>>> <init>
>>>> INFO: Setting the EPSG factory org.geotools.referencing.factory.epsg.ThreadedHsqlEpsgFactory
>>>> to a 1800000ms timeout
>>>> May 20, 2014 2:15:10 PM org.geotools.referencing.factory.epsg.ThreadedHsqlEpsgFactory
>>>> createDataSource
>>>> INFO: Building new data source for org.geotools.referencing.factory.
>>>> epsg.ThreadedHsqlEpsgFactory
>>>> May 20, 2014 2:15:10 PM org.geotools.referencing.factory.epsg.ThreadedHsqlEpsgFactory
>>>> createBackingStore
>>>> INFO: Building backing store for org.geotools.referencing.factory.epsg.
>>>> ThreadedHsqlEpsgFactory
>>>> 2014-05-20 14:15:32,105 INFO : [Consumer] - Killing (Actor.act)
>>>> thread: 0
>>>> Indexing live with URL: null, and params: null&dataResource=dr0
>>>> java.lang.NullPointerException
>>>> at
>>>> au.org.ala.util.CMD$.au$org$ala$util$CMD$$indexDataResourceLive$1(CommandLineTool.scala:371)
>>>> at
>>>> au.org.ala.util.CMD$$anonfun$executeCommand$2.apply(CommandLineTool.scala:90)
>>>> at
>>>> au.org.ala.util.CMD$$anonfun$executeCommand$2.apply(CommandLineTool.scala:86)
>>>> at scala.collection.IndexedSeqOptimized$class.foreach(
>>>> IndexedSeqOptimized.scala:33)
>>>> at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:105)
>>>> at au.org.ala.util.CMD$.executeCommand(CommandLineTool.scala:86)
>>>> at au.org.ala.util.CommandLineTool$.main(CommandLineTool.scala:26)
>>>> at au.org.ala.util.CommandLineTool.main(CommandLineTool.scala)
>>>>
>>>>
>>>>
>>>> Nowadays we use the* biocache-1.0-SNAPSHOT* in our environment. But
>>>> in your last mail you mentioned the version *biocache-1.1-assembly*.
>>>>
>>>> I did download of this newer version, but when I ran it (ingest dr0)
>>>> in our environment the system showed many errors (see below).
>>>>
>>>>
>>>> log4j:WARN custom level class [org.ala.client.appender.RestLevel] not
>>>> found.
>>>> Exception in thread "main" java.lang.ExceptionInInitializerError
>>>> at
>>>> au.org.ala.biocache.load.DataLoader$class.$init$(DataLoader.scala:28)
>>>> at au.org.ala.biocache.load.Loader.<init>(Loader.scala:34)
>>>> at au.org.ala.biocache.cmd.CMD$.executeCommand(CMD.scala:29)
>>>> at
>>>> au.org.ala.biocache.cmd.CommandLineTool$.main(CommandLineTool.scala:22)
>>>> at au.org.ala.biocache.cmd.CommandLineTool.main(CommandLineTool.scala)
>>>> Caused by: com.google.inject.CreationException: Guice creation errors:
>>>>
>>>> 1) No implementation for java.lang.Integer annotated with @com.google.
>>>> inject.name.Named(value=cassandra.max.connections) was bound.
>>>> while locating java.lang.Integer annotated with @com.google.inject.
>>>> name.Named(value=cassandra.max.connections)
>>>> for parameter 4 at
>>>> au.org.ala.biocache.persistence.CassandraPersistenceManager.<init
>>>> >(CassandraPersistenceManager.scala:24)
>>>> at au.org.ala.biocache.ConfigModule.configure(Config.scala:184)
>>>>
>>>> 2) No implementation for java.lang.Integer annotated with @com.google.
>>>> inject.name.Named(value=cassandra.max.retries) was bound.
>>>> while locating java.lang.Integer annotated with @com.google.inject.
>>>> name.Named(value=cassandra.max.retries)
>>>> for parameter 5 at
>>>> au.org.ala.biocache.persistence.CassandraPersistenceManager.<init
>>>> >(CassandraPersistenceManager.scala:24)
>>>> at au.org.ala.biocache.ConfigModule.configure(Config.scala:184)
>>>>
>>>> 3) No implementation for java.lang.Integer annotated with @com.google.
>>>> inject.name.Named(value=cassandra.port) was bound.
>>>> while locating java.lang.Integer annotated with @com.google.inject.
>>>> name.Named(value=cassandra.port)
>>>> for parameter 1 at
>>>> au.org.ala.biocache.persistence.CassandraPersistenceManager.<init
>>>> >(CassandraPersistenceManager.scala:24)
>>>> at au.org.ala.biocache.ConfigModule.configure(Config.scala:184)
>>>>
>>>> 4) No implementation for java.lang.Integer annotated with @com.google.
>>>> inject.name.Named(value=thrift.operation.timeout) was bound.
>>>> while locating java.lang.Integer annotated with @com.google.inject.
>>>> name.Named(value=thrift.operation.timeout)
>>>> for parameter 6 at
>>>> au.org.ala.biocache.persistence.CassandraPersistenceManager.<init
>>>> >(CassandraPersistenceManager.scala:24)
>>>> at au.org.ala.biocache.ConfigModule.configure(Config.scala:184)
>>>>
>>>> 5) No implementation for java.lang.String annotated with @com.google.
>>>> inject.name.Named(value=cassandra.hosts) was bound.
>>>> while locating java.lang.String annotated with @com.google.inject.
>>>> name.Named(value=cassandra.hosts)
>>>> for parameter 0 at
>>>> au.org.ala.biocache.persistence.CassandraPersistenceManager.<init
>>>> >(CassandraPersistenceManager.scala:24)
>>>> at au.org.ala.biocache.ConfigModule.configure(Config.scala:184)
>>>>
>>>> 6) No implementation for java.lang.String annotated with @com.google.
>>>> inject.name.Named(value=cassandra.keyspace) was bound.
>>>> while locating java.lang.String annotated with @com.google.inject.
>>>> name.Named(value=cassandra.keyspace)
>>>> for parameter 3 at
>>>> au.org.ala.biocache.persistence.CassandraPersistenceManager.<init
>>>> >(CassandraPersistenceManager.scala:24)
>>>> at au.org.ala.biocache.ConfigModule.configure(Config.scala:184)
>>>>
>>>> 7) No implementation for java.lang.String annotated with @com.google.
>>>> inject.name.Named(value=cassandra.pool) was bound.
>>>> while locating java.lang.String annotated with @com.google.inject.
>>>> name.Named(value=cassandra.pool)
>>>> for parameter 2 at
>>>> au.org.ala.biocache.persistence.CassandraPersistenceManager.<init
>>>> >(CassandraPersistenceManager.scala:24)
>>>> at au.org.ala.biocache.ConfigModule.configure(Config.scala:184)
>>>>
>>>> 8) No implementation for java.lang.String annotated with @com.google.
>>>> inject.name.Named(value=exclude.sensitive.values) was bound.
>>>> while locating java.lang.String annotated with @com.google.inject.
>>>> name.Named(value=exclude.sensitive.values)
>>>> for parameter 1 at au.org.ala.biocache.index.SolrIndexDAO.<init
>>>> >(SolrIndexDAO.scala:28)
>>>> at au.org.ala.biocache.ConfigModule.configure(Config.scala:164)
>>>>
>>>> 9) No implementation for java.lang.String annotated with @com.google.
>>>> inject.name.Named(value=extra.misc.fields) was bound.
>>>> while locating java.lang.String annotated with @com.google.inject.
>>>> name.Named(value=extra.misc.fields)
>>>> for parameter 2 at au.org.ala.biocache.index.SolrIndexDAO.<init
>>>> >(SolrIndexDAO.scala:28)
>>>> at au.org.ala.biocache.ConfigModule.configure(Config.scala:164)
>>>>
>>>> 10) No implementation for java.lang.String annotated with @com.google.
>>>> inject.name.Named(value=solr.home) was bound.
>>>> while locating java.lang.String annotated with @com.google.inject.
>>>> name.Named(value=solr.home)
>>>> for parameter 0 at au.org.ala.biocache.index.SolrIndexDAO.<init
>>>> >(SolrIndexDAO.scala:28)
>>>> at au.org.ala.biocache.ConfigModule.configure(Config.scala:164)
>>>>
>>>> 10 errors
>>>> at com.google.inject.internal.Errors.
>>>> throwCreationExceptionIfErrorsExist(Errors.java:354)
>>>> at com.google.inject.InjectorBuilder.initializeStatically(
>>>> InjectorBuilder.java:152)
>>>> at com.google.inject.InjectorBuilder.build(InjectorBuilder.java:105)
>>>> at com.google.inject.Guice.createInjector(Guice.java:92)
>>>> at com.google.inject.Guice.createInjector(Guice.java:69)
>>>> at com.google.inject.Guice.createInjector(Guice.java:59)
>>>> at au.org.ala.biocache.Config$.<init>(Config.scala:24)
>>>> at au.org.ala.biocache.Config$.<clinit>(Config.scala)
>>>> ... 5 more
>>>>
>>>>
>>>> Related to these specific issues (data update and incremental load),
>>>> I will need to upgrade the biocache version (1.1 or newer) or I could
>>>> work with the version 1.0-SNAPSHOT? If I update this version, I will have
>>>> compatibility with the other components? How should I proceed?
>>>>
>>>> Which layer files should I include in my environment to run these tests?
>>>>
>>>> Thanks!
>>>>
>>>> Regards,
>>>>
>>>> Daniel Lins da Silva
>>>> (Mobile) 55 11 96144-4050
>>>> Research Center on Biodiversity and Computing (Biocomp)
>>>> University of Sao Paulo, Brazil
>>>> daniellins at usp.br
>>>> daniel.lins at gmail.com
>>>>
>>>>
>>>> 2014-05-09 2:14 GMT-03:00 <David.Martin at csiro.au>:
>>>>
>>>>> Thanks Daniel.
>>>>>
>>>>> I've spotted the problem:
>>>>>
>>>>> java -cp .:biocache.jar au.org.ala.util.DwcCSVLoader dr0 -l
>>>>> dataset-updated.csv -b true
>>>>>
>>>>> this bypasses lookups against the collectory for the metadata.
>>>>>
>>>>> To load this dataset, you can use the biocache commandline tool like
>>>>> so:
>>>>>
>>>>> $ java -cp /usr/lib/biocache:/usr/lib/biocache/biocache
>>>>> -store-1.1-assembly.jar -Xms2g -Xmx2g
>>>>> au.org.ala.biocache.cmd.CommandLineTool
>>>>>
>>>>>
>>>>> ----------------------------
>>>>>
>>>>> | Biocache management tool |
>>>>>
>>>>> ----------------------------
>>>>>
>>>>> Please supply a command or hit ENTER to view command list.
>>>>>
>>>>> biocache> ingest dr8
>>>>>
>>>>> This will:
>>>>>
>>>>> 1) Retrieve the metadata from the configured instance of the
>>>>> collectory
>>>>> 2) Load, process, sample (if there are layers configured and
>>>>> available) and index
>>>>>
>>>>> Cheers
>>>>>
>>>>> Dave
>>>>>
>>>>> ------------------------------
>>>>> *From:* Daniel Lins [daniel.lins at gmail.com]
>>>>> *Sent:* 09 May 2014 14:27
>>>>>
>>>>> *To:* Martin, Dave (CES, Black Mountain)
>>>>> *Cc:* ala-portal at lists.gbif.org; dos Remedios, Nick (CES, Black
>>>>> Mountain); Pedro Corrêa; Nicholls, Miles (CES, Black Mountain)
>>>>> *Subject:* Re: [Ala-portal] DwC-A loading problems
>>>>>
>>>>> David,
>>>>>
>>>>> The dr0 configuration:
>>>>>
>>>>> https://www.dropbox.com/s/lsy11jadwmyghjj/collectoryConfig1.png
>>>>>
>>>>> Sorry, but this server doesn't have external access yet.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 2014-05-09 1:06 GMT-03:00 <David.Martin at csiro.au>:
>>>>>
>>>>>> As an example of what it should look like, see:
>>>>>>
>>>>>>
>>>>>> http://ala-demo.gbif.org/collectory/dataResource/edit/dr8?page=contribution
>>>>>>
>>>>>>
>>>>>> ------------------------------
>>>>>> *From:* Daniel Lins [daniel.lins at gmail.com]
>>>>>>
>>>>>> *Sent:* 09 May 2014 13:59
>>>>>> *To:* Martin, Dave (CES, Black Mountain)
>>>>>> *Cc:* ala-portal at lists.gbif.org; dos Remedios, Nick (CES, Black
>>>>>> Mountain); Pedro Corrêa; Nicholls, Miles (CES, Black Mountain)
>>>>>> *Subject:* Re: [Ala-portal] DwC-A loading problems
>>>>>>
>>>>>> Thanks David,
>>>>>>
>>>>>> We use the DwC term "occurrenceID" to identify the records. It's a
>>>>>> unique key.
>>>>>>
>>>>>> However, when I reload a dataset to update some DwC terms of the
>>>>>> records, the system duplicates this data (keeps the old record and creates
>>>>>> another with changes).
>>>>>>
>>>>>> For instance (update of locality).
>>>>>>
>>>>>> Load 1 ($ java -cp .:biocache.jar au.org.ala.util.DwcCSVLoader dr0
>>>>>> -l dataset.csv -b true)
>>>>>>
>>>>>> {OccurrenceID: 1, municipality: Sao Paulo, ...},
>>>>>> {OccurrenceID: 2, municipality: Sao Paulo, ...}
>>>>>>
>>>>>> Process 1 (biocache$ process dr0)
>>>>>> Index 1 (biocache$ index dr0)
>>>>>>
>>>>>> Load 2 (updated records and new records) (($ java -cp .:biocache.jar
>>>>>> au.org.ala.util.DwcCSVLoader dr0 -l dataset-updated.csv -b true)
>>>>>>
>>>>>> {OccurrenceID: 1, municipality: Rio de Janeiro, ...},
>>>>>> {OccurrenceID: 2, municipality: Rio de Janeiro, ...},
>>>>>> {OccurrenceID: 3, municipality: Sao Paulo, ...}
>>>>>>
>>>>>> Process 2 (biocache$ process dr0)
>>>>>> Index 2 (biocache$ index dr0)
>>>>>>
>>>>>> Results shown by ALA:
>>>>>>
>>>>>> {OccurrenceID: 1, municipality: Sao Paulo, ...},
>>>>>> {OccurrenceID: 2, municipality: Sao Paulo, ...},
>>>>>> {OccurrenceID: 1, municipality: Rio de Janeiro, ...},
>>>>>> {OccurrenceID: 2, municipality: Rio de Janeiro, ...}
>>>>>> {OccurrenceID: 3, municipality: Sao Paulo, ...}
>>>>>>
>>>>>> But I expected:
>>>>>>
>>>>>> {OccurrenceID: 1, municipality: Rio de Janeiro, ...},
>>>>>> {OccurrenceID: 2, municipality: Rio de Janeiro, ...}
>>>>>> {OccurrenceID: 3, municipality: Sao Paulo, ...}
>>>>>>
>>>>>> I need to delete (delete-resource function) existing data before
>>>>>> the reload? If no, what I did wrong to generate this data duplication?
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Daniel Lins da Silva
>>>>>> (Mobile) 55 11 96144-4050
>>>>>> Research Center on Biodiversity and Computing (Biocomp)
>>>>>> University of Sao Paulo, Brazil
>>>>>> daniellins at usp.br
>>>>>> daniel.lins at gmail.com
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2014-05-07 0:46 GMT-03:00 <David.Martin at csiro.au>:
>>>>>>
>>>>>>> Thanks Daniel. Natasha has now left the ALA.
>>>>>>>
>>>>>>> The uniqueness of records is determined by information stored in
>>>>>>> the collectory. See screenshot [1].
>>>>>>> By default, "catalogNumber" is used but you can change this to any
>>>>>>> number of fields that should be stable in the data.
>>>>>>> Using unstable fields for the ID isn't recommended (e.g.
>>>>>>> scientificName). To update the records, the process is to just
>>>>>>> re-load the dataset.
>>>>>>>
>>>>>>> Automatically loaded - this isnt in use and we may remove from the
>>>>>>> UI in future iterations.
>>>>>>> Incremental Load - affects the sample/process/index steps to only
>>>>>>> run these against the new records. Load is always incremental based on the
>>>>>>> key field(s) but if the incremental load box isn’t checked it runs the
>>>>>>> sample/process/index steps against the whole data set. This can cause a
>>>>>>> large processing overhead when there’s a minor update to a large data set.
>>>>>>>
>>>>>>> Cheers
>>>>>>>
>>>>>>> Dave Martin
>>>>>>> ALA
>>>>>>>
>>>>>>> [1] http://bit.ly/1g72HFN
>>>>>>>
>>>>>>> ------------------------------
>>>>>>> *From:* Daniel Lins [daniel.lins at gmail.com]
>>>>>>> *Sent:* 05 May 2014 15:39
>>>>>>> *To:* Quimby, Natasha (CES, Black Mountain)
>>>>>>> *Cc:* ala-portal at lists.gbif.org; dos Remedios, Nick (CES, Black
>>>>>>> Mountain); Martin, Dave (CES, Black Mountain); Pedro Corrêa
>>>>>>> *Subject:* Re: [Ala-portal] DwC-A loading problems
>>>>>>>
>>>>>>> Hi Natasha,
>>>>>>>
>>>>>>> I managed to import the DwC-A file following the steps reported in
>>>>>>> the previous email. Thank you!
>>>>>>>
>>>>>>> However, when I tried to update some metadata of an occurrence
>>>>>>> record (already stored in the database), the system created a new record
>>>>>>> with these duplicated information. So I started to have several records
>>>>>>> with the same occurrenceID (I did set in the data resource
>>>>>>> configuration to use "OcurrenceID" to uniquely identify a record).
>>>>>>>
>>>>>>> How can I update existing records in the database? For instance,
>>>>>>> the location's metadata of an occurrence record stored in my database?
>>>>>>>
>>>>>>> I also would like to better understand the behavior of the
>>>>>>> properties "Automatically loaded" and "Incremental Load".
>>>>>>>
>>>>>>> Thanks!!
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Daniel Lins da Silva
>>>>>>> (Mobile) 55 11 96144-4050
>>>>>>> Research Center on Biodiversity and Computing (Biocomp)
>>>>>>> University of Sao Paulo, Brazil
>>>>>>> daniellins at usp.br
>>>>>>> daniel.lins at gmail.com
>>>>>>>
>>>>>>>
>>>>>>> 2014-04-28 3:52 GMT-03:00 Daniel Lins <daniel.lins at gmail.com>:
>>>>>>>
>>>>>>>> Thanks Natasha!
>>>>>>>>
>>>>>>>> I will try your recommendations. Once finished, I will contact
>>>>>>>> you.
>>>>>>>>
>>>>>>>> Regards
>>>>>>>>
>>>>>>>> Daniel Lins da Silva
>>>>>>>> (Mobile) 55 11 96144-4050
>>>>>>>> Research Center on Biodiversity and Computing (Biocomp)
>>>>>>>> University of Sao Paulo, Brazil
>>>>>>>> daniellins at usp.br
>>>>>>>> daniel.lins at gmail.com
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2014-04-28 3:26 GMT-03:00 <Natasha.Quimby at csiro.au>:
>>>>>>>>
>>>>>>>> Hi Daniel,
>>>>>>>>>
>>>>>>>>> When you specify a local DwcA Load the archive needs to be
>>>>>>>>> unzipped. Try unzipping *2f676abc-4503-489e-8f0c-fcb6e1bc554b.zip
>>>>>>>>> *and then running the following:
>>>>>>>>> s*udo** java -cp .:biocache.jar au.org.ala.util.DwCALoader dr7
>>>>>>>>> -l
>>>>>>>>> /data/collectory/upload/1398658607824/2f676abc-4503-489e-8f0c-fcb6e1bc554b*
>>>>>>>>>
>>>>>>>>> If you configure the collectory to provide the dwca the biocache
>>>>>>>>> automatically unzips the archive for you. You would need to configure dr7
>>>>>>>>> with the following connection parameters:
>>>>>>>>>
>>>>>>>>> "protocol":"DwCA"
>>>>>>>>> "termsForUniqueKey":["occurrenceID"],
>>>>>>>>> "url":"file:////data/collectory/upload/
>>>>>>>>> 1398658607824/2f676abc-4503-489e-8f0c-fcb6e1bc554b.zip"
>>>>>>>>>
>>>>>>>>> You could then load the resource by:
>>>>>>>>> s*udo** java -cp .:biocache.jar au.org.ala.util.DwCALoader dr7*
>>>>>>>>>
>>>>>>>>> If you continue to have issues please let us know.
>>>>>>>>>
>>>>>>>>> Hope that this helps.
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>> Natasha
>>>>>>>>>
>>>>>>>>> From: Daniel Lins <daniel.lins at gmail.com>
>>>>>>>>> Date: Monday, 28 April 2014 3:54 PM
>>>>>>>>> To: "ala-portal at lists.gbif.org" <ala-portal at lists.gbif.org
>>>>>>>>> <ala-portal at lists.gbif.org> <ala-portal at lists.gbif.org>
>>>>>>>>> <ala-portal at lists.gbif.org> <ala-portal at lists.gbif.org>
>>>>>>>>> <ala-portal at lists.gbif.org> <ala-portal at lists.gbif.org>>, "dos
>>>>>>>>> Remedios, Nick (CES, Black Mountain)" <Nick.Dosremedios at csiro.au>,
>>>>>>>>> "Martin, Dave (CES, Black Mountain)" <David.Martin at csiro.au>
>>>>>>>>> Subject: [Ala-portal] DwC-A loading problems
>>>>>>>>>
>>>>>>>>> Hi Nick and Dave,
>>>>>>>>>
>>>>>>>>> We are having some problems in Biocache during the upload of
>>>>>>>>> DwC-A files.
>>>>>>>>>
>>>>>>>>> As shown below, after run the method
>>>>>>>>> "au.org.ala.util.DwCALoader", our system returns the error message "Exception
>>>>>>>>> in thread "main" org.gbif.dwc.text.UnkownDelimitersException:
>>>>>>>>> Unable to detect field delimiter"
>>>>>>>>>
>>>>>>>>> I accomplished tests using DwC-A files with tab-delimited text
>>>>>>>>> files and comma-delimited text files. In both cases the error generated was
>>>>>>>>> the same.
>>>>>>>>>
>>>>>>>>> What causes these problems? (** CSV Loader works great)
>>>>>>>>>
>>>>>>>>> *tab-delimited file test*
>>>>>>>>>
>>>>>>>>> poliusp at poliusp-VirtualBox:~/dev/biocache$ s*udo java -cp
>>>>>>>>> .:biocache.jar au.org.ala.util.DwCALoader dr7 -l
>>>>>>>>> /data/collectory/upload/1398658607824/2f676abc-4503-489e-8f0c-fcb6e1bc554b.zip*
>>>>>>>>> 2014-04-28 01:44:02,837 INFO : [ConfigModule] - Loading
>>>>>>>>> configuration from /data/biocache/config/biocache-config.properties
>>>>>>>>> 2014-04-28 01:44:03,090 INFO : [ConfigModule] - Initialise SOLR
>>>>>>>>> 2014-04-28 01:44:03,103 INFO : [ConfigModule] - Initialise name
>>>>>>>>> matching indexes
>>>>>>>>> 2014-04-28 01:44:03,605 INFO : [ConfigModule] - Initialise
>>>>>>>>> persistence manager
>>>>>>>>> 2014-04-28 01:44:03,606 INFO : [ConfigModule] - Configure complete
>>>>>>>>> Loading archive /data/collectory
>>>>>>>>> /upload/1398658607824/2f676abc-4503-489e-8f0c-fcb6e1bc554b.zip
>>>>>>>>> for resource dr7 with unique terms List(dwc:occurrenceID)
>>>>>>>>> stripping spaces false incremental false testing false
>>>>>>>>> *Exception in thread "main"
>>>>>>>>> org.gbif.dwc.text.UnkownDelimitersException: Unable to detect field
>>>>>>>>> delimiter*
>>>>>>>>> at org.gbif.file.CSVReaderFactory.buildArchiveFile(
>>>>>>>>> CSVReaderFactory.java:129)
>>>>>>>>> at org.gbif.file.CSVReaderFactory.build(
>>>>>>>>> CSVReaderFactory.java:46)
>>>>>>>>> at org.gbif.dwc.text.ArchiveFactory.readFileHeaders(
>>>>>>>>> ArchiveFactory.java:344)
>>>>>>>>> at org.gbif.dwc.text.ArchiveFactory.openArchive(
>>>>>>>>> ArchiveFactory.java:289)
>>>>>>>>> at
>>>>>>>>> au.org.ala.util.DwCALoader.loadArchive(DwCALoader.scala:129)
>>>>>>>>> at
>>>>>>>>> au.org.ala.util.DwCALoader.loadLocal(DwCALoader.scala:106)
>>>>>>>>> at au.org.ala.util.DwCALoader$.main(DwCALoader.scala:52)
>>>>>>>>> at au.org.ala.util.DwCALoader.main(DwCALoader.scala)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *comma-delimited file test*
>>>>>>>>>
>>>>>>>>> poliusp at poliusp-VirtualBox:~/dev/biocache$ *sudo java -cp
>>>>>>>>> .:biocache.jar au.org.ala.util.DwCALoader dr7 -l ./dwca-teste3.zip*
>>>>>>>>> 2014-04-28 01:56:04,683 INFO : [ConfigModule] - Loading
>>>>>>>>> configuration from /data/biocache/config/biocache-config.properties
>>>>>>>>> 2014-04-28 01:56:04,940 INFO : [ConfigModule] - Initialise SOLR
>>>>>>>>> 2014-04-28 01:56:04,951 INFO : [ConfigModule] - Initialise name
>>>>>>>>> matching indexes
>>>>>>>>> 2014-04-28 01:56:05,437 INFO : [ConfigModule] - Initialise
>>>>>>>>> persistence manager
>>>>>>>>> 2014-04-28 01:56:05,438 INFO : [ConfigModule] - Configure complete
>>>>>>>>> Loading archive ./dwca-teste3.zip for resource dr7 with unique
>>>>>>>>> terms List(dwc:occurrenceID) stripping spaces false incremental
>>>>>>>>> false testing false
>>>>>>>>> *Exception in thread "main"
>>>>>>>>> org.gbif.dwc.text.UnkownDelimitersException: Unable to detect field
>>>>>>>>> delimiter*
>>>>>>>>> at org.gbif.file.CSVReaderFactory.buildArchiveFile(
>>>>>>>>> CSVReaderFactory.java:129)
>>>>>>>>> at org.gbif.file.CSVReaderFactory.build(
>>>>>>>>> CSVReaderFactory.java:46)
>>>>>>>>> at org.gbif.dwc.text.ArchiveFactory.readFileHeaders(
>>>>>>>>> ArchiveFactory.java:344)
>>>>>>>>> at org.gbif.dwc.text.ArchiveFactory.openArchive(
>>>>>>>>> ArchiveFactory.java:289)
>>>>>>>>> at
>>>>>>>>> au.org.ala.util.DwCALoader.loadArchive(DwCALoader.scala:129)
>>>>>>>>> at
>>>>>>>>> au.org.ala.util.DwCALoader.loadLocal(DwCALoader.scala:106)
>>>>>>>>> at au.org.ala.util.DwCALoader$.main(DwCALoader.scala:52)
>>>>>>>>> at au.org.ala.util.DwCALoader.main(DwCALoader.scala)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> Regards.
>>>>>>>>> --
>>>>>>>>> Daniel Lins da Silva
>>>>>>>>> (Mobile) 55 11 96144-4050
>>>>>>>>> Research Center on Biodiversity and Computing (Biocomp)
>>>>>>>>> University of Sao Paulo, Brazil
>>>>>>>>> daniellins at usp.br
>>>>>>>>> daniel.lins at gmail.com
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Daniel Lins da Silva
>>>>>>>> (Cel) 11 6144-4050
>>>>>>>> daniel.lins at gmail.com
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Daniel Lins da Silva
>>>>>>> (Cel) 11 6144-4050
>>>>>>> daniel.lins at gmail.com
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Daniel Lins da Silva
>>>>>> (Cel) 11 6144-4050
>>>>>> daniel.lins at gmail.com
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Daniel Lins da Silva
>>>>> (Cel) 11 6144-4050
>>>>> daniel.lins at gmail.com
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Daniel Lins da Silva
>>>> (Cel) 11 6144-4050
>>>> daniel.lins at gmail.com
>>>>
>>>
>>>
>>>
>>> --
>>> Daniel Lins da Silva
>>> (Cel) 11 6144-4050
>>> daniel.lins at gmail.com
>>>
>>
>>
>>
>> --
>> Daniel Lins da Silva
>> (Cel) 11 6144-4050
>> daniel.lins at gmail.com
>>
>
>
>
> --
> Daniel Lins da Silva
> (Cel) 11 6144-4050
> daniel.lins at gmail.com
>
--
Daniel Lins da Silva
(Cel) 11 6144-4050
daniel.lins at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.gbif.org/pipermail/ala-portal/attachments/20140627/94a23123/attachment-0001.html
More information about the Ala-portal
mailing list