[Ala-portal] DwC-A loading problems

Fri Jun 27 05:54:03 CEST 2014

Hi Dave,

Did you see this mail? Do you think this issue can be something related to
the configuration of api_key property?

Thanks.

Regards,

2014-06-25 2:14 GMT-03:00 Daniel Lins <daniel.lins at gmail.com>:

> Hi Dave,
>
> Thanks for the support.
>
> The data loading in the biocache is working properly now. But
> error continues during the update of collectory (see below).
>
> *java.net.SocketTimeoutException: Read timed out*
> * at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)*
> * at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)*
> * at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)*
> * at java.lang.reflect.Constructor.newInstance(Constructor.java:526)*
> * at
> sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1675)*
> * at
> sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1673)*
> * at java.security.AccessController.doPrivileged(Native Method)*
> * at
> sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1671)*
> * at
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1244)*
> * at
> java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)*
> * at scalaj.http.Http$Request.liftedTree1$1(Http.scala:107)*
> * at scalaj.http.Http$Request.process(Http.scala:103)*
> * at scalaj.http.Http$Request.responseCode(Http.scala:120)*
> * at
> au.org.ala.biocache.load.DataLoader$class.updateLastChecked(DataLoader.scala:354)*
> * at
> au.org.ala.biocache.load.DwCALoader.updateLastChecked(DwCALoader.scala:74)*
> * at au.org.ala.biocache.load.DwCALoader.load(DwCALoader.scala:103)*
> * at au.org.ala.biocache.load.Loader.load(Loader.scala:75)*
> * at
> au.org.ala.biocache.cmd.CMD$$anonfun$executeCommand$7.apply(CMD.scala:69)*
> * at
> au.org.ala.biocache.cmd.CMD$$anonfun$executeCommand$7.apply(CMD.scala:69)*
> * at
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)*
> * at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)*
> * at au.org.ala.biocache.cmd.CMD$.executeCommand(CMD.scala:69)*
> * at
> au.org.ala.biocache.cmd.CommandLineTool$.main(CommandLineTool.scala:22)*
> * at au.org.ala.biocache.cmd.CommandLineTool.main(CommandLineTool.scala)*
> *Caused by: java.net.SocketTimeoutException: Read timed out*
> * at java.net.SocketInputStream.socketRead0(Native Method)*
> * at java.net.SocketInputStream.read(SocketInputStream.java:152)*
> * at java.net.SocketInputStream.read(SocketInputStream.java:122)*
> * at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)*
> * at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)*
> * at java.io.BufferedInputStream.read(BufferedInputStream.java:334)*
> * at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)*
> * at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)*
> * at
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323)*
> * at
> java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)*
> * at
> scalaj.http.Http$Request$$anonfun$responseCode$1.apply(Http.scala:120)*
> * atscala
> j.http.Http$Request$$anonfun$responseCode$1.apply(Http.scala:120)*
> * at scalaj.http.Http$Request.liftedTree1$1(Http.scala:104)*
> * ... 13 more*
>
> In the external configuration file
> (/data/biocache/config/biocache-config.properties) the property registry.url
> is correct (registry.url=http://192.168.15.132:8080/collectory/ws), indicating
> the URL of the collectory WS page.
>
> It could be something related to permission for external access? How works
> this *api_key* property in the collectory?
>
> Thanks!
>
> Regards,
>
> Daniel Lins da Silva
> (Mobile) 55 11 96144-4050
> Research Center on Biodiversity and Computing (Biocomp)
> University of Sao Paulo, Brazil
> daniellins at usp.br
> daniel.lins at gmail.com
>
>
> 2014-06-20 2:26 GMT-03:00 <David.Martin at csiro.au>:
>
>>  Hi Daniel.
>>
>>  There is a version updated of biocache-store of 1.1.1 that helped fix
>> some of the problems Burke spotted when loading darwin core archives
>> downloaded from the GBIF portal. The symptom where similar (only one record
>> loaded for a dataset).
>>
>>  The exception in point 3) indicates the URL you have configured for the
>> collectory (registry.url in biocache.properties) is either incorrect, or
>> the collectory can not be accessed for some reason. At the end of data
>> load, the collectory is updated to indicate the last loaded date for that
>> dataset. This is done using a webservice.
>>
>>  One thing to mention - if you want to remove all data from your
>> database, the easiest thing to do is use the cassandra-cli and run the
>> command:
>>
>>  >> truncate occ;
>>
>>  This will remove all occurrence records from the database, but not from
>> the index.
>>
>>  The warnings you are seeing in the processing phase e.g.
>>
>>  *2014-06-20 01:51:20,505 WARN : [ALANameSearcher] - Unable to parse
>> Abaca bunchy top  (Babuvirus). Name of type virus unparsable: Abaca bunchy
>> top  (Babuvirus)*
>>
>>  are normal. This is referring to the sensitive species list in use.
>>
>>  Cheers
>>
>> Dave
>>
>>
>>  ------------------------------
>> *From:* Daniel Lins [daniel.lins at gmail.com]
>> *Sent:* 20 June 2014 15:11
>> *To:* Martin, Dave (CES, Black Mountain)
>> *Cc:* ala-portal at lists.gbif.org; dos Remedios, Nick (CES, Black
>> Mountain); Pedro Corrêa
>> *Subject:* Re: [Ala-portal] DwC-A loading problems
>>
>>   Hi Dave, thanks for the information from the last email.
>>
>>  I'm following your advice and performing the update of our test
>> environment for biocache version 1.1. But I'm having some problems and I
>> would like to know if you or anyone has already found this issue and know a
>> solution.
>>
>>  To update the biocache version I did these steps below (based on the
>> Vagrant/Ansible installation process):
>>
>>  1. Cleaning of the database and index through delete-resource function
>> (delete-resource dr0 dr1 dr2 ...);
>> 2. An update of the Biocache config file
>> (/data/biocache/config/biocache-config.properties) (copied from the Vagrant
>> VM, with some configuration changes);
>> 3. An update of the biocache build file (biocache. jar) (copied from the
>> Vagrant VM - /usr/lib/biocache);
>> 4. Deployment of the new biocache-service build (copied from the Vagrant
>> VM - tomcat7/webapps/biocache-service.war)
>> 5. An update of the Solr config files (schema.xml, solrconfig.xml)
>> (copied from the Vagrant VM - /data/solr/biocache);
>> 6. Exclusion of the indexing folder of Biocache Core (/data/solr
>> /biocache/data);
>>
>>  Notes 1 ** No change was made in the Hubs-Webapp and Collectory.
>>
>>  Notes 2 **  The import of CSV files is working (using load-local-csv
>> dr0 /<file_location>/xxx.csv).
>>
>>
>>  I tried to import a Darwin Core Archive by following these steps:
>>
>>  1. Created a data resource (dr0);
>>
>>  2. Uploaded a DWC-A zip file into the DR using the "Upload File" option.
>>
>>  *Protocol:DarwinCore archive*
>> *Location
>> URL:file:////data/collectory/upload/1403239521145/dwca-ocorrencias_lobo_guara_1.zip*
>> *Automatically loaded:false*
>> *DwC terms that uniquely identify a record: occurrenceID*
>> *Strip whitespaces in key: false*
>> *Incremental Load: false*
>>
>>  3. Used the Command Line Tool (Biocache) to Load (*load dr0*), Process (*process
>> dr0*) and Index (*index dr0*) data.
>>
>>
>>  During the data loading phase, the system generated these errors:
>>
>>  *...*
>>  *2014-06-20 01:49:12,506 INFO : [DataLoader] - Finished DwC loader.
>> Records processed: 32*
>> *java.net.SocketTimeoutException: Read timed out*
>> *at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)*
>> *at
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)*
>> *at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)*
>> *at java.lang.reflect.Constructor.newInstance(Constructor.java:526)*
>> *at
>> sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1675)*
>> *at
>> sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1673)*
>> *at java.security.AccessController.doPrivileged(Native Method)*
>> *at
>> sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1671)*
>> *at
>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1244)*
>> *at
>> java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)*
>> *at scalaj.http.Http$Request.liftedTree1$1(Http.scala:107)*
>> *at scalaj.http.Http$Request.process(Http.scala:103)*
>> *at scalaj.http.Http$Request.responseCode(Http.scala:120)*
>> *at
>> au.org.ala.biocache.load.DataLoader$class.updateLastChecked(DataLoader.scala:354)*
>> *at
>> au.org.ala.biocache.load.DwCALoader.updateLastChecked(DwCALoader.scala:74)*
>> *at au.org.ala.biocache.load.DwCALoader.load(DwCALoader.scala:103)*
>> *at au.org.ala.biocache.load.Loader.load(Loader.scala:75)*
>> *at
>> au.org.ala.biocache.cmd.CMD$$anonfun$executeCommand$7.apply(CMD.scala:69)*
>> *at
>> au.org.ala.biocache.cmd.CMD$$anonfun$executeCommand$7.apply(CMD.scala:69)*
>> *at
>> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)*
>> *at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)*
>> *at au.org.ala.biocache.cmd.CMD$.executeCommand(CMD.scala:69)*
>> *at
>> au.org.ala.biocache.cmd.CommandLineTool$.main(CommandLineTool.scala:22)*
>> *at au.org.ala.biocache.cmd.CommandLineTool.main(CommandLineTool.scala)*
>> *Caused by: java.net.SocketTimeoutException: Read timed out*
>> *at java.net.SocketInputStream.socketRead0(Native Method)*
>> *at java.net.SocketInputStream.read(SocketInputStream.java:152)*
>> *at java.net.SocketInputStream.read(SocketInputStream.java:122)*
>> *at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)*
>> *at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)*
>> *at java.io.BufferedInputStream.read(BufferedInputStream.java:334)*
>> *at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)*
>> *at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)*
>> *at
>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323)*
>> *at
>> java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)*
>> *at
>> scalaj.http.Http$Request$$anonfun$responseCode$1.apply(Http.scala:120)*
>> *at
>> scalaj.http.Http$Request$$anonfun$responseCode$1.apply(Http.scala:120)*
>> *at scalaj.http.Http$Request.liftedTree1$1(Http.scala:104)*
>> *... 13 more*
>>
>>
>>  And in Cassandra was saved only one record:
>>
>>
>> *cqlsh:occ> select * from occ; *
>>
>>  * key      | portalId | uuid*
>> *----------+----------+--------------------------------------*
>> * dr0|null |     null | 1b5b21fc-594a-46e6-b8db-cf37c50b8f7b*
>>
>>
>>  During the data processing phase, the system generated these additional
>> errors:
>>
>>  ...
>>  *Jun 20, 2014 1:51:08 AM
>> org.geotools.referencing.factory.epsg.ThreadedHsqlEpsgFactory
>> createDataSource*
>> *INFO: Building new data source for
>> org.geotools.referencing.factory.epsg.ThreadedHsqlEpsgFactory*
>> *Jun 20, 2014 1:51:08 AM
>> org.geotools.referencing.factory.epsg.ThreadedHsqlEpsgFactory
>> createBackingStore*
>> *INFO: Building backing store for
>> org.geotools.referencing.factory.epsg.ThreadedHsqlEpsgFactory*
>> *2014-06-20 01:51:20,505 WARN : [ALANameSearcher] - Unable to parse Abaca
>> bunchy top  (Babuvirus). Name of type virus unparsable: Abaca bunchy top
>>  (Babuvirus)*
>> *2014-06-20 01:51:20,509 WARN : [ALANameSearcher] - Unable to parse Abaca
>> mosaic, sugarcane mosaic (Potyvirus). Name of type virus unparsable: Abaca
>> mosaic, sugarcane mosaic (Potyvirus)*
>> *2014-06-20 01:51:21,210 WARN : [ALANameSearcher] - Unable to parse Acute
>> bee paralysis  (Cripavirus). Name of type virus unparsable: Acute bee
>> paralysis  (Cripavirus)*
>> *2014-06-20 01:51:21,255 WARN : [ALANameSearcher] - Unable to parse
>> Agropyron mosaic  (Rymovirus). Name of type virus unparsable: Agropyron
>> mosaic  (Rymovirus)*
>> *2014-06-20 01:51:21,289 WARN : [ALANameSearcher] - Unable to parse
>> Alphacrytovirus vicia. Name of type virus unparsable: Alphacrytovirus vicia*
>> *2014-06-20 01:51:21,334 WARN : [ALANameSearcher] - Unable to parse
>> American plum line pattern  (APLPV, Ilaravirus). Name of type virus
>> unparsable: American plum line pattern  (APLPV, Ilaravirus)*
>> *2014-06-20 01:51:21,525 WARN : [ALANameSearcher] - Unable to parse Apis
>> iridescent  (Iridovirus). Name of type virus unparsable: Apis iridescent
>>  (Iridovirus)*
>> *2014-06-20 01:51:21,546 WARN : [ALANameSearcher] - Unable to parse
>> Apricot ring pox  (Unassigned). Name of type blacklisted unparsable:
>> Apricot ring pox  (Unassigned)*
>> *2014-06-20 01:51:21,549 WARN : [ALANameSearcher] - Unable to parse
>> Arabis mosaic  (Nepovirus). Name of type virus unparsable: Arabis mosaic
>>  (Nepovirus)*
>> *2014-06-20 01:51:21,623 WARN : [ALANameSearcher] - Unable to parse
>> Artichoke Italian latent  (Nepovirus). Name of type virus unparsable:
>> Artichoke Italian latent  (Nepovirus)*
>> *2014-06-20 01:51:21,640 WARN : [ALANameSearcher] - Unable to parse
>> Asparagus   (Ilarvirus). Name of type virus unparsable: Asparagus
>> (Ilarvirus)*
>> *2014-06-20 01:51:21,641 WARN : [ALANameSearcher] - Unable to parse
>> Asparagus   (Potyvirus). Name of type virus unparsable: Asparagus
>> (Potyvirus)*
>>  ...
>>
>>  During the last phase there were no errors. However, only one record
>> was indexed.
>>
>>  *2014-06-20 01:54:07,739 INFO : [SolrIndexDAO] - >>>>>>>>>>>>> Document
>> count of index: 1*
>> *2014-06-20 01:54:07,741 INFO : [SolrIndexDAO] - Finalise finished.*
>>
>>  I attached a file with the complete messages generated by Biocache
>> during this test.
>>
>>
>>  Thanks!
>>
>>  Cheers.
>>
>>   Daniel Lins da Silva
>>  (Mobile) 55 11 96144-4050
>>  Research Center on Biodiversity and Computing (Biocomp)
>> University of Sao Paulo, Brazil
>>  daniellins at usp.br
>>  daniel.lins at gmail.com
>>
>>
>>
>> 2014-06-18 6:15 GMT-03:00 <David.Martin at csiro.au>:
>>
>>>  Hi Daniel,
>>>
>>>  From what you've said, Im not clear on what customisations you have
>>> made so its difficult to make a call on the impact of migrating to 1.1. We
>>> also do not know what subversion revisions you started with.
>>>
>>>  We can tell you that functionally there wasn't a great deal of
>>> difference between the later snapshots of 1.0 and 1.1.
>>> The changes where largely structural i.e. a clean up of packages,
>>> removal of redundant code. We did this largely because we needed to (this
>>> code base is now over 5 years old) and we wanted to clean things up before
>>> other projects started to work with the software.
>>>
>>>  Upgrading to biocache-service 1.1 and biocache-store shouldnt require
>>> any changes to cassandra, but it may require and upgrade of SOLR. If this
>>> is the case, you'll need to regenerate your index using the biocache
>>> commandline tool. Upgrading to 1.1 shouldnt require any changes to
>>> hubs-webapp if you've customised this component.
>>>
>>>  I'd really recommend move to 1.1 sooner rather than later as it'll
>>> give you a stable baseline to work against.
>>>
>>>  Hope this helps,
>>>
>>>  Dave Martin
>>> ALA
>>>
>>>  ------------------------------
>>> *From:* Daniel Lins [daniel.lins at gmail.com]
>>> *Sent:* 18 June 2014 15:54
>>>
>>> *To:* Martin, Dave (CES, Black Mountain)
>>> *Cc:* ala-portal at lists.gbif.org; dos Remedios, Nick (CES, Black
>>> Mountain); Pedro Corrêa; Nicholls, Miles (CES, Black Mountain)
>>> *Subject:* Re: [Ala-portal] DwC-A loading problems
>>>
>>>     Hi Dave,
>>>
>>>  How can I update the Biocache-1.0-SNAPSHOT to the version 1.1? I
>>> updated the biocache-store (biocache.jar) and the config file
>>> (/data/biocache/conf/config.properties-biocache) but I still have problems.
>>> Which other steps I need  to do? Apparently this new version of the
>>> biocache configuration file generates impacts directly in my
>>> Biocache-Services and Solr.
>>>
>>>  This update will generate some impacts in other components?
>>>
>>>  I cannot use the installation process based on the Vagrant/Ansible
>>> because our environment is different and already have customizations. So I
>>> would like to update the biocache with minimum impact, if possible. After
>>> we will have to plan the update of the other components.
>>>
>>>  Can you advise me as to the best way forward?
>>>
>>>  Thanks!!
>>>
>>>  Regards,
>>>
>>>   Daniel Lins da Silva
>>> (Mobile) 55 11 96144-4050
>>>  Research Center on Biodiversity and Computing (Biocomp)
>>> University of Sao Paulo, Brazil
>>>  daniellins at usp.br
>>> daniel.lins at gmail.com
>>>
>>>
>>>
>>> 2014-05-26 3:58 GMT-03:00 <David.Martin at csiro.au>:
>>>
>>>>  Thanks Daniel.
>>>>
>>>>  I'd recommend upgrading to 1.1 and I'd recommend installation with
>>>> the ansible scripts. This will give you base line configuration.
>>>> The scripts can be tested on a local machine using vagrant.
>>>> The configuration between 1.0 and 1.1 changed significantly - removal
>>>> of redundant, legacy properties, adoption of standard format for property
>>>> names.
>>>> Heres the template used for the configuration file in the ansible
>>>> scripts:
>>>>
>>>>
>>>> https://github.com/gbif/ala-install/blob/master/ansible/roles/biocache-service/templates/config/biocache-config.properties
>>>>
>>>>  Cheers
>>>>
>>>>  Dave Martin
>>>> ALA
>>>>
>>>>  ------------------------------
>>>> *From:* Daniel Lins [daniel.lins at gmail.com]
>>>> *Sent:* 26 May 2014 15:02
>>>> *To:* Martin, Dave (CES, Black Mountain)
>>>> *Cc:* ala-portal at lists.gbif.org; dos Remedios, Nick (CES, Black
>>>> Mountain); Pedro Corrêa; Nicholls, Miles (CES, Black Mountain)
>>>> *Subject:* Re: [Ala-portal] DwC-A loading problems
>>>>
>>>>   Hi Dave,
>>>>
>>>>  When I ran the ingest command (ingest dr0), the system showed errors
>>>> like these below. However, after the error messages, I ran the index
>>>> command (index dr0), and data were published on the Portal.
>>>>
>>>>  2014-05-20 14:15:05,412 ERROR: [Grid] - cannot find GRID: /data/ala
>>>> /data/layers/ready/diva/worldclim_bio_19
>>>> 2014-05-20 14:15:05,414 ERROR: [Grid] - java.io.FileNotFoundException:
>>>> /data/ala/data/layers/ready/diva/worldclim_bio_19.gri (No such file or
>>>> directory)
>>>> at java.io.RandomAccessFile.open(Native Method)
>>>> at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
>>>> at java.io.RandomAccessFile.<init>(RandomAccessFile.java:122)
>>>> at org.ala.layers.intersect.Grid.getValues3(Grid.java:1017)
>>>> at org.ala.layers.intersect.SamplingThread.intersectGrid(
>>>> SamplingThread.java:112)
>>>> at org.ala.layers.intersect.SamplingThread.sample(SamplingThread.java:
>>>> 97)
>>>> at org.ala.layers.intersect.SamplingThread.run(SamplingThread.java:67)
>>>>
>>>>  2014-05-20 14:15:05,447 INFO : [Sampling] - ********* END - TEST
>>>> BATCH SAMPLING FROM FILE ***************
>>>> 2014-05-20 14:15:05,496 INFO : [Sampling] - Finished loading:
>>>> /tmp/sampling-dr0.txt in 49ms
>>>> 2014-05-20 14:15:05,496 INFO : [Sampling] - Removing temporary file:
>>>> /tmp/sampling-dr0.txt
>>>> 2014-05-20 14:15:05,553 INFO : [Consumer] - Initialising thread: 0
>>>> 2014-05-20 14:15:05,575 INFO : [Consumer] - Initialising thread: 1
>>>> 2014-05-20 14:15:05,575 INFO : [Consumer] - Initialising thread: 2
>>>> 2014-05-20 14:15:05,577 INFO : [Consumer] - In thread: 0
>>>> 2014-05-20 14:15:05,579 INFO : [Consumer] - Initialising thread: 3
>>>> 2014-05-20 14:15:05,579 INFO : [ProcessWithActors] - Starting with
>>>> dr0| endingwith dr0|~
>>>> 2014-05-20 14:15:05,581 INFO : [Consumer] - In thread: 2
>>>> 2014-05-20 14:15:05,581 INFO : [Consumer] - In thread: 1
>>>> 2014-05-20 14:15:05,584 INFO : [Consumer] - In thread: 3
>>>> 2014-05-20 14:15:05,592 INFO : [ProcessWithActors] - Initialised
>>>> actors...
>>>> 2014-05-20 14:15:05,647 INFO : [ProcessWithActors] - First rowKey
>>>> processed: dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:
>>>> MA120999
>>>> 2014-05-20 14:15:05,998 INFO : [ProcessWithActors] - Last row key
>>>> processed: dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA99991
>>>> 2014-05-20 14:15:06,006 INFO : [ProcessWithActors] - Finished.
>>>> 2014-05-20 14:15:06,015 INFO : [AttributionDAO] - Calling web service
>>>> for dr0
>>>> 2014-05-20 14:15:06,017 INFO : [Consumer] - Killing (Actor.act)
>>>> thread: 3
>>>> 2014-05-20 14:15:06,016 INFO : [Consumer] - Killing (Actor.act)
>>>> thread: 2
>>>> 2014-05-20 14:15:06,015 INFO : [Consumer] - Killing (Actor.act)
>>>> thread: 1
>>>> 2014-05-20 14:15:06,289 INFO : [AttributionDAO] - Looking up collectory
>>>> web service for ICMBIO|PARNASO
>>>> May 20, 2014 2:15:10 PM org.geotools.referencing.factory.epsg.ThreadedEpsgFactory
>>>> <init>
>>>> INFO: Setting the EPSG factory org.geotools.referencing.factory.epsg.DefaultFactory
>>>> to a 1800000ms timeout
>>>> May 20, 2014 2:15:10 PM org.geotools.referencing.factory.epsg.ThreadedEpsgFactory
>>>> <init>
>>>> INFO: Setting the EPSG factory org.geotools.referencing.factory.epsg.ThreadedHsqlEpsgFactory
>>>> to a 1800000ms timeout
>>>> May 20, 2014 2:15:10 PM org.geotools.referencing.factory.epsg.ThreadedHsqlEpsgFactory
>>>> createDataSource
>>>> INFO: Building new data source for org.geotools.referencing.factory.
>>>> epsg.ThreadedHsqlEpsgFactory
>>>> May 20, 2014 2:15:10 PM org.geotools.referencing.factory.epsg.ThreadedHsqlEpsgFactory
>>>> createBackingStore
>>>> INFO: Building backing store for org.geotools.referencing.factory.epsg.
>>>> ThreadedHsqlEpsgFactory
>>>> 2014-05-20 14:15:32,105 INFO : [Consumer] - Killing (Actor.act)
>>>> thread: 0
>>>> Indexing live with URL: null, and params: null&dataResource=dr0
>>>> java.lang.NullPointerException
>>>> at
>>>> au.org.ala.util.CMD$.au$org$ala$util$CMD$$indexDataResourceLive$1(CommandLineTool.scala:371)
>>>> at
>>>> au.org.ala.util.CMD$$anonfun$executeCommand$2.apply(CommandLineTool.scala:90)
>>>> at
>>>> au.org.ala.util.CMD$$anonfun$executeCommand$2.apply(CommandLineTool.scala:86)
>>>> at scala.collection.IndexedSeqOptimized$class.foreach(
>>>> IndexedSeqOptimized.scala:33)
>>>> at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:105)
>>>> at au.org.ala.util.CMD$.executeCommand(CommandLineTool.scala:86)
>>>> at au.org.ala.util.CommandLineTool$.main(CommandLineTool.scala:26)
>>>> at au.org.ala.util.CommandLineTool.main(CommandLineTool.scala)
>>>>
>>>>
>>>>
>>>>  Nowadays we use the* biocache-1.0-SNAPSHOT* in our environment. But
>>>> in your last mail you mentioned the version *biocache-1.1-assembly*.
>>>>
>>>>  I did download of this newer version, but when I ran it (ingest dr0)
>>>> in our environment the system showed many errors (see below).
>>>>
>>>>
>>>>  log4j:WARN custom level class [org.ala.client.appender.RestLevel] not
>>>> found.
>>>>   Exception in thread "main" java.lang.ExceptionInInitializerError
>>>> at
>>>> au.org.ala.biocache.load.DataLoader$class.$init$(DataLoader.scala:28)
>>>> at au.org.ala.biocache.load.Loader.<init>(Loader.scala:34)
>>>> at au.org.ala.biocache.cmd.CMD$.executeCommand(CMD.scala:29)
>>>> at
>>>> au.org.ala.biocache.cmd.CommandLineTool$.main(CommandLineTool.scala:22)
>>>> at au.org.ala.biocache.cmd.CommandLineTool.main(CommandLineTool.scala)
>>>> Caused by: com.google.inject.CreationException: Guice creation errors:
>>>>
>>>>  1) No implementation for java.lang.Integer annotated with @com.google.
>>>> inject.name.Named(value=cassandra.max.connections) was bound.
>>>>   while locating java.lang.Integer annotated with @com.google.inject.
>>>> name.Named(value=cassandra.max.connections)
>>>>     for parameter 4 at
>>>> au.org.ala.biocache.persistence.CassandraPersistenceManager.<init
>>>> >(CassandraPersistenceManager.scala:24)
>>>>   at au.org.ala.biocache.ConfigModule.configure(Config.scala:184)
>>>>
>>>>  2) No implementation for java.lang.Integer annotated with @com.google.
>>>> inject.name.Named(value=cassandra.max.retries) was bound.
>>>>   while locating java.lang.Integer annotated with @com.google.inject.
>>>> name.Named(value=cassandra.max.retries)
>>>>     for parameter 5 at
>>>> au.org.ala.biocache.persistence.CassandraPersistenceManager.<init
>>>> >(CassandraPersistenceManager.scala:24)
>>>>   at au.org.ala.biocache.ConfigModule.configure(Config.scala:184)
>>>>
>>>>  3) No implementation for java.lang.Integer annotated with @com.google.
>>>> inject.name.Named(value=cassandra.port) was bound.
>>>>   while locating java.lang.Integer annotated with @com.google.inject.
>>>> name.Named(value=cassandra.port)
>>>>     for parameter 1 at
>>>> au.org.ala.biocache.persistence.CassandraPersistenceManager.<init
>>>> >(CassandraPersistenceManager.scala:24)
>>>>   at au.org.ala.biocache.ConfigModule.configure(Config.scala:184)
>>>>
>>>>  4) No implementation for java.lang.Integer annotated with @com.google.
>>>> inject.name.Named(value=thrift.operation.timeout) was bound.
>>>>   while locating java.lang.Integer annotated with @com.google.inject.
>>>> name.Named(value=thrift.operation.timeout)
>>>>     for parameter 6 at
>>>> au.org.ala.biocache.persistence.CassandraPersistenceManager.<init
>>>> >(CassandraPersistenceManager.scala:24)
>>>>   at au.org.ala.biocache.ConfigModule.configure(Config.scala:184)
>>>>
>>>>  5) No implementation for java.lang.String annotated with @com.google.
>>>> inject.name.Named(value=cassandra.hosts) was bound.
>>>>   while locating java.lang.String annotated with @com.google.inject.
>>>> name.Named(value=cassandra.hosts)
>>>>     for parameter 0 at
>>>> au.org.ala.biocache.persistence.CassandraPersistenceManager.<init
>>>> >(CassandraPersistenceManager.scala:24)
>>>>   at au.org.ala.biocache.ConfigModule.configure(Config.scala:184)
>>>>
>>>>  6) No implementation for java.lang.String annotated with @com.google.
>>>> inject.name.Named(value=cassandra.keyspace) was bound.
>>>>   while locating java.lang.String annotated with @com.google.inject.
>>>> name.Named(value=cassandra.keyspace)
>>>>     for parameter 3 at
>>>> au.org.ala.biocache.persistence.CassandraPersistenceManager.<init
>>>> >(CassandraPersistenceManager.scala:24)
>>>>   at au.org.ala.biocache.ConfigModule.configure(Config.scala:184)
>>>>
>>>>  7) No implementation for java.lang.String annotated with @com.google.
>>>> inject.name.Named(value=cassandra.pool) was bound.
>>>>   while locating java.lang.String annotated with @com.google.inject.
>>>> name.Named(value=cassandra.pool)
>>>>     for parameter 2 at
>>>> au.org.ala.biocache.persistence.CassandraPersistenceManager.<init
>>>> >(CassandraPersistenceManager.scala:24)
>>>>   at au.org.ala.biocache.ConfigModule.configure(Config.scala:184)
>>>>
>>>>  8) No implementation for java.lang.String annotated with @com.google.
>>>> inject.name.Named(value=exclude.sensitive.values) was bound.
>>>>   while locating java.lang.String annotated with @com.google.inject.
>>>> name.Named(value=exclude.sensitive.values)
>>>>     for parameter 1 at au.org.ala.biocache.index.SolrIndexDAO.<init
>>>> >(SolrIndexDAO.scala:28)
>>>>   at au.org.ala.biocache.ConfigModule.configure(Config.scala:164)
>>>>
>>>>  9) No implementation for java.lang.String annotated with @com.google.
>>>> inject.name.Named(value=extra.misc.fields) was bound.
>>>>   while locating java.lang.String annotated with @com.google.inject.
>>>> name.Named(value=extra.misc.fields)
>>>>     for parameter 2 at au.org.ala.biocache.index.SolrIndexDAO.<init
>>>> >(SolrIndexDAO.scala:28)
>>>>   at au.org.ala.biocache.ConfigModule.configure(Config.scala:164)
>>>>
>>>>  10) No implementation for java.lang.String annotated with @com.google.
>>>> inject.name.Named(value=solr.home) was bound.
>>>>   while locating java.lang.String annotated with @com.google.inject.
>>>> name.Named(value=solr.home)
>>>>     for parameter 0 at au.org.ala.biocache.index.SolrIndexDAO.<init
>>>> >(SolrIndexDAO.scala:28)
>>>>   at au.org.ala.biocache.ConfigModule.configure(Config.scala:164)
>>>>
>>>>  10 errors
>>>> at com.google.inject.internal.Errors.
>>>> throwCreationExceptionIfErrorsExist(Errors.java:354)
>>>> at com.google.inject.InjectorBuilder.initializeStatically(
>>>> InjectorBuilder.java:152)
>>>> at com.google.inject.InjectorBuilder.build(InjectorBuilder.java:105)
>>>> at com.google.inject.Guice.createInjector(Guice.java:92)
>>>> at com.google.inject.Guice.createInjector(Guice.java:69)
>>>> at com.google.inject.Guice.createInjector(Guice.java:59)
>>>> at au.org.ala.biocache.Config$.<init>(Config.scala:24)
>>>> at au.org.ala.biocache.Config$.<clinit>(Config.scala)
>>>> ... 5 more
>>>>
>>>>
>>>>  Related to these specific issues (data update and incremental load),
>>>> I will need to upgrade the biocache version (1.1 or newer) or I could
>>>> work with the version 1.0-SNAPSHOT? If I update this version, I will have
>>>> compatibility with the other components? How should I proceed?
>>>>
>>>> Which layer files should I include in my environment to run these tests?
>>>>
>>>>  Thanks!
>>>>
>>>>  Regards,
>>>>
>>>>   Daniel Lins da Silva
>>>> (Mobile) 55 11 96144-4050
>>>>  Research Center on Biodiversity and Computing (Biocomp)
>>>> University of Sao Paulo, Brazil
>>>>  daniellins at usp.br
>>>> daniel.lins at gmail.com
>>>>
>>>>
>>>> 2014-05-09 2:14 GMT-03:00 <David.Martin at csiro.au>:
>>>>
>>>>>  Thanks Daniel.
>>>>>
>>>>>  I've spotted the problem:
>>>>>
>>>>>  java -cp .:biocache.jar au.org.ala.util.DwcCSVLoader dr0 -l
>>>>> dataset-updated.csv -b true
>>>>>
>>>>>  this bypasses lookups against the collectory for the metadata.
>>>>>
>>>>>  To load this dataset, you can use the biocache commandline tool like
>>>>> so:
>>>>>
>>>>>   $ java -cp /usr/lib/biocache:/usr/lib/biocache/biocache
>>>>> -store-1.1-assembly.jar -Xms2g -Xmx2g
>>>>> au.org.ala.biocache.cmd.CommandLineTool
>>>>>
>>>>>
>>>>>  ----------------------------
>>>>>
>>>>> | Biocache management tool |
>>>>>
>>>>> ----------------------------
>>>>>
>>>>> Please supply a command or hit ENTER to view command list.
>>>>>
>>>>> biocache> ingest dr8
>>>>>
>>>>>  This will:
>>>>>
>>>>>  1) Retrieve the metadata from the configured instance of the
>>>>> collectory
>>>>> 2) Load, process, sample (if there are layers configured and
>>>>> available) and index
>>>>>
>>>>>  Cheers
>>>>>
>>>>>  Dave
>>>>>
>>>>>   ------------------------------
>>>>> *From:* Daniel Lins [daniel.lins at gmail.com]
>>>>> *Sent:* 09 May 2014 14:27
>>>>>
>>>>> *To:* Martin, Dave (CES, Black Mountain)
>>>>> *Cc:* ala-portal at lists.gbif.org; dos Remedios, Nick (CES, Black
>>>>> Mountain); Pedro Corrêa; Nicholls, Miles (CES, Black Mountain)
>>>>> *Subject:* Re: [Ala-portal] DwC-A loading problems
>>>>>
>>>>>    David,
>>>>>
>>>>>  The dr0 configuration:
>>>>>
>>>>>  https://www.dropbox.com/s/lsy11jadwmyghjj/collectoryConfig1.png
>>>>>
>>>>>  Sorry, but this server doesn't have external access yet.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 2014-05-09 1:06 GMT-03:00 <David.Martin at csiro.au>:
>>>>>
>>>>>>  As an example of what it should look like, see:
>>>>>>
>>>>>>
>>>>>> http://ala-demo.gbif.org/collectory/dataResource/edit/dr8?page=contribution
>>>>>>
>>>>>>
>>>>>>  ------------------------------
>>>>>> *From:* Daniel Lins [daniel.lins at gmail.com]
>>>>>>
>>>>>> *Sent:* 09 May 2014 13:59
>>>>>> *To:* Martin, Dave (CES, Black Mountain)
>>>>>> *Cc:* ala-portal at lists.gbif.org; dos Remedios, Nick (CES, Black
>>>>>> Mountain); Pedro Corrêa; Nicholls, Miles (CES, Black Mountain)
>>>>>>  *Subject:* Re: [Ala-portal] DwC-A loading problems
>>>>>>
>>>>>>    Thanks David,
>>>>>>
>>>>>>  We use the DwC term "occurrenceID" to identify the records. It's a
>>>>>> unique key.
>>>>>>
>>>>>>  However, when I reload a dataset to update some DwC terms of the
>>>>>> records, the system duplicates this data (keeps the old record and creates
>>>>>> another with changes).
>>>>>>
>>>>>>  For instance (update of locality).
>>>>>>
>>>>>>  Load 1 ($ java -cp .:biocache.jar au.org.ala.util.DwcCSVLoader dr0
>>>>>> -l dataset.csv -b true)
>>>>>>
>>>>>>  {OccurrenceID: 1, municipality: Sao Paulo, ...},
>>>>>> {OccurrenceID: 2, municipality: Sao Paulo, ...}
>>>>>>
>>>>>>  Process 1 (biocache$ process dr0)
>>>>>> Index 1 (biocache$ index dr0)
>>>>>>
>>>>>>  Load 2 (updated records and new records) (($ java -cp .:biocache.jar
>>>>>> au.org.ala.util.DwcCSVLoader dr0 -l dataset-updated.csv -b true)
>>>>>>
>>>>>>  {OccurrenceID: 1, municipality: Rio de Janeiro, ...},
>>>>>> {OccurrenceID: 2, municipality: Rio de Janeiro, ...},
>>>>>>  {OccurrenceID: 3, municipality: Sao Paulo, ...}
>>>>>>
>>>>>>  Process 2 (biocache$ process dr0)
>>>>>> Index 2 (biocache$ index dr0)
>>>>>>
>>>>>>  Results shown by ALA:
>>>>>>
>>>>>>  {OccurrenceID: 1, municipality: Sao Paulo, ...},
>>>>>> {OccurrenceID: 2, municipality: Sao Paulo, ...},
>>>>>>   {OccurrenceID: 1, municipality: Rio de Janeiro, ...},
>>>>>> {OccurrenceID: 2, municipality: Rio de Janeiro, ...}
>>>>>>  {OccurrenceID: 3, municipality: Sao Paulo, ...}
>>>>>>
>>>>>>  But I expected:
>>>>>>
>>>>>>  {OccurrenceID: 1, municipality: Rio de Janeiro, ...},
>>>>>>  {OccurrenceID: 2, municipality: Rio de Janeiro, ...}
>>>>>>  {OccurrenceID: 3, municipality: Sao Paulo, ...}
>>>>>>
>>>>>>  I need to delete (delete-resource function) existing data before
>>>>>> the reload? If no, what I did wrong to generate this data duplication?
>>>>>>
>>>>>>  Thanks!
>>>>>>
>>>>>>
>>>>>>  Regards,
>>>>>>
>>>>>>   Daniel Lins da Silva
>>>>>>  (Mobile) 55 11 96144-4050
>>>>>>  Research Center on Biodiversity and Computing (Biocomp)
>>>>>> University of Sao Paulo, Brazil
>>>>>>  daniellins at usp.br
>>>>>>  daniel.lins at gmail.com
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2014-05-07 0:46 GMT-03:00 <David.Martin at csiro.au>:
>>>>>>
>>>>>>>   Thanks Daniel. Natasha has now left the ALA.
>>>>>>>
>>>>>>>  The uniqueness of records is determined by information stored in
>>>>>>> the collectory. See screenshot [1].
>>>>>>> By default, "catalogNumber" is used but you can change this to any
>>>>>>> number of fields that should be stable in the data.
>>>>>>> Using unstable fields for the ID isn't recommended (e.g.
>>>>>>> scientificName).  To update the records, the process is to just
>>>>>>> re-load the dataset.
>>>>>>>
>>>>>>>  Automatically loaded - this isnt in use and we may remove from the
>>>>>>> UI in future iterations.
>>>>>>> Incremental Load - affects the sample/process/index steps to only
>>>>>>> run these against the new records.  Load is always incremental based on the
>>>>>>> key field(s) but if the incremental load box isn’t checked it runs the
>>>>>>> sample/process/index steps against the whole data set. This can cause a
>>>>>>> large processing overhead when there’s a minor update to a large data set.
>>>>>>>
>>>>>>>  Cheers
>>>>>>>
>>>>>>>  Dave Martin
>>>>>>>  ALA
>>>>>>>
>>>>>>>  [1] http://bit.ly/1g72HFN
>>>>>>>
>>>>>>>  ------------------------------
>>>>>>> *From:* Daniel Lins [daniel.lins at gmail.com]
>>>>>>> *Sent:* 05 May 2014 15:39
>>>>>>> *To:* Quimby, Natasha (CES, Black Mountain)
>>>>>>> *Cc:* ala-portal at lists.gbif.org; dos Remedios, Nick (CES, Black
>>>>>>> Mountain); Martin, Dave (CES, Black Mountain); Pedro Corrêa
>>>>>>> *Subject:* Re: [Ala-portal] DwC-A loading problems
>>>>>>>
>>>>>>>     Hi Natasha,
>>>>>>>
>>>>>>>  I managed to import the DwC-A file following the steps reported in
>>>>>>> the previous email. Thank you!
>>>>>>>
>>>>>>>  However, when I tried to update some metadata of an occurrence
>>>>>>> record (already stored in the database), the system created a new record
>>>>>>> with these duplicated information. So I started to have several records
>>>>>>> with the same occurrenceID (I did set in the data resource
>>>>>>> configuration to use "OcurrenceID" to uniquely identify a record).
>>>>>>>
>>>>>>>  How can I update existing records in the database? For instance,
>>>>>>> the location's metadata of an occurrence record stored in my database?
>>>>>>>
>>>>>>>  I also would like to better understand the behavior of the
>>>>>>> properties "Automatically loaded" and "Incremental Load".
>>>>>>>
>>>>>>>  Thanks!!
>>>>>>>
>>>>>>>  Regards,
>>>>>>>
>>>>>>>   Daniel Lins da Silva
>>>>>>>  (Mobile) 55 11 96144-4050
>>>>>>>  Research Center on Biodiversity and Computing (Biocomp)
>>>>>>> University of Sao Paulo, Brazil
>>>>>>>  daniellins at usp.br
>>>>>>>  daniel.lins at gmail.com
>>>>>>>
>>>>>>>
>>>>>>> 2014-04-28 3:52 GMT-03:00 Daniel Lins <daniel.lins at gmail.com>:
>>>>>>>
>>>>>>>> Thanks Natasha!
>>>>>>>>
>>>>>>>>  I will try your recommendations. Once finished, I will contact
>>>>>>>> you.
>>>>>>>>
>>>>>>>>  Regards
>>>>>>>>
>>>>>>>>  Daniel Lins da Silva
>>>>>>>>  (Mobile) 55 11 96144-4050
>>>>>>>>  Research Center on Biodiversity and Computing (Biocomp)
>>>>>>>> University of Sao Paulo, Brazil
>>>>>>>>  daniellins at usp.br
>>>>>>>>  daniel.lins at gmail.com
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>  2014-04-28 3:26 GMT-03:00 <Natasha.Quimby at csiro.au>:
>>>>>>>>
>>>>>>>>  Hi Daniel,
>>>>>>>>>
>>>>>>>>>  When you specify a local DwcA Load the archive needs to be
>>>>>>>>> unzipped. Try unzipping *2f676abc-4503-489e-8f0c-fcb6e1bc554b.zip
>>>>>>>>> *and then running the following:
>>>>>>>>>  s*udo** java -cp .:biocache.jar au.org.ala.util.DwCALoader dr7
>>>>>>>>> -l
>>>>>>>>> /data/collectory/upload/1398658607824/2f676abc-4503-489e-8f0c-fcb6e1bc554b*
>>>>>>>>>
>>>>>>>>>  If you configure the collectory to provide the dwca the biocache
>>>>>>>>> automatically unzips the archive for you.  You would need to configure dr7
>>>>>>>>> with the following connection parameters:
>>>>>>>>>
>>>>>>>>>  "protocol":"DwCA"
>>>>>>>>> "termsForUniqueKey":["occurrenceID"],
>>>>>>>>> "url":"file:////data/collectory/upload/
>>>>>>>>> 1398658607824/2f676abc-4503-489e-8f0c-fcb6e1bc554b.zip"
>>>>>>>>>
>>>>>>>>>  You could then load the resource by:
>>>>>>>>>  s*udo** java -cp .:biocache.jar au.org.ala.util.DwCALoader dr7*
>>>>>>>>>
>>>>>>>>>  If you continue to have issues please let us know.
>>>>>>>>>
>>>>>>>>>  Hope that this helps.
>>>>>>>>>
>>>>>>>>>  Regards
>>>>>>>>> Natasha
>>>>>>>>>
>>>>>>>>>   From: Daniel Lins <daniel.lins at gmail.com>
>>>>>>>>> Date: Monday, 28 April 2014 3:54 PM
>>>>>>>>> To: "ala-portal at lists.gbif.org" <ala-portal at lists.gbif.org
>>>>>>>>> <ala-portal at lists.gbif.org> <ala-portal at lists.gbif.org>
>>>>>>>>> <ala-portal at lists.gbif.org> <ala-portal at lists.gbif.org>
>>>>>>>>> <ala-portal at lists.gbif.org> <ala-portal at lists.gbif.org>>, "dos
>>>>>>>>> Remedios, Nick (CES, Black Mountain)" <Nick.Dosremedios at csiro.au>,
>>>>>>>>> "Martin, Dave (CES, Black Mountain)" <David.Martin at csiro.au>
>>>>>>>>> Subject: [Ala-portal] DwC-A loading problems
>>>>>>>>>
>>>>>>>>>   Hi Nick and Dave,
>>>>>>>>>
>>>>>>>>>  We are having some problems in Biocache during the upload of
>>>>>>>>> DwC-A files.
>>>>>>>>>
>>>>>>>>>  As shown below, after run the method
>>>>>>>>> "au.org.ala.util.DwCALoader", our system returns the error message "Exception
>>>>>>>>> in thread "main" org.gbif.dwc.text.UnkownDelimitersException:
>>>>>>>>> Unable to detect field delimiter"
>>>>>>>>>
>>>>>>>>>  I accomplished tests using DwC-A files with tab-delimited text
>>>>>>>>> files and comma-delimited text files. In both cases the error generated was
>>>>>>>>> the same.
>>>>>>>>>
>>>>>>>>>  What causes these problems? (** CSV Loader works great)
>>>>>>>>>
>>>>>>>>>  *tab-delimited file test*
>>>>>>>>>
>>>>>>>>>  poliusp at poliusp-VirtualBox:~/dev/biocache$ s*udo java -cp
>>>>>>>>> .:biocache.jar au.org.ala.util.DwCALoader dr7 -l
>>>>>>>>> /data/collectory/upload/1398658607824/2f676abc-4503-489e-8f0c-fcb6e1bc554b.zip*
>>>>>>>>> 2014-04-28 01:44:02,837 INFO : [ConfigModule] - Loading
>>>>>>>>> configuration from /data/biocache/config/biocache-config.properties
>>>>>>>>> 2014-04-28 01:44:03,090 INFO : [ConfigModule] - Initialise SOLR
>>>>>>>>> 2014-04-28 01:44:03,103 INFO : [ConfigModule] - Initialise name
>>>>>>>>> matching indexes
>>>>>>>>> 2014-04-28 01:44:03,605 INFO : [ConfigModule] - Initialise
>>>>>>>>> persistence manager
>>>>>>>>> 2014-04-28 01:44:03,606 INFO : [ConfigModule] - Configure complete
>>>>>>>>> Loading archive /data/collectory
>>>>>>>>> /upload/1398658607824/2f676abc-4503-489e-8f0c-fcb6e1bc554b.zip
>>>>>>>>> for resource dr7 with unique terms List(dwc:occurrenceID)
>>>>>>>>> stripping spaces false incremental false testing false
>>>>>>>>> *Exception in thread "main"
>>>>>>>>> org.gbif.dwc.text.UnkownDelimitersException: Unable to detect field
>>>>>>>>> delimiter*
>>>>>>>>>         at org.gbif.file.CSVReaderFactory.buildArchiveFile(
>>>>>>>>> CSVReaderFactory.java:129)
>>>>>>>>>         at org.gbif.file.CSVReaderFactory.build(
>>>>>>>>> CSVReaderFactory.java:46)
>>>>>>>>>         at org.gbif.dwc.text.ArchiveFactory.readFileHeaders(
>>>>>>>>> ArchiveFactory.java:344)
>>>>>>>>>         at org.gbif.dwc.text.ArchiveFactory.openArchive(
>>>>>>>>> ArchiveFactory.java:289)
>>>>>>>>>         at
>>>>>>>>> au.org.ala.util.DwCALoader.loadArchive(DwCALoader.scala:129)
>>>>>>>>>         at
>>>>>>>>> au.org.ala.util.DwCALoader.loadLocal(DwCALoader.scala:106)
>>>>>>>>>         at au.org.ala.util.DwCALoader$.main(DwCALoader.scala:52)
>>>>>>>>>         at au.org.ala.util.DwCALoader.main(DwCALoader.scala)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  *comma-delimited file test*
>>>>>>>>>
>>>>>>>>>  poliusp at poliusp-VirtualBox:~/dev/biocache$ *sudo java -cp
>>>>>>>>> .:biocache.jar au.org.ala.util.DwCALoader dr7 -l ./dwca-teste3.zip*
>>>>>>>>> 2014-04-28 01:56:04,683 INFO : [ConfigModule] - Loading
>>>>>>>>> configuration from /data/biocache/config/biocache-config.properties
>>>>>>>>> 2014-04-28 01:56:04,940 INFO : [ConfigModule] - Initialise SOLR
>>>>>>>>> 2014-04-28 01:56:04,951 INFO : [ConfigModule] - Initialise name
>>>>>>>>> matching indexes
>>>>>>>>> 2014-04-28 01:56:05,437 INFO : [ConfigModule] - Initialise
>>>>>>>>> persistence manager
>>>>>>>>> 2014-04-28 01:56:05,438 INFO : [ConfigModule] - Configure complete
>>>>>>>>> Loading archive ./dwca-teste3.zip for resource dr7 with unique
>>>>>>>>> terms List(dwc:occurrenceID) stripping spaces false incremental
>>>>>>>>> false testing false
>>>>>>>>> *Exception in thread "main"
>>>>>>>>> org.gbif.dwc.text.UnkownDelimitersException: Unable to detect field
>>>>>>>>> delimiter*
>>>>>>>>>         at org.gbif.file.CSVReaderFactory.buildArchiveFile(
>>>>>>>>> CSVReaderFactory.java:129)
>>>>>>>>>         at org.gbif.file.CSVReaderFactory.build(
>>>>>>>>> CSVReaderFactory.java:46)
>>>>>>>>>         at org.gbif.dwc.text.ArchiveFactory.readFileHeaders(
>>>>>>>>> ArchiveFactory.java:344)
>>>>>>>>>         at org.gbif.dwc.text.ArchiveFactory.openArchive(
>>>>>>>>> ArchiveFactory.java:289)
>>>>>>>>>         at
>>>>>>>>> au.org.ala.util.DwCALoader.loadArchive(DwCALoader.scala:129)
>>>>>>>>>         at
>>>>>>>>> au.org.ala.util.DwCALoader.loadLocal(DwCALoader.scala:106)
>>>>>>>>>         at au.org.ala.util.DwCALoader$.main(DwCALoader.scala:52)
>>>>>>>>>         at au.org.ala.util.DwCALoader.main(DwCALoader.scala)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  Thanks!
>>>>>>>>>
>>>>>>>>>  Regards.
>>>>>>>>> --
>>>>>>>>>  Daniel Lins da Silva
>>>>>>>>> (Mobile) 55 11 96144-4050
>>>>>>>>>  Research Center on Biodiversity and Computing (Biocomp)
>>>>>>>>> University of Sao Paulo, Brazil
>>>>>>>>>  daniellins at usp.br
>>>>>>>>> daniel.lins at gmail.com
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>  --
>>>>>>>> Daniel Lins da Silva
>>>>>>>> (Cel) 11 6144-4050
>>>>>>>> daniel.lins at gmail.com
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>  --
>>>>>>> Daniel Lins da Silva
>>>>>>> (Cel) 11 6144-4050
>>>>>>> daniel.lins at gmail.com
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>  --
>>>>>> Daniel Lins da Silva
>>>>>> (Cel) 11 6144-4050
>>>>>> daniel.lins at gmail.com
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>  --
>>>>> Daniel Lins da Silva
>>>>> (Cel) 11 6144-4050
>>>>> daniel.lins at gmail.com
>>>>>
>>>>
>>>>
>>>>
>>>>  --
>>>> Daniel Lins da Silva
>>>> (Cel) 11 6144-4050
>>>> daniel.lins at gmail.com
>>>>
>>>
>>>
>>>
>>>  --
>>> Daniel Lins da Silva
>>> (Cel) 11 6144-4050
>>> daniel.lins at gmail.com
>>>
>>
>>
>>
>>  --
>> Daniel Lins da Silva
>> (Cel) 11 6144-4050
>> daniel.lins at gmail.com
>>
>
>
>
> --
> Daniel Lins da Silva
> (Cel) 11 6144-4050
> daniel.lins at gmail.com
>

-- 
Daniel Lins da Silva
(Cel) 11 6144-4050
daniel.lins at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.gbif.org/pipermail/ala-portal/attachments/20140627/94a23123/attachment-0001.html