[Ala-portal] [ExternalEmail] Re: Using Biocache-Store and Apache Cassandra on different servers

Daniel Lins daniel.lins at gmail.com
Wed Jun 25 08:00:17 CEST 2014


Hi Dave,

I used the biocache tool (load dr0) to load these data. To view these data
in Cassandra normally I use the cassandra-cli or the cql tool (cqlsh).

After loading the data on the remote server, I can see them using these
tools (see below). But after, while performing the processing and indexing
steps, these issues occur.



poliusp at poliusp-VirtualBox:~/dev/apache-cassandra-1.2.13/bin$ sudo
./cassandra-cli -h 192.168.15.199 -k occ
Connected to: "Biocache Cluster" on 192.168.15.199/9160
Welcome to Cassandra CLI version 1.2.13

Type 'help;' or '?' for help.
Type 'quit;' or 'exit;' to quit.

[default at occ] list occ limit 1;
Using default cell limit of 100
-------------------
RowKey: dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA120999
=> (name=acceptedNameUsageID, value=MA20, timestamp=1403674832408000)
=> (name=associatedMedia,
value=/data/biocache-media/dr0/11302/4356e553-1080-4815-ba86-20f85f13394c/Chrysocyon.brachyurus.jpg,
timestamp=1403674832408000)
=> (name=basisOfRecord, value=HumanObservation, timestamp=1403674832408000)
=> (name=catalogNumber, value=MA120999, timestamp=1403674832408000)
=> (name=classs, value=Mammalia, timestamp=1403666202349000)
=> (name=collectionCode, value=PARNASO, timestamp=1403674832408000)
=> (name=continent, value=América, timestamp=1403674832408000)
=> (name=coordinatePrecision, value=30, timestamp=1403674832408000)
=> (name=country, value=Brasil, timestamp=1403674832408000)
=> (name=countryCode, value=BR, timestamp=1403674832408000)
=> (name=county, value=Anastacio, timestamp=1403674832408000)
=> (name=dataResourceUid, value=dr0, timestamp=1403674832408000)
=> (name=datasetName, value=Monitoramento, timestamp=1403674832408000)
=> (name=dateIdentified, value=2014-03-16 00:00:00.0,
timestamp=1403674832408000)
=> (name=day, value=16, timestamp=1403674832408000)
=> (name=decimalLatitude, value=-18.355717, timestamp=1403674832408000)
=> (name=decimalLongitude, value=-55.55105, timestamp=1403674832408000)
=> (name=defaultValuesUsed, value=false, timestamp=1403674832408000)
=> (name=eventDate, value=2014-03-16 00:00:00.0, timestamp=1403674832408000)
=> (name=eventID, value=100, timestamp=1403674832408000)
=> (name=eventRemarks, value=Chuvoso, timestamp=1403674832408000)
=> (name=family, value=Canidae, timestamp=1403674832408000)
=> (name=firstLoaded, value=2014-06-25T00:15:40Z,
timestamp=1403666202349000)
=> (name=genus, value=Chrysocyon, timestamp=1403674832408000)
=> (name=geodeticDatum, value=EPSG:4326, timestamp=1403674832408000)
=> (name=georeferenceProtocol, value="Guide to Best Practices for
Georeferencing" (Chapman and Wieczorek, eds. 2006),
timestamp=1403674832408000)
=> (name=georeferencedBy, value=D. Silva, timestamp=1403674832408000)
=> (name=identifiedBy, value=James L. Patton, timestamp=1403674832408000)
=> (name=institutionCode, value=ICMBIO, timestamp=1403674832408000)
=> (name=kingdom, value=Animalia, timestamp=1403674832408000)
=> (name=language, value=pt-BR, timestamp=1403674832408000)
=> (name=lastModifiedTime, value=2014-06-25T02:40:29Z,
timestamp=1403674832408000)
=> (name=locality, value=Fazenda Alegre Paraguai River,
timestamp=1403674832408000)
=> (name=locationDetermined, value=false, timestamp=1403674832408000)
=> (name=locationRemarks, value=Cerrado, timestamp=1403674832408000)
=> (name=maximumElevationInMeters, value=0, timestamp=1403674832408000)
=> (name=minimumElevationInMeters, value=0, timestamp=1403674832408000)
=> (name=miscProperties, value={"class":"Mammalia","type":"Event"},
timestamp=1403674832408000)
=> (name=modified, value=2013-03-16 18:19:39.0, timestamp=1403674832408000)
=> (name=month, value=3, timestamp=1403674832408000)
=> (name=municipality, value=Anastacio, timestamp=1403674832408000)
=> (name=occurrenceID,
value=urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA120999,
timestamp=1403674832408000)
=> (name=order, value=Carnivora, timestamp=1403674832408000)
=> (name=phylum, value=Chordata, timestamp=1403674832408000)
=> (name=recordNumber, value=444, timestamp=1403674832408000)
=> (name=recordedBy, value=S. Siemel, timestamp=1403674832408000)
=> (name=rights, value=Os termos e condições para uso estão disponíveis no
documento localizado em http://www.icmbio.gov.br/direitos/normas.pdf.,
timestamp=1403674832408000)
=> (name=rowKey,
value=dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA120999,
timestamp=1403674832408000)
=> (name=samplingEffort, value=105 homem-hora, timestamp=1403674832408000)
=> (name=samplingProtocol, value=Protocolo de Monitoramento,
timestamp=1403674832408000)
=> (name=scientificName, value=Chrysocyon brachyurus,
timestamp=1403674832408000)
=> (name=scientificNameAuthorship, value=(Illiger 1815),
timestamp=1403674832408000)
=> (name=stateProvince, value=Mato Grosso do Sul,
timestamp=1403674832408000)
=> (name=taxonRank, value=species, timestamp=1403674832408000)
=> (name=uuid, value=4356e553-1080-4815-ba86-20f85f13394c,
timestamp=1403674832408000)
=> (name=verbatimLatitude, value=-18.355717, timestamp=1403674832408000)
=> (name=verbatimLongitude, value=-55.55105, timestamp=1403674832408000)
=> (name=year, value=2014, timestamp=1403674832408000)




poliusp at poliusp-VirtualBox:~/dev/biocache$ cqlsh 192.168.15.199
Connected to Biocache Cluster at 192.168.15.199:9160.
[cqlsh 3.1.8 | Cassandra 2.0.8 | CQL spec 3.0.0 | Thrift protocol 19.39.0]
Use HELP for help.
cqlsh> use occ;
cqlsh:occ> select * from occ;

 key                                                           | portalId |
uuid
---------------------------------------------------------------+----------+--------------------------------------
 dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA120999 |     null |
4356e553-1080-4815-ba86-20f85f13394c
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA34261 |     null |
ab61409f-5e42-41a6-ac81-8bf9c0b93aac
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA63770 |     null |
cffb70cb-f583-4e10-8ea9-600fdce36706
 dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA133941 |     null |
9be57c4f-e2d0-4d67-a418-cb140c113029
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA63773 |     null |
9c3c0c56-9279-4e3d-8b01-a4bc9068093f
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA27101 |     null |
4e1b082d-3610-45b2-b49f-f9c72c050866
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA36962 |     null |
515b1fd5-7611-4724-a371-c2843f4b0900
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA63846 |     null |
a48edaee-f7ba-41bf-bef1-bef867a67faa
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA63771 |     null |
352fe23c-7f29-43eb-bb82-8b5f195150a1
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA28313 |     null |
6a1d090d-dadb-4f24-8bfe-a6c24846d8c3
 dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA146295 |     null |
5c2500cc-6935-4caf-ad05-0fea60fc2f3c
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA42147 |     null |
a9f14456-01cb-44b8-91bc-605aecd26d50
   dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA2523 |     null |
c5a064fc-31b5-4952-a513-9062ad347db0
 dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA123393 |     null |
75a50714-1c30-4e56-a2bb-006593b6277a
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA36937 |     null |
69db738e-a98e-4f25-95e1-ba9dd52aa006
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA63769 |     null |
4a8e57f0-d288-48f5-9b07-d56fe190e4e1
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA44534 |     null |
173dbf31-09e6-4672-a210-dc63dbe33f47
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA21669 |     null |
b75bc5d8-0899-4baf-883a-bb0b97416a74
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA71180 |     null |
676e7b64-2303-4707-b726-63112fde031a
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA28312 |     null |
93106978-65fd-4eb6-aa5d-ca1d2e5c98f4
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA28311 |     null |
92c03fe9-20cf-4ff0-9749-85e5730e8893
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA23771 |     null |
a7c18f6a-5575-48e5-b5fc-8cf3b28a8848
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA71179 |     null |
4ea9ad71-16a3-4f17-800c-ba40492c978b
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA63847 |     null |
2c94f101-2fec-4aba-93a0-05d50f529300
 dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA464300 |     null |
0ac772ed-8e02-444d-8433-a80aa9771f25
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA63772 |     null |
2ce8275c-7d36-4e9d-a15f-4e3daf6bfde9
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA16756 |     null |
253b2511-6a09-493a-9d05-2f2573f9fbd1
 dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA133940 |     null |
368a1412-7350-46b5-b389-07125dd28a33
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA41315 |     null |
ead8efa0-2216-427b-bfaa-595a49603a09
 dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA198039 |     null |
e7f101cd-243d-479a-98b1-043adae19cb7
 dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA133942 |     null |
704cf8c0-9c10-4e1b-93d1-8a5a52e62126
   dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA2524 |     null |
1c287699-75dd-4ac6-80cc-314151277366



2014-06-25 2:32 GMT-03:00 <David.Martin at csiro.au>:

>  I thought I'd just post the following shell transcript to the list in
> case its of use to others.
> The cassandra-cli tool is part of the cassandra distribution.
> To view the data in cassandra using the cassandra-cli tool:
>
>  mar759 at ala-ono:~$ cassandra-cli -h localhost -p 9160 -k occ
>
> Column Family assumptions read from
> /home/mar759/.cassandra/assumptions.json
>
> Connected to: "Biocache" on localhost/9160
>
> Welcome to Cassandra CLI version 1.2.10
>
>
>  Type 'help;' or '?' for help.
>
> Type 'quit;' or 'exit;' to quit.
>
>
>  [default at occ] list occ limit 1;
>
> Using default cell limit of 100
>
> -------------------
>
> RowKey: dr107|43704421
>
> => (name=attr.qa, value=[], timestamp=1397043413407000)
>
> => (name=basisOfRecord, value=UNKNOWN, timestamp=1397043412777000)
>
> => (name=bor.qa, value=[20002], timestamp=1397043413407000)
>
> => (name=catalogNumber, value=1, timestamp=1397043412777000)
>
> => (name=class.qa, value=[10008,10005], timestamp=1397043413407
>
> .....
>
>
>  ------------------------------
> *From:* ala-portal-bounces at lists.gbif.org [
> ala-portal-bounces at lists.gbif.org] on behalf of David.Martin at csiro.au
> [David.Martin at csiro.au]
> *Sent:* 25 June 2014 15:24
> *To:* daniel.lins at gmail.com; ala-portal at lists.gbif.org;
> pedro.correa at usp.br
> *Subject:* [ExternalEmail] Re: [Ala-portal] Using Biocache-Store and
> Apache Cassandra on different servers
>
>   Thanks Daniel.
>
>  How did you load the data ?
> Also can you see the records using the cassandra-cli tool ?
>
>  Dave
>
>
>  ------------------------------
> *From:* Daniel Lins [daniel.lins at gmail.com]
> *Sent:* 25 June 2014 15:21
> *To:* Martin, Dave (CES, Black Mountain); ala-portal at lists.gbif.org;
> Pedro Corrêa
> *Subject:* Using Biocache-Store and Apache Cassandra on different servers
>
>   Hi,
>
>  We configured the ALA applications on different Servers of the database
> Server. However, we are having problems running the biocache-store.
>
>  I changed the biocache configuration file
> (/data/biocache/config/biocache-config.properties) and the Cassandra
> configuration (cassandra.yaml) for enabling remote access.
>
>  (biocache-config.properties)
>  # Cassandra Config
> db=cassandra
> *cassandra.hosts=192.168.15.199*
>  cassandra.port=9160
> cassandra.pool=biocache-store-pool
> cassandra.keyspace=occ
> cassandra.max.connections=-1
> cassandra.max.retries=6
> thrift.operation.timeout=8000
>
>  (cassandra.yaml)
>
> *listen_address: 192.168.15.199 *
>
> *rpc_address: 192.168.15.199 *
>  *rpc_port: 9160*
>  ...
>
>  *Running the Biocache-store in the server 192.168.15.132*
>
>  When I ran the Loading method, the data were saved correctly in the
> Cassandra Database. However, in the Processing method, the data was not
> recovered (see message below) and the Indexing method did not index.
>
>   biocache> process dr0
> Processing dr0 incremental=false
> 2014-06-25 00:17:20,228 INFO : [Consumer] - Initialising thread: 0
> 2014-06-25 00:17:20,270 INFO : [Consumer] - Initialising thread: 1
> 2014-06-25 00:17:20,271 INFO : [Consumer] - Initialising thread: 2
> 2014-06-25 00:17:20,274 INFO : [Consumer] - Initialising thread: 3
> 2014-06-25 00:17:20,275 INFO : [ProcessWithActors] - Starting with dr0|
> endingwith dr0|~
> 2014-06-25 00:17:20,275 INFO : [ProcessWithActors] - Initialised actors...
> 2014-06-25 00:17:20,274 INFO : [Consumer] - In thread: 0
> 2014-06-25 00:17:20,274 INFO : [Consumer] - In thread: 1
> 2014-06-25 00:17:20,281 INFO : [Consumer] - In thread: 2
> 2014-06-25 00:17:20,293 INFO : [Consumer] - In thread: 3
> 2014-06-25 00:17:20,326 INFO : [ProcessWithActors] - Last row key
> processed:
> 2014-06-25 00:17:20,327 INFO : [ProcessWithActors] - Finished.
> 2014-06-25 00:17:20,333 INFO : [Consumer] - Killing (Actor.act) thread: 0
> 2014-06-25 00:17:20,333 INFO : [Consumer] - Killing (Actor.act) thread: 3
> 2014-06-25 00:17:20,333 INFO : [Consumer] - Killing (Actor.act) thread: 2
> 2014-06-25 00:17:20,333 INFO : [Consumer] - Killing (Actor.act) thread: 1
>
>   biocache> index dr0
>  2014-06-25 00:17:36,848 INFO : [IndexRecords] - Starting to index dr0|
> until dr0|~
> 2014-06-25 00:17:36,858 INFO : [IndexRecords] - Total indexing time 0.005
> seconds
> 2014-06-25 00:17:36,858 INFO : [SolrIndexDAO] - Initialising the solr
> server http://192.168.15.132:8080/solr null null
> 2014-06-25 00:17:36,859 INFO : [SolrIndexDAO] - Initialising connection
> to SOLR server.....
> 2014-06-25 00:17:37,764 INFO : [SolrIndexDAO] - Initialising connection
> to SOLR server - done.
> 2014-06-25 00:17:38,040 INFO : [SolrIndexDAO] - >>>>>>>>>>>>> Document
> count of index: 0
> 2014-06-25 00:17:38,041 INFO : [SolrIndexDAO] - Finalise finished.
>
>  ** The delete_resource method also didn't work and I need to delete the
> occurrence table using the truncate command.
>
>  Does anyone know how to solve this issue?
>
>  Thanks!!
>
>  Regards,
>
>  --
>  Daniel Lins da Silva
> (Mobile) 55 11 96144-4050
>  Research Center on Biodiversity and Computing (Biocomp)
> University of Sao Paulo, Brazil
>  daniellins at usp.br
> daniel.lins at gmail.com
>
>


-- 
Daniel Lins da Silva
(Cel) 11 6144-4050
daniel.lins at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.gbif.org/pipermail/ala-portal/attachments/20140625/910d12c7/attachment-0001.html 


More information about the Ala-portal mailing list