[Ala-portal] [ExternalEmail] Re: Using Biocache-Store and Apache Cassandra on different servers

David.Martin at csiro.au David.Martin at csiro.au
Wed Jun 25 08:18:25 CEST 2014


Thanks Daniel. Is it possible that you are running a different version of the commandline tool when you run a load compared to when you run process+index ?
I can see from the output from cassandra-cli that the records havent been processed at all.
I can only think that when you are running the processing and indexing its using configuration that is pointing at a different server.

Dave

________________________________
From: Daniel Lins [daniel.lins at gmail.com]
Sent: 25 June 2014 16:00
To: Martin, Dave (CES, Black Mountain)
Cc: ala-portal at lists.gbif.org; Pedro Corrêa
Subject: Re: [ExternalEmail] Re: [Ala-portal] Using Biocache-Store and Apache Cassandra on different servers

Hi Dave,

I used the biocache tool (load dr0) to load these data. To view these data in Cassandra normally I use the cassandra-cli or the cql tool (cqlsh).

After loading the data on the remote server, I can see them using these tools (see below). But after, while performing the processing and indexing steps, these issues occur.



poliusp at poliusp-VirtualBox:~/dev/apache-cassandra-1.2.13/bin$ sudo ./cassandra-cli -h 192.168.15.199 -k occ
Connected to: "Biocache Cluster" on 192.168.15.199/9160<http://192.168.15.199/9160>
Welcome to Cassandra CLI version 1.2.13

Type 'help;' or '?' for help.
Type 'quit;' or 'exit;' to quit.

[default at occ] list occ limit 1;
Using default cell limit of 100
-------------------
RowKey: dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA120999
=> (name=acceptedNameUsageID, value=MA20, timestamp=1403674832408000)
=> (name=associatedMedia, value=/data/biocache-media/dr0/11302/4356e553-1080-4815-ba86-20f85f13394c/Chrysocyon.brachyurus.jpg, timestamp=1403674832408000)
=> (name=basisOfRecord, value=HumanObservation, timestamp=1403674832408000)
=> (name=catalogNumber, value=MA120999, timestamp=1403674832408000)
=> (name=classs, value=Mammalia, timestamp=1403666202349000)
=> (name=collectionCode, value=PARNASO, timestamp=1403674832408000)
=> (name=continent, value=América, timestamp=1403674832408000)
=> (name=coordinatePrecision, value=30, timestamp=1403674832408000)
=> (name=country, value=Brasil, timestamp=1403674832408000)
=> (name=countryCode, value=BR, timestamp=1403674832408000)
=> (name=county, value=Anastacio, timestamp=1403674832408000)
=> (name=dataResourceUid, value=dr0, timestamp=1403674832408000)
=> (name=datasetName, value=Monitoramento, timestamp=1403674832408000)
=> (name=dateIdentified, value=2014-03-16 00:00:00.0, timestamp=1403674832408000)
=> (name=day, value=16, timestamp=1403674832408000)
=> (name=decimalLatitude, value=-18.355717, timestamp=1403674832408000)
=> (name=decimalLongitude, value=-55.55105, timestamp=1403674832408000)
=> (name=defaultValuesUsed, value=false, timestamp=1403674832408000)
=> (name=eventDate, value=2014-03-16 00:00:00.0, timestamp=1403674832408000)
=> (name=eventID, value=100, timestamp=1403674832408000)
=> (name=eventRemarks, value=Chuvoso, timestamp=1403674832408000)
=> (name=family, value=Canidae, timestamp=1403674832408000)
=> (name=firstLoaded, value=2014-06-25T00:15:40Z, timestamp=1403666202349000)
=> (name=genus, value=Chrysocyon, timestamp=1403674832408000)
=> (name=geodeticDatum, value=EPSG:4326, timestamp=1403674832408000)
=> (name=georeferenceProtocol, value="Guide to Best Practices for Georeferencing" (Chapman and Wieczorek, eds. 2006), timestamp=1403674832408000)
=> (name=georeferencedBy, value=D. Silva, timestamp=1403674832408000)
=> (name=identifiedBy, value=James L. Patton, timestamp=1403674832408000)
=> (name=institutionCode, value=ICMBIO, timestamp=1403674832408000)
=> (name=kingdom, value=Animalia, timestamp=1403674832408000)
=> (name=language, value=pt-BR, timestamp=1403674832408000)
=> (name=lastModifiedTime, value=2014-06-25T02:40:29Z, timestamp=1403674832408000)
=> (name=locality, value=Fazenda Alegre Paraguai River, timestamp=1403674832408000)
=> (name=locationDetermined, value=false, timestamp=1403674832408000)
=> (name=locationRemarks, value=Cerrado, timestamp=1403674832408000)
=> (name=maximumElevationInMeters, value=0, timestamp=1403674832408000)
=> (name=minimumElevationInMeters, value=0, timestamp=1403674832408000)
=> (name=miscProperties, value={"class":"Mammalia","type":"Event"}, timestamp=1403674832408000)
=> (name=modified, value=2013-03-16 18:19:39.0, timestamp=1403674832408000)
=> (name=month, value=3, timestamp=1403674832408000)
=> (name=municipality, value=Anastacio, timestamp=1403674832408000)
=> (name=occurrenceID, value=urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA120999, timestamp=1403674832408000)
=> (name=order, value=Carnivora, timestamp=1403674832408000)
=> (name=phylum, value=Chordata, timestamp=1403674832408000)
=> (name=recordNumber, value=444, timestamp=1403674832408000)
=> (name=recordedBy, value=S. Siemel, timestamp=1403674832408000)
=> (name=rights, value=Os termos e condições para uso estão disponíveis no documento localizado em http://www.icmbio.gov.br/direitos/normas.pdf., timestamp=1403674832408000)
=> (name=rowKey, value=dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA120999, timestamp=1403674832408000)
=> (name=samplingEffort, value=105 homem-hora, timestamp=1403674832408000)
=> (name=samplingProtocol, value=Protocolo de Monitoramento, timestamp=1403674832408000)
=> (name=scientificName, value=Chrysocyon brachyurus, timestamp=1403674832408000)
=> (name=scientificNameAuthorship, value=(Illiger 1815), timestamp=1403674832408000)
=> (name=stateProvince, value=Mato Grosso do Sul, timestamp=1403674832408000)
=> (name=taxonRank, value=species, timestamp=1403674832408000)
=> (name=uuid, value=4356e553-1080-4815-ba86-20f85f13394c, timestamp=1403674832408000)
=> (name=verbatimLatitude, value=-18.355717, timestamp=1403674832408000)
=> (name=verbatimLongitude, value=-55.55105, timestamp=1403674832408000)
=> (name=year, value=2014, timestamp=1403674832408000)




poliusp at poliusp-VirtualBox:~/dev/biocache$ cqlsh 192.168.15.199
Connected to Biocache Cluster at 192.168.15.199:9160<http://192.168.15.199:9160>.
[cqlsh 3.1.8 | Cassandra 2.0.8 | CQL spec 3.0.0 | Thrift protocol 19.39.0]
Use HELP for help.
cqlsh> use occ;
cqlsh:occ> select * from occ;

 key                                                           | portalId | uuid
---------------------------------------------------------------+----------+--------------------------------------
 dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA120999 |     null | 4356e553-1080-4815-ba86-20f85f13394c
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA34261 |     null | ab61409f-5e42-41a6-ac81-8bf9c0b93aac
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA63770 |     null | cffb70cb-f583-4e10-8ea9-600fdce36706
 dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA133941 |     null | 9be57c4f-e2d0-4d67-a418-cb140c113029
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA63773 |     null | 9c3c0c56-9279-4e3d-8b01-a4bc9068093f
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA27101 |     null | 4e1b082d-3610-45b2-b49f-f9c72c050866
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA36962 |     null | 515b1fd5-7611-4724-a371-c2843f4b0900
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA63846 |     null | a48edaee-f7ba-41bf-bef1-bef867a67faa
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA63771 |     null | 352fe23c-7f29-43eb-bb82-8b5f195150a1
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA28313 |     null | 6a1d090d-dadb-4f24-8bfe-a6c24846d8c3
 dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA146295 |     null | 5c2500cc-6935-4caf-ad05-0fea60fc2f3c
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA42147 |     null | a9f14456-01cb-44b8-91bc-605aecd26d50
   dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA2523 |     null | c5a064fc-31b5-4952-a513-9062ad347db0
 dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA123393 |     null | 75a50714-1c30-4e56-a2bb-006593b6277a
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA36937 |     null | 69db738e-a98e-4f25-95e1-ba9dd52aa006
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA63769 |     null | 4a8e57f0-d288-48f5-9b07-d56fe190e4e1
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA44534 |     null | 173dbf31-09e6-4672-a210-dc63dbe33f47
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA21669 |     null | b75bc5d8-0899-4baf-883a-bb0b97416a74
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA71180 |     null | 676e7b64-2303-4707-b726-63112fde031a
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA28312 |     null | 93106978-65fd-4eb6-aa5d-ca1d2e5c98f4
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA28311 |     null | 92c03fe9-20cf-4ff0-9749-85e5730e8893
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA23771 |     null | a7c18f6a-5575-48e5-b5fc-8cf3b28a8848
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA71179 |     null | 4ea9ad71-16a3-4f17-800c-ba40492c978b
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA63847 |     null | 2c94f101-2fec-4aba-93a0-05d50f529300
 dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA464300 |     null | 0ac772ed-8e02-444d-8433-a80aa9771f25
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA63772 |     null | 2ce8275c-7d36-4e9d-a15f-4e3daf6bfde9
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA16756 |     null | 253b2511-6a09-493a-9d05-2f2573f9fbd1
 dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA133940 |     null | 368a1412-7350-46b5-b389-07125dd28a33
  dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA41315 |     null | ead8efa0-2216-427b-bfaa-595a49603a09
 dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA198039 |     null | e7f101cd-243d-479a-98b1-043adae19cb7
 dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA133942 |     null | 704cf8c0-9c10-4e1b-93d1-8a5a52e62126
   dr0|urn:lsid:icmbio.gov.br:icmbio.parnaso.occurrence:MA2524 |     null | 1c287699-75dd-4ac6-80cc-314151277366



2014-06-25 2:32 GMT-03:00 <David.Martin at csiro.au<mailto:David.Martin at csiro.au>>:
I thought I'd just post the following shell transcript to the list in case its of use to others.
The cassandra-cli tool is part of the cassandra distribution.
To view the data in cassandra using the cassandra-cli tool:

mar759 at ala-ono:~$ cassandra-cli -h localhost -p 9160 -k occ

Column Family assumptions read from /home/mar759/.cassandra/assumptions.json

Connected to: "Biocache" on localhost/9160

Welcome to Cassandra CLI version 1.2.10


Type 'help;' or '?' for help.

Type 'quit;' or 'exit;' to quit.


[default at occ] list occ limit 1;

Using default cell limit of 100

-------------------

RowKey: dr107|43704421

=> (name=attr.qa<http://attr.qa>, value=[], timestamp=1397043413407000)

=> (name=basisOfRecord, value=UNKNOWN, timestamp=1397043412777000)

=> (name=bor.qa<http://bor.qa>, value=[20002], timestamp=1397043413407000)

=> (name=catalogNumber, value=1, timestamp=1397043412777000)

=> (name=class.qa<http://class.qa>, value=[10008,10005], timestamp=1397043413407

.....


________________________________
From: ala-portal-bounces at lists.gbif.org<mailto:ala-portal-bounces at lists.gbif.org> [ala-portal-bounces at lists.gbif.org<mailto:ala-portal-bounces at lists.gbif.org>] on behalf of David.Martin at csiro.au [David.Martin at csiro.au]
Sent: 25 June 2014 15:24
To: daniel.lins at gmail.com<mailto:daniel.lins at gmail.com>; ala-portal at lists.gbif.org<mailto:ala-portal at lists.gbif.org>; pedro.correa at usp.br<mailto:pedro.correa at usp.br>
Subject: [ExternalEmail] Re: [Ala-portal] Using Biocache-Store and Apache Cassandra on different servers

Thanks Daniel.

How did you load the data ?
Also can you see the records using the cassandra-cli tool ?

Dave


________________________________
From: Daniel Lins [daniel.lins at gmail.com<mailto:daniel.lins at gmail.com>]
Sent: 25 June 2014 15:21
To: Martin, Dave (CES, Black Mountain); ala-portal at lists.gbif.org<mailto:ala-portal at lists.gbif.org>; Pedro Corrêa
Subject: Using Biocache-Store and Apache Cassandra on different servers

Hi,

We configured the ALA applications on different Servers of the database Server. However, we are having problems running the biocache-store.

I changed the biocache configuration file (/data/biocache/config/biocache-config.properties) and the Cassandra configuration (cassandra.yaml) for enabling remote access.

(biocache-config.properties)
# Cassandra Config
db=cassandra
cassandra.hosts=192.168.15.199
cassandra.port=9160
cassandra.pool=biocache-store-pool
cassandra.keyspace=occ
cassandra.max.connections=-1
cassandra.max.retries=6
thrift.operation.timeout=8000

(cassandra.yaml)
listen_address: 192.168.15.199
rpc_address: 192.168.15.199
rpc_port: 9160
...

Running the Biocache-store in the server 192.168.15.132

When I ran the Loading method, the data were saved correctly in the Cassandra Database. However, in the Processing method, the data was not recovered (see message below) and the Indexing method did not index.

biocache> process dr0
Processing dr0 incremental=false
2014-06-25 00:17:20,228 INFO : [Consumer] - Initialising thread: 0
2014-06-25 00:17:20,270 INFO : [Consumer] - Initialising thread: 1
2014-06-25 00:17:20,271 INFO : [Consumer] - Initialising thread: 2
2014-06-25 00:17:20,274 INFO : [Consumer] - Initialising thread: 3
2014-06-25 00:17:20,275 INFO : [ProcessWithActors] - Starting with dr0| endingwith dr0|~
2014-06-25 00:17:20,275 INFO : [ProcessWithActors] - Initialised actors...
2014-06-25 00:17:20,274 INFO : [Consumer] - In thread: 0
2014-06-25 00:17:20,274 INFO : [Consumer] - In thread: 1
2014-06-25 00:17:20,281 INFO : [Consumer] - In thread: 2
2014-06-25 00:17:20,293 INFO : [Consumer] - In thread: 3
2014-06-25 00:17:20,326 INFO : [ProcessWithActors] - Last row key processed:
2014-06-25 00:17:20,327 INFO : [ProcessWithActors] - Finished.
2014-06-25 00:17:20,333 INFO : [Consumer] - Killing (Actor.act) thread: 0
2014-06-25 00:17:20,333 INFO : [Consumer] - Killing (Actor.act) thread: 3
2014-06-25 00:17:20,333 INFO : [Consumer] - Killing (Actor.act) thread: 2
2014-06-25 00:17:20,333 INFO : [Consumer] - Killing (Actor.act) thread: 1

biocache> index dr0
2014-06-25 00:17:36,848 INFO : [IndexRecords] - Starting to index dr0| until dr0|~
2014-06-25 00:17:36,858 INFO : [IndexRecords] - Total indexing time 0.005 seconds
2014-06-25 00:17:36,858 INFO : [SolrIndexDAO] - Initialising the solr server http://192.168.15.132:8080/solr null null
2014-06-25 00:17:36,859 INFO : [SolrIndexDAO] - Initialising connection to SOLR server.....
2014-06-25 00:17:37,764 INFO : [SolrIndexDAO] - Initialising connection to SOLR server - done.
2014-06-25 00:17:38,040 INFO : [SolrIndexDAO] - >>>>>>>>>>>>> Document count of index: 0
2014-06-25 00:17:38,041 INFO : [SolrIndexDAO] - Finalise finished.

** The delete_resource method also didn't work and I need to delete the occurrence table using the truncate command.

Does anyone know how to solve this issue?

Thanks!!

Regards,

--
Daniel Lins da Silva
(Mobile) 55 11 96144-4050<tel:55%2011%2096144-4050>
Research Center on Biodiversity and Computing (Biocomp)
University of Sao Paulo, Brazil
daniellins at usp.br<mailto:daniellins at usp.br>
daniel.lins at gmail.com<mailto:daniel.lins at gmail.com>




--
Daniel Lins da Silva
(Cel) 11 6144-4050
daniel.lins at gmail.com<mailto:daniel.lins at gmail.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.gbif.org/pipermail/ala-portal/attachments/20140625/21d64520/attachment-0001.html 


More information about the Ala-portal mailing list