[Ala-portal] Solutions for some issues

Marie Elise Lecoq melecoq at gbif.fr
Tue Jun 21 17:41:47 CEST 2016


Hello all,

Thanks again for your help.
The 10 years anniversary of GBIF France went well (I guess), nobody told us
that it was bad so ...

I sent a lot of questions the last few weeks and I would like to give
solutions that I found for my issues. It could help others people :-).

INDEXATION :

1. About the wrong indexation, I found two bugs on the checklist used for
the name indexing :
- some of species don't have the entire classification (e.g.
http://www.gbif.org/species/4814179)
- some of them send a NullPointerException (see error below) when I ran the
searchText command directly on the server.

$ sudo nameindexer -testSearch "Canis familiaris Linnaeus, 1758"
org.apache.lucene.index.IndexFormatTooNewException: Format version is not
supported (resource:
MMapIndexInput(path="/data/lucene/namematching/cb/segments.gen")): -3
(needs to be between -2 and -2)
at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:722)
at
org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:65)
at au.org.ala.names.search.ALANameSearcher.<init>(ALANameSearcher.java:117)
at au.org.ala.names.search.DwcaNameIndexer.main(DwcaNameIndexer.java:488)


2. I still got issues with the punctuation or space in provider codes. My
future work will focus on this.

3. I have successfully uploaded my dataset with more than 20 millions
occurrences by following those steps :
  a.   I uploaded a DwCArchive with 15 occurrences in order to create the
dataset into the system. I need to do this because the Zip File library
using in biocache store can't open a file bigger than 1Go.
  b.  I copied the real DwC-Archive instead of the fake one on the
/collectory/upload/ folder
  c.  I asked our system administrator to increase the RAM in our Virtual
Machine (from 4Go to 80Go).
  d.  I made some correction into the collectory-pluggin (you can see my
email that I sent on June, 1st) and the load, process and indexation works
well after this. It took ages but it worked.
  e.  Our data is now visible into our portal (
http://metadonnee.gbif.fr/public/showDataResource/dr179)
I'm not sure it's the good way to do it but it works !


SPATIAL :

I removed all the tools using environmental layers but I will be really
interested by a training about it in order to install it :-)!


DATA :

For my error with Institution UID instead of name, I just changed
"caches.collections.enabled" to true in the configuration file of biocache
and it works perfectly.

Thanks again!
cheers,
Marie


--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gbif.org/pipermail/ala-portal/attachments/20160621/6565de81/attachment.html>


More information about the Ala-portal mailing list