Hello all,

Thanks again for your help. 
The 10 years anniversary of GBIF France went well (I guess), nobody told us that it was bad so ...

I sent a lot of questions the last few weeks and I would like to give solutions that I found for my issues. It could help others people :-). 


1. About the wrong indexation, I found two bugs on the checklist used for the name indexing :
- some of species don't have the entire classification (e.g. http://www.gbif.org/species/4814179)
- some of them send a NullPointerException (see error below) when I ran the searchText command directly on the server.

$ sudo nameindexer -testSearch "Canis familiaris Linnaeus, 1758"
org.apache.lucene.index.IndexFormatTooNewException: Format version is not supported (resource: MMapIndexInput(path="/data/lucene/namematching/cb/segments.gen")): -3 (needs to be between -2 and -2)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:722)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:65)
at au.org.ala.names.search.ALANameSearcher.<init>(ALANameSearcher.java:117)
at au.org.ala.names.search.DwcaNameIndexer.main(DwcaNameIndexer.java:488)

2. I still got issues with the punctuation or space in provider codes. My future work will focus on this.

3. I have successfully uploaded my dataset with more than 20 millions occurrences by following those steps :
  a.   I uploaded a DwCArchive with 15 occurrences in order to create the dataset into the system. I need to do this because the Zip File library using in biocache store can't open a file bigger than 1Go. 
  b.  I copied the real DwC-Archive instead of the fake one on the /collectory/upload/ folder
  c.  I asked our system administrator to increase the RAM in our Virtual Machine (from 4Go to 80Go).
  d.  I made some correction into the collectory-pluggin (you can see my email that I sent on June, 1st) and the load, process and indexation works well after this. It took ages but it worked.   
  e.  Our data is now visible into our portal (http://metadonnee.gbif.fr/public/showDataResource/dr179
I'm not sure it's the good way to do it but it works !


I removed all the tools using environmental layers but I will be really interested by a training about it in order to install it :-)! 


For my error with Institution UID instead of name, I just changed 
"caches.collections.enabled" to true in the configuration file of biocache and it works perfectly.

Thanks again!
