[Ala-portal] Problem generating a new name index

David.Martin at csiro.au David.Martin at csiro.au
Fri Sep 5 01:56:26 CEST 2014


Thanks guys.

There are details on how to build in the README.md, but the build the steps to make a full executable are in the ansible scripts (see the role nameindexer [1]).

Dave

[1] https://github.com/AtlasOfLivingAustralia/ala-install/tree/master/ansible/roles/nameindexer

________________________________
From: ala-portal-bounces at lists.gbif.org [ala-portal-bounces at lists.gbif.org] on behalf of Tim Robertson [trobertson at gbif.org]
Sent: 04 September 2014 23:25
To: Santiago Martinez de la Riva
Cc: ala-portal at lists.gbif.org
Subject: Re: [Ala-portal] Problem generating a new name index

Hi Santiago

This is likely to need ALA folks, but since they are asleep, this might give you some ideas to explore before they come online.

I’ve logged the issue with a proposed fix:
  https://github.com/AtlasOfLivingAustralia/ala-name-matching/issues/4

What it fails on though is that it is getting NULL names.  Perhaps you can modify your input checklist to not have null names ever?

You might for example use this kind of SQL or similar for whatever you are using to generate the names list:

SELECT
  kingdom, phylum, class, order, family, genus,
    COALESCE (name, genus, family, order, class, phylum, kingdom) AS scientificName
FROM ...

The COALESCE function will then set the name to be the first non NULL value.

I tried to build the project the fix myself, but “mvn:assembly:single” did not produce me a fat jar, and the project read me doesn’t tell me how they did it… sorry.

I hope this helps,
Tim



On 04 Sep 2014, at 14:50, Santiago Martinez de la Riva <sama at gbif.es<mailto:sama at gbif.es>> wrote:

Hi all,


I'm trying to create our own name index. I'm following the steps of the wiki in GitHub: https://github.com/AtlasOfLivingAustralia/documentation/wiki/Creating-a-name-index

Our dwca has the same estructura that dwca-col-mammals, but the problem is that when I try to generate the name index with the command: sudo nameindexer -dwca /...

I get the next exception:

vagrant at ala:/data/lucene/sources/dwca-spe2000-plantae$ sudo nameindexer -dwca /data/lucene/sources/dwca-spe2000-plantae
2014-09-04 12:04:26,093 INFO : [DwcaNameIndexer] - Generating loading index: true
2014-09-04 12:04:26,094 INFO : [DwcaNameIndexer] - Generating searching index: true
2014-09-04 12:04:26,094 INFO : [DwcaNameIndexer] - Using the  DwCA name file: /data/lucene/sources/dwca-spe2000-plantae
2014-09-04 12:04:26,094 INFO : [DwcaNameIndexer] - Using the default IRMNG name file: /data/lucene/sources/IRMNG_DWC_HOMONYMS
2014-09-04 12:04:26,095 INFO : [DwcaNameIndexer] - Using the default common name file: /data/lucene/sources/col_vernacular.txt
2014-09-04 12:04:26,182 INFO : [DwcaNameIndexer] - Starting to create the temporary loading index.
2014-09-04 12:08:10,283 INFO : [DwcaNameIndexer] - Finished creating the temporary load index with 1070805 concepts
java.lang.NullPointerException
       at au.org.ala.names.search.ALANameIndexer.isBlacklisted(ALANameIndexer.java:778)
       at au.org.ala.names.search.ALANameIndexer.createALAIndexDocument(ALANameIndexer.java:788)
       at au.org.ala.names.search.ALANameIndexer.createALAIndexDocument(ALANameIndexer.java:757)
       at au.org.ala.names.search.DwcaNameIndexer.addIndex(DwcaNameIndexer.java:350)
       at au.org.ala.names.search.DwcaNameIndexer.generateIndex(DwcaNameIndexer.java:281)
       at au.org.ala.names.search.DwcaNameIndexer.create(DwcaNameIndexer.java:101)
       at au.org.ala.names.search.DwcaNameIndexer.main(DwcaNameIndexer.java:527)

And when I try to search some name, I get this other one expection:

vagrant at ala:/data/lucene$ sudo nameindexer -testSearch "Nepeta Catarea"
Search for name
org.apache.lucene.index.IndexNotFoundException: no segments* file found in org.apache.lucene.store.NIOFSDirectory@/data/lucene/namematching/cb lockFactory=org.apache.lucene.store.NativeFSLockFactory at c22530: files: [write.lock]
       at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:741)
       at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
       at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:65)
       at au.org.ala.names.search.ALANameSearcher.<init>(ALANameSearcher.java:122)
       at au.org.ala.names.search.DwcaNameIndexer.main(DwcaNameIndexer.java:465)


Because the nameindexer didn't generate the necessary files:

Help meee!! xD

Cheers,
SaMa


---------------------------------------------------------------------------------------
Santiago Martínez de la Riva
GBIF.ES, Unidad de Coordinación         Tel. +34 91 4203017 x 273
Real Jardín Botánico - CSIC                     Fax +34 91 429 2405
Plaza de Murillo, 2                                     sama at gbif.es<mailto:sama at gbif.es>
28014 Madrid, Spain                                 www.gbif.es<http://www.gbif.es>
_______________________________________________
Ala-portal mailing list
Ala-portal at lists.gbif.org<mailto:Ala-portal at lists.gbif.org>
http://lists.gbif.org/mailman/listinfo/ala-portal


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.gbif.org/pipermail/ala-portal/attachments/20140904/3ae5b28b/attachment-0001.html 


More information about the Ala-portal mailing list