[Ala-portal] Name matching issues

David.Martin at csiro.au David.Martin at csiro.au
Wed Aug 27 08:27:20 CEST 2014

Thanks Allan.

1) This is a bug and we'll work on a fix. The biocache-hubs shouldn't have a dependency on that web service so we need to remove this. To support an auto-complete functionality we'll have to look at writing something that generates an autocomplete index from either the occurrence index or separately from the DWC-A of names used in the creation of the name matching index. If any subscribers are interested in contributing this functionality then Id be keen to hear from people. This isn't something the ALA will use, but is likely to be very useful to anyone else reusing the biocache.

2) I'll have dig a bit deeper on this one. Fuzzy matching is supported but it could obviously be improved.


From: ala-portal-bounces at lists.gbif.org [ala-portal-bounces at lists.gbif.org] on behalf of Allan Koch [allan.kv at gmail.com]
Sent: 25 August 2014 22:33
To: ala-portal at lists.gbif.org
Subject: [Ala-portal] Name matching issues

Hi guys,

I have two issues concerning taxon names. I will start with, probably, the easier issue.

(1) We create the name matching index using a merged list from CoL and a list of Brazilian species. But these names aren't used in the autocomplete of searching page. We saw the call of AJAX and it's calling a service from BIE (http://bie.ala.org.au/search/auto.json?callback=jQuery183029955692077055573_1408751293018&q=podo&limit=100&timestamp=1408751307476&_=1408751307479). The  autocomplete feature can be changed to call the names from Lucene name matching index?

(2) When we ingest a dataset, the process try to match the names using some techniques of fuzzy matching, is it right? But, for some names, this not occured. For example, the name "Apis melifera" is matched with "Apis (Apis) mellifera" (it an example that works well), but the name "Chrysocyon brachycrus" should match with "Chrysocyon brachyurus", and it not happen.
Why the fuzzy matching is applied to some names and don't aplied to other names?

Allan Koch Veiga

Núcleo de Pesquisa em Biodiversidade e Computação - BioComp
Laboratório de Automação Agrícola - LAA
Depto. de Engenharia de Computação e Sistemas Digitais - PCS
Engenharia Elétrica - Escola Politécnica da USP
Celular: +55 11 8401-2277
Email: allan.kv at usp.br<mailto:allan.kv at usp.br>

"Stay hungry, stay foolish." Stewart Brand
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.gbif.org/pipermail/ala-portal/attachments/20140827/d0397dce/attachment.html 

More information about the Ala-portal mailing list